How to Write Performant Queries

Few Words On Guidewire ORM

Query.make() is the heart of all database queries.  Queries are made against entities (e.g.  .eti, .etx) which are complex database objects.  xCenter uses an Object Relational Mapper (ORM) to retrieve and save all data from the Database Management System (DBMS).  By using an ORM, the query logic remains DBMS agnostic, so it does not matter if we are using MS SQL, Oracle, DB2 or H2, the query language is abstracted away from us.  Most of the SQL produced by the ORM is ANSI SQL with some exceptions for specific DBMS implementations.

Within an entity, there can be simple items such as columns or foreign keys (FK), or complicated ones like arrays or effective-dated properties.

Simple
Less Simple
Complicated

For this reason, figuring out an actual Database query can be tricky because columns may be added via implemented interfaces.  Thus we need an easy way to manipulate and query these entries and  Query.make provides that.

Retrieving First Result

Take a simple query like:

var jobNumber = "MyJobNumber"
var job = Query.make(Submission).compare("JobNumber", Equals, jobNumber)

By default, nothing happens because, at this point, we have only created a variable to the function.

From this variable, we can either join other tables, or simply select.  Once we select, we get back a collection; a collection of Submissions in the example above (technically it is an array of bean results of type Submission.)

Now we simply call FirstResult to operate on the first submission .

There are two main ways to get the first item: FirstResult and first().  In theory, these return the same thing but, in practice, they are very different depending on the collection type.  Always use FirstResult for database queries.

Performance Tip

Use FirstResult to force the database to use it’s “Top 1” capability at the server level.  This results in filtering then returning only the first row from the server.  One row rather than many… great win.

Consideration Tip – inspired by Jerry Brenner (Guidewire)

Let’s consider for a moment what is really needed here.  If we truly need a single result, we should build our query with enough filtering such that there is only one row returned, in which case we can, and should, use AtMostOneRow or first(). 

Why?

Let’s get a little geeky here.

FirstResult performs a Top 1, but since we include a Where clause (from the compare statement), it forces SQL to put an Order By ID (the Primary Key) on the query. Depending on previous queries with this table, or the size of the table, the SQL optimizer can switch between an Index Seek and a Table Scan to find the needed ID.

AtMostOneRow performs the same query without an Order By and without a Top 1, which forces an Index Seek.  Gosu then throws an error if more than one row is returned.

first() uses the same query, again without the Order By and Top 1, but won’t throw an error if more than one row is indeed returned.

Performance Tip

You should always consider moving your logic to the server via SQL vs pulling results into Gosu and manipulating in memory. The database is far more performant than in memory operations, and saves from unnecessarily moving data across the wire.

Avoid Mistakes By Strong Typing

Notice the makeup of the query.  

var job = gw.api.database.Query.make(Submission).compare("JobNumber", Equals, jobNumber)

The entity is not a string, but an actual entity name whereas the comparison refers to a “column” via the string representation.

We can also choose to strongly type our comparison by using a column reference inference.

var job = gw.api.database.Query.make(Submission).compare(Submission#JobNumber, Equals, jobNumber)

Notice the makeup of the Entity#Column.  This invokes a special reference api and is the preferred method for columns due to its type safety.

Performance Tip

Type safety generates compile-time errors, instead of runtime errors, to quickly identify hard to find syntax issues.

Never Again Join Data In Memory

By all means, avoid loading complete tables into Java/Gosu memory to only then link them in order to achieve your needs. Database JOIN operations, with appropriate constraints and indexes, are extremely fast. Let’s take the following example:

uses gw.api.database.Relop
uses gw.api.database.Query

var policyNumber = "MyPolicyNumber"
var policyQuery = Query.make(Policy)
policyQuery.withDistinct(true)
var periodTable = policyQuery.join(PolicyPeriod, "Policy")
periodTable.compare(PolicyPeriod#PolicyNumber, Relop.Equals,  policyNumber)
var resultPolicy =  policyQuery.select()
resultPolicy.each(elt -> print(elt.LatestPeriod.DisplayName))

SQL EQUIVALENT

Select DISTINCT PolicyNumber, pp.PeriodStart, pp.PeriodEnd,*

FROM dbo.pc_policy pol
Inner JOIN dbo.pc_policyperiod pp ON pol.ID = pp.PolicyID

WHERE pp.PolicyNumber = 'MyPolicyNumber'

Notice what is happening here.  When we perform the join using 

var periodTable = policyQuery.join(PolicyPeriod, "Policy")

we are relying on the ORM to understand the entity make-up and provide the join constraint for us.

Inner JOIN dbo.pc_policyperiod pp ON pol.ID = pp.PolicyID

This is the beauty of the ORM.  Once the entities are defined within the xCenter, SQL is easily created using simple entity references. Moreover, developers don’t have to worry about reviewing JOIN predicates as the ORM provides the most efficient SQL, such as removing unwanted duplicates from the cartesian product of the JOIN statement.

Notice that the PolicyPeriod eti has the mentioned reference which tells the ORM what column to use when joining the tables.

Subqueries Will Take Your Code To The Next Level

SQL subqueries allow you to further stay within the database and write highly performant operations. In the below example you can see a PolicyPeriod entity query is used as a subquery to a Policy entity one.

var policies : gw.api.database.IQueryBeanResult
gw.transaction.Transaction.runWithNewBundle(bundle -> {
  //grab a list of producer codes
  var producerCodes = User.util.CurrentUser.UserProducerCodes*.ProducerCode

  var periodsSubquery = Query.make(PolicyPeriod)
      .compare(PolicyPeriod#PeriodEnd, GreaterThanOrEquals, DateUtil.currentDate())
      .compare(PolicyPeriod#MostRecentModel, Equals, true)
      .compare(PolicyPeriod#Status, Equals, typekey.PolicyPeriodStatus.TC_BOUND)
      .compare(PolicyPeriod#CancellationDate, Equals, null)
      .compareIn(PolicyPeriod#ProducerCodeOfRecord, producerCodes)

  //find policies with a producercode of record that we want
  policies = Query.make(Policy).or(aQuery -> {
    aQuery.compareIn(Policy#ProducerCodeOfService, producerCodes)
    aQuery.subselect(Policy#ID, CompareIn, periodsSubquery, "Policy")
  }).withDistinct(true).select()
}, "aarmstrong")

policies.each(elt -> print(elt.LatestPeriod.DisplayName))

The periodsSubquery does not select anything itself.  It is used to produce the necessary SQL to retrieve the Policy results. 

Side note:  Notice that lambdas are used extensively with queries. I will tell you some more on this topic so stay tuned.

Performance Tip

Use compare filtering to limit the number of rows taken for subsequent processing. Compare clauses should be organized to filter out the biggest chunks of data early in the query.  Sequencing compares is important.

Make sure you use the retrieved data wisely. Note we have the print statement at the end which introduces an issue: it needs to hydrate the entire Policy to print the DisplayName since it is comprised of data from multiple tables. 

    .compare(PolicyPeriod#Status, Equals, typekey.PolicyPeriodStatus.TC_BOUND)

Notice that the value to compare is the real type.  For example, the Status uses the PolicyPeriodStatus typelist.  At the server, Status is compared to the actual typekey ID of 10, and not the string “Bound”.  The ORM translates types into their SQL equivalent.

Reducing Colum Count With Row Queries

Finally, you can reduce the column count in your query by specifying what you would like

var table = Query.make(Person)
var rows = table.select({
    QuerySelectColumns.pathWithAlias("FName", Paths.make(Person#FirstName)),
    QuerySelectColumns.path( Paths.make(Person#LastName))
}).orderBy(QuerySelectColumns.path(Paths.make(Person#LastName)))

rows.eachWithIndex(elt, idx -> {
    print("Contact: FName :${elt.getColumn('FName')}, LastName: ${elt.getColumn('Person.LastName')} ")
   //Equivalent using column ordinals
    print("Contact: FName :${elt.getColumn(0)}, LastName: ${elt.getColumn(1)} ")
})

SQL EQUIVALENT

SELECT gRoot.FirstName col0, gRoot.LastName
FROM pc_contact gRoot WHERE gRoot.Subtype IN (2,4,7,9) AND gRoot.Retired = 0
ORDER BY col1 ASC

One of the main reasons we use entities to query is the very simple case above.  Notice we are selecting from the entity Person, yet the SQL being written contains 4 subtypes and adds other ancillary columns (Retired) automatically.  This is the power of the ORM because we don’t care about all the subtypes stored in the same table.  We can simply query for a Person.

As of version 9.0 IMHO there is a somewhat cumbersome new method for dealing with individual columns in row queries.  As you can see in the query above, the interface uses QuerySelectColumns for various mechanisms dealing with columns.  The most straight forward of these is Paths which simply includes those columns in the query and excludes all others.

print("Contact: FName :${elt.getColumn('FName')}, LastName: ${elt.getColumn('Person.LastName')} ")})

And using pathWithAlias is similar to an SQL “as” statement allowing you to refer to that column in results via the alias, like I did with FName.

Note: The alias used in the query (FName) doesn’t make its way to SQL because it has no bearing on the query and is acted upon within the data once it reaches the xCenter.  However, the OrderBy does because that changes the actual SQL and the results returned.  It is recommended to put as much onus on the database as possible to boost performance.

Aggregate Functions Boost Performance

If you wish to use aggregate functions, you must go through the same interface but use the dbFunction to achieve the result.

var query = Query.make(Address)
var latestAddress = QuerySelectColumns.dbFunctionWithAlias("LatestAddress",
  DBFunction.Max(Paths.make(Address#CreateTime)))
var rowResults = query.withLogSQL(true).select({
  QuerySelectColumns.pathWithAlias("Country", Paths.make(Address#Country)),
  QuerySelectColumns.pathWithAlias("Address1", Paths.make(Address#AddressLine1)),
  latestAddress})
rowResults.each(elt -> print("Country: ${elt.getColumn("Country")}, AddressLine1: ${elt.getColumn("Address1")}"))

SQL EQUIVALENT

SELECT gRoot.Country col0, gRoot.AddressLine1 col1, MAX(gRoot.CreateTime) col2
FROM pc_address gRoot
WHERE gRoot.Retired = 0
GROUP BY gRoot.Country, gRoot.AddressLine1

Notice that to make this query work in SQL, a GroupBy is needed, which is automatically added.

Performance Tip

Anytime you need to aggregate data (e.g. max, min, sum, etc.), the SQL version is much more performant since its  correlates data in its native environment then return a single result.  To do the calculation in Gosu, the ORM must return all relevant rows, including irrelevant data.  For Gosu to handle this, it must calculate your single value from a single column from all the rows.

Remember AND/OR Are Also Available In SQL

gw.transaction.Transaction.runWithNewBundle(bundle -> {

 var activitiesQuery = Query.make(Activity)
 var accountQuery = activitiesQuery.join("Account")
  .compare(Account#AccountNumber, Relop.Equals, "MyAccountNumber").withLogSQL(true)

 activitiesQuery.withLogSQL(true)
   .and(andRes ->
     andRes.startsWith("Subject", "Sub", true)
   .or(orRes -> {
    orRes.compare(Activity#AssignedByUser, Equals, User.util.CurrentUser)
    orRes.compareIn(Activity#AssignedGroup, groups)
   })
 )
 var activies = activitiesQuery.select()
}, "aarmstrong")

While the syntax for and/or is a little backward from the way we tend to write this in SQL, it forces you to design your query structure into smaller, more manageable chunks.

By design, each Compare forces an And into the query.  If we need to incorporate additional complex and/or logic, we need to add restrictions to the query.

SQL EQUIVALENT

SELECT gRoot.Subject
FROM pc_activity gRoot
INNER JOIN pc_account account_0 ON account_0.ID = gRoot.AccountID
WHERE
   gRoot.Retired = 0
   AND (gRoot.Subject LIKE 'Sub%')
   AND (
     gRoot.AssignedByUserID = 58
     OR gRoot.AssignedGroupID = 21
   )
AND account_0.AccountNumber = 'MyAccountNumber'

Troy Stauffer
Senior Software Architect


Watch or read our other posts at Kimputing Blogs. You’ll find everything from Automated testing to CenterTest, Guidewire knowledge to general interest. We’re trying to help share our knowledge from decades of experience.

Similar Posts