Deeper Dive into Performant Queries

In another blog, we covered some basics on the query language and how to make it perform better. However, if you are like me, you want to understand more internals to be better informed when making decisions. This blog takes a deep dive into the inner workings of queries and caches used by Guidewire (GW). Let’s compare a set of Guidewire queries and the resulting SQL as a starting point.

Chapter 1.  Guidewire Query into SQL

Let’s use a fairly complex query in Guidewire as the starting point to understand how it becomes SQL. We’ll explain what happens along each step of the way.

Query Language Internals

Query Makeup

uses gw.api.database.Query
uses gw.api.util.DateUtil

var unionQuery : gw.api.database.IQueryBeanResult
print("start")
gw.transaction.Transaction.runWithNewBundle(bundle -> {
  //grab a list of producer codes that aarmstrong has access to
  var producerCodes = User.util.CurrentUser.UserProducerCodes*.ProducerCode
  //GW recommends avoiding splat(*.) operations.  For further geekiness, please refer
  //to my next article on when they are available and when to use them

  //get the most recent, active, non-cancelled PolicyPeriods for my producer codes
  var periodsSubquery = Query.make(PolicyPeriod)
      .compare(PolicyPeriod#PeriodEnd, GreaterThanOrEquals, DateUtil.currentDate())
      .compare(PolicyPeriod#MostRecentModel, Equals, true)
      .compare(PolicyPeriod#Status, Equals, typekey.PolicyPeriodStatus.TC_BOUND)
      .compare(PolicyPeriod#CancellationDate, Equals, null)
      .compareIn(PolicyPeriod#ProducerCodeOfRecord, producerCodes)

  //find policies with a producer code of service that we want
  var policies = Query.make(Policy)
        .compareIn("ProducerCodeOfService", producerCodes)
        .withDistinct(true)

  //find policies with a producer code of record that we want
  var policiesPeriods = Query.make(Policy)
        .subselect("ID", CompareIn, periodsSubquery, "Policy")
        .withDistinct(true)

  unionQuery = policies.union(policiesPeriods).select()
}, "aarmstrong")

unionQuery.each(elt -> print((elt as Policy).LatestPeriod.DisplayName))
print("end")

SQL Equivalent

SELECT /* KeyTable:pc_policy; */ DISTINCT gRoot.ID col1
FROM pc_policy gRoot
WHERE gRoot.Retired = 0
AND ((((gRoot.ProducerCodeOfServiceID IN (6,7,8,9,10,11))
        OR (gRoot.ID IN (SELECT gRoot0.PolicyID col0
                FROM pc_policyperiod gRoot0
                WHERE gRoot0.PeriodEnd >= '2021-03-25 12:56:26.2630000'
                        AND gRoot0.MostRecentModel = 1
                        AND gRoot0.Status = 10
                        AND gRoot0.CancellationDate IS NULL
                        AND gRoot0.ProducerCodeOfRecordID IN (6,7,8,9,10,11)
                        AND gRoot0.Retired = 0
                        AND gRoot0.TemporaryBranch = 0)))))

Results

Looking that the results, we get four columns of information: the policy number, effective date, expiration date, and job number.

start
4282963789, 03/16/2021, 09/16/2021, 0001154270
6380897076, 07/14/2020, 07/14/2021, 0000150489
6921869612, 01/15/2019, 07/15/2019, 0000657947
9827809010, 03/16/2021, 03/16/2022, 0001218921
9677250739, 01/15/2021, 01/15/2022, 0000960253
1703316074, 01/16/2021, 01/16/2022, 0001392158
7325988661, 01/16/2021, 01/16/2022, 0000246875
Unassigned, 04/27/2021, 04/27/2022, 0000206608
Unassigned, 04/27/2021, 04/27/2022, 0000344971
end

Let’s look closer into how this simple SQL returns these complex results.

Chapter 2. Three from one

Looking closely at the SQL select statement, SELECT /* KeyTable:pc_policy; */ DISTINCT gRoot.ID col1, the only thing actually returned is the policy ID.  Yet, the print statement, print((elt as Policy).LatestPeriod.DisplayName), actually reference two other tables, the PolicyPeriod table and the Job table.  Hang on, I only see Policy and LatestPeriod.  Where is the Job table? To understand this, we need to look at the DisplayName processor.

In the DisplayName processor for a PolicyPeriod, we see it prints not only the policyPeriod’s policy number, effective date, and expiration date, but also the job number.  PolicyPeriod.Job automatically joins us to the Job table which then uses its own DisplayName handler to display the job number.  Policy, PolicyPeriod, Job: three tables from one! 

*** More information about DisplayName handlers can be found in an upcoming article about splat(*.).

Chapter 3.  Efficient queries without joins!

I must admit, when I first began working with Guidewire, I thought this was incredibly inefficient.  As a DBA, I am appalled; how can they write efficient queries without joins?

To understand this, look at the database as mere tables and relate them to an entity.  Let’s look at our statement again:

SELECT /* KeyTable:pc_policy; */ DISTINCT gRoot.ID col1
FROM pc_policy gRoot

Notice they grab the ID from the pc_policy table using a Gosu query (as indicated by g[osu]Root).  In our case, we have six ID’s returned from the above query.

Note: Gosu queries are not stored in cache.

Then a select statement grabs the Policy entity data using the given ID’s.

SELECT /* Entity:pc_policy; */ TOP (6)
bRoot.ID col0, bRoot.PublicID col1, bRoot.BeanVersion col2, bRoot.CreateTime col3,
… (the rest left off for brevity)
FROM pc_policy bRoot
WHERE bRoot.ID IN (1,2,3,4,5,6)

Since we know all the policy ID’s, we can grab the data all at once using a bean query (as seen by b[ean]Root).

Note: Bean queries DO use local and global cache, which also uses more memory.

Remember, we have a union query, so we also need to grab policy period entity data.  Since each PolicyPeriod is tied to a different policy, Guidewire invokes six further single-item queries to get each PolicyPeriod.ID using the Policy ID.

SELECT /* KeyTable:pc_policyperiod, ParentTable:pc_policy:; */ aRoot.ID col0
FROM pc_policyperiod aRoot
WHERE aRoot.PolicyID = 2 AND aRoot.Retired = 0

Once we have that, we query the PolicyPeriod table using all needed columns.  Notice that they do not do a select *; they return only the columns required for the entity.

//Grab the entity columns
SELECT /* Entity:pc_policyperiod; */ TOP (1) bRoot.ID col0, bRoot.PublicID col1, bRoot.BeanVersion col2, bRoot.CreateTime col3, bRoot.UpdateTime col4,
… (the rest left off for brevity)
FROM pc_policyperiod bRoot
WHERE bRoot.ID in (2, 101)

Similar queries exist for the pc_job table using each job ID to retrieve the job entities.

Chapter 4.  the wonderful world of caching

So, what is going on with the print statement?  To efficiently traverse the object graph from a PolicyPeriod to the policy line, back to the branch, and so on, Guidewire caches the entity graph table by table.  By storing all needed columns for each table required for object hydration, no matter what is needed, it is readily available.  This also means that if re-running the same query,  Guidewire knows that we already asked for ID 1 and avoids unnecessarily going back to the database once the record is cached.

So, what if a change occurs to that table?  If it is cached, how will we get the new data?

There is a table called pc_message with an intricate internal messaging system for all entities.  When you update an entity, it also writes a record to the pc_message table with the row ID for that entity.  Then each query checks the timestamps between pc_message and the entity in cache.  If they match, it returns the cached information.  Otherwise, it performs another lookup returning the latest information and re-caching it. The pc_message table for each Center has foreign keys to each of its major entities.

SELECT /* ISNULL:pc_message.FrozenSetID:, KeyTable:pc_message; */ TOP (100000)
       qRoot.ID col0
,   qRoot.CreationTime col1
FROM pc_message qRoot
WHERE qRoot.Status = 1
       AND qRoot.PolicyPeriodID IS NOT NULL
       AND qRoot.DestinationID = 67
       AND qRoot.FrozenSetID IS NULL
       AND qRoot.SendOrder IN (SELECT MIN(qRoot0.SendOrder) col0
       FROM pc_message qRoot0
       WHERE qRoot0.PolicyPeriodID = qRoot.PolicyPeriodID
             AND qRoot0.DestinationID = 67)
ORDER BY col1 ASC
,        col0 ASC

Conclusion

Sometimes understanding what is going on helps when designing your systems.  I, personally, enjoy knowing the inner workings because it also helps me when building my own designs.  Until next time, enjoy!

Troy Stauffer
Senior Software Architect


Watch or read our other posts at Kimputing Blogs. You’ll find everything from Automated testing to CenterTest, Guidewire knowledge to general interest. We’re trying to help share our knowledge from decades of experience.

Similar Posts