Avoiding Compensation in WHERE Clauses and Predicates

When a database is queried, DataDirect XQuery® usually pushes queries to the database, where they are executed in SQL. However, some expressions, functions and casts are not supported by SQL, and are therefore performed in program memory, a strategy known as compensation. Unfortunately, when these operations are performed in main memory, they can not take advantage of indexes and other optimizations available in the database server. This can dramatically hurt performance.

The DataDirect XQuery® User's Guide and Reference clearly states which expressions, functions, and operators are compensated. Most of the commonly used functions that are compensated fall into the following categories:

  • String Functions that Use Pattern Matching
  • Date and Time Functions for XML Schema Date/Time Datatypes
  • Casts to XML Schema Datatypes not supported by the underlying database.
  • XML whitespace functions
  • FLWOR Expressions using the AT Clause
  • Numeric Predicates

When querying relational data, if a compensated function or operator is used in the WHERE clause of a FLWOR expression or in a predicate of a path expression, performance can suffer dramatically. For instance, in the following query, normalize-space() is compensated, and forces evaluation in memory; since this function must be called once for every row, the performance hit is very significant.

Example14. Avoid using compensated functions in a WHERE clause


     for $h in collection('stocks.dbo.historical')/historical 
     where normalize-space($h/ticker) = 'AMZN' 
     return $h
            

One way to avoid compensation is to use a SQL function instead of a compensated XQuery function, as in the following example.

Example 15. Using a SQL function instead of a compensated function

    declare function ddtek-sql:rtrim($in as xs:string) as xs:string external;

    for $h in collection('historical')/historical 
    where ddtek-sql:rtrim($h/ticker) = 'AMZN' 
    return $h           

When querying relational data, using numeric predicates or an AT clause in a FLWOR expression forces compensation, which can dramatically impact performance, especially when it forces a join to be done in memory. Consider the following query.

Example 16. Avoid AT clauses when querying relational data

     for $user at $i in collection("users")/users,
           $hold at $j in collection("holdings")/holdings
     return
          <holding>
              <userid>{ $i }</userid>
              <user>{ string($user/userid) }</user>
              <stockid>{ $j }</stockid>
              <stock>{ string($hold/stockticker) }</stock>
              <share>{ string($hold/shares }</share>
          </holding>                
            

This query uses AT clauses to identify the individual users and stocks, and these AT clauses force DataDirect XQuery® to do the joins in memory instead of letting the database engine do them. Not only is this a dramatic performance hit, it's also a bad way to identify the users and holdings — because rows in a relational table do not generally have a guaranteed stable order, these identifiers might change each time the query is run. It is generally much better to use the primary key from a table to identify the rows of that table.

Example 17. For identifiers, rely on the primary keys from relational tables

 (:-------------------------------------------------------------------
  :  userid is the primary key for collection("users")
  :  stockticker, userid is the primary key for collection("holdings")
  --------------------------------------------------------------------:)
                
     for $user in collection("users")/users,
         $hold in collection("holdings")/holdings
     return
          <holding>
              <user>{ string($user/userid) }</user>
              <stock>{ string($hold/stockticker) }</stock>
              <share>{ string($hold/shares }</share>
          </holding>                
            

Try DataDirect XQuery® Free!

Put the power, scalability, and performance of DataDirect XQuery® to work for you today! Our free trial lets you see for yourself how easy it is to build data integration applications that access relational, EDI, and other file formats as XML!

Download DataDirect XQuery® today!

Online Video Tutorials!

Our easy-to-follow online video tutorials are a great way to get acquainted with the many features of DataDirect XQuery®.

And if you like what you see, download a free copy today and try DataDirect XQuery® for yourself!

Back to School with DataDirect XQuery®

Getting your mind around the possibilities of a data integration technology as promising as XQuery can be difficult, but our XML developers Webinars will help you understand the power and versatility of XQuery, and our favorite XQuery processor, DataDirect XQuery®.

Stay Informed!

XQuery is one of the hottest XML technologies being developed today. Stay informed with vital news about standards, tools, and trends by signing up for the DataDirect XQuery® newsletter.

DataDirect XQuery FAQ

This informative DataDirect XQuery® FAQ answers frequently-asked questions about DataDirect XQuery®, including questions about performance, scalability, use-cases, resources, and more.

If you're more of a hands-on learner, then download a free copy and start exploring DataDirect XQuery® today!