Monday, May 05, 2014

Table macros

Table macros are a new Optiq feature (since release 0.6) that combine the efficiency of tables with the flexibility of functions.

Optiq offers a convenient model for presenting data from multiple external sources via a single, efficient SQL interface. Using adapters, you create a schema for each external source, and a table for each data set within a source.

But sometimes the external data source does not consist of a fixed number of data sets, known ahead of time. Consider, for example, Optiq’s web adapter, optiq-web, which makes any HTML table in any web page appear as a SQL table. Today you can create an Optiq model and define within it several tables.

Optiq-web’s home page shows an example where you can create a schema with tables “Cities” and “States” (based on the Wikipedia pages List of states and territories of the United States and List of United States cities by population) and execute a query to find out the proportion of the California’s population that live in cities:

SELECT COUNT(*) "City Count",

  SUM(100 * c."Population" / s."Population") "Pct State Population"

FROM "Cities" c, "States" s

WHERE c."State" = s."State" AND s."State" = 'California';
But what if you want to query a URL that isn’t in the schema? A table macro will allow you to do this:
SELECT * FROM TABLE(
  web(‘http://en.wikipedia.org/wiki/List_of_countries_by_population’));
web is a function that returns a table. That is, a Table object, which is the definition of a table. In Optiq, a table definition doesn’t need to be assigned a name and put inside a schema, although most do; this is a free-floating table. A table just needs to be able to describe its columns, and to be able to convert itself to relational algebra. Optiq invokes it while the query is being planned.

Here is the WebTableMacro class:

public class WebTableMacro {
  public Table eval(String url) {
    Map operands = new HashMap();
    operands.put(“url”, url);
    return new WebTable(operands, null);

  }

}
And here is how you define a WEB function based upon it in your JSON model:
{
  version: '1.0',
  defaultSchema: ‘ADHOC’,
  schemas: [
    {

      name: 'ADHOC',
      functions: [
        {

          name: ‘WEB’,
          className: 'com.example.WebTableMacro'
        }

      ]
    }
  ]


}
Table macros are a special kind of table function. They are defined in the same in the model, and invoked in the same way from a SQL statement. A table function can be used at prepare time if (a) its arguments are constants, and (b) the table it returns implements TranslatableTable. If it fails either of those tests, it will be invoked at runtime; it will still produce results, but will have missed out on the advantages of being part of the query optimization process.

What kind of advantages can the optimization process being? Suppose a web page that produces a table supports URL parameters to filter on a particular column and sort on another. We could write planner rules that push take a FilterRel or SortRel on top of a WebTableScan and convert them into a scan with extra URL parameters. A table that came from the web function would be able to participate in that process.

The name ‘table macros’ is inspired by Lisp macros — functions that are invoked at compile time rather than run time. Macros are an extremely powerful feature in Lisp and I hope they will prove to be a powerful addition to SQL. But to SQL users, a more familiar name might be ‘parameterized views’.

Views and table macros are both expanded to relational algebra before the query is optimized. Views are specified in SQL, whereas table macros invoke user code (it takes some logic to handle those parameters). Under the covers, Optiq’s views are implemented using table macros. (They always have been — we’ve only just got around to making table macros a public feature.)

To sum up. Table macros are powerful new Optiq feature that extend the reach of Optiq to data sources that have not been pre-configured into an Optiq model. They are a generalization of SQL views, and share with views the efficiency of expanding relational expressions at query compilation time, where they can be optimized. Table macros will help bring a SQL interface to yet more forms of data.

No comments: