Monday, January 26, 2009

SQLstream 2.0

SQLstream release 2.0 hit the streets today. It's actually our third release, but it's the first one we've made a fuss about. Since release 1.1, we have hardened the product at customer deployments, introduced 64 bit support to allow larger working memory, and added two major SQL extensions: streaming aggregation and user-defined transforms.

Streaming aggregation adds support for the GROUP BY construct to streaming SQL, and allows us to compute totals on a periodic basis, for example computing hourly subtotals. SQLstream acts as a continuously operating conduit between operational systems and the data warehouse, replacing the traditional batch-based ETL process and populating the fact table and aggregate tables simultaneously. This gives us a natural synergy with my other project, the Mondrian OLAP engine, and we have been getting good take-up among Pentaho's customers.

Streaming aggregation builds on relational operators in SQLstream 1.1 such as joins, windowed aggregations and unions. Those relational operators allowed you to build queries such as fraud detection, looking for anamolous rows in real time. Those were 'needle in the haystack' kinds of problem, and the new features also allow you to do build a high-performance 'water main' between your operational system and data warehouse.

User-defined transforms allow you to define new relational operators in Java and incorporate them into streaming SQL statements. SQLstream 1.1 had user-defined functions, to allow you to compute scalar quantities in Java, and syntactically, user-defined transforms are simply functions that have JDBC ResultSets and PreparedStatements as parameters. If a function has a PreparedStatement as a parameter, SQLstream lets you include it in the FROM clause of a SQL statement as a data source, alongside regular streams, tables, and views. Similarly, if a function has a ResultSet as a parameter, then you can pass in a cursor based on a SELECT statement as an argument.

User-defined transforms are an excellent example of our ongoing collaboration with the open-source Eigenbase project. User-defined transforms were originally developed for LucidDB to operate on traditional stored relational data, with SQL:2003-compliant syntax, and we extended them to handle streaming relational data, but keeping the syntax the same. For more about user-defined transforms in LucidDB, see the excellent documentation at Eigenbase.

Release 2.0 is a major milestone for SQLstream, and is the culmination of several years of development. It allows you to tackle in industry-standard SQL some application areas that previously required guile and custom coding. Go to www.sqlstream.com and see whether there is a fit with your real-time BI application.

1 comment:

Samir said...

Congratulations Julian!