Tuesday, May 26, 2009

Webinar: Eliminating MySQL Bottlenecks and Replication Issues using Real-Time Queries & Continuous ETL

Would you like to find out how to build a continuous ETL process integrating source systems, MySQL data warehouse, and Mondrian OLAP engine?

I'm going to be hosting a webinar tomorrow describing how to do this using SQLstream. (Basically a repeat of the webinar I gave at the MySQL conference this year, but many of you missed it.)

Join me and Damian Black, CEO of SQLstream, on the webinar at 11am PDT/2pm EDT tomorrow, Wednesday 27th May. To register for the webinar, visit https://www2.gotomeeting.com/register/668399275.

Tuesday, May 19, 2009

Explaining the structure of Mondrian schemas

There are some major schema changes coming in Mondrian 4.0, and I'm writing up specifications for these so that everyone knows what's coming and has chance to influence it.

But before I do that, I thought I'd try to improve how we describe the structure of XML schemas in the present release, just a bit. I have tried a couple of things. First, I created an XML skeleton that shows which elements can occur inside which other elements:

aggElements
aggElements
relation
<SQL/>
<SQL/>
<SQL/>
<SQL/>
<SQL/>
<SQL/>
<SQL/>

relation ::=
<SQL/>
<SQL/>
<Row>
relation

aggElement ::=


You can see the full version in the Mondrian schema guide.

This approach shows where things are located, but it doesn't show how many of each element can belong to a particular parent element, or the order in which they are required. So, I wrote up a small BNF grammar and used Clapham to generate a railroad diagram. For comparison, the railroad diagram for the work-in-progress mondrian-4.0 schema is here.

Monday, May 11, 2009

Clapham: A railroad diagram generator

I don't work with the Oracle database very much anymore, and one thing I miss is their server documentation. I still have my old copy of the Oracle 7.3 SQL Language Reference, and sometimes I reach for it when the SQL:2008 standard has fuddled my brain and I want to be reassured that SQL can be simple, powerful and trustworthy. The calming effect is partly due to the authoritative tone, but the railroad diagrams describing the syntax of each command say 'Don't worry'.

For example, here is Oracle 10.2's CREATE TABLE:



Yes, railroad diagrams. You can easily get lost in something as large as the SQL language, with its hundreds of commands, keywords and unexpected clauses, and railroad diagrams are the map.

When it came to writing our documentation for SQLstream, we of course wanted to include railroad diagrams to illustrate our dialect of SQL. It's possible to construct the diagrams by hand, but it's tedious, error prone, and it's difficult to get the diagrams to look consistent. Unbelievably, we couldn't find a tool to generate them, so we ended up writing them by hand.

Now I've gotten a little breathing room after the release of SQLstream 2.0, I took a couple of days to write an open-source railroad diagram generator. I've released it on Sourceforge, and named it Clapham, after the South London town which is home to the most complicated railway junction you ever saw.

This has been a nice return to old-school open source, with its mantras "release early, release often"; and "don't whine: contribute". The diagrams aren't yet as pretty as Oracle's, but we're getting there. Even though this is the very first release, and the project is barely alpha, it has already generated charts for LucidDB's not inconsiderable SQL grammar.

More details at the home page, and you can download release clapham-0.1.003 from SourceForge. Contributions welcome, of course.