
[From logicnest via infosthetics.]
Don't worry, this isn't turning into a humor site. Normal service will resume shortly. The last couple of days, I just needed a laugh.


"New and revolutionary solutions and methodologies are coming together to handle the exploding data volumes real-world systems are being required to store and serve up. Supporting ever-larger databases, with ever-increasing demands for getting "answers" faster, requires a new way to approach the problem."Weiss's solution uses improvements to storage management, but I doubt that it would satisfy Google's requirements, or even the price/performance requirements of a medium-sized internet media company.
function(arg {, arg}) OVER window-specificationand grouped aggregation, which is of the form
function(arg {, arg})and requires a GROUP BY clause. (If the GROUP BY clause is not present, 'GROUP BY ()' is assumed.)
SELECTbut this is illegal SQL. Why is it illegal? Because the LAST_VALUE function (like FIRST_VALUE and RANK) is a windowed aggregate function and is only meaningful on an ordered set.
day,
ticker,
SUM(shares) AS volume,
LAST_VALUE(price) AS closingPrice
FROM Trades
GROUP BY day, ticker
SELECTWith the OVER clause, LAST_VALUE is now a windowed aggregate function within the context of a GROUP BY query, which was previously illegal. Every windowed aggregate is applied to a window, so what is the window in this case? We want the window to contain all of the rows with the same day and ticker value, and to be sorted by timeOfDay. In other words, the window inherits the GROUP BY columns as its implicit PARTITION BY clause. It is as if they had written
day,
ticker,
SUM(shares) AS volume,
LAST_VALUE(price) OVER (ORDER BY timeOfDay) AS closingPrice
FROM Trades
GROUP BY day, ticker
LAST_VALUE(price) OVER (PARTITION BY day, ticker ORDER BY timeOfDay)Now, if you know that I work for SQLstream, you will guess that I am motivated to make this work for streaming queries. A streaming aggregation query over the Trades stream would look like this:
SELECT STREAMThis is identical to the traditional, non-streaming SQL above, except for the STREAM keyword that tells SQLstream that the result of the query should be a stream.
day,
ticker,
SUM(shares) AS volume,
LAST_VALUE(price) OVER (ORDER BY timeOfDay) AS closingPrice
FROM Trades
GROUP BY day, ticker
SELECT STREAMThis form uses SQLstream's system ROWTIME column and the 'FLOOR(datetime expression TO time unit)' operator, and so can dispense with the day and timeOfDay columns. Also, streams are ordered by ROWTIME by default, so we can abbreviate 'OVER (ORDER BY ROWTIME)' to 'OVER ()'. This form is more terse, and more typical of how the query would be written in a SQLstream application, but the previous form works also.
FLOOR(t.ROWTIME TO DAY),
ticker,
SUM(shares) AS volume,
LAST_VALUE(price) OVER () AS closingPrice
FROM Trades AS t
GROUP BY FLOOR(t.ROWTIME TO DAY), ticker