Wednesday, April 11, 2012

"Big Data" is dead... long live Big Data Architecture

Now that just about every data-management and business intelligence product claims that it handles "Big Data", the term is approaching zero information content.

So, I'm shorting the term "Big Data". In the next few months, the marketers will realize that their audience realize that the term means nothing and, in accordance with Monash's First Law of Commercial Semantics, they'll start coming up with new terms.

Have any of those terms been spotted in the wild yet?

Though I'm still not clear what exactly Big Data is, I am fond of the term "Big Data Architecture". That term describes — fairly concisely, to the people who I want to understand me — the idea of a system where scalability is so important that it's best not to assume that there is only one of anything; where scalability is so important that it's worth revisiting all your assumptions; and where the raw performance of each component in the system is not paramount, because if the components can be composed in a scalable fashion, the system will meet its performance goals.

This architecture is going to be the standard for the kind of systems I build, so I think I'll be using the term "Big Data Architecture" for many years to come. If you can come up with got a good alternative to that one, I might just buy you a pint.

6 comments:

dai clegg said...

my favorite definition of big data? 'just a bit bigger than your current infrastructure will handle'.

sounds trite, until you realize that it's recursive.

there's always a bit more - unless you're dealing in something like you describe as a big data architecture.

Gustavo Cabrera said...

julian please I need your help, there's a problem that is driving me crazy, I asked for help to enterprise pentaho support but they don't give the solution, it's about the analyzer report plug-in, the problem is that at certain time intervals mondrian's cache is full and when you make a query it don't give results unless you clear the mondrian cache, what it could be, it could be the operating system or network problems please give me a clue I'm following you on twitter and google+ I'm gustavo cabrera @gustavoacm7

Julian Hyde said...

Gustavo,

This isn't the place for Mondrian support requests. Since you have a Pentaho subscription, escalate your support request. They will pull me in to help find the answer to your problem.

Julian

Gustavo Cabrera said...

I know julian and I'm sorry but this isn't new it's 3 months or more, I will take your advice, thank you and apologizes for any inconvenience.

Gustavo

Paul Baclace said...

Big Data Architecture, I like the sound of that.

Paul Baclace said...

Big Data Architecture; that sounds better than any combination of 2 of the words. Meanwhile, I've been using distributed system architecture for a few decades.

The term Big Data has been credited to a paper from NASA about satellite data. True to form, their biggest problem is obscure metadata.

NASA, CERN, and SLAC have scientific big data, but in common practice, big data is anything that does not fit into a regular RDBMS.