NoSql Stores and aggregations

In normal db world we are comfortable running queries in creating statistics (sum/min/max/avg), some percentile. Achieving this in efficient way we climb the pre-aggregated  world of olap across dimensions. If we need some kind of range/histograms/filters – we try to anticipate them and provide ui elements and again push of queries to datastore.  With bunch of in-memory columnar storages – we try to do them on fly. With MPP  systems we are comfortable doing it when required.

Over time need has come up to have aggregations to be created in declarative
manner.

Approach in ElasticSearch is pretty nice addition. ES works across the cluster.

In a way to understand – yes ES too are queries of search kind , but declarative model makes them better to fathom.

top_hits agg (coming in 1.3)
Parent/Child support not yet.

It definitely looks little bit more than facets – as the composability is key.

With Cassandra – you do the extra work while writing. But generally it is not
composable and can only do numeric functions. This requirement is tracked at high level here.

Database engines sort of gave away this field by quoting standards, issues of performance/consistency rather than creating decent enough declarative mechanism.90% of queries in DB world will become simpler. Materialized view was the last great innovation and we stopped there. Product designer/engineers should be made to look at the queries people need to write to get simple things done.

 

NoSql Stores and aggregations

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s