Poetry and Coursera

I love poems/short stories and pick them up from time to time when I get some solitude.  Following are 2 poems (one by W. Stevens and another by Bodhisattva in hindi).  Then there is a link for coursera for poetry.

Continual Conversation With A Silent Man (Wallace Stevens)

The old brown hen and the old blue sky,
Between the two we live and die–
The broken cartwheel on the hill.

As if, in the presence of the sea,
We dried our nets and mended sail
And talked of never-ending things,

Of the never-ending storm of will,
One will and many wills, and the wind,
Of many meanings in the leaves,

Brought down to one below the eaves,
Link, of that tempest, to the farm,
The chain of the turquoise hen and sky

And the wheel that broke as the cart went by.
It is not a voice that is under the eaves.
It is not speech, the sound we hear

In this conversation, but the sound
Of things and their motion: the other man,
A turquoise monster moving round.

A course for poetry is on coursera -  http://blog.coursera.org/post/44385047250/my-coursera-experience-modern-poetry-for-a-physics

A poem about a poem by Bodhisattva (from jankipul)

कमजोर कविता
वह एक कमजोर कविता थी
कवि उसे छिपाता फिरता था
कुछ बहुत कमजोर विचार थे उसमें
कहने के लिए कुछ थोड़े शब्द थे घिसे पिटे
न अलंकार था न छंद न समास
नीरस सी जहाँ-तहाँ रूखे भाव उफनाए से दिखते थे उसमें।
उस कविता में न थे गंभीर तत्व
ऐसे ही थी जैसे बेस्वाद पापड़-चटनी-अंचार।
वह बहुत कमजोर कविता थी
जैसे असुंदर बीमार बेटी
जैसे कुरूप कर्कशा पत्नी
जैसे अपाहिज औलाद
जैसे मंद बुद्धि बैठोल पति
जैसे धूल धूसर जर्जर घर
जैसे फूटी ढिबरी
जैसे खेत ऊसर बंजर अनुर्वर
(read rest on the link above)

Teched 2013 – Bengaluru & Pune

TechEd 2013 is here, wow ten years on and I have been involved with TechED India as speaker, organizer. It has been a great ride and for last few years we have tried different things. Main theme being getting people different from MS background getting invited to speak.
This year again we have
1. Rajat(ex vertica dev) and Siva(greenplum query optimization guru) talking about Hosted Data platform(Qubole) – on how their customers are using the platform and finding it useful and changes they are making to provide scale and performance.(yes that is big Data coverage, we are covering HDInsight in data track)
2. S. Anand – famed and very well known speaker/data scientist sharing his wisdom on how data with right kind of visualization drives pragmatic insights in compressed time.
3. Saurabh Gupta from HFI – takes on UX and explains misconceptions that he has seen over years.
4. Amit Bahree & Ramkumar from Avanade – explain their take on how to develop multi-device applications using various toolsets. (we tried really hard to get Xamarin )
5. Mandar Kulkarni from Netmagic will try to share his experience around managing/provisioning datacenters.
6. Pete Brown is travelling all the way from US to share tips around building LOB apps on Windows 8 platform.

We have bunch of Sharepoint/O365 sessions lined up as we have new release on both fronts.
1. Abhisek & Aniruddh will share what has changed in new platform and what should/can be migrated.
2. Abhisek will also be doing a lap around 0365 explaining new feature sets relevant to SI/ISV crowd.
3. Amartya from Infosys plans to share best practices in development/deployment for Sharepoint culled from years of sweat and blood. (only in Bangalore)

Then we do have Azure related session by Sudhanshu Hate from Infosys around creating hybrid applications on Azure platform. (only in Pune)

How can we forget the data platform

Vinod Kumar delves into Availability options in SQL server 2012. He will go in deep as I know he was planning to write a book around the subject.

Dates -
Bengaluru – 18/19th March
Pune – 25/26th March

Website – https://india.msteched.com

What I could not achieve 

Functional language coverage :( – sadly one of the most experienced person is in Pune but we could not cover t&e. Clojure rocks ! (BigML/Cascalog is elegant)
Javascript – Again css/js is the future for long term and wished we could cover it.
Fun Stuff – Kinect based fun stuff – again lack of t&e

Machine learning too was dropped as the key person was offline :( .

Hosted Hadoop platform – why Qubole is setting the pace

Disclaimer – I do not work for Qubole. I am also not an analyst. I am just fascinated by the Database spring that we are witnessing for last couple of years. I maintain the hope of  eventual cheaper/better/faster option(s) . (update) Heck I still remember what Data General used to offer – the hardware to software solution. Way we used to schedule jobs, ask for quotas and damn – lifting the disks too from storage area to compute area. That was compute/storage on demand too :) . Sir you want to analyze sales of quarter x – please bring your data, schedule your job and wait out on the console for your turn.

There are three kind of approaches vendors(new or old) have taken to Hadoop’s presence.

Traditional DB vendor  with “integration story” 

1. We will help you store/retrieve cold/processed data in/from hdfs, you can do your fancy jobs there, aggregate the data and we can extract it back here. Our tools can help do dumping/extraction/cleaning up. Our existing engine can help you serve workloads much better.

2.  We can do a query across two stores – relational & hdfs (using own query mechanism or integrated into hive).

Traditional DB or the newSQL vendor with “Memory is cheap” –

Here traditional db is the dominant usage scenario – and messaging remains-  not everybody is fb/twitter/amazon to require these solutions.

1. We can add massive memory and still use the simple database without changes to access/store models (will support Mohan, will support notions of buffer pool/locks). This will suffice for many without sophisticated hw(infiniband/storage magic) underneath.

2. The columnar access pattern dominates the workload, let us optimize for it, compress/store those maps in memory.

3. Let us take a leap of faith and do away with “buffer pool” and related latches/locks but maintain parity with SQL, ACID which developers understand.

Traditional DB vendors have challenges for

- Horizontal Scale out based on data   as partitioned data requires awkward compromise for the columns/keys

- On other hand as the shared nothing scaleout happens – maintaining developer calm by providing min consistency – pushing changes in sync to x replica, pushing reads to replicas becomes an issue.

Pure Hadoop based vendors 

- Get more efficient filesystem, add memory based cache, add something more than just mr pattern, compression at storage, HA(fixing it in innovative ways), improving operations (overcoming accidental deletes/differential backups/replications across dc? )

- Push changes which will benefit everybody into main trunk in public repository (YARN for instance or HBase)

Hosted Hadoop & services

1. Will help you create UX/Command line based clusters, change settings, monitor conditions based on a distribution X.

2. Will really go ahead and fix/add things which are missing and make the hosted platform more appealing

3. Add security features(authentication/storage)

Qubole lies in the 2nd type of hosted vendors. Why did they attract so much love and respect from me personally?  A vendor who goes and creates following deserves all the kudos.

1. Way to create quotas, Kill Mode, TestMode to data extraction/massaging world – knows what happens in real world. (Mistakes/learning on the job/bad data most of the time)

2.  Missing features – upsert (how imp is that for data movement), move data out of partitions of hive(again solving practical issues)

3. Really take advantage of cloud vendor’s abilities – add/test hybrid/spot instances/ (bidding/timeout for the instances/%age of spot instances)

Unsolicited advice

-If they add on-premise option to work with traditional private cloud provider – this will end search for other options. 

-Working with ISVs to bundle it is alltogether different ballgame.

Another disclaimer – These are my own opinions as humble data person and do not reflect my employer.  I just look at what is delivered/documented in public domain.

(update) This does not mean pure hadoop vendors are not ahead in fixing enterprise issues/meeting requests, actually they are far ahead, it is the hosted platform which is the point of discussion in this 1o min post. Some of the pure hadoop dist vendors have tougher task of thinking through what remains inside vs available outside. Training/Mentoring/competing with existing enterprise db/app sales can’t be long term goal when people are focusing on “solutions” - http://www.theregister.co.uk/2012/11/11/police_ibm_analysis_crime_prevention/. This post also bypasses excellent MPP systems and innovative ways they integrate with HDFS or Hadoop ecosystem as I have never been able to look (forget access).

नको नको रे पावसा …!! – इंदिरा संत

The days when I long for rain are many. I believe I can be happiest in a clouded place , the ones where clouds of different shapes color play a riot in the sky. That also means I love rains the ones who listen to you. Here’s a beautiful poem by Indira Sant. She converses with rain, actually requests to take bring her lover back from a distance with care by using lightning :) , not to dirty her backyard. 

नको नको रे पावसा …!!

 नको नको रे पावसा
असा अवेळी धिंगाणा
घर माझे चंदमौळी
आणि दारात सायली;
 
नको नाचू तडातडा
असा कौलारावरन,
तांब सतेलीपातेली
आणू भांडी मी कोठून?
 
नको करू झोंबाझोंबी
माझी नाजूक वेलण,
नको टाकू फुलमाळ
अशी मातीत लोटून;
 
आडदांडा नको येउं
झेपावत दारातून,
माझे नसेूचे जुनेरे
नको टांकू भिजवून;
 
किती सोसले मी तुझे
माझे एवढे ऐक ना,
वाटेवरी माझा सखा
तयाला माघारी आण ना;
 
वेशीपुढे आठ कोस
जा रे आडवा धावत,
विजेबा, कडाडून
मागे फिरव पंथस्थ;
 
आणि पावसा राजसा
नीट आण सांभाळून,
घाल कितीही धिंगाणा
मग मुळी न बोलेन;
 
पितळेची लोटीवाटी
तुझ्यासाठी मी मांडीन,
माझ्या सख्याच्या डोळयांत
तुझ्या विजेला पाजीन;
 
नको नको रे पावसा
असा अवेळी धिंगाणा
घर माझे चंदमौळी
आणि दारात सायली….
 
इंदिरा संत