What do ISVs trying to bring their solutions to Cloud want?

Easy to understand Billing model  

Make it easy to reason about the billing model, simpler than what is  exposed to “pay per use”. I need to use it every day. It should just work without surprise. Do not expose the – “you looked at me – y $, you asked for that z$”. Please provide reliable API that I can utilize for creating SaaS applications.

Tell me about your maintenance cycles (please) –

For end customers using a  solution, downtime communication is essential. Ideally 24*7 operation is required but we can craft a solution which can deliver minimum viable  option at lower cost.

MultiTenancy

It means a documentDB/Aurora or Search should have ability to create “tiers” for free/shared instances where I can club in folks for my “freemium tier” without paying production amounts. As it is very low margin business let us find ways to make it simpler. This is little bit different from me creating a shard instance.

Support for MultiCloud Libraries/stacks

We need support for jcloud, fog, libcloud across Provisioning, monitoring, billing of all possible assets.  We understand it will not be a odbc standard but something more workable. Provide deeper integration into chef/puppet/ansible/salt with better templating than promoting custom “provider models”.  Thanks for integrating with github…push it as alternative to store assets. So that config (testing/deployment) etc everything is coded up and stored in github or something similar.  Thanks for support for docker, coreos.

What azure is supported only for blobs in one of them somewhere(libcloud)? No powershell is awesome but not everyone’s favourite piping tool.

Win-Win

I bring you x $, you provide me 0.20%x. No really – make the partnership work with real people rather than english.  Let us find a way to make the adoption faster.  Help us unseat the existing partner brokers who are deadweight – whose deployment/AMC (people/cost) models are a challenge in pure cloud model. That air cover we talked about needs to be about partners, partners, partners. Help unlock the cio-tech-team ice. It is not about x% discounts on the platform.  Focus on that annual sign up stuff for certain software licenses will not open door to growing pie.

Here is shout out to Vijay who joined MongoDB and he correctly  points out “lack of lever” with both customer and seller – there is no  complexity. http://andvijaysays.com/2014/03/25/are-we-there-yet-cant-wait-to-start-my-new-adventure/

In cloud based setup simplicity is much more stark.

Support

Real support in terms of what does not work rather than “green my  scorecard” – so just use it(shove down my throat). Own up the support  issues and help bring down my costs and increase your spread. Get folks who understand both business and technology(people outside use different from what you sell). Let us know at what is coming down which can potentially make us commodity. Be honest about it.

No unless I explicitly tell you don’t push a service. I will pick unique services based on their strength, honestly I will. Love completely hands off 99.99 % Sql Azure where I get backup, HA all in great price. Wished that infra was available for others to host stuff like DB.

 Make it supportable

Other OS is as useful and widely deployed so tools for picking up monitoring information should become better. IIS is a great tool but so are nginx , apache and their friends ha-proxy, squid, varnish. Make “separation/divorce” easier. Easier to withdraw data, easier to withdraw configuration settings – UX should reflect what is possible through powershell, cli and at worst language specific rest bindings. Preferably a language which runs on all platforms.

Advertisement
What do ISVs trying to bring their solutions to Cloud want?

Azure throttling errors

Most of the cloud services provide elasticity  creating illusion of unlimited resources. But many  times hosted services need to push back requests to  provide good governance.

Azure does a good job providing information about this throttling in various ways across services. One  of the 1st service was SQLAzure which provided error
to help the client to retry. Slowly now all services are providing information when they are throttled. Depending on whether you access native API or REST endpoint you get this information in different ways.  I am hoping slowly comprehensive information from services and underlying resources like network, cpu and memory starts percolating like storage so that client, monitoring systems can manage workloads.

Azure DocumentDB provides throttling error and also the time after which to retry.
(HTTP error 429 ) . It definitely is ahead of other services for providing this exclusive information.

Azure Storage on other hand provides information to the native client so that it can back off retry. It also pushes this information into metrics. A great paper exists which provides information about Azure transactions and capacity.

SQL Azure Throttling    – was one of the 1st services to provide throttling information to due to crud/memory operations(45168,45169,40615,40550,40549,40551,40554,40552,40553).

Azure Search throttling provides HTTP error 429/503 so that client can take proper action.

Azure Scheduler provides HTTP Status 503 as it gets busy and expects client to retry.

Azure Queue, Service Bus Queue both send back 503 which REST clients can take advantage of.

Biztalk services  provides “Server is busy. Please try again”

Over # of years we always request customers to exploit Azure and one of the ways is to actually is to work with hosted services and plan workloads by catching these kind of errors. Some of the customers like SQLAzure’s throttling so much they wished they want some of those soft/hard throttling errors in on-premise database.

Most of the Azure services do not charge when quotas are hit or throttling is done. Idea for the client is to back off and try again. I hope though the “monitoring” becomes better say for example in case of biztalk services – a client should be able to query the “busy-ness” since it has to try after systems becomes less busy. SQlAzure’s retry logic has been the well codified and understood over years.

Just in case you wonder other public cloud services too have throttling?  Public cloud services are shared infrastructure and implement throttling for governance. Throttling is implemented and it is exposed in different ways. DynamoDB for example – has 400 series of error codes with specifically LimitExceededException,  ProvisionedThroughputExceededException, ThrottlingException as an example. Almost every service has 400 series of errors with Throttling as specific exception.

Azure throttling errors

What do ISVs trying to bring their solutions to Cloud want ?

Easy to understand Billing model 

Make it easy to reason about the billing model, simpler than what is exposed to “pay per use”. I  need to use it every day. It should just work without surprise. Do not expose the – you looked at  me – y $, you asked for that z$.

Tell me about your maintenance cycles (please)

For end customers using a solution, downtime  communication is essential. Ideally 24*7 operation is required but we can craft a solution which can deliver minimum viable option at lower cost.

Business relationship
Active Go To Market without very informal Partner network requirements highlighting where we are and how to move forward. Help us unseat the existing partner brokers who are deadweight – whose  whole deployment models are a challenge. That aircover we talked about needs to be about partners, partners, partners. Help unlock the cio-tech-team ice. It is not about x% discounts.

Support
Real support in terms of what does not work rather than “green my scorecard” – so just use it(shove down my throat). Own up the support issues and help bring down my costs and increase your spread. Get folks who understand both business and technology. Let us know at what is coming down which can potentially make us commodity. Be honest about it.

Win-Win
I bring you x $, you provide me 0.20%x. No really – make the partnership work in a simple way. Right now this model is challenging to say the least. At times –
build/test/bill model is a challenge to agility. Let us find a way to make the adoption faster.

Here is shoutout to Vijay who recently joined MongoDB and he correctly points out “lack of lever”  with both customer and seller – there is no complexity. http://andvijaysays.com/2014/03/25/are-we-there-yet-cant-wait-to-start-my-new-adventure/

In cloud based setup it is much more stark.

What do ISVs trying to bring their solutions to Cloud want ?

5 years and going – Adopting Cloud Azure

It has been nearly 5 years since we started working on Azure cloud platform. We are small team and finally I thought we have worked enough to call ourselves little knowledgeable on various platform parts- ours and others, what works and what does not, how to make the move. Microsoft Azure has evolved over years to support these requirements. It is still long way to go…but path is right.  We help guide customers prioritize the workload which they can move as part of comprehensive briefing in adopting Cloud platform either as private, public or hybrid.

I have worked with customers who want to move everything in cloud hoping it will mask challenges on-premise (scale, performance, monitoring) to pragmatic folks who pick and choose workloads like email first(established enterprise challenges are myriad – no email sending/only receiving).  There are many folks who want to take advantage of local infrastructure and move forward. Some folks just pick simplest and easiest – backup on cloud to get feet wet. Others push dev/test to cloud to minimize local requirements. Path varies for Enterprises & ISVs and we have lot to offer.

When migrating applications to cloud platform, simple evaluation measures are

  1. Legal issues (any issue with putting data on cloud, encryption required- implies key management)
  2. Performance requirements and verified by tests
    1. IO/Network/CPU – end user workload – have a plan for these tests to ensure end user and perf testing is done.
  3. Is the application end of life or the tools used not supported anymore – like VB6/old client server power builder applications or dos/QT based application, they will provide lot more roadblocks than progress.
  4. Advantages one wants to exploit of cloud – scale out, elasticity – Is that possible with existing applications. Should they be modified?
  5. Availability requirements  (What are the availability requirements and what you can live with) – one data center vs DR to others – data movement/deployment – warm/cold. 
  6. Is system in this form able to meet SLA. Otherwise modify/decouple to achieve the availability. Simplest requirement of handling throttling, failures of underlying infra requires change
  7. Requirement with Integration with On premise applications (authentication/Antivirus), pushing/pulling of data.
  8. Operational stuff – How will you do ALM –deploy a set of software(os+dependencies+app+topology), push patches/updates, backup, DR for for these applications.
  9. Monitoring requires culture change and we have seen developers jolted out of lethargy to adopt/learn new tools and work with admin folks to provide SLAs of performance and availability using canaries, graceful-degradation,failover frameworks. Existing on premise tools are becoming better but cloud based ones like NewRelic/Boundary for backend applications Gomez/Keynote for reachability/availability, corelate with frontend tools like Errorception  are easier to adopt.

Non Functional stuff while doing migration

  1. Chalk out responsibilities of the involved people (application owner, Services provider, vendor)
    1. Steps/Goals of each milestone (performance/availability/workarounds)
    2. Chalk out support steps for each application (escalation steps )
      1. Placement of cloud expert locally to handhold/support + dedicated support (concepts like storage/availability group or zones from other cloud platform)
    3. Support of ISV apps by guiding them over time to the platform
    4. Chalk out how migration monitoring is done  (easiest – daily/weekly vs fire meetings and owners from stakeholders)
    5. Production monitoring for capacity/outage/testing of failover (tool based testing – simianarmy of your own)

I have seen customers surprised as they see the amount of work they sometimes need to do to adopt cloud (availability/monitoring/performance – shared infrastructure ). Since we get customers who have tried/used other cloud platforms – so it is always fun and encouraging to see something what Google compute announced (transparent vm migration ) or what AWS added – Read replicas – life in general is becoming easier to be a developer. Although it means adapting/changing to new platform , it is a sweet journey.  With frameworks from Netflix/Twitter/Linkedin and other folks – one can literally start hit ground running. This really is best time to be a developer.

Normally our help is taken by field facing teams serving our customers – but do feel free to reach out to mtcbang at microsoft dot com for help in adopting cloud platform (public/private/hybrid) or specific workloads like database , integration, sharepoint, security(device/application/infra), adopting byod-managing devices or windows 8/phone .

There is another small announcement – we are looking for a person – preferably a woman stationed out of Mumbai  as part of our team(http://www.microsoft.com/en-us/mtc/default.aspx ). Requirements are very simple – listen to customer – have empathy – understand the pain and resolve it using right tools and technologies.

  • Could mean using/suggesting architecture change – decouple/monitor/canary/shard the db at app or db/use reactive pattern  or help create greenfield solution from ground up (20-25% of job)
  • looking at the code at javascript, c#, java – choose your poison and suggesting changes (all associated tools/issues right from ide – webstorm vs x to express vs y,idiomatic way – know your monitoring of running modern apps – resources like memory/io/cpu/nw – identifying patterns of problems)
  • Look at performance/maintenance/availability issues around sql server/mysql/oracle (yeah we are ok with person having postgres/sybase/oracle experience as techniques remain same – tool name and methodology changes a little)
  • Ability to pick up redis/cassandra/mongodb/hbase  – yeah – we have get many of them too.
  • At times be generalist and use capabilities of sharepoint and showcase how it helps enhance productivity using its arsenal. Yes generic sharepoint knowledge and associated tools information is good.
  • Have an Idea what BI means – facts,dimensions, – what are the tools which one can use and newer ones (hadoop lake + aggregations via jobs ) + near real time stuff (streaminsight ,storm and friends).
  • Have generic idea about messaging platforms (connect to source/destinations via transforms, add routing/orchestration ) – it is okay not to know this particular piece – but ideally exposed once or twice in the field
  • Have basics in place – which data structure I can use on mobile device constrained by memory/storage for storage/query or what makes pragmatic sense – connecting two systems – this is very much required than a “particular way of product-feature” bent as we need to think of different ways and brainstorm the ideas with customers
  • Open mind to pick up things right from phantomjs, d3.js or sci-kit to optiq and adopt/use them

Again if you know somebody or you are interested – please reach out @ [myid] at microsoft dot com – [govindk] . We do whole lot of other things like – http://dreamadream.org/2013/02/funday-at-the-microsoft-technology-center-mtc-bangalore/ and generally do not travel – are very flat and fun (shhh – we are known to do monday movies – beer/food at new joints . Basically a no-bs , do-your-thing workplace.  You will have some of the most fun people like Vinod – http://blogs.extremeexperts.com/about/  (book writer, well known speaker and great friend) to Anand – https://twitter.com/tweetmsanand ( our private cloud, infrastructure, security herder ).

5 years and going – Adopting Cloud Azure

Update – Article on Cloud and Hive

Suprotim and Sumit pushed me to publish an article on “Decision Making Pivots for adoption of Cloud” – This is basically gist of guiding principles we use to help customers to migrate to cloud. We have few variations for enterprise strategy where workloads like exchange(email), sharepoint (collaboration) or CRM need to move to cloud. We have a colleague MS Anand who helps customer on the Private cloud adoption front to create efficiencies out of existing infrastructure. Here is the document which focuses on Azure and was part of the magazine.  Azure Adoption – pivots to help make right decision

I just completed something else I promised Suprotim – an article on comparing Hive for people who are used SQL as dialect to interact with Database. Although comparison of database and hive is not strictly apple-to-apple comparison. I wanted to take an approach where understanding BigData does not become a burden of learning MapReduce/hdfs and overall hadoop ecosystem. It is much easier to start  doing something very simple that we do with regular data store and try to do it with Hive and then start looking at differences. It also helps to understand why HDFS and map-reduce are helping in addressing scale and availability for very large amount of data. Although there are tools like Pig/Cascalog/Scalding/Cascading- I decided to focus on HiveQL as it is closest to SQL dialect with simple intention of not introducing many new things simultaneously.  Once the article is out for a month in the magazine – I plan to share it here again or you can pick it from http://www.dotnetcurry.com/magazine/dnc-magazine-issue2.aspx (updated – 1st Sep 2012) once it comes online.

And if everything goes allright with help and push from Vinod & Pinal – I will devote energies toward something more useful.

Update – 1st Sep 2012 – Things I have not covered in Hive Article-

* There is tab for DisasterRecover on hosted azure which when switched on –

It is not clear WRT NameNode whether FsImage and EditLogs  are backed up every x minutes to another “managed” location.

* Is there a secondary namenode where log/checkpoint gets shared to.
* There is a secondary namenode but execution of command against it in RDP simply hangs.
* WRT HDFS data
* Is data snapshotted/backed up to Azure storage and takes advantage of inherent replication there. (updated – preferred storage is azure storage rather than local nodes)
* WRT Hive Metadata – Is it backed up to “Managed” location every x hours/minutes (no clear idea)

* If NameNode crashes – Not clear now (WRT to HadoopOnAzure) – Whether AppFabric services inherent in Azure are utilized to identify and bring it up  & use the earlier “managed” location  (using –importCheckpoint option)

* Upgrade & rollback of underlying version will be part of HOA’s lifecycle management. Assumption here is at present one version will be prevalent across tenants. Upgrading individual clusters to different  version is not supported. (updated-2013 December – upgrade to new version is supported)

*  Addition/deletion of nodes into existing cluster (still a manual job)

*  Adding incremental data (updated – normal import process)

* Adding a Fair scheduler

*  Monitoring the job progress/cancellation (updated DEc-2013 – powershell based)

* Identifying bottlenecks  in JVM/hdfs settings (completely roll your own)

* Dealing with hadoop fsck identifying bad/missing blocks and related issues (do you own)

* Rebalancing the data (do your own)

Updated – 20th Sep 2012

Cloudera posted a wonderful article on using flume, oozie and hive to analyze the tweets. http://www.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop/

Update – Article on Cloud and Hive