• The seemingly immutable law of data gravity, which has kept most large-scale data stores tucked safely away behind the corporate firewalls, is no more.
• Cloud platform providers of all shapes and sizes are actively redefining such laws, showing that even the largest data warehouse can live happily in the clouds.
While attending the aptly named Domopalooza conference in Salt Lake City earlier this month, what struck me the most wasn’t the number of concerts, ski parties, parties and after party parties put on by the host, cloud-borne BI vendor Domo. Oh, that sort of thing is quite normal for the unconventional vendor from American Fork, Utah. I was instead dumbstruck to learn of the vendor’s seemingly crazy, all you can eat cloud business model. That’s right. Domo doesn’t care how much data you dump into its proprietary data warehouse or how many calculations, transformations, joins, etc. you perform upon said data. There’s just one price to pay, and that’s a simple, per user fee.
Given the prevalence of the utility pricing model for cloud services, which promotes the idea that you should only pay for what you use, how can Domo make money at scale when it’s the one footing the bill for more than 100 trillion rows of data? (In case you’re wondering, yes, that’s the current size of Domo’s Amazon-based data warehouse.) Well, that is an as yet unanswered question for Domo as it pushes further into the enterprise marketplace where petabyte-scale data lakes are quite common. That’s pure economics. When considered architecturally, as an Amazon Web Services (AWS) customer, Domo is living proof that the seemingly immutable law of data gravity, which has kept most large-scale data stores tucked safely away behind the corporate firewall, is no more.
Cloud platform providers of all shapes and sizes are actively redefining such laws, showing that even the largest data warehouse can live happily in the clouds. There are many, well-documented reasons for moving data to the cloud: increased security, guaranteed availability, capital expenditure reduction, etc. The only problem is getting it there. After all, data transmission is itself a form of friction, which is itself a form of cost. For vendors like Domo, which live off of real-time data feeds from the likes of Twitter, Salesforce.com, Box, Marketo, Twilio, et al., this is not a problem. But for data-heavy, on-premises data centers, moving a multi-petabyte data warehouse to the cloud in one sitting can be both an expensive and dangerous proposition.
That’s why we’re starting to see some unique “lift and shift” solutions put forward by pure cloud platform providers like Amazon and Google as a means of enticing large enterprise customers to make the figurative leap from data center closet to public cloud. For customers with more than 12 terabytes of data to lift, Amazon offers a physical appliance (Amazon Snowball), which can be leased for a measly $250, plus shipping and handling of course. The data transfer itself is free. For a more measured, long-term migration path, Google is rolling out BigQuery Data Transfer Service, which works as a free (for now) managed ETL service. And there are many third-party vendors like Iron Mountain getting in on the act with full-service data center recovery and migration services.
Of course, none of that takes into account the on-premises software feeding into and running on this lifted and shifted data. Re-architecting legacy code and closing down physical data centers takes time. While cloud-ready Evernote only needed 70 days to push three petabytes of data to Google Cloud Platform, it took Netflix seven years to fully make the leap to AWS, for example. Regardless of the time involved, it is clear that the gravity of data has shifted to the cloud. If you need further proof, consider Teradata. The 36-year-old company that pioneered data warehousing, recently announced Teradata IntelliCloud, which effectively hybridizes its premises offerings (both software and appliance) with public cloud platforms (AWS now and Microsoft Azure later), where it can compete directly with Amazon Redshift and Microsoft Azure SQL Data Warehouse. Clearly the battle for the best data warehouse will take place in the clouds.