Amazon Web Services Summit: Real-time IoT Insights via SQL? Say Hello to Kinesis Analytics
August 17, 2016 Leave a comment
- IoT platforms from major cloud platform players are quickly emerging, bringing to bear the economies of scale only available with the public cloud to the task of instrumenting business.
- No stranger to the public cloud, Amazon announced a new streaming analytics service that will open up IoT to a broad range of ISVs and enterprises by blending the familiarity of SQL with both speed and scale.
While enduring some pretty extreme heat (an index of over 110 degrees) in New York City late last week, I had the opportunity to attend the Amazon Web Services (AWS) Summit, where I took in a particularly interesting keynote address given by the CTO and VP of Amazon.com, Dr. Werner Vogels. During his speech, Mr. Vogels trumped a familiar idea about digital transformation – basically, that companies abandon analog methods in favor of those digital – and the reasons why this is necessary not just to compete, but to remain in business.
Interestingly, a real driver of that idea emerged just before Mr. Vogel’s speech, when Amazon announced Kinesis Analytics. This new service adds an interesting and necessary twist to the company’s end-to-end IoT solution stack, which now looks like this:
- Device Management – Amazon IoT
- Collection – Amazon Kinesis Analytics
- Storage – Amazon S3
- Processing – Amazon Elastic MapReduce + Spark/Hive
- Analysis – Amazon Redshift + Amazon QuickSight
The great thing here is that, at the outset during data ingestion, Kinesis Analytics makes use of the tried and true ANSI-compliant SQL standard. With SQL, developers can build IoT applications using what they already know. And since clickstreams often contain both structured and unstructured data that’s not always well documented, Kinesis Analytics actually looks over the data stream itself and ferrets out an appropriate schema, assuming the stream has some structure to be found. Developers can use this suggested schema, or they can use a purpose-built editor to extend the schema or build their own. This alone will open up IoT analytics to ISVs and enterprise customers not blessed with a sizable data sciences contingent.
Actually, I take that back. The real magic happens when you start up your new streaming service. Let’s say you’re pushing data into Amazon S3. Once you instantiate a Kinesis Analytics stream feeding into S3, Amazon automatically scales (both up and down) to accommodate your stream, promising sub-second query response regardless of the amount of data streaming into Kinesis Analytics. The trick here, as with any and all AWS apps, is predicting and controlling costs. Hopefully we’ll see Amazon applying its own software (e.g., Amazon Predictive Analytics) to the problem of customer billing.
Once you set up Kinesis Analytics (two versions: Stream or Firehose), you have to specify a destination for the resulting data. Right now, that’s Amazon S3, Redshift and Elasticsearch. You can even throw the output at another Kinesis Analytics Stream; imagine the piping possibilities there. More destinations are on the way from Amazon. And I would imagine that Kinesis Analytics will itself learn to ingest a broader array of sources (right now, it prefers JSON or CSV) and sit between many disparate Amazon data processing and storage offerings.
On that front, there are many opportunities awaiting the Kinesis Analytics offering. I would like to see the company’s emerging Machine Learning service tied into Kinesis Analytics, for starters. Another interesting combination might include AWS Lambda, which would allow businesses to define and run business rules against a real-time analytics stream without any real overhead in terms of architecture.
Actually, even if Amazon doesn’t tie these products together tightly (by their very nature, they’re interoperable right now), customers can mix and match services to fit their needs. Still, I’d like to see a productized real-time data discovery solution built on this foundation, something that could populate an AWS QuickSight dashboard in real time via Kinesis Analytics, and do so for specific industries and use cases out of the box. Maybe next summer.