Let’s Drain the Database Swamp! (Okay, Just Kidding)

B. Shimmin
B. Shimmin

Summary Bullets:

  • Enterprise buyers looking to simplify their data integration woes through centralization are missing the value inherent in diversity.
  • Database diversity (actually diversity across all workloads) should not only be welcomed but actually sought after as a means of blending opportunity with capability.

Back in the ‘90s, the average enterprise maintained not one, not two, but seven databases on average: one for transactional information, one for data mining cubes, one for server logs, etc. Today, that has grown dramatically thanks to the proliferation of NoSQL-style databases built to handle unstructured, semi-structured and polymorphic data. Add to this the ever-expanding list of data storage options across public cloud data platforms, and you’ve an honest to goodness embarrassment of riches.

Twenty years ago, the idea of maintaining multiple databases was seen as a risk and a sure fire way to stifle innovation. The whole idea works against a company’s desire to constrain database management costs and divert the many dangers that come with a splintered data layer. In short, it’s hard to lock down and integrate data across multiple, disparate platforms. You either live with the chaos or you invest heavily in a centralized data management scheme (e.g., a full-scale data warehouse or data lake) that on its own could be a surefire way to block ideas like digital transformation in both cost and complexity.

Fast forward to today, and quite frankly, nothing has changed on that front, except that this diversity now works in our favor. Let me explain. First, there’s nothing wrong with diversity. Actually, there is no other option. There is no single database to rule them all. Unlike programming languages, which are somewhat geared toward developer taste and experience, databases are at the end of the day best suited to one workload over another. Need ad hoc reporting? A good old fashioned RDBMS like IBM DB2 will do nicely. Need to work with unstructured data like log files? Pick up a NoSQL, schemaless database like MongoDB. Need to mix and match transactional and analytical workloads? Go with a NewSQL database like SAP HANA.

So, what’s the upside of having to manage SQL, NoSQL and NewSQL databases? Well, balancing workloads across disparate platforms and databases carries with it a number of advantages. Maintaining a diverse database portfolio offers flexibility in accommodating new business patterns and absorbing organizational change. Employing a multi- and hybrid cloud data layer also creates resilience in terms of both redundancy and scale-out/up capabilities. Lastly and most importantly, diversity engenders opportunity. But, for any of that to play out, there’s been one missing element that only in the last year or two has emerged as a viable option; that is the combination of metadata and artificial intelligence (AI).

If you build a metadata layer that spans both database and data warehouse, and if you apply machine learning (ML) predictive modeling to that metadata, you can start to do some very interesting things. You can better govern use of data sources as they evolve, dynamically applying routines such as data masking, for instance – something that might come in handy this May when general data protection regulation (GDPR) laws come into play. You could also predict and scale data resources to match anticipated needs, moving workloads and data itself to and fro as needed. You could troubleshoot performance issues by looking at problems in the context of the business itself. And you could also speed integration and development initiatives by augmenting and automating data preparation routines.

The trouble is in setting up these ML models and in creating a consistent metadata layer above the bubbling chaos of database diversity. That will take some effort. Fortunately, many data and analytics players like SAP, Oracle, IBM and others are setting up what they term data hubs as a means of establishing a lightweight metadata layer. Other companies specializing in data processing and integration like Informatica are actually looking to set up an even broader layer of AI-infused metadata that can unify database, analytics, cloud and application workloads.

However, we do not expect to see such solutions sweep across the buyer landscape right away. They are not plug and play. Potential buyers first need to build an understanding of their data assets and invest in the data sciences before they take the plunge into AI-informed data unification. As with anything worthwhile, doing so will take time and effort.

What do you think?

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.