July 1, 2020

Data is dirty. Can we clean it up?

In 2020, most businesses are trying to use data insights to drive decision making, but there’s so much data that it becomes a massive challenge to draw valuable conclusions.

Most businesses have exponentially more data than they know how to use. In fact, most companies effectively use less than 5% of all their enterprise data. Companies see and hear about the dramatic impact machine learning can have, but this is only true if one’s data is not in disarray -- which is not always the case.

When you look below the surface, the data of the world’s richest, largest corporations is generally still a mess. This brings up a central problem: gathering data is really hard, but most people still think it needs to be done before any real insights can be gleaned. This creates an inverted pyramid, with companies seeing their data scientists spending anywhere from 50% to 80% of their effort on data gathering and cleansing rather than building models, gaining insights and solving critical business issues. We need to unlearn that way of thinking and approach data differently.

The data conundrum

Historically, we’ve been taught that data must be put into a lake, and from there, value can be extracted via queries. We’re then sold a few hundred different tools and services that companies need to integrate with and learn and promised that it’ll sift through the mess and enable innovation, that rarely actually happens.

Then come along consultants and “solutions” to digitize these tools and solutions to make them just a little easier, calling them innovations. The vast majority of the time, these so-called innovations don’t solve the core data-availability problem or the time-to-value problem. Many times, the proposal for “solutions” is in fact hiring outsourced software engineers and data scientists. In certain instances, these solutions do work and the results for businesses can be lucrative. But faster speeds, new data feeds and better tools aren’t typically solving the big problems that most CEOs face. Especially in today’s environment.

Despite the excitement Silicon Valley and the rest of the world’s AI entrepreneurs have created around the industry, the world doesn’t always need another AI tool. What is needed now are practical AI and ML solutions that deliver real business value - reduce spend, engage customers and increase revenue. That is where the real value is today.

Most machine learning efforts fail

Artificial intelligence and machine learning have occupied an outsized portion of the modern technology conversation. We all hear about AI and ML so regularly; you’d think that they’re the silver bullet to solve all business inefficiencies, but the truth is many are still coming up short.

Most data "solutions” and companies only provide surface-level insights on quick timeframes. If you want deep insights, current approaches typically require long timelines with tremendous expense and high risk. Traditionally, it takes a lot of work to investigate expensive and complex organizational challenges which companies need the most help solving.

Bringing innovation into the boardroom or across business units requires more business depth than most companies are capable of addressing, with or without better tools. AI and ML are hard enough – depending on which statistic you read, between 65% and 85% of all AI efforts fail.

The hard work needs to be done. ML benefits need to be brought into the C-level and the board room. There is opportunity for businesses who are willing to roll up their sleeves and simply get busy.

Forget the hype – do the work

The hype surrounding AI, ML and data is not about to go away. Businesses need to see through this hype and know that when data is used to solve surface-level problems, all you get are surface-level results. When you utilize powerful AI and ML, and when you have a deep understanding of your data, this is where true transformation occurs.

Unfortunately this hype impacts how corporations choose to spend and commit on data solutions. On one hand, companies may see the true value of their data and may fall in to the easy trap of accidentally downplaying its importance. In many cases, they give it less interest and attention than their fixed or human assets.

On the other hand, corporations may listen to the hype and assume there are always high costs, long for data preparation and high likelihood of failure for implementation. Which means there are also long lead time periods on return of data without realizing there are solutions or products out there that are in fact very low risk and can be transformative to the organization.

Data is an invaluable asset for saving an organization time, money and human energy and opening up potential new avenues of business, especially when disruption occurs.

You need to follow the basic principles of:

  • Understand the hype and gain clarity on what is real and proven versus what is not
  • Understand the effort you need to spend on your data (1st party data) and also where you can use 3rd party data – pay particular attention to partners and technologies who make this process fast and easy
  • If you have made data lake investments, look for quick time-to-value ways to unfreeze the data – such as using proven, pre-built machine learning (ML) models which have been applied to use cases across multiple industries
  • Time diligently spent upfront will flip the normal pyramid and eliminate the hurdles others went through, meaning your data scientists or partners will be spending 80% of their time creating value for your organization and only 20% on data, or that your organization can see transformative results in just weeks versus months


It’s easy for companies to downplay the importance of their data, giving it less interest and attention than their fixed or human assets. Why is that? It is because you can’t see it if it sitting in a data lake.

But we’ve discovered in the last few years that, when applied to the right challenges within a business, data is invaluable for saving an organization time, money and human energy.

That’s why ElectrifAi is helping customers across healthcare, government, financial services, telecommunications and other sectors change the way they do business through proven, pre-built machine learning models that save cost, increase customer engagement and drive revenue.

We’re here to help you make sense of all of this and deliver excellent results.