Five Common Myths of Big Data and Analytics Solutions

Five Common Myths of Big Data and Analytics Solutions

Posted on June 2, 2015 by TheStorageChap in Big Data and Analytics, Data Lake

Helping the business realise the benefits of a data driven strategy is at the top of many CIO’s agendas but as I talk to customer’s about their Big Data and Analytics aspirations I consistently hear five assumptions which are a little off-base. Here are my “Five Common Myths of Big Data” – would love to hear your thoughts and comments on if you’ve experienced any of these expectations within your business or beyond, or would add to the list.

1. The Business knows the Outcomes that they want.

The biggest challenge for most organisations considering the use of Big Data and Analytics is understanding what I refer to as the “Art of the Possible”. There are so many opportunities related to the use of Big Data and Analytics that it can be difficult to identify them and then prioritise them. It’s rare that an organisation will implement Big Data Analytics for a single outcome, though a single use case might catalyse an initial investment or project.

2. Your Data is available at the Push of a Button.

Once you have identified the business outcomes you wish to achieve through the use of Big Data and Analytics the second myth is that the data is simply ready for you to start analysing. Whilst there are an ever increasing number of data sources becoming available within the connected world in which we live, the reality is that of all of the steps related to the Data Analytics Lifecycle, Data Preparation is the most intensive and time consuming. Data needs to be cleaned, opened up from existing applications, secured and moved to a ‘Data Lake’ platform to allow analytics to deliver insights without impacting the end-user experience of your enterprise application performance.

3. We will use our existing IT Platforms.

This idea is simply not realisitc. Analysts estimate that there will be 40ZB of data by 2020 that is 57 times more data than every grain of sand on the earth. Where are we supposed to store all that data? Probably not in your existing block storage arrays. And even if you could, running I/O intensive analytics on production systems would probably give your IT teams and employees a very bad headache…

Analysts also believe that by 2020 at least 50% of the 40ZB of data will be held in a solution platform not in place within the enterprise today, most probably one based on a Hadoop variant. There are other storage technologies better suited to the requirements of these third platform workloads.

4. Combining it all together is Easy.

You know (some of ) the outcomes you’re after, you have the data and you know the infrastructure you are going to use.

The fourth myth is that bringing it together is easy. Combining together all the required components and standing up the required infrastructure can be a laborious and painstaking task. Putting all this together in to a fully functional whole is not necessarily a straightforward process.

Fortunately as Big Data and Analytics becomes a mainstream requirement of every organisation, IT vendors have created integrated systems for Big Data, Analytics and Applications Development. These flexible platforms use a variety of best of breed third party software can enable you to ingest, store, analyse and surface your data in a tailor made fashion, suited for whomever should need it, and allowing developers to create applications that can act on that data in realtime.

5. Once you have technology in place you are ready to begin.

Of course the reality is that technology is only part of the overall solution and people and process are equally, if not more, important. In todays digitized world, data is our most valuable asset and we must exploit that opportunity so that our respective organisations can thrive.

Like early gold rush pioneers if you pull together home grown equipment, with unskilled labour, and mine in the wrong area you will probably end up with questionable results. However if you mine the right data, using proven processes, with skilled data scientists and with an industrial or an enterprise ready platform the results can be very apparent.


What do you think? Leave or comment or tweet me @TheStorageChap.