by Steve Thompson –
The people that would argue that point are a rapidly dwindling group. But there’s a vast difference between recognizing the benefits of Big Data and realizing the benefits of Big Data.
Unfortunately, lots of companies — particularly mid-size and smaller companies — find themselves on the outside looking in. Like a kid that can’t afford the price of a ticket to a ballgame, they find themselves peeking through the cracks of a virtual fence to see what they’re missing.
That virtual fence represents the cost of tapping into the benefits of Big Data. It’s a fence that has kept many companies out of the game.
That fence is about to come down.
You’ve probably heard some of the statistics1 about the astounding speed with which data is generated nowadays:
Not all of that data is useful, of course. Part of the challenge of making the most of Big Data involves sorting the wheat from the chaff. If data doesn’t conform to the following five Vs, it doesn’t really do anything for your business use case:
Selectively capturing Big Data that conforms to the five Vs, and using the data to build a Hadoop data lake – that’s how many companies have found success with Big Data.
A data lake enables a single source of truth. It enables data governance and data management. It supports predictive analytics and business intelligence.
Many companies, though, have been unable to benefit from Big Data because of the prohibitive costs involved. Multi-million dollar investments in hardware and software have been required to play in the Big Data sandbox.
But that is changing, thanks to the cloud.
Quite simply, cloud-based data ingestion is a game changer. It’s disruptive technology in the sense that it’s going to change the way things are done and change the way people do business.
Cloud-based data ingestion eliminates the need to invest millions of dollars in hardware and software. Instead, you just use a cloud-based Hadoop cluster and data ingestion engine as you need it.
Whether through Amazon AWS EC2 Cloud, Microsoft Azure Cloud, using a Cloudera cluster, or through Hortonworks, you can now utilize a pay-per-use strategy for data ingestion. Data ingestion can now be performed on-demand, scheduled or event-driven. And you’re only paying for it when you use it. Clusters can now grow as needed and dynamically, and then be shut down when not needed. And the data can be stored in the cloud and archived as needed (ie, EBS and S3 storage).
Quite simply, cloud-based data ingestion makes the benefits of Big Data available to virtually every company, but at a fraction of the cost and as a pay-per-use model for the business.
New services and frameworks such as RCG|enable™ Data simplify cloud-based data ingestion.
RCG|enable™ Data is a service that delivers a data ingestion platform that’s compatible with many different open source products and technologies. It runs on the gateway on the edge node and enables data ingestion, which is why we call it RCG|enable™ Data.
With RCG|enable™ Data you can connect to a wide range of different technologies and data formats, such as:
You can use RCG|enable™ Data with structured, semi-structured, and unstructured data. And you can perform tasks with a drag-and-drop user interface that enables quick and easy management across all these different technologies. The native mpp capabilities of the cluster are then used to run the ingestion jobs, either map reduce, yarn, or spark.
The cloud now makes the potential of Big Data available and affordable to businesses of all sizes. And tools such as RCG|enable™ Data provide a means of harnessing that potential.
Early adopters of technology typically pay a pretty penny to play in a new sandbox. Early-stage technology is expensive; that’s the way it has always been.
Not many businesses could afford to invest millions in a room-sized UNIVAC computer in the 1950s, for example. Those that were able to afford the revolutionary technology could benefit greatly. Others simply had to wait for the day when the new technology became more affordable.
But that day always comes. New technology always becomes more affordable, while also becoming more capable.
And thanks to the advent of cloud-based data ingestion and tools such as RCG|enable™ Data, that day has now come for the age of Big Data.
1. Forbes. (2015, Sept 30) "Big Data: 20 Mind-Boggling Facts Everyone Must Read" Retrieved from https://www.forbes.com/sites/bernardmarr/2015/09/30/big-data-20-mind-boggling-facts-everyone-must-read