When data becomes really big, sometimes it is faster to upload it into a truck and send it via highway, than to transfer it online. That’s how the 100 petabytes of satellite images of Earth DigitalGlobe collected over 17 years are being moved.
Data being transported in Amazon’s “Snowmobile” truck says a lot about just how big data has grown. And just how much is that? Well, Global IP traffic data has skyrocketed from 96,054 petabytes a month in 2016 to 150,910 petabytes in 2018 and is estimated to hit 278,108 in 2021.
According to an IDC analysis, the world created 16,3 zettabytes of data a year in 2017. By 2025, the amount will be tenfold higher. That’s why businesses cannot afford to overlook critical trends in data management and big data analytics. So how should you go about riding this wave?
1. Growing demand for data science and analytics talents
According to a recent Glassdoor report, data scientist is “the best job in America in 2018”. At the beginning of the year, there were 4,524 job openings with a median base salary of $110,000. Candidates follow the market closely, and it should come as no surprise that new ways to become a data scientist are emerging. As Linkedin’s 2017 U.S. Emerging Jobs Report states, there are 9.8 times more Machine Learning Engineers working today than in 2012.
But there is one challenge – being a data scientist takes more than just knowing how to code in Python. The job requires both specific tech skills and a “data-oriented” mindset. Although new ways of becoming data scientist are emerging, there is a growing gap between the demand for data scientists and the actual supply of this rare species – and that doesn’t appear likely to change anytime soon.
Any company seeking to boost up the business processes with AI and machine learning should bear in mind how challenging hiring a data scientist can be. Building a team internally, Enlisting the support of an experienced mentor to build a team internally, from the ground up, may be more cost-effective and much faster than acquiring specialists.
2. Augmented analysis on the rise
According to Gartner, augmented analytics is “an approach that automates insights using machine learning and natural language generation” that “marks the next wave of disruption in the data and analytics market”. With augmented analytics, companies can use new technologies to efficiently harvest the insights from data, both obvious and less obvious ones.
According to Forbes Insights, 69% of leading-edge companies believe that augmented intelligence will improve customer loyalty and 50% of all companies surveyed think that these technologies will improve customer experience. What’s more, 60% of companies believe that augmented analytics will be crucial in helping them obtain new customers. Clearly, this is not a technology to overlook over underestimate in future investments. Given the volume of data being produced today and the amount of possible correlations it gives rise to, the days of its being inspected manually are numbered.
3. Edge computing speeds up data transfer
Just as the number of connected devices is growing, so too is the need for computing power required to analyze data. There will be no fewer than 30 billion data-producing devices in 2020. With the decentralization of data comes a plethora of challenges, including delays in data transfer and huge pressure exerted on central infrastructure.
To optimize the amount of data to be transferred via the Internet, companies perform more and more computing near the edge of the network, close to the source of the data. Consider autonomous vehicles, each of which, according to GE, generates 40 terabytes of data for every eight hours of driving. It would be neither practical nor cheap – and for sure not safe – to send all that data to an external processing facility.
4. Convergence, or merging the past, present and future world in one dashboard
Merging data from various sources is nothing new – companies have been using weather forecasts or combining historical data with sales predictions for a long time. New technologies, however, are bringing data merging into the mainstream, and there’s a growing number of use cases to exemplify just why this is happening.
One inspiring example comes from Stella Artois, the producer of apple and pear cider. Analysis of its historical data showed the company that its sales grow when the temperature rises above a certain degree.
So it decided to run an outdoor campaign on digital billboards on a cost-per-minute basis, triggered only by specific conditions – proper temperature, sunny weather and no clouds. The company reported a YOY sales increase north of 65% increase during the period when its weather-responsive campaign ran, efficiently harnessing the power of its data.
We can thank Herradura Tequila, a premium liquor brand, for another fascinating example. The company partnered with Foursquare to gain information about potential customers with their check-ins. What’s more, Herradura’s producer was able to obtain information about other places customers liked to visit.
With the information it gathered, the company was able to send targeted ads to the people representing a similar profile. Combining the historical data with the profile and geolocation resulted in a 23% incremental rise in visits to places selling Herradura among people who had been exposed to the ads, when compared to the control group.
5. New data types – Excel is not enough (nor even close)
With voice-activated personal assistants or image recognition technology going mainstream, companies will gain access to new, unstructured types of data.
By dint of the sheer amount and nature of such data, those flowing from diverse and unstructured sources need to be processed with machine learning algorithms. Manual processing would just be too time-consuming and ineffective.
deepsense.ai’s cooperation with global researcher Nielsen provides a fine example of such algorithms at work. With the power of deep learning, deepsense.ai built an app that swiftly recognizes the ingredient fields on various FMCG products and then uploads the information into a structured, centralized database. With an accuracy of 90%, the solution gives researchers a new tool to deliver faster analysis of the retail market.
The possibilities are countless – be they visual quality control based on IP cameras or logo detection and brand visibility analytics tools – you name it. As the automating quality testing with machine learning may increase defect detection rates up to 90%, the possible savings are impressive.
Big data is quite a common term in modern business. But harvesting the power of the information becomes more problematic as the amount collected continues to grow.