Meet our client
INDUSTRYRetail / Research
How we did it
Retail research leader Nielsen provides detailed market research data on various industries to a worldwide client base. For the FMCG industry, the company calculates each product’s market share from its shelf presence. It supplements this information with detailed data such as product ingredients, allergens, and barcodes.
Thousands of new FMCG products flood the market every month. Nielsen harvests the product information directly from the images of packages. The process of manually extracting, writing it all down and entering it into a database takes about 30 minutes per product. The ingredients can be written in any number of languages, and sometimes a single ingredient is represented by two different words due to a producers’ failure to be consistent. Not a challenge for a human, but any system that was not based on machine learning found the task impossible.
Handling the task described above required delivering a complex solution consisting of multiple steps performed by several neural networks. Each is performing a separate task that involves image recognition and natural language processing.
The delivered system is based on deep neural networks that process the original, unedited image of the product taken from the shelf. Then the system is trained to recognize a frame with the ingredients by spotting relevant words and commas by Region Proposal Network that points “where to look”.
In the following step, the system identifies and corrects the scraped text before filling in the table of contents in the process performed by the Detector Network.
The neural network then identifies the ingredients and enters them into a spreadsheet by their respective categories. The final effect is based on the cooperation of two deep neural networks, trained and feature-engineered to avoid mistakes and deliver accurate information.
The app reduces the time needed to gather and validate the data to less than 2 minutes. The deep neural networks-based system localizes the list of ingredients on the label, scans it and puts it into the proper column in the database. At the same time, the data is being cleared from noises and redundancies, delivering standardized information.
Applied to CPG product labels and compared to simple optical character recognition, this approach allows us to retain the ‘context’ of the specific region of interest. This makes the difference between ‘extracting raw text’ from the image versus ‘understanding’ image content allowing for a better accuracy and, consequently, higher quality results and higher automation rates. Nielsen is investing significantly in “smart” automation in the area of reference data which is one of the company’s key assets and the glue that enables internal and external data exchanges.
Alessandro Zolla, vice president of Technology at Nielsen