Customized image recognition system for retail analytics

93% reduction in data gathering time by reducing the amount of manual work

Meet our client

INDUSTRYRetail / Research

CUSTOMERNielsen Holdings

How we did it

Retail research leader Nielsen provides detailed market research data on various industries to a worldwide client base. For the FMCG industry, the company calculates each product’s market share from its shelf presence. It supplements this information with detailed data such as product ingredients, allergens, and barcodes. To streamline the process of gathering data, Nielsen looked for a customized AI-powered system.

The challenge

Nielsen harvests the product information directly from the images of packages. The ingredients can be written in any number of languages, and sometimes a single ingredient is represented by two different words due to a producers’ failure to be consistent. The process of manually extracting, writing it all down and entering it into a database takes about 30 minutes per product. With thousands of analyzed products, manual data collection is not effective.

The solution

Handling the task described above required from delivering a complex, customized solution consisting of multiple steps performed by several neural networks. Each is performing a separate task that involves image recognition and natural language processing.
The delivered system is based on deep neural networks that process the original, unedited image of the product taken from the shelf. Then the system is trained to recognize a frame with the ingredients by spotting relevant words and commas by Region Proposal Network that points “where to look”.
In the following step, the system identifies and corrects the scraped text before filling in the table of contents in the process performed by the Detector Network.
The neural network then identifies the ingredients and enters them into a spreadsheet by their respective categories. The final effect is based on the cooperation of two deep neural networks, trained and feature-engineered to avoid mistakes and deliver accurate information.

The effect

The system delivered by increases the throughput of the data extraction pipeline to thousands of images per minute. The deep neural networks localize the list of ingredients on the label, scans it and puts it into the proper column in the database. At the same time, the data is being cleared from noises and redundancies, delivering standardized information.


Applied to CPG product labels and compared to simple optical character recognition, this approach allows us to retain the ‘context’ of the specific region of interest. This makes the difference between ‘extracting raw text’ from the image versus ‘understanding’ image content allowing for a better accuracy and, consequently, higher quality results and higher automation rates. Nielsen is investing significantly in “smart” automation in the area of reference data which is one of the company’s key assets and the glue that enables internal and external data exchanges.

Alessandro Zolla, vice president of Technology at Nielsen

Contact us

The administrator of the personal data provided by you in the registration form is sp. z o.o., headquartered at al. Jerozolimskie 44, 00-024 Warsaw, Poland. Your personal data will be processed for the purpose of directing marketing content to you.
Detailed information about the processing of your personal data, including your rights, can be found in our privacy policy.
* This consent is required to receive email communication from sp. z o.o. regarding the company and its offerings.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
  •, Inc.
  • 2100 Geng Road, Suite 210
  • Palo Alto, CA 94303
  • United States of America
  • Sp. z o.o.
  • al. Jerozolimskie 44
  • 00-024 Warsaw
  • Poland
  • ul. Łęczycka 59
  • 85-737 Bydgoszcz
  • Poland
Let us know how we can help