Assessing content quality

ML-based content management process automation

Meet our client



How we did it

The COVID-19 pandemic has resulted in significant changes in how teaching is done. In response to new educational challenges, many online learning platforms have been developed. Brainly is contributing to the learning revolution by providing a collaborative platform for students, parents, and teachers to ask and answer homework questions. The platform has elements of gamification in the form of motivational points and ranks. It encourages users to engage in the online community by answering other users’ questions.

The challenge

Brainly - one of the fastest growing and most popular learning platforms - was looking for a team of AI experts that come aboard to support building a system to measure the quality of its questions & answers and automate part of the content management process. With more than 350 million monthly users, it was crucial to automate the question validation process to build a valuable knowledge base of the highest-quality content. An additional challenge was to find a team of experts who would smoothly join Brainly's ranks and, apart from delivering successful project results, perfectly fit into the organizational culture.

The solution has worked with Brainly’s data scientists on machine learning models that enhance and automate the content management process, ensure the quality and safety of Brainly’s Knowledge Base and enable new analytics and product features. The system classifies questions created by Brainly users into good and bad quality labels and provides predictions on question quality.’s first step was to create a dataset based on data from Brainly. Unfortunately, we were unable to use information from the moderation process as a ground truth because it was too inconsistent. Instead, we created a simplified question quality taxonomy and then used it to label an unbiased sample of questions from Brainly. We used that dataset as a benchmark to test and validate both existing solutions and solutions we developed ourselves. Based on our research, we decided to split different question quality issues into two groups. In one group we collected questions that can be detected using pre-trained models, and in the other, ones that no ready-made solution could be found for. In the first phase we focused on analyzing solutions for questions belonging to the first group. The second phase has just started and will be focused on collecting adequate datasets and then training in-house state of the art Transformer-based text classifiers to solve Brainly-specific issues.

The effect

The system will ultimately help to automate reporting of low quality content in real-time and rank the moderation queue such that moderators can focus on the riskiest content first. Moreover it would provide a higher quality selected knowledge base for training additional ML algorithms on top, for other use cases such as recommended systems or information retrieval. On top of that’s people have become an important part of Brainly’s team by supporting not only the platform development but also team spirit and motivation aspects.

Contact us

The administrator of the personal data provided by you in the registration form is sp. z o.o., headquartered at al. Jerozolimskie 44, 00-024 Warsaw, Poland. Your personal data will be processed for the purpose of directing marketing content to you.
Detailed information about the processing of your personal data, including your rights, can be found in our privacy policy.
* This consent is required to receive email communication from sp. z o.o. regarding the company and its offerings.
  •, Inc.
  • 2100 Geng Road, Suite 210
  • Palo Alto, CA 94303
  • United States of America
  • Sp. z o.o.
  • al. Jerozolimskie 44
  • 00-024 Warsaw
  • Poland
  • ul. Łęczycka 59
  • 85-737 Bydgoszcz
  • Poland
Let us know how we can help