Research & Development Hub

There is no development without research

Research work is an essential part of and a key aspect of the company’s development. Our projects bring fresh ideas to AI research, an area of human thought and production we believe is worth contributing to. In collaboration with top universities, scientific institutions and global corporations, we have made crucial contributions to the field of Reinforcement Learning.

TrelBERT: A pre-trained encoder for Polish Twitter

A research project, presented at the EACL 2023 Workshop
  • In this paper, the researchers present TrelBERT – the first Polish language model suited to application in the social media domain. TrelBERT is based on an existing general-domain model called HerBERT and has been adapted to the language of social media by pre-training it further on a collection of almost 100 million messages taken from Polish Twitter.
  • To evaluate TrelBERT against the tasks included in the Polish NLP, a benchmark called KLEJ (analogous to the famous English GLUE benchmark) was used. Of particular interest was the cyberbullying detection task in which TrelBERT outperformed all other competitors, currently holding the top spot on the KLEJ cyberbullying detection leaderboard.

Fast and Precise: Adjusting the Planning Horizon with Adaptive Subgoal Search

Joint research with University of Warsaw, Ideas NCBR, Jagiellonian University, KAUST, Google Research & Stanford University, Polish Academy of Sciences; presented at ICLR 2023
  • The researchers proposed Adaptive Subgoal Search (AdaSubS), a search algorithm that adjusts the planning horizon to match the local complexity of the problems solved.
  • Complex reasoning problems contain states that vary in terms of the computational cost required to determine the right action plan. To take advantage of this property, Adaptive Subgoal Search (AdaSubS), a search method that adaptively adjusts the planning horizon, was proposed. To this end, AdaSubS generates diverse sets of subgoals at different distances. A verification mechanism is employed to swiftly filter out unreachable subgoals, making it possible to focus on other more feasible subgoals. In this way, AdaSubS benefits from the efficiency of planning with longer-term subgoals and the fine control with shorter-term ones, and thus scales well to difficult planning problems.
  • The research shows that AdaSubS significantly surpasses hierarchical planning algorithms based on results from three complex reasoning tasks: Sokoban, the Rubik’s Cube, and the inequality-proving benchmark INT.

Subgoal Search For Complex Reasoning Tasks

Joint research with University of Warsaw, University of Toronto Vector Institute, Polish Academy of Sciences, University of Oxford; presented at NeurIPS 2021
  • Humans excel in solving complex reasoning tasks through a mental process of moving from one idea to a related one. Inspired by this, the authors propose the Subgoal Search (kSubS) method.
  • kSubS is a key component of a learned subgoal generator that produces a diversity of subgoals that are both achievable and closer to the solution. Using subgoals reduces the search space and induces a high-level search graph suitable for efficient planning.
  • In this paper, the authors implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework. kSubS achieves strong results including state-of-the-art on INT within a modest computational budget.

Continual World: A Robotic Benchmark For Continual Reinforcement Learning

Joint research with DeepMind, Polish Academy of Sciences, Jagiellonian University; presented at NeurIPS 2021
  • Continual learning (CL) – the ability to continuously learn, building on previously acquired knowledge – is a natural requirement for long-lived autonomous reinforcement learning (RL) agents.
  • In response to the issues related with building such agents, the authors advocate for the need to prioritize forward transfer and propose Continual World, a benchmark consisting of realistic and meaningfully diverse robotic tasks built on top of Meta-World as a testbed.
  • The presented benchmark aims to provide a meaningful and computationally inexpensive challenge for the community and thus help better understand the performance of existing and future solutions.

Off-Policy Correction For Multi-Agent Reinforcement Learning

Joint research with Google Research, Polish Academy of Sciences, University of Warsaw; presented at AAMAS 2022 (extended abstract), NeurIPS 2021 workshop
  • Multi-agent reinforcement learning (MARL) provides a framework for problems involving multiple interacting agents.
  • In this work, the authors propose MA-Trace, a new on-policy actor-critic algorithm, which extends V-Trace to the MARL setting.

CARLA Real Traffic Scenarios – novel training ground and benchmark for autonomous driving

Joint research with Volkswagen, Google Research, University of Warsaw, Polish Academy of Sciences, Jagiellonian University; presented at NeurIPS 2020 workshop
  • The research paper introduces interactive traffic scenarios in the CARLA
    simulator, which are based on real-world traffic.
  • The CARLA Real Traffic Scenarios (CRTS) is intended to be a training and
    testing ground for autonomous driving systems.
  • The work presents how to obtain competitive policies and evaluate
    experimentally how observation types and reward schemes affect the
    training process and the resulting agent’s behavior.

Structure and randomness in planning and reinforcement learning

Joint research with University of Warsaw, Gdańsk University of Technology, Polish Academy of Sciences; presented at IJCNN 2021, DRL Workshop, NeurIPS 2020
  • The research paper presents a novel method, Shoot Tree Search (STS), which makes it possible to more explicitly control the balance between the depth and breadth of the search needed for planning in large state spaces.
  • The algorithm can be understood as an interpolation between two celebrated search mechanisms: MCTS and random shooting. It also lets the user control the bias-variance trade-off, akin to T D(n), but in the tree search context. In experiments on challenging domains, we show that STS can get the best of both worlds: consistently achieving higher scores.

Applying deep learning to right whale photo identification

Joint paper with National Oceanic and Atmospheric Administration, University of Warsaw
  • The research paper presents the winning solution of a computer vision competition organized by the NOAA Fisheries on the data science platform.
  • The solution automatically identifies individual whales with an 87% accuracy using a series of convolutional neural networks to identify the region of interest on an image, then rotate, crop, and create standardized photographs of uniform size and orientation, and then identify the correct individual whale from these passport-like photographs.

Model-Based Reinforcement Learning for Atari

Joint research with Google Brain, the University of Warsaw and the University of Illinois at Urbana-Champaign
  • Trained a number of action-conditioned video models which are used as neural simulators of Atari environments
  • Trained Atari agents using learned neural simulators and tested the performance of the agents in original environments
  • Compared the performance of our agents to performance of agents trained using two model-free algorithms Rainbow and PPO and 100K and 500K interactions with the environment
  • Invited presentations at the University of Oxford, DeepMind and Google Brain Zurich

Parallel training of Atari games

Joint research with Intel
  • Conducted parallel training of Atari games on one of the largest European supercomputers
  • Held the world record in parallel training of Atari games for two months in 2018 (beaten by DeepMind)
  • Presented work at the leading European HPC conference and were cited by DeepMind

Sim2Real consisting of training in Unreal Engine 4 and deployment on a real car

Joint research with a leading car manufacturer
  • Modeled a dozen real-life routes in Unreal Engine 4
  • Parallelized computations in order to generate hundreds of millions of frames using Unreal Engine 4 and one of the biggest European supercomputers
  • Models we trained were deployed on real autonomous vehicles multiple times in 2018 and 2019

Reinforcement learning and Theorem Proving’s research project
  • The first time reinforcement learning has been convincingly applied to solving general mathematical problems on a large scale
  • Published a paper in collaboration with researchers from the Technical
  • University in Prague and the University of Innsbruck
  • Presented at the main track of NeurIPS 2018

Expert-augmented actor-critic for ViZDoom and Montezuma’s Revenge’s research project
  • Achieved state-of-the-art results in environments known for challenging exploration
  • Presented at the  Deep RL and Imitation Learning Workshops at NeurIPS 2018

Learning to Run challenge solutions’s research project
  • Trained walking gaits with reinforcement learning methods
  • Took 6th place out of 400+ teams in the Learning to Run challenge, NIPS, 2017

Hierarchical Reinforcement Learning with Parameters’s research project
  • Developed an original RL algorithm for hierarchical reinforcement learning; presented at Google Headquarters in Mountain View during the Robot Learning conference, 2017

Contact us

The administrator of the personal data provided by you in the registration form is sp. z o.o., headquartered at al. Jerozolimskie 44, 00-024 Warsaw, Poland. Your personal data will be processed for the purpose of directing marketing content to you.
Detailed information about the processing of your personal data, including your rights, can be found in our privacy policy.
* This consent is required to receive email communication from sp. z o.o. regarding the company and its offerings.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
  •, Inc.
  • 2100 Geng Road, Suite 210
  • Palo Alto, CA 94303
  • United States of America
  • Sp. z o.o.
  • al. Jerozolimskie 44
  • 00-024 Warsaw
  • Poland
  • ul. Łęczycka 59
  • 85-737 Bydgoszcz
  • Poland
Let us know how we can help