University of Warsaw researchers and launch reinforcement learning project powered by Google’s TensorFlow Research Cloud

University of Warsaw researchers and launch reinforcement learning project powered by Google’s TensorFlow Research Cloud

Researchers from the University of Warsaw, Google AI and take on a new reinforcement learning challenge on Cloud TPU hardware accelerators. The goal of the experiment is to end-to-end train an artificial intelligence to play video games fully inside a computation graph.

A team from the University of Warsaw, made up of Piotr Miłoś, Błażej Osiński and Henryk Michalewski, has started a collaboration on reinforcement learning research with Łukasz Kaiser from the Google Brain team and with researchers from This project is connected to a research program on RL that started last year.
In the experiment, an artificial intelligence will be end-to-end trained to play video games fully inside a computation graph. Assuming that a game simulator would also be a part of the graph, this could make tasks such as training AI to play video games even faster than what’s team achieved last year. The intent is to run the training process entirely on Cloud TPUs, which are new machine learning accelerators designed by Google. This will save time previously spent on communication between accelerators and a host computer.

The main experiments are being run on Cloud TPUs via the TensorFlow Research Cloud program and supported by Google Warsaw’s Antonio Gulli, Ignacy Kowalczyk and Maciej Pytel, who are helping us to deploy our experiments on the Google Cloud Platform. TFRC provides ML researchers with access to second-generation Cloud TPUs, each of which provides 180 teraflops of machine learning acceleration.

Henryk Michalewski, a research team leader on the project, offered his appreciation. “Many thanks to Google for sharing early access to Cloud TPUs with us, as well as to Antonio Gulli and Ignacy Kowalczyk for providing the Google Cloud Platform power to deploy our experiments. With such a strong infrastructure, we’re perfectly equipped to tackle our ambitious goal and leverage the research on reinforcement learning efficiency we started last year.”

In 2017, Michalewski’s team from published a paper describing a new method they had developed to train a robotic arm to grip a can of coke. Their work was recognized by the General Chairs of the Conference on Robot Learning (CoRL) as one of 11 noteworthy papers in the reinforcement learning and robotics category. The team presented the paper in November at Google’s headquarters in Mountain View. The method could be used, for example, to train humanoid robots to combine single steps into a walk or a run.



Media contact: trademarks at boilerplate

Contact us

The administrator of the personal data provided by you in the registration form is sp. z o.o., headquartered at al. Jerozolimskie 44, 00-024 Warsaw, Poland. Your personal data will be processed for the purpose of directing marketing content to you.
Detailed information about the processing of your personal data, including your rights, can be found in our privacy policy.
* This consent is required to receive email communication from sp. z o.o. regarding the company and its offerings.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
  •, Inc.
  • 2100 Geng Road, Suite 210
  • Palo Alto, CA 94303
  • United States of America
  • Sp. z o.o.
  • al. Jerozolimskie 44
  • 00-024 Warsaw
  • Poland
  • ul. Łęczycka 59
  • 85-737 Bydgoszcz
  • Poland
Let us know how we can help