Table of contents
Table of contents
September brought us two interesting AI-related stories, both with a surprising social context.
Despite its enormous impact on our daily lives, Artificial Intelligence (AI) is often still regarded as too hermetic and obscure for ordinary people to understand. As a result, an increasing number of people use Natural Language Processing-powered personal assistants, yet only a tiny fraction try to understand how they work and how to use them effectively. This makes them somewhat of a black box.
Making the field more comprehensible and accessible is one aspect of AI researchers’ mission. That’s why research recently done by OpenAI is so interesting.
Hide-and-Seek – the reinforcement learning way
Reinforcement learning has delivered inspiring and breathtaking results. The technique is used in the training models behind autonomous cars and the controlling of sophisticated devices like automated arms and robots. Unlike in supervised learning, a reinforcement learning model learns by interacting with the environment. The scientist can shape its behavior by applying a policy of rewards and punishments. The mechanism is close to that which humans use to learn. Reinforcement learning has been used to create super killing agents to go toe-to-toe against human masters in Chess, Go and Starcraft. Now OpenAI, the company behind the GPT-2 model and several other breakthroughs in AI, has created agents that play a version of hide-and-seek, that most basic and ageless of children’s games. OpenAI researchers divided the agents into two teams, hiders and seekers, and provided them a closed environment with walls and movable objects like boxes, walls and ramps. Any team could “lock” these items to make them unmovable for the opposing team. The teams developed a set of strategies and counter-strategies in a bid to successfully hide from or seek out the other team. The strategies included:- Running – the first and least sophisticated ability, enabling one to avoid the seekers.
- Blocking passages – the hider could block passages with a box in order to build a safe shelter.
- Using a ramp – to overcome the wall or a box, the seekers team learned to use a ramp to jump over an obstacle or climb a box and see the hider.
- Blocking the ramp – to prevent the seekers from using the ramp to climb the box, the hiders could block access to the ramp. The process required a great deal of teamwork, which was not supported by the researchers in any way.
- Box surfing – a strategy developed by seekers who were basically exploiting a bug in the system. The seekers not only jumped on a box using a ramp that had been blocked by the hiders, but also devised a way to move it while standing on it.
- All-block – the ultimate hider-team teamwork strategy of blocking all the objects on the map and building a shelter.