deepsense.ai
  • Success stories
  • Solutions
    • Increase sales
    • Reduce costs
    • Optimize processes
    • Build data strategy
    • Train your team
  • R&D Hub
  • Blog
  • About us
    • Our story
    • Management
    • US operations
    • Advisory Board
    • Press center
    • Careers
  • Contact
  • Menu
Key findings from the International Conference on Learning Representations (ICLR)

Trends and fads in machine learning – topics on the rise and in decline in ICLR submissions

October 24, 2019/in Blog posts, Machine learning /by Michał Kustosz and Błażej Osiński

ICLR (The International Conference on Learning Representations) is one of the most important international machine learning conferences. It’s popularity is growing fast, putting it on a par with such conferences as ICML, NeurIPS or CVPR.

The 2020 conference is slated for next April 26th, but the submissions deadline has already come and gone. 2585 publicly available papers were submitted. That’s about a thousand more than were featured at the 2019 conference.

2nd law of paper-dynamics tells us that the number of submitted papers will reach 100.000 in 24 years. That’s some serious growth!

We analyzed abstracts and keywords of all the ICLR papers submitted within the last three years to see what’s trending and what’s dying out. Brace yourselves! This year, 28% of the papers used or claimed to introduce state-of-the-art algorithms, so be prepared for a great deal of solid machine learning work!

“Deep learning” – have you heard about it?

To say you use deep learning in Computer Vision or Natural Language Processing is like saying fish live in water. Deep learning has revolutionized machine learning and become it’s underpinning. It’s present in almost all fields of ML, including less obvious ones like time series analysis or demand forecasting. This may be why the number of references to deep learning in keywords actually fell – from 19% in ‘18 to just 11% in ‘20–It’s just too obvious to acknowledge.

Deep learning

A revolution in network architecture?

One of the hottest topics this year turned out to be Graph Neural Networks. GNN is a deep learning architecture for graph-structured data. These networks have proved tremendously helpful in some applications in medicine, social network classification and modeling the behavior of dynamic interacting objects. The rise of GNNs is unprecedented, from 12 papers mentioning them in 18’ to 111 in 20’!

graph neural network

All Quiet on the GAN Front

The next topic has been extremely popular in recent years. But what has been called ‘the coolest idea in machine learning in the last twenty years’ has quickly become exploited. Generative Adversarial Networks can learn to mimic any distribution of data – creating impressive never seen artificial images. Yet they are on the decline. Yet they are on the decline, despite being prevalent in the media (deep fakes).

generative adversarial networks

Leave designing your machine learning to… machines

Finding the right architecture for your neural network can be a pain in the neck. Fear not, though: Neural Architecture Search (NAS) will save you. NAS is a method of building network architecture automatically rather than handcrafting it. It has been used in several state-of-the-art algorithms improving image classification, object detection or segmentation models. The number of papers on NAS increased from a mere five in ‘18 to 47 in ‘20!

neural architecture

Reinforcement learning – keeping stable

The percentage of papers on reinforcement learning has remained more or less constant. Interest in the topic remains significant – autonomous vehicles, Alpha Star’s success in playing StarCraft, and advances in robotics were allwidely discussed this year. RL is a stable branch of machine learning, and for good reason: future progress is widely anticipated.

reinforcement learning

What’s next?

That was just a sample of machine learning trends. What will be on top next year? Even the deepest neural network cannot predict it. But interest in machine learning is still on the rise, and the researchers are nothing if not creative. We shouldn’t be surprised to hear about groundbreaking discoveries next year and a 180-degree change in the trends.

To see a full analysis of the trends papers throughout the last three conferences click the photo below:

ICLR most popular keywords papers

 

https://deepsense.ai/wp-content/uploads/2019/10/key-findings-from-the-international-conference-on-learning-representations-iclr.png 337 1140 Michał Kustosz https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Michał Kustosz2019-10-24 12:49:092019-11-07 08:30:27Trends and fads in machine learning - topics on the rise and in decline in ICLR submissions
AI Monthly Digest #13 – an unexpected twist for the stock image market

AI Monthly Digest #13 – an unexpected twist for the stock image market

October 7, 2019/in Blog posts, Machine learning, AI Monthly Digest /by Konrad Budek and Arkadiusz Nowaczynski

September brought us two interesting AI-related stories, both with a surprising social context.

Despite its enormous impact on our daily lives, Artificial Intelligence (AI) is often still regarded as too hermetic and obscure for ordinary people to understand. As a result, an increasing number of people use Natural Language Processing-powered personal assistants, yet only a tiny fraction try to understand how they work and how to use them effectively. This makes them somewhat of a black box.

Making the field more comprehensible and accessible is one aspect of  AI researchers’ mission. That’s why research recently done by OpenAI is so interesting.

Hide-and-Seek – the reinforcement learning way

Reinforcement learning has delivered inspiring and breathtaking results. The technique is used in the training models behind autonomous cars and the controlling of sophisticated devices like automated arms and robots.

Unlike in supervised learning, a reinforcement learning model learns by interacting with the environment. The scientist can shape its behavior by applying a policy of rewards and punishments. The mechanism is close to that which humans use to learn.

Reinforcement learning has been used to create super killing agents to go toe-to-toe against human masters in Chess, Go and Starcraft. Now OpenAI, the company behind the GPT-2 model and several other breakthroughs in AI, has created agents that play a version of hide-and-seek, that most basic and ageless of children’s games.

OpenAI researchers divided the agents into two teams, hiders and seekers, and provided them a closed environment with walls and movable objects like boxes, walls and ramps. Any team could “lock” these items to make them unmovable for the opposing team. The teams developed a set of strategies and counter-strategies in a bid to successfully hide from or seek out the other team. The strategies included:

  • Running – the first and least sophisticated ability, enabling one to avoid the seekers.
  • Blocking passages – the hider could block passages with a box in order to build a safe shelter.
  • Using a ramp – to overcome the wall or a box, the seekers team learned to use a ramp to jump over an obstacle or climb a box and see the hider.
  • Blocking the ramp – to prevent the seekers from using the ramp to climb the box, the hiders could block access to the ramp. The process required a great deal of teamwork, which was not supported by the researchers in any way.
  • Box surfing – a strategy developed by seekers who were basically exploiting a bug in the system. The seekers not only jumped on a box using a ramp that had been blocked by the hiders, but also devised a way to move it while standing on it.
  • All-block – the ultimate hider-team teamwork strategy of blocking all the objects on the map and building a shelter.

The research delivered, among other benefits, a mesmerizing visual of little agents running around.

Why does it matter?

The research itself is neither groundbreaking nor breathtaking. From a scientific and developmental point of view, it looks like little more than elaborate fun. Yet it would be unwise to consider the project insignificant.

AI is still considered a hermetic and difficult field. Showing the results of training in the form of friendly, entertaining animations is a way to educate society on the significance of modern AI research.

Also, animation can be inspiring for journalists to write about and may lead youth to take an interest in AI-related career paths. So while the research has brought little if any new knowledge, it could well end up spreading knowledge on what we already know.

AI-generated stock photos available for free

Generative Adversarial Networks have proved to be insanely effective in delivering convincing images of not only hamburgers and dogs, but also human faces. One breakthrough is breathtaking indeed. Not even a year ago the eerie “first AI-generated portrait” was sold on auction for nearly a half-million dollars.

Now, generating faces of non-existent people is as easy as generating any other fake image – a cat, hamburger or landscape. To prove that the technology works, the team behind the 100K faces project delivered a hundred thousand AI-generated faces to use in any stock usage, from business folders, to flyers to presentations. Future use cases could include delivering on-the-go image generators that, powered by a demand forecasting tool, provides an image that best suits demand.

More information on the project can be found on the team’s Medium page.

Why does it matter

The images added to the free images bank are not perfect. With visible flaws in a model’s hair, teeth or eyes, some are indeed far from it. But that’s nothing a skilled graphic designer can’t handle. Also, there are multiple images that look nearly perfect – especially when there are no teeth visible in the smile.

Many photos are good enough to provide a stock photo as a “virtual assistant” image or to fulfill any need for a random face. This is an early sign that professional models and photographers will see the impact of AI in their daily work sooner than expected.

https://deepsense.ai/wp-content/uploads/2019/10/AI-Monthly-Digest-13-–-an-unexpected-twist-for-the-stock-image-market.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-10-07 12:26:182019-10-07 12:26:18AI Monthly Digest #13 - an unexpected twist for the stock image market
5 examples of the versatility of computer vision algorithms and applications

5 examples of the versatility of computer vision algorithms and applications

July 25, 2019/in Blog posts, Machine learning /by Konrad Budek

Computer vision enables machines to perform once-unimaginable tasks like diagnosing diabetic retinopathy as accurately as a trained physician or supporting engineers by automating their daily work. 

Recent advances in computer vision are providing data scientists with tools to automate an ever-wider range of tasks. Yet companies sometimes don’t know how best to employ machine learning in their particular niche. The most common problem is understanding how a machine learning model will perform its task differently than a human would.

What is computer vision?

Computer vision is an interdisciplinary field that enables computers to understand, process and analyze images. The algorithms it uses can process both videos and static images. Practitioners strive to deliver a computer version of human sight while reaping the benefits of automation and digitization. Sub-disciplines of computer vision include object recognition, anomaly detection, and image restoration. While modern computer vision systems rely first and foremost on machine learning, there are also trigger-based solutions for performing simple tasks.

The following case studies show computer vision in action.

Diagnosing diabetic retinopathy

Diagnosing diabetic retinopathy usually takes a skilled ophthalmologist.  With obesity on the rise globally, so too is the threat of diabetes. As the World Bank indicates, obesity is a threat to world development – among Latin America’s countries only Haiti has an average adult Body Mass Index reading below 25 (the upper limit of the healthy weight range). With rising obesity comes a higher risk of diabetes – it is believed that obesity comes with 80-85% risk of developing type 2 diabetes. This results in a skyrocketing need for proper diagnostics.

What is the difference between these two images?


The one on the left has no signs of diabetic retinopathy, while the other one has severe signs of it.

By applying algorithms to analyze digital images of the retina, deepsense.ai delivered a system that diagnosed diabetic retinopathy with the accuracy of a trained human expert. The key was in training the model on a large dataset of healthy and non-healthy retinas.

AI movie restoration

The algorithms trained to find the difference between healthy and diseased retinas are equally capable of spotting blemishes on old movies and making the classics shine again.

Recorded on a celluloid film, old movies are endangered by two factors – the fading technology of reading tapes that enable users to watch them and the nature of the tape, which degenerates with age. Moreover, the process of digitizing the movie is no guarantee of flawlessness, as the recorded film comes with multiple new damages.

However, when trained on two versions of a movie – one with digital noise and one that is perfect – the model learns to spot the disturbances and remove them during the AI movie restoration process.

Digitizing industrial installation documentation

Another example of the push towards digitization comes via industrial installation documentation. Like films, this documentation is riddled with inconsistencies in the symbols used, which can get lost in the myriad of lines and other writing that ends up in the documentation–and must be made sense of by humans. Digitizing industrial documentation that takes a skilled engineer up to ten hours of painstaking work can be reduced to a mere 30 minutes thanks to machine learning.

Building digital maps from satellite images

Despite their seeming similarities, satellite images and fully-functional maps that deliver actionable information are two different things. The differences are never as clear as during a natural disaster such as a flood or hurricane, which can quickly if temporarily, render maps irrelevant.

deepsense.ai has also used image recognition technology to develop a solution that instantly turns satellite images into maps, replete with roads, buildings, trees and the countless obstacles that emerge during a crisis situation. The model architecture we used to create the maps is similar to those used to diagnose diabetic retinopathy or restore movies.

Check out the demo:

Your browser does not support the video tag.

Click below to play with our demo:

https://sat-segmentation.demo.deepsense.ai/

 Aerial image recognition

Computer vision techniques can work as well on aerial images as they do on satellite images. deepsense.ai delivered a computer vision system that supports the US NOAA in recognizing individual North Atlantic Right whales from aerial images.

With only about 411 whales alive, the species is highly endangered, so it is crucial that each individual be recognizable so its well-being can be reliably tracked. Before deepsense.ai delivered its AI-based system, identification was handled manually using a catalog of the whales. Tracking whales from aircraft above the ocean is monumentally difficult as the whales dive and rise to the surface, the telltale patterns on their heads obscured by rough seas and other forces of nature.

w_0_bboxBounding box produced by the head localizer

These obstacles made the process both time-consuming and prone to error. deepsense.ai delivered an aerial image recognition solution that improves identification accuracy and takes a mere 2% of the time the NOAA once spent on manual tracking.

The deepsense.ai takeaway

As the above examples show, computer vision is today an essential component of numerous AI-based solutions. When combined with natural language processing, it can be used to read the ingredients from product labels and automatically sort them into categories. Alongside reinforcement learning, computer vision powers today’s groundbreaking autonomous vehicles. It can also support demand forecasting and function as a part of an end-to-end machine learning manufacturing support system.

The key difference between human vision and computer vision is the domain of knowledge behind data processing. Machines find no difference in the type of image data they process, be it images of retinas, satellite images or documentation – the key is in providing enough training data to allow the model to spot if a given case fits the pattern. The domain is usually irrelevant.

https://deepsense.ai/wp-content/uploads/2019/07/5-examples-of-the-versatility-of-computer-vision-algorithms-and-applications.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-07-25 13:40:162019-10-08 11:32:325 examples of the versatility of computer vision algorithms and applications

AI movie restoration – Scarlett O’Hara HD

July 18, 2019/in Blog posts, Machine learning /by Konrad Budek

With convolutional neural networks and state-of-the-art image recognition techniques it is possible to make old movie classics shine again. Neural networks polish the image, reduce the noise and apply colors to the aged images. 

The first movies were created in the late nineteenth century with celluloid photographic film used in conjunction with motion picture cameras.

Skip ahead to 2018, when the global movie industry was worth $41.7 billion globally. Serving entertainment, cultural and social purposes, films are a hugely important heritage to protect. And that’s not always easy. Especially considering the fact that modern movies are produced and screened digitally, with the technology of celluloid tape fading into obsolescence.

Challenges in film preservation

The challenge and importance of preserving the cultural heritage of old movies has been underscored by numerous organizations including the European Commision, which noted that a lack of proper devices to play aging technology on could make it impossible to watch old films.

In deepsense.ai’s experience with restoring film, the first challenge is to remove distortions. Classics are usually recorded in low resolution while the original tapes are obviously aged and filled with noise and cracks. Also, the transition process from celluloid tape to digital format usually damages the material and results in the loss of quality.

By using AI-driven solutions, specifically supervised learning techniques, deepsense.ai’s team removed the cracks and black spots from the digitized version of a film. The model we produced uses deep neural networks trained on a movie with cracks and flaws added manually for training purposes. Having some films in original and broken form, the system learned to remove the flaws. An example of generated noise put on the classic Polish movie “Rejs” and the neural network’s output is displayed below.

The example clearly shows that our neural network can process and restore even a thoroughly damaged source material and make it shine again. The networks start to produce low-quality predictions when the images are so darkened and blurred that the human eye can barely recognize people in the film.

How to convert really old movies into HD

A similar training technique was applied to deliver a neural network used to improve the quality of an old movie. The goal was to deliver missing details and “pump up” the resolution from antiquated to HD quality.

The key challenge lay in reproducing the details, which was nearly impossible. Due to technological development, it is difficult for people to watch lower quality video than what they are used to.

The model was trained by downscaling an HD movie and then conducting a supervised training to deliver the missing details.

Move your mouse cursor over the image to see the difference.

Lewy alt
Prawy alt

The model performs well thanks to the wide availability of training data. The team could downscale the resolution of any movie, provide the model with the original version and let the neural network learn how to forge and inject the missing detail into the film.

A key misconception about delivering HD versions of old movies is that the neural network will discover the missing details from the original. In fact, there is no way to reclaim lost details because there were none on the originally registered material. The neural network produces them on the go on with the same techniques Thispersondoesnotexist and similar Generative Adversarial Networks use.

So, the source material is enriched with details that only resemble reality, but are in fact not real ones. This can be a challenge (or a problem) if the material is to be used for forensic purposes or detailed research. But when it comes to delivering the movies for entertainment or cultural ends, the technique is more than enough.

Coloring old movies

Another challenge comes with producing color versions of movie classics, technically reviving them for newer audiences. The process was long handled by artists applying color to every frame. The first film colored this way was the British silent movie “The Miracle” (1912).

Because there are countless color movies to draw on, providing a rich training set, a deep neural network can vastly reduce the time required to revive black and white classics. Yet the process is not fully automatic. In fact, putting color on the black and white movie is a titanic undertaking. Consider Disney’s “Tron,” which was shot in black and white and then colored by 200 inkers and painters from Taiwan-based Cuckoo’s Nest Studio.

When choosing colors, a neural network tends to play it safe. An example of how this can be problematic would be when the network misinterprets water as a field of grass. It would do that because it is likely more common for fields than for lakes to appear as a backdrop in a film. 

By manually applying colored pixels to single frames, an artist can suggest what colors the AI model should choose.

There is no way to determine the real color of a scarf or a shirt an actor or actress was wearing when a film rendered in black and white was shot. After all these years, does it even matter? In any case, neural networks employ the LAB color standard, leveraging lightness (L) to predict the two remaining channels (A and B respectively).

Transcription and face recognition

Last but not least, transcribing dialogue makes analysis and research much easier – be it for linguistic or cultural studies purposes. With facial recognition software, the solution can attribute all of the lines delivered to the proper characters.

The speech-to-text function processes the sound and transcribes the dialogue while the other network checks which of the people in the video moves his or her lips. When combined with image recognition, the model can both synchronize the subtitles and provide the name of a character or actor speaking.

While the content being produced needs to be supervised, it still vastly reduces the time required for transcription. In the traditional way, the transcription only takes at least the time of a recording and then needs to be validated. The machine transcribes an hour-long movie in a few seconds.

Summary

Using machine learning-based techniques to restore movies takes less time and effort than other methods. It also makes efforts to preserve the cultural heritage more successful and ensures films remain relevant. Machine learning in business gets huge recognition but ML-based techniques remain a novel way to serve the needs of culture and art.. deepsense.ai’s work has proven that AI in the art can serve multiple purposes, including promotion and education. Maybe using it in art and culture will be one of 2020’s AI trends.

Reviving and digitalizing classics improves the access to and availability of cultural goods and ensures that those works remain available, so future generations will, thanks to AI, enjoy Academy-awarded movies of the past as much as, if not more than, we do now.

https://deepsense.ai/wp-content/uploads/2019/07/AI-movie-restoration.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-07-18 14:35:042019-11-22 10:56:40AI movie restoration - Scarlett O’Hara HD
Outsmarting failure. Predictive maintenance powered by machine learning

Outsmarting failure. Predictive maintenance powered by machine learning

November 13, 2018/in Blog posts, Data science, Machine learning /by Konrad Budek

Since the days of the coal-powered industrial revolution, manufacturing has become machine-dependent. As the fourth industrial revolution approaches, factories can harness the power of machine learning to reduce maintenance costs.

The internet of things (IoT) is nothing new for industry.  Worldwide, the number of cellular-enabled factory automation devices reached 270 000 in 2012 worldwide. In 2018 it will rise to a staggering 820 000. Machines are present in every stage of the production process, from assembly to shipment. Although automation makes industry more efficient, with rising complexity it also becomes more vulnerable to breakdowns, as service is both time-consuming and expensive.

Four levels of predictive maintenance

According to PricewaterhouseCoopers, there are four levels of predictive maintenance.

1.Visual inspection, where the output is entirely based on the inspector’s knowledge and intuition
2.Instrument inspection, where conclusions are a combination of the specialist’s experience and the instrument’s read-outs
3.Real-time condition monitoring that is based on constant monitoring with IoT and alerts triggered by predefined conditions
4.AI-based predictive analytics, where the analysis is performed by self-learning algorithms that continuously tweak themselves to the changing conditions

As the study indicates, a good number of the companies surveyed by PwC (36%) are now on level 2 while more than a quarter (27%) are on level 1. Only 22% had reached level 3 and 11% level 4, which is basically level 3 on machine learning steroids. The PwC report states that only 3% use no predictive maintenance at all.

Staying on track

According to the PwC data, the rail sector is the most advanced sector of those surveyed with 42% of companies at level 4, compared to 11% overall.

One of the most prominent examples is Infrabel, the state-owned Belgian company, which owns, builds, upgrades and operates a railway network which it makes available to privately-owned transportation companies. The company spends more than a billion euro annually to maintain and develop its infrastructure, which contains over 3 600 kilometers of railway and some 12 000 civil infrastructure works like crossings, bridges, and tunnels. The network is used by 4 200 trains every day, transporting both cargo and passengers.

Outsmarting failure. Predictive maintenance powered by machine learning

According to the PwC data, the rail sector is the most advanced sector of those surveyed with 42% of companies at level 4, compared to 11% overall.

The company faces both technical and structural challenges. Among them is its aging technical staff, which is shrinking.

At the same time, the density of railroad traffic is increasing – the number of daily passengers has increased by 50% since 2000, reaching 800 000. What’s more, the growing popularity of high-speed trains is exerting ever greater tension on the rails and other infrastructure.

To face these challenges, the company has implemented monitoring tools, such as sensors for monitoring overheating tracks, cameras which inspect the pantographs and meters to detect drifts in power consumption, which usually occur before mechanical failures in switches. All of the data is collected and analyzed by a single tool designed to apply predictive maintenance. Machine learning models are a component of that tool.

As sounding brass

Mueller Industries (Memphis, Tennessee) is a global manufacturer and distributor of copper, brass, aluminum and plastic products. The predictive maintenance solution the company uses is based on sound analysis. Every machine can be characterized by the sound it makes and any change in the tone or the sounds it makes may be a sign of impending malfunction. The analysis of the sound and the vibrations of the machine is done in real-time with the cloud-based machine learning solution that seeks patterns in the data gathered.

Both the amount and the nature of the data collected render it impossible for a human to analyze, but a machine-learning powered AI solution handles it with ease. The devices are able to gather data in ultrasonic and vibration sensors and analyze them in real time. Contrary to experience-based analytics, using the devices requires little-to-no training and can be done on the go.

Endless possibilities

With the power of machine learning enlisted, handling the tremendous amounts of data generated by the sensors in modern factories becomes a much easier task. It allows the company to detect failures before they paralyze the company, thus saving time and money. What’s more, the data that is gathered can be used to further optimize the company’s performance, including by searching for bottlenecks and managing workflows.

That’s why 98% of industrial companies expect to increase efficiency with digital technologies.

https://deepsense.ai/wp-content/uploads/2018/11/Outsmarting-failure.-Predictive-maintenance-powered-by-machine-learning.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2018-11-13 10:43:062019-06-13 11:32:09Outsmarting failure. Predictive maintenance powered by machine learning
A comprehensive guide to demand forecasting

A comprehensive guide to demand forecasting

May 28, 2019/in Blog posts, Data science, Machine learning, Popular posts /by Konrad Budek and Piotr Tarasiewicz

Everything you need to know about demand forecasting – from the purpose and techniques to the goals and pitfalls to avoid.

Essential since the dawn of commerce and business, demand forecasting enters a new era of big-data rocket fuel.

What is demand forecasting?

The term couldn’t be clearer: demand forecasting forecasts demand. The process of predicting the future involves processing historical data to estimate the demand for a product. An accurate forecast can bring significant improvements to supply chain management, profit margins, cash flow and risk assessment.

What is the purpose of demand forecasting?

Demand forecasting is done to optimize processes, reduce costs and avoid losses caused by freezing up cash in stock or being unable to process orders due to being out of stock. In an ideal world, the company would be able to satisfy demand without overstocking.

Demand forecasting techniques

Demand forecasting is an essential component of every form of commerce, be it retail, wholesale, online, offline or multichannel. It has been present since the very dawn of civilization when intuition and experience were used to forecast demand.

Sybilla – deepsense.ai’s demand forecasting tool

More recent techniques combine intuition with historical data. Modern merchants can dig into their data in a search for trends and patterns. At the pinnacle of these techniques, are demand forecasting machine learning models, including gradient boosting and neural networks, which are currently the most popular ones and outperform classic statistics-based methods.

The basis of more recent demand forecasting techniques is historical data from transactions. These are data that sellers collect and store for fiscal and legal reasons. Because they are also searchable, these data are the easiest to use.

Sybilla – deepsense.ai’s demand forecasting tool

How to choose the right demand forecasting method – indicators

As always, selecting the right technique depends on various factors, including:

  • The scale of operations – the larger the scale, the more challenging processing the data becomes.
  • The organization’s readiness – even the large companies can operate (efficiency aside) on fragmented and messy databases, so the technological and organizational readiness to apply more sophisticated demand forecasting techniques is another challenge.
  • The product – it is easier to forecast demand for an existing product than for a newly introduced one. When considering the latter, it is crucial to forming a set of assumptions to work on. Owning as much information about the product as possible is the first step, as it allows the company to spot the similarities between particular goods and search for correlations in the buying patterns. Spotting an accessory that is frequently bought along with the main product is one example.

How demand forecasting can help a business

Demand forecasting and following sales forecasting is crucial to shaping a company’s logistics policy and preparing it for the immediate future. Among the main advantages of demand forecasting are:

  • Loss reduction – any demand that was not fulfilled should be considered a loss. Moreover, the company freezes its cash in stock, thus reducing liquidity.
  • Supply chain optimization – behind every shop there is an elaborate logistics chain that generates costs and needs to be managed. The bigger the organization, the more sophisticated and complicated its inventory management must be. When demand is forecast precisely, managing and estimating costs is easier.
  • Increased customer satisfaction – there is no bigger disappointment for consumers than going to the store to buy something only to return empty-handed. For a business, the worst-case scenario is for said consumers to swing over to the competition to make their purchase there. Companies reduce the risk of running out of stock–and losing customers–by making more accurate predictions.
  • Smarter workforce management – hiring temporary staff to support a demand peak is a smart way for a business to ensure it is delivering a proper level of service.
  • Better marketing and sales management – depending on the upcoming demand for particular goods, sales and marketing teams can shift their efforts to support cross- and upselling of complementary products,
  • Supporting expert knowledge – models can be designed to build predictions for every single product, regardless of how many there are. In small businesses, humans handle all predictions, but when the scale of the business and the number of goods rises, this becomes impossible. Machine learning models extend are proficient at big data processing.

How to start demand forecasting – a short guide

Building a demand forecasting tool or solution requires, first and foremost, data to be gathered.

While the data will eventually need to be organized, simply procuring it is a good first step. It is easier to structure and organize data and make them actionable than to collect enough data fast. The situation is much easier when the company employs an ERP or CRM system, or some other form of automation, in their daily work. Such systems can significantly ease the data gathering process and automate the structuring.

Sybilla – deepsense.ai’s demand forecasting tool

The next step is building testing scenarios that allow the company to test various approaches and their impact on business efficiency. The first solution is usually a simple one, and is a good benchmark for solutions to come. Every next iteration should be tested to see if it is performing better than the previous one.

Historical data is usually everything one needs to launch a demand forecasting project, and obviously, there are significantly less data on the future. But sometimes it is available, for example:

  • Short-term weather forecasts – the information about upcoming shifts in weather can be crucial in many businesses, including HoReCa and retail. It is quite intuitive to cross-sell sunglasses or ice cream on sunny days.
  • The calendar – Black Friday is a day like no other. The same goes for the upcoming holiday season or other events that are tied to a given date.

Sources of data that originate from outside the company make predictions even more accurate and provide better support for making business decisions.

Common pitfalls to avoid when building a demand forecasting solution

There are numerous pitfalls to avoid when building a demand forecasting solution. The most common of them include:

  • The data should be connected with the marketing and ads history – a successful promotion results in a significant change in data, so having information about why it was a success makes predictions more accurate. If machine learning was used to make the predictions, the model could have misattributed the changes and made false predictions based on wrong assumptions.
  • New products with no history – when new products are introduced, demand must still be estimated, but without the help of historical data. The good news here is that great strides have been made in this area, and techniques such as product DNA can help a company uncover similar products its past/current portfolio. Having data on similar products can boost the accuracy of prediction for new products.
  • The inability to predict the weather – weather drives demand in numerous contexts and product areas and can sometimes be even more important than the price of a product itself! (yes, classical economists would be very upset). The good news is that even if you are unable to predict the weather, you can still use it in your model to explain historical variations in demand.
  • Lacking information about changes – In an effort to support both short- and long-term goals, companies constantly change their offering and websites. When the information about changes is not annotated in the data, the model encounters sudden dwindles and shifts in demand with apparently no reason. In the reality, it is usually a minor issue like changing the inventory or removing a section from website.
  • Inconsistent portfolio information – predictions can be done only if the data set is consistent. If any of the goods in a portfolio have undergone a name or ID change, it must be noted in order not to confuse the system or miss out on a valuable insight.
  • Overfitting the model – a vicious problem in data science. A model is so good at working on the training dataset that it becomes inflexible and produces worse predictions when new data is delivered. Avoiding overfitting is down to the data scientists.
  • Inflexible logistics chain – the more flexible the logistics process is, the better and more accurate the predictions will be. Even the best demand forecasting model is useless when the company’s logistics is a fixed process that allows no space for changes.

Sybilla – deepsense.ai’s demand forecasting tool

Summary

Demand and sales forecasting is a crucial part of any business. Traditionally it has been done by experts, based on know-how honed through experience. With the power of machine learning it is now possible to combine the astonishing scale of big data with the precision and cunning of a machine-learning model. While the business community must remain aware of the multiple pitfalls it will face when employing machine learning to predict demand, there is no doubt that it will endow demand forecasting with awesome power and flexibility.

https://deepsense.ai/wp-content/uploads/2019/05/A-comprehensive-guide-to-demand-forecasting.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-05-28 11:24:422019-08-07 09:47:18A comprehensive guide to demand forecasting
AI Monthly Digest #8 – new AI applications for music and gaming

AI Monthly Digest #8 – new AI applications for music and gaming

May 9, 2019/in Blog posts, Machine learning, AI Monthly Digest /by Konrad Budek and Arkadiusz Nowaczynski

The April edition of AI Monthly Digest looks at how AI is used in entertainment, for both research and commercial purposes.

After its recent shift from non-profit to for-profit, OpenAI continues to build a significant presence in the world of AI research. It is involved in two of five stories chosen as April’s most significant.

AI Music – spot the discord…

While machine learning algorithms are getting increasingly better at delivering convincing text or gaining superior accuracy in image recognition, machines struggle to understand the complicated patterns behind the music. In its most basic form, the music is built upon repetitive motifs that return based on sections of various length – it may be a recurrent part of one song or a leading theme of an entire movie, opera or computer game.

Machine learning-driven composing is comparable to natural language processing – the short parts are done well but the computer gets lost when it comes to keeping the integrity of the longer ones. April brought us two interesting stories regarding different approaches to ML-driven composition.

OpenAI developed MuseNet, a neural network that produces music in a few different styles. Machine learning algorithms were used to analyze the style of various classical composers, including Chopin, Bach, Beethoven and Rachmaninoff. The model was further fed rock songs by Queen, Green Day and Nine Inch Nails and pop music by Madonna, Adele and Ricky Martin, to name a few. The model learned to mimic the style of a particular artist and infuse it with twists. If the user wants to spice up the Moonlight Sonata with a drum, the road is open.

OpenAI has rolled out an early version of the model and it performs better when the user is trying to produce a consistent piece of music, rather than pair up a disparate coupling of Chopin and Nine Inch Nails-style synthesizers.

OpenAI claims that music is a great tool with which to evaluate a model’s ability to maintain long-term consistency, mainly thanks to how easy it is to spot discord.

…or embrace it

While OpenAI embraces harmony in music, Dadabots has taken the opposite tack. Developed by Cj Carr and Zack Zukowski, Databots model imitates rock, particularly metal bands. The team has put their model on YouTube to deliver technical death metal as an endless live stream – the Relentless Doppelganger.

While it is increasingly common to find AI-generated music on Bandcamp, putting a 24/7 death metal stream on YouTube is undoubtedly something new.

Fans of the AI-composed death metal have given the music rave reviews. As The Verge notes, the creation is “Perfectly imperfect” thanks to its blending of various death metal styles, transforming vocals into a choir and delivering sudden style-switching.

It appears that bare-metal has ushered in a new era in technical death metal.

Why does it matter?

Researchers behind the Relentless Doppelganger remark that developing music-making AI has mainly been based on classical music, which is heavily reliant on harmony, while death metal, among others, embraces the power of chaos. It stands to reason, then, that the  music generated is not perfect when it comes to delivering harmony. The effect is actually more consistent with the genre’s overall sound. What’s more, Databots’ model delivers not only instrumentals, but also vocals, which would be unthinkable with classical music. Of course, the special style of metal singing called growl makes most of the lyrics incomprehensible, so little to no sense is actually required here.

Related:  AI Monthly Digest #7 - machine mistakes, and the hard way to profit from non-profit

From a scientific point of view, OpenAI delivers much more significant work. But AI is working its way into all human activity, including politics, social problems and policy and art. From an artistic point of view, AI-produced technical death metal is interesting.

It appears that when it comes to music, AI likes it brutal.

AI in gaming goes mainstream

Game development has a long and uneasy tradition of delivering computer players to allow users to play in single-player mode. There are many forms of non-ML-based AI present in video games. They are usually based on a set of triggers that initiate a particular action the computer player takes. What’s more, modern, story-driven games rely heavily on scripted events like ambushes or sudden plot twists.

This type of AI delivers an enjoyable level of challenge but lacks the versatility and viciousness of human players coming up with surprising strategies to deal with. Also, the goal of AI in single-player mode is not to dominate the human player in every way possible.

Related:  AI Monthly Digest #6 - AI ethics and artificial imagination

The real challenge in all of this comes from developing bots, or the computer-controlled players, to deliver a multiplayer experience in single-player mode. Usually, the computer players significantly differ from their human counterparts and any transfer from single to multiplayer ends with shock and an instant knock-out from experienced players.

To deliver bots that behave in a more human way yet provide a bigger challenge, Milestone, the company behind MotoGP 19, turned to reinforcement learning to build computer players to race against human counterparts. The artificial intelligence controlling opponents is codenamed A.N.N.A. (Artificial Neural Network Agent).

A.N.N.A. is a neural network-based AI that is not scripted directly but created through reinforcement learning. This means developers describe an agent’s desired behaviour and then train a neural network to achieve it. Agents created in this way show more skilled and realistic behaviors, which are high on the wish list of Moto GP gamers.

Why does it matter?

Applying ML-based artificial intelligence in a mainstream game is the first step in delivering a more realistic and immersive game experience. Making computer players more human in their playing style makes them less exploitable and more flexible.

The game itself is an interesting example. It is common in RL-related research to apply this paradigm in strategic games, be it chess, GO or Starcraft II for research purposes. In this case, the neural network controls a digital motorcycle. Racing provides a closed game environment with a limited amount of variables to control. Thus, racing in a virtual world is a perfect environment to deploy ML-based solutions.

In the end, it isn’t the technology but rather gamers’ experience that is key. Will reinforcement learning bring a new paradigm of embedding AI in games? We’ll see once gamers react.

Bittersweet lessons from OpenAI Five

Defense of The Ancients 2 (DOTA 2) is a highly popular multiplayer online battle arena game with two teams, each consisting of five players fighting for control over a map. The game blends tactical, strategic and action elements and is one of the most popular online sports games.

OpenAI Five is the neural network that plays DOTA 2, developed by OpenAI.

The AI agent beat world champions from Team OG during the OpenAI Five Finals on April 13th. It was the first time an AI-controlled player has beaten a pro-player team during a live-stream.

Why does it matter?

Although the project seems similar to Deepmind’s AlphaStar, there are several significant differences:

  • The model was trained continuously for almost a year instead of starting from zero knowledge for each new experiment – the common way of developing machine learning models is to design the entire training procedure upfront, launch it and observe the result. Every time a novel idea is proposed, the learning algorithm is modified accordingly and a new experiment is launched starting from scratch to get a fair comparison between various concepts. In this case, researchers decided not to run training from scratch, but to integrate ideas and changes into the already trained model, sometimes doing elaborate surgery on their artificial neural network. Moreover, the game received a number of updates during the training process. Thus, the model was forced at some points not to learn a new fact, but to update its knowledge. And it managed to do so. The approach enabled the team to massively reduce the computing power over the amount it had invested in training previous iterations of the model.
  • The model effectively cooperated with human players – The model was available publicly as a player, so users could play with it, both as ally and foe. Despite being trained without human interaction, the model was effective both as an ally and foe, clearly showing that AI is a potent tool to support humans in performing their tasks — even when that task is slaying an enemy champion.
  • The research done was somewhat of a failure – The model performs well, even if building it was not the actual goal. The project was launched to break a previously unbroken game by testing and looking for new approaches. The best results were achieved by providing more computing power and upscaling the neural network. Despite delivering impressive results for OpenAI, the project did not lead to the expected breakthroughs and the company has hinted that it could be discontinued in its present format. A bitter lesson indeed.

Blurred computer vision

Computer vision techniques deliver astonishing results. They have sped up the diagnosing of diabetic retinopathy, built maps from satellite images and recognized particular whales from aerial photography. Well-trained models often outperform human experts. Given that they don’t get tired and never lose their focus, why shouldn’t they?

But there remains room for improvement for machine vision, as researchers from KU Leuven University in Belgium report. They delivered an image that fooled an algorithm, rendering the person holding a card with an image virtually invisible to a machine learning-based solution.

Why does it matter?

As readers of William Gibson’s novel Zero Hour will attest, images devised to fool AI are nothing new. Delivering a printable image to confound algorithm highlights a serious interest among malicious players interfering with AI.

Examples may include images produced to fool AI-powered medical diagnostic devices for fraudulent reasons or sabotaging road infrastructure to render it useless for autonomous vehicles.

AI should not be considered a black box and algorithms are not unbreakable. As always, reminders of that are welcome, especially as responsibility and transparency are among the most significant AI trends for 2019.

https://deepsense.ai/wp-content/uploads/2019/05/AI-Monthly-Digest-8-–-new-AI-applications-for-music-and-gaming.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-05-09 11:34:462019-07-22 12:29:33AI Monthly Digest #8 - new AI applications for music and gaming

Keras vs. PyTorch: Alien vs. Predator recognition with transfer learning

October 3, 2018/in Blog posts, Deep learning, Machine learning /by Piotr Migdal, Patryk Miziuła and Rafał Jakubanis

In our previous post, we gave you an overview of the differences between Keras and PyTorch, aiming to help you pick the framework that’s better suited to your needs. Now, it’s time for a trial by combat. We’re going to pit Keras and PyTorch against each other, showing their strengths and weaknesses in action. We present a real problem, a matter of life-and-death: distinguishing Aliens from Predators!

Predator and Alien
Image taken from our dataset. Both Predator and Alien are deeply interested in AI.

We perform image classification, one of the computer vision tasks deep learning shines at. As training from scratch is unfeasible in most cases (as it is very data hungry), we perform transfer learning using ResNet-50 pre-trained on ImageNet. We get as practical as possible, to show both the conceptual differences and conventions.

At the same time we keep the code fairly minimal, to make it clear and easy to read and reuse. See notebooks on GitHub, Kaggle kernels or Neptune versions with fancy charts.

Wait, what’s transfer learning? And why ResNet-50?

In practice, very few people train an entire Convolutional Network from scratch (with random initialization), because it is relatively rare to have a dataset of sufficient size. Instead, it is common to pretrain a ConvNet on a very large dataset (e.g. ImageNet, which contains 1.2 million images with 1000 categories), and then use the ConvNet either as an initialization or a fixed feature extractor for the task of interest.– Andrej Karpathy (Transfer Learning – CS231n Convolutional Neural Networks for Visual Recognition)

Transfer learning is a process of making tiny adjustments to a network trained on a given task to perform another, similar task. In our case we work with the ResNet-50 model trained to classify images from the ImageNet dataset. It is enough to learn a lot of textures and patterns that may be useful in other visual tasks, even as alien as this Alien vs. Predator case. That way, we use much less computing power to achieve much better result.

In our case we do it the simplest way:

  • keep the pre-trained convolutional layers (so-called feature extractor), with their weights frozen,
  • remove the original dense layers, and replace them with brand-new dense layers we will use for training.

So, which network should be chosen as the feature extractor?

ResNet-50 is a popular model for ImageNet image classification (AlexNet, VGG, GoogLeNet, Inception, Xception are other popular models). It is a 50-layer deep neural network architecture based on residual connections, which are connections that add modifications with each layer, rather than completely changing the signal.

ResNet was the state-of-the-art on ImageNet in 2015. Since then, newer architectures with higher scores on ImageNet have been invented. However, they are not necessarily better at generalizing to other datasets (see the Do Better ImageNet Models Transfer Better? arXiv paper).

Ok, it’s time to dive into the code.

Let the match begin!

We do our Alien vs. Predator task in seven steps:

  1. Prepare the dataset
  2. Import dependencies
  3. Create data generators
  4. Create the network
  5. Train the model
  6. Save and load the model
  7. Make predictions on sample test images

We supplement this blog post with Python code in Jupyter Notebooks (Keras-ResNet50.ipynb, PyTorch-ResNet50.ipynb). This environment is more convenient for prototyping than bare scripts, as we can execute it cell by cell and peak into the output.

All right, let’s go!

0. Prepare the dataset

We created a dataset by performing a Google Search with the words “alien” and “predator”. We saved JPG thumbnails (around 250×250 pixels) and manually filtered the results. Here are some examples:

We split our data into two parts:

  • Training data (347 samples per class) – used for training the network.
  • Validation data (100 samples per class) – not used during the training, but needed in order to check the performance of the model on previously unseen data.

Keras requires the datasets to be organized in folders in the following way:

|-- train
    |-- alien
    |-- predator
|-- validation
    |-- alien
    |-- predator

If you want to see the process of organizing data into directories, check out the data_prep.ipynb file. You can download the dataset from Kaggle.

1. Import dependencies

First, the technicalities. We assume that you have Python 3.5+, Keras 2.2.2 (with TensorFlow 1.10.1 backend) and PyTorch 0.4.1. Check out the requirements.txt file in the repo.

So, first, we need to import the required modules. We separate the code in Keras, PyTorch and common (one required in both).

COMMON

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
%matplotlib inline

KERAS

import keras
from keras.preprocessing.image import ImageDataGenerator
from keras.applications import ResNet50
from keras.applications.resnet50 import preprocess_input
from keras import Model, layers
from keras.models import load_model, model_from_json

PYTORCH

import torch
from torchvision import datasets, models, transforms
import torch.nn as nn
from torch.nn import functional as F
import torch.optim as optim

We can check the frameworks’ versions by typing keras.__version__  and torch.__version__, respectively.

2. Create data generators

Normally, the images can’t all be loaded at once, as doing so would be too much for the memory to handle. At the same time, we want to benefit from the GPU’s performance boost by processing a few images at once. So we load images in batches (e.g. 32 images at once) using data generators. Each pass through the whole dataset is called an epoch.

We also use data generators for preprocessing: we resize and normalize images to make them as ResNet-50 likes them (224 x 224 px, with scaled color channels). And last but not least, we use data generators to randomly perturb images on the fly:

Performing such changes is called data augmentation. We use it to show a neural network which kinds of transformations don’t matter. Or, to put it another way, we train on a potentially infinite dataset by generating new images based on the original dataset.

Almost all visual tasks benefit, to varying degrees, from data augmentation for training. For more info about data augmentation, see as applied to plankton photos or how to use it in Keras. In our case, we randomly shear, zoom and horizontally flip our aliens and predators.

Here we create generators that:

  • load data from folders,
  • normalize data (both train and validation),
  • augment data (train only).

KERAS

train_datagen = ImageDataGenerator(
    shear_range=10,
    zoom_range=0.2,
    horizontal_flip=True,
    preprocessing_function=preprocess_input)

train_generator = train_datagen.flow_from_directory(
    'data/train',
    batch_size=32,
    class_mode='binary',
    target_size=(224,224))

validation_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input)

validation_generator = validation_datagen.flow_from_directory(
    'data/validation',
    shuffle=False,
    class_mode='binary',
    target_size=(224,224))

PYTORCH

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

data_transforms = {
    'train':
        transforms.Compose([
            transforms.Resize((224,224)),
            transforms.RandomAffine(0, shear=10, scale=(0.8,1.2)),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            normalize]),
    'validation':
        transforms.Compose([
            transforms.Resize((224,224)),
            transforms.ToTensor(),
            normalize])}

image_datasets = {
    'train':
        datasets.ImageFolder('data/train', data_transforms['train']),
    'validation':
        datasets.ImageFolder('data/validation', data_transforms['validation'])}

dataloaders = {
    'train':
        torch.utils.data.DataLoader(
            image_datasets['train'],
            batch_size=32,
            shuffle=True,
            num_workers=4),
    'validation':
        torch.utils.data.DataLoader(
            image_datasets['validation'],
            batch_size=32,
            shuffle=False,
            num_workers=4)}

In Keras, you get built-in augmentations and preprocess_input  method normalizing images put to ResNet-50, but you have no control over their order. In PyTorch, you have to normalize images manually, but you can arrange augmentations in any way you like.

There are also other nuances: for example, Keras by default fills the rest of the augmented image with the border pixels (as you can see in the picture above) whereas PyTorch leaves it black. Whenever one framework deals with your task much better than the other, take a closer look to see if they perform preprocessing identically; we bet they don’t.

3. Create the network

The next step is to import a pre-trained ResNet-50 model, which is a breeze in both cases. We freeze all the ResNet-50’s convolutional layers, and only train the last two fully connected (dense) layers. As our classification task has only 2 classes (compared to 1000 classes of ImageNet), we need to adjust the last layer.

Here we:

  • load pre-trained network, cut off its head and freeze its weights,
  • add custom dense layers (we pick 128 neurons for the hidden layer),
  • set the optimizer and loss function.

KERAS

conv_base = ResNet50(include_top=False,
                     weights='imagenet')

for layer in conv_base.layers:
    layer.trainable = False

x = conv_base.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(128, activation='relu')(x)
predictions = layers.Dense(2, activation='softmax')(x)
model = Model(conv_base.input, predictions)

optimizer = keras.optimizers.Adam()
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

PYTORCH

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

model = models.resnet50(pretrained=True).to(device)

for param in model.parameters():
    param.requires_grad = False

model.fc = nn.Sequential(
    nn.Linear(2048, 128),
    nn.ReLU(inplace=True),
    nn.Linear(128, 2)).to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.fc.parameters())

We load the ResNet-50 from both Keras and PyTorch without any effort. They also offer many other well-known pre-trained architectures: see Keras’ model zoo and PyTorch’s model zoo.  So, what are the differences?

In Keras we may import only the feature-extracting layers, without loading extraneous data (include_top=False). We then create a model in a functional way, using the base model’s inputs and outputs. Then we use model.compile(…) to bake into it the loss function, optimizer and other metrics.

In PyTorch, the model is a Python object. In the case of models.resnet50, dense layers are stored in model.fc attribute. We overwrite them. The loss function and optimizers are separate objects. For the optimizer, we need to explicitly pass a list of parameters we want it to update.

Predator's wrist computer
Frame from ‘AVP: Alien vs. Predator’: Predators’ wrist computer. We’re pretty sure Predator could use it to compute logsoftmax.

In PyTorch, we should explicitly specify what we want to load to the GPU using .to(device) method. We have to write it each time we intend to put an object on the GPU, if available. Well…

Layer freezing works in a similar way. However, in The Batch Normalization layer of Keras is broken (as of the current version; thx Przemysław Pobrotyn for bringing this issue). That is – some layers get modified anyway, even with trainable = False.

Keras and PyTorch deal with log-loss in a different way.

In Keras, a network predicts probabilities (has a built-in softmax function), and its built-in cost functions assume they work with probabilities.

In PyTorch we have more freedom, but the preferred way is to return logits. This is done for numerical reasons, performing softmax then log-loss means doing unnecessary log(exp(x)) operations. So, instead of using softmax, we use LogSoftmax (and NLLLoss) or combine them into one nn.CrossEntropyLoss  loss function.

4. Train the model

OK, ResNet is loaded, so let’s get ready to space rumble!

Predators' mother ship
Frame from ‘AVP: Alien vs. Predator’: the Predators’ Mother Ship. Yes, we’ve heard that there are no rumbles in space, but nothing is impossible for Aliens and Predators.

Now, we proceed to the most important step – model training. We need to pass data, calculate the loss function and modify network weights accordingly. While we already had some differences between Keras and PyTorch in data augmentation, the length of code was similar. For training… the difference is massive. Let’s see how it works!

Here we:

  • train the model,
  • measure the loss function (log-loss) and accuracy for both training and validation sets.

KERAS

history = model.fit_generator(
    generator=train_generator,
    epochs=3,
    validation_data=validation_generator)

PYTORCH

def train_model(model, criterion, optimizer, num_epochs=3):
    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch+1, num_epochs))
        print('-' * 10)

        for phase in ['train', 'validation']:
            if phase == 'train':
                model.train()
            else:
                model.eval()

            running_loss = 0.0
            running_corrects = 0

            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                outputs = model(inputs)
                loss = criterion(outputs, labels)

                if phase == 'train':
                    optimizer.zero_grad()
                    loss.backward()
                    optimizer.step()

                _, preds = torch.max(outputs, 1)
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / len(image_datasets[phase])
            epoch_acc = running_corrects.double() / len(image_datasets[phase])

            print('{} loss: {:.4f}, acc: {:.4f}'.format(phase,
                                                        epoch_loss,
                                                        epoch_acc))
    return model

model_trained = train_model(model, criterion, optimizer, num_epochs=3)

In Keras, the model.fit_generator performs the training… and that’s it! Training in Keras is just that convenient. And as you can find in the notebook, Keras also gives us a progress bar and a timing function for free. But if you want to do anything nonstandard, then the pain begins…

Predators' shuriken
Predator’s shuriken returning to its owner automatically. Would you prefer to implement its tracking ability in Keras or PyTorch?

PyTorch is on the other pole. Everything is explicit here. You need more lines to construct the basic training, but you can freely change and customize all you want.

Let’s shift gears and dissect the PyTorch training code. We have nested loops, iterating over:

  • epochs,
  • training and validation phases,
  • batches.

The epoch loop does nothing but repeat the code inside. The training and validation phases are done for three reasons:

  • Some special layers, like batch normalization (present in ResNet-50) and dropout (absent in ResNet-50), work differently during training and validation. We set their behavior by model.train() and model.eval(), respectively.
  • We use different images for training and for validation, of course.
  • The most important and least surprising thing: we train the network during training only. The magic commands optimizer.zero_grad(), loss.backward() and optimizer.step() (in this order) do the job. If you know what backpropagation is, you appreciate their elegance.

We take care of computing the epoch losses and prints ourselves.

5. Save and load the model

Saving

Once our network is trained, often with high computational and time costs, it’s good to keep it for later. Broadly, there are two types of savings:

  • saving the whole model architecture and trained weights (and the optimizer state) to a file,
  • saving the trained weights to a file (keeping the model architecture in the code).
    It’s up to you which way you choose.

Here we:

  • save the model.

KERAS

# architecture and weights to HDF5
model.save('models/keras/model.h5')

# architecture to JSON, weights to HDF5
model.save_weights('models/keras/weights.h5')
with open('models/keras/architecture.json', 'w') as f:
    f.write(model.to_json())

PYTORCH

torch.save(model_trained.state_dict(),'models/pytorch/weights.h5')
Alien is evolving
Frame from ‘Alien: Resurrection’: Alien is evolving, just like PyTorch.

One line of code is enough in both frameworks. In Keras you can either save everything to a HDF5 file or save the weights to HDF5 and the architecture to a readable json file. By the way: you can then load the model and run it in the browser.

Currently, PyTorch creators recommend saving the weights only. They discourage saving the whole model because the API is still evolving.

Loading

Loading models is as simple as saving. You should just remember which saving method you chose and the file paths.

Here we:

  • load the model.

KERAS

# architecture and weights from HDF5
model = load_model('models/keras/model.h5')

# architecture from JSON, weights from HDF5
with open('models/keras/architecture.json') as f:
    model = model_from_json(f.read())
model.load_weights('models/keras/weights.h5')

PYTORCH

model = models.resnet50(pretrained=False).to(device)
model.fc = nn.Sequential(
    nn.Linear(2048, 128),
    nn.ReLU(inplace=True),
    nn.Linear(128, 2)).to(device)
model.load_state_dict(torch.load('models/pytorch/weights.h5'))

In Keras we can load a model from a JSON file, instead of creating it in Python (at least when we don’t use custom layers). This kind of serialization makes it convenient for transfering models.

PyTorch can use any Python code. So pretty much we have to re-create a model in Python.

Loading model weights is similar in both frameworks.

6. Make predictions on sample test images

All right, it’s finally time to make some predictions! To fairly check the quality of our solution, we ask the model to predict the type of monsters from images not used for training. We can use the validation set, or any other image.

Here we:

  • load and preprocess test images,
  • predict image categories,
  • show images and predictions.

COMMON

validation_img_paths = ["data/validation/alien/11.jpg",
                        "data/validation/alien/22.jpg",
                        "data/validation/predator/33.jpg"]
img_list = [Image.open(img_path) for img_path in validation_img_paths]

KERAS

validation_batch = np.stack([preprocess_input(np.array(img.resize((img_size, img_size))))
                             for img in img_list])

pred_probs = model.predict(validation_batch)

PYTORCH

validation_batch = torch.stack([data_transforms['validation'](img).to(device)
                                for img in img_list])

pred_logits_tensor = model(validation_batch)
pred_probs = F.softmax(pred_logits_tensor, dim=1).cpu().data.numpy()

COMMON

fig, axs = plt.subplots(1, len(img_list), figsize=(20, 5))
for i, img in enumerate(img_list):
    ax = axs[i]
    ax.axis('off')
    ax.set_title("{:.0f}% Alien, {:.0f}% Predator".format(100*pred_probs[i,0],
                                                          100*pred_probs[i,1]))
    ax.imshow(img)

Prediction, like training, works in batches (here we use a batch of 3; though we could surely also use a batch of 1). In both Keras and PyTorch we need to load and preprocess the data. A rookie mistake is to forget about the preprocessing step (including color scaling). It is likely to work, but result in worse predictions (since it effectively sees the same shapes but with different colors and contrasts).

In PyTorch there are two more steps, as we need to:

  • convert logits to probabilities,
  • transfer data to the CPU and convert to NumPy (fortunately, the error messages are fairly clear when we forget this step).

And this is what we get:

It works!

And how about other images? If you can’t come up with anything (or anyone) else, try using photos of your co-workers. :)

Conclusion

As you can see, Keras and PyTorch differ significantly in terms of how standard deep learning models are defined, modified, trained, evaluated, and exported. For some parts it’s purely about different API conventions, while for others fundamental differences between levels of abstraction are involved.

Keras operates on a much higher level of abstraction. It is much more plug&play, and typically more succinct, but at the cost of flexibility.

PyTorch provides more explicit and detailed code. In most cases it means debuggable and flexible code, with only small overhead. Yet, training is way-more verbose in PyTorch. It hurts, but at times provides a lot of flexibility.

Transfer learning is a big topic. Try tweaking your parameters (e.g. dense layers, optimizer, learning rate, augmentation) or choose a different network architecture.

Have you tried transfer learning for image recognition? Consider the list below for some inspiration:

  • Chihuahua vs. muffin, sheepdog vs. mop, shrew vs. kiwi (already serves as an interesting benchmark for computer vision)
  • Original images vs. photoshopped ones
  • Artichoke vs. broccoli vs. cauliflower
  • Zerg vs. Protoss vs. Orc vs. Elf
  • Meme or not meme
  • Is it a picture of a bird?
  • Is it huggable?

Pick Keras or PyTorch, choose a dataset and let us know how it went in the comments section below :)

https://deepsense.ai/wp-content/uploads/2019/04/keras-vs-pytorch-avp-transfer-learning.jpg 337 1140 Piotr Migdal https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Piotr Migdal2018-10-03 17:21:202019-09-04 10:36:08Keras vs. PyTorch: Alien vs. Predator recognition with transfer learning
AI-monthly_digest_7

AI Monthly Digest #7 – machine mistakes, and the hard way to profit from non-profit

April 11, 2019/in Blog posts, Machine learning, AI Monthly Digest /by Konrad Budek and Arkadiusz Nowaczynski

March saw some major events concerning top figures of the ML world, including OpenAI, Yann LeCun, Geoffrey Hinton and Yoshua Bengio.

The past month was also the backdrop for inspiring research on how machines think and how different they can be from humans when provided with the same conditions and problem to solve.

OpenAI goes from non-profit to for-profit

OpenAI was initially the non-profit organization focused on pushing the boundaries of artificial intelligence in the same manner the open-source organizations are able to deliver the highest-class software. The Mozilla Foundation and the Linux Foundation, the non-profit powerhouses behind the popular products, are the best examples.

Yet unlike the popular software with development powered by human talent, AI requires not only brilliant minds but also a gargantuan amount of computing power. The cost of reproducing the GPT-2 model is estimated to be around $50,000 – and that’s only one experiment to conduct. Getting the job done requires a small battalion of research scientists tuning hyperparameters, debugging and testing the approaches and ideas.

Related:  AI Monthly Digest #6 - AI ethics and artificial imagination

Staying on technology’s cutting edge pushed the organization toward the for-profit model to fund the computing power and attracting top talent, as the company notes on its website.

Why does it matter?

First of all, OpenAI was a significant player on the global AI map despite being a non-profit organization. Establishing a for-profit arm creates a new strong player that can develop commercial AI projects.

Moreover, the problem lies in the need for computing power, marking the new age of development challenges. In a traditional software development world, a team of talented coders is everything one would need. When it comes to delivering the AI-models, that is apparently not enough.

The bitter lesson of ML’s development

OpenAI’s transition could be seen as a single event concerning only one organization. It could be, that is, if it wasn’t discussed by the godfathers of modern machine learning.

Related:  With your head in the clouds - how to harness the power of artificial intelligence

Richard Sutton is one of the most renowned and influential researchers of reinforcement learning. In a recent essay, he remarked that most advances in AI development are powered by access to computing power, while the importance of expert knowledge and creative input from human researchers is losing significance.

Moreover, numerous attempts have been made to enrich machine learning with expert knowledge. Usually, the efforts were short-term gains with no bigger significance when seen in the broader context of AI’s development.

Why does it matter?

The opinion would seem to support the general observation that computing power is the gateway to pushing the boundaries of machine learning and artificial intelligence. That power combined with relatively simple machine learning techniques frequently challenges the established ways of solving problems. The RL-based agents playing GO, chess or Starcraft are only top-of-mind examples.

Yann LeCun, Geoffrey Hinton and Yoshua Bengio awarded the Turing Award

The Association for Computing Machinery, the world’s largest organization of computing professionals, announced that this year’s Turing Award went to three researchers for their work on advancing and popularizing neural networks. Currently, the researchers split their time between academia and the private sector, with Yann LeCun being employed by Facebook and New York University, Geoffrey Hinton working for Google and the University of Toronto, and Yoshua Bengio splitting his time between the University of Montreal and his company Element AI.

Why does it matter?

Named after Alan Turing, a giants of mathematics and the godfather of modern computer science, the Turing Award has been called IT’s Nobel Prize. While the lack of such a prize in IT is obvious — IT specialists get the Turing Award.

Nvidia creates a wonder brush – AI that turns a doodle into a landscape

Nvidia has shown how an AI-powered editor swiftly transforms simple, childlike images into near-photorealistic landscapes. While the technology isn’t exactly new, this time the form is interesting. It uses Generative Adversarial Networks and amazes with the details it can muster – if the person drawing adds a lake near a tree, the water will reflect it.

Why does it matter?

Nvidia does a great job in spreading the knowledge about machine learning. Further applications in image editing will no doubt be forthcoming, automating the work of illustrators and graphic designers. But for now, it is amazing to behold.

So do you think like a computer?

While machine learning models are superhumanly effective in image recognition, if they fail, their predictions are usually at least surprising. Until recently, it was believed that people are unable to predict how a computer will interpret an image when not in the right way. Moreover, the totally inhuman way of recognizing the image is prone to mistakes – it is possible to prepare an artificial image that can effectively fool the AI behind the image recognition and, for example, convince the model that a car is in fact a bush.

Related:  AI Monthly digest #5 - AlphaStar beats human champions, robots learn to grasp and a Finnish way to make AI a commodity

The confusion about the machines identifying objects usually comes from the fact that most AI models are narrow AI. The systems are designed to work in a closed environment and solve a narrow problem, like identifying cars or animals. Consequently, the machine has a narrow catalog of entities to name.

To check if humans are able to understand how the machine is making its mistakes, the researchers provided volunteers with images that had already fooled AI models together with the names the machines were able to choose from for those images. In those conditions, people provided the same answers as the machines 75% of the time.

Why does it matter?

A recent study from John Hopkins University shows that computers become increasingly human even in their mistakes and that surprising outcomes are the consequence of extreme narrowness of the artificial mind. A typical preschooler has an incomparably larger vocabulary and amount of experience collected than even the most powerful neural network, so the likelihood of a human finding a more accurate association for the image are many times larger.

Again, the versatility and flexibility of the human mind is the key to its superiority.

https://deepsense.ai/wp-content/uploads/2019/04/AI-monthly_digest_7.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-04-11 13:29:282019-07-08 11:35:10AI Monthly Digest #7 - machine mistakes, and the hard way to profit from non-profit
With-your-head-in-the-clouds-–-how-to-harness-the-power-of-artificial-intelligence

With your head in the clouds – how to harness the power of artificial intelligence

April 4, 2019/in Blog posts, Machine learning /by Paweł Osterreicher and Konrad Budek

To fully leverage the potential of artificial intelligence in business, access to computing power is key. That doesn’t mean, however, that supercomputers are necessary for everyone to take advantage of the AI revolution.

In January 2019 the world learned of the AI model known as AlphaStar, which had vanquished the world’s reigning Starcraft II champion. Training the model by having it square off against a human would have required around 400 years of non-stop play. But thanks to enormous computing power and some neural networks, AlphaStar’s creators needed a mere two weeks.

AlphaStar and similar models of artificial intelligence can be compared to a hypothetical chess player that achieves a superhuman level by playing millions of opponents—a number humans, due to our biological limitations, cannot endeavor to meet. So that they need not spend years training their models, programmers provide them access to thousands of chessboards and allow them to play huge numbers of games at the same time. This parallelization can be achieved by properly building the architecture of the solution on a supercomputer, but it can also be much simpler: by ordering all these chess boards as instances (machines) in the cloud.

Related:  AI Monthly Digest #6 - AI ethics and artificial imagination

Building such models comes with a price tag few can afford as easily as do Google-owned AlphaStar and AlphaGo. And given these high-profile experiments, it should come as no surprise that the business world still believes technological experiments require specialized infrastructure and are expensive. Sadly, this perception extends to AI solutions.

A study done by McKinsey and Company—“AI adoption advances, but foundational barriers remain”—brings us the practical confirmation of this thesis. 25% of the companies McKinsey surveyed indicated that a lack of adequate infrastructure was a key obstacle to developing AI in their own company. Only the lack of an AI strategy and of people and support in the organization were bigger obstacles – that is, fundamental themes preceding any infrastructure discussions.

So how does one reap the benefits of artificial intelligence and machine learning without having to run multi-million dollar projects? There are three ways.

  1. Precisely determine the scope of your AI experiments. A precise selection of business areas to test models and clearly defined success rates is necessary. By scaling down the case being tested to a minimum, it is possible to see how a solution works while avoiding excessive hardware requirements.
  2. Use mainly cloud resources, which generate costs only when used and, thanks to economies of scale, keep prices down.
  3. Create a simplified model using cloud resources, thus combining the advantages of the first two approaches.

Feasibility study – even on a laptop

Even the most complicated problem can be tackled by reducing it to a feasibility study, the maximum simplification of all elements and assumptions. Limiting unnecessary complications and variables will remove the need for supercomputers to perform the calculations – even a decent laptop will get the job done.

An example of a challenge

In a bid to boost its cross-sell/up-sell efforts, a company that sells a wide variety of products in a range of channels needs an engine that recommends to customers its most interesting products.

A sensible approach to this problem would be to first identify a specific business area or line, analyze the products then select the distribution channel. The channel should provide easy access to structured data. The ideal candidate here is online/e-commerce businesses, which could limit the group of customers to a specific segment or location and start building and testing models on the resulting database.

Such an approach would enable the company to avoid the gathering, ordering, standardizing and processing huge amounts of data that complicate the process and account for most of the infrastructure costs. But preparing a prototype of the model makes it possible to efficiently enter the testing and verification phase of the solution’s operation.

Related:  AI Monthly digest #5 - AlphaStar beats human champions, robots learn to grasp and a Finnish way to make AI a commodity

So narrowed down, the problem can be solved by even a single date scientist equipped with a good laptop. Another practical application of this approach comes via Uber. Instead of creating a centralized analysis and data science department, the company hired a machine learning specialist for each of its teams. This led to small adjustments being made to automate and speed up daily work in all of its departments. AI is perceived here not as another corporate-level project, but as a useful tool employees turn to every day.

This approach fits into the widely used (and for good reason) agile paradigm, which aims to create value iteratively by collecting feedback as quickly as possible and include it in the ‘agile’ product construction process (see the graphic below).

When is power really necessary?

Higher computing power is required in the construction of more complex solutions whose tasks go far beyond the standard and which will be used on a much larger scale. The nature of the data being processed can pose yet another obstacle, as the model deepsense.ai developed for the National Oceanic and Atmospheric Administration (NOAA) will illustrate.

The company was tasked by the NOAA with recognizing particular Atlantic Right whales in aerial photographs. With only 450 individual whales left, the species is on the brink of extinction, and only by tracking the fate of each animal separately is it possible to protect them and provide timely help to injured whales.

Is that Thomas and Maddie in the picture?

The solution called for three neural networks that would perform different tasks consecutively – finding the whale’s head in the photo, cropping and framing the photo, and recognizing exactly which individual was shown.

These tasks required surgical precision and consideration of all of the data available—as you’d rightly image—from a very limited number of photos. Such demands complicated the models and required considerable computing power. The solution ultimately automated the process of identifying individuals swimming throughout the high seas, limiting the human work time from a few hours spent poring over the catalog of whale photos to about thirty minutes for checking the results the model returned.

Reinforcement learning (RL) is another demanding category of AI. RL models are taught through interactions with their environment and a set of punishments and rewards. This means that neural networks can be trained to solve open problems and operate in variable, unpredictable environments based on pre-established principles, the best example being controlling an autonomous car.

Related:  AI Monthly digest #4 - artificial intelligence and music, a new GAN standard and fighting depression

In addition to the power needed to train the model, autonomous cars require a simulated environment to be run in, and creating that is often even more demanding than the neural network itself. Researchers from deepsense.ai and Google Brain are working on solving this problem through artificial imagination, and the results are more than promising.

Travel in the cloud

Each of the above examples requires access to large-scale computing power. When the demand for such power jumps, cloud computing holds the answer. The cloud offers three essential benefits in supporting the development of machine learning models:

  • A pay-per-use model keeps costs low and removes barriers to entry – with a pay-per-use model, as the name suggests, the user pays only for resources used, thus driving down costs significantly. Cloud is estimated to be as much as 30% cheaper to implement and maintain than an ownership infrastructure. After all, it removes the need to invest in equipment that can handle complicated calculations. Instead, you pay only for a ‘training session’ for a neural network. After completing the calculations, the network is ready for operation. Running machine learning-based solutions isn’t all that demanding. Consider, for example, that AI is used in mobile phones to adjust technical settings and make photos the best they can be. With support for AI models, you can even handle a processor on your phone.
  • Easy development and expansion opportunities – transferring company resources to the cloud brings a wide range of complementary products within reach. Cloud computing makes it possible to easily share resources with external platforms, and thus even more efficiently use the available data through the visualization and in-depth analysis those platforms offer. According to a RightScale report, State of the Cloud 2018, as an organization’s use of the cloud matures, its interest in additional services grows. Among enterprises describing themselves as cloud computing “beginners,” only 18 percent are interested in additional services. The number jumps to 40 percent among the “advanced” cohort. Whatever a company’s level of interest, there is a wide range of solutions supporting machine learning development awaiting them – from simple processor and memory hiring to platforms and ready-made components for the automatic construction of popular solutions.
  • Permanent access to the best solutions – by purchasing specific equipment, a company will be associated with a particular set of technologies. Companies looking to develop machine learning may need to purchase special GPU-based computers to efficiently handle the calculations. At the same time, equipment is constantly being developed, which may mean that investment in a specific infrastructure is not the optimal choice. For example, Google’s Tensor Processing Unit (TPU) may be a better choice than a GPU for machine learning applications. Cloud computing allows every opportunity to employ the latest available technologies – both in hardware and software.

Risk in the skies?

The difficulties that have put a question mark over migrating to the cloud are worth elaborating here. In addition to the organizational maturity mentioned earlier and the fear of high costs, the biggest problem is that some industries and jurisdictions may prevent full control over one’s own data. Not all data can be freely sent (even for a moment) to foreign data centers. Mixed solutions that simultaneously use both cloud and the organization’s own infrastructure are useful here. Data are divided into those that must remain in the company and those that can be sent to the cloud.

In Poland, this problem will be addressed by the recently appointed Operator Chmury Krajowej (National Cloud Operator), which will offer cloud computing services fully implemented in data centers located in Poland. This could well convince some industries and the public sector to open up to the cloud.

Matching solutions

Concerns about the lack of adequate computational resources a company needs if it wants to use machine learning are in most cases unjustified. Many models can be built using ordinary commodity hardware that is already in the enterprise. And when a system is to be built that calls for long-term training of a model and large volumes of data, the power and infrastructure needed can be easily obtained from the cloud.

Machine learning is currently one of the hottest business trends. And for good reason. Easy access to the cloud democratizes computing power and thus also access to AI – small companies can use the enormous resources available every bit as quickly as the market’s behemoths. Essentially, there’s a feedback loop between cloud computing and machine learning that cannot be ignored.

The article was originally published in Harvard Business Review.

https://deepsense.ai/wp-content/uploads/2019/04/With-your-head-in-the-clouds-–-how-to-harness-the-power-of-artificial-intelligence.jpg 337 1140 Paweł Osterreicher https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Paweł Osterreicher2019-04-04 13:15:542019-06-28 07:13:36With your head in the clouds - how to harness the power of artificial intelligence
Page 1 of 512345

Start your search here

NEWSLETTER SUBSCRIPTION

You can modify your privacy settings and unsubscribe from our lists at any time (see our privacy policy).

This site is protected by reCAPTCHA and the Google privacy policy and terms of service apply.

THE NEWEST AI MONTHLY DIGEST

  • AI Monthly Digest #14 - quantum supremacyAI Monthly Digest #14 – quantum supremacyNovember 13, 2019

CATEGORIES

  • Blog posts
  • Big data & Spark
  • Data science
  • Deep learning
  • Machine learning
  • Neptune
  • Reinforcement learning
  • Seahorse
  • Job offer
  • Popular posts
  • AI Monthly Digest
  • Press release

POPULAR POSTS

  • A comprehensive guide to demand forecastingA comprehensive guide to demand forecastingMay 28, 2019
  • Five top artificial intelligence (AI) trends for 2019Five top artificial intelligence (AI) trends for 2019January 9, 2019
  • What is reinforcement learning? The complete guideWhat is reinforcement learning? The complete guideJuly 5, 2018
  • deepsense.ai logo white
  • Solutions
  • Increase sales
  • Reduce costs
  • Optimize processes
  • Build data strategy
  • Train your team
  • Knowledge base
  • Blog
  • Press center
  • deepsense.ai
  • Our story
  • Management
  • Scientific Advisory Board
  • Careers
  • Support
  • Terms of service
  • Privacy policy
  • Contact us
  • Join our community
  • facebook logo linkedin logo twitter logo
  • © deepsense.ai 2014-2019
Scroll to top

This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies.

OKLearn more