Data science Archives - Page 3 of 10

AI Monthly digest #2 – the fakeburger, BERT for NLP and machine morality

November 8, 2018/in Data science, AI Monthly Digest /by Konrad Budek and Arkadiusz Nowaczynski

Fake images of hamburgers, the autonomous trolley problem, Google’s BERT for NLP and more stories from October, curated by deepsense.ai’s team, right here in our AI Monthly Digest.

October brought important developments in machine learning and sparks for interesting discussions about machine morality. deepsense.ai’s Arkadiusz Nowaczyński and Konrad Budek chose the five stories below.

1. Enter the fakeburger – DeepMind managed to produce convincing images of hamburgers, animals and landscapes

Renowned AI company DeepMind produced synthetic photos of hamburgers, landscapes and animals out of the ImageNet dataset. In most cases, as a team, it was difficult for us to determine if the pictures depicted a real or fake burger.

This was not the first time a neural network was used to create a convincing fake photo – take a look at NVIDIA’s one hour of imaginary celebrities.

We now have the ability to produce realistic images after training on the ImageNet dataset famous for advancing state-of-the-art image classification. DeepMind researcher Andrew Brock achieved his breakthrough with highly tuned Generative Adversarial Networks (GAN). The GAN uses a generator that produces artificial samples of images, and a discriminator, which distinguishes between fake and a real-world examples. The GANs here are scaled up, leading to the impressive results. Larger networks and training batch sizes (=2048) vastly improve the quality over previous work. Google’s Tensor Processing Unit (TPU) made the training feasible on such a large scale.

AI Monthly digest #2 – the fakeburger, BERT for NLP and machine morality - Fake images

Animals, nature and food look convincing, but scenes involving humans aren’t there yet.

Even though the model isn’t perfect, this is a remarkable step towards generating realistic looking photos with neural networks. To read more about the BigGAN and creating images of fakeburgers, check out the Arxiv paper.

2. AI-generated portrait sold for $432,500 at auction

Contrary to the fakeburgers and fakedogs built by DeepMind, the de Bellamy family consists of people with AI-generated eerie faces that are clearly non-human. The entire de Bellamy family images was generated by Obvious, a group of French AI engineers and artists.
Images were produced using GAN (Generative Adversarial Network) in the same manner DeepMind used it to produce an image of a hamburger. The model was fed with 15,000 portraits from last 600 years and attempted to build new ones using the data.
It is easy to see that images have no match to the master’s paintings, even considering the variety of styles represented by the artificial artist. Nevertheless, the portrait of Edmond de Belamy was sold for nearly half million dollar.

3. The machine morality – biases and automated discrimination

The transparency of machine learning models and Artificial Intelligence again in October came to the fore, igniting discussion over the Internet. The first important issue was gender bias in an Amazon-developed Artificial Intelligence model for preprocessing resumes delivered to the company. Trained with 10 years’ worth of resumes, the system was unintentionally trained to choose male candidates, as tech is dominated by men. The company changed its model but as the MIT Technology Review states, the company is no longer certain about the system’s neutrality in other areas.
Given the towering AI adoption rate, discussion about machine ethics and transparency is necessary. According to data from Deloitte, the number of machine learning project implementations was twice higher in 2018 than in 2017, and is expected to quadruple from there by 2020. As machines themselves are incapable of being racist, misogynist or biased, creating proper datasets and designing the evaluation process to spot hidden bias is crucial to building models that best and most fairly serve companies and society alike.
The data science community has not left the problem unaddressed – Google AI launched a Kaggle competition aiming to design image recognition models that could perform well on photos taken in different geographical regions than those used for training. The competition is being held as a competition track at the NIPS 2018 Conference, which is better-attended than even ComicCon.
This all goes to show that experts have been right from the start — AI is yet another tool that needs to be constantly evaluated and developed if it is to achieve its goals.

4. And the machine morality again – autonomous trolley problem

The topic of machine bias is even more important considering the rise of autonomous cars. A global study shows that people from various social and cultural backgrounds differ in their perception of “a lesser evil” when it comes to the “trolley problem” in autonomous cars. In a nutshell, if a car has to choose whether to hit an elderly person or a child, whom should it choose? A group of people or a single person? A pregnant woman or a medical doctor?
The study shows that choices vary by country, and the differences are more than significant. When facing the extreme situation of a car accident, human drivers make their mind autonomously and are solely responsible for their choices. On the other hand, every autonomous car will carry the AI model its manufacturer has provided. Are the legal system and society ready to transfer the responsibility for driving and the choices it necessitates from the driver to a non-human machine? Should the AI model fit the culture it operates in or follow some other code? These questions have yet to be answered.

5. Google’s BERT for NLP – new state-of-the-art in language modeling

Natural Language Processing may enter a new era with Google’s Bidirectional Encoder Representation from Transformations (BERT).
For now, NLP practitioners continue to use pre-trained word embeddings as initialization or input features in custom architectures for specific tasks. BERT, a model that can be pre-trained on a large text corpus and then fine-tuned for various NLP downstream tasks, may change that. It might be similar to what we have seen in Computer Vision in the last couple of years, where fine-tuning models pre-trained on ImageNet has proved a great success.
BERT is a multi-layer bidirectional encoder taken from Transformer architecture, which was introduced in Attention Is All You Need. The pre-training procedure is entirely unsupervised and includes two objectives: filling random gaps in the input sequence and classifying whether two input sentences are actually two consecutive sentences cut out from the larger text. During the fine-tuning, predictions can be made for entire sequences or each input token separately.
The study shows that fine-tuning a pre-trained BERT model has set a new gold standard in 11 benchmark tasks.
Google has released an official implementation of BERT for NLP available on github.

And now for some bonus information:

Paul Romer, winner of this year’s Nobel Prize in Economics, is a 62-year old ex-World Bank chief economist, writer and Python programming language user. He is also a firm supporter of making research available and clear, so he shares his findings via Jupyter notebooks and makes data available for everyone to process and interpret.
His example perhaps shows that combining knowledge about economics and science with a proper toolset for one’s daily work can help lead to a rewarding career.

AI Monthly digest #1 – AI stock trading & Kaggle record

October 5, 2018/in Data science, Deep learning, Machine learning, AI Monthly Digest /by Konrad Budek and Arkadiusz Nowaczynski

AI-based stock trading, a record-breaking competition on Kaggle and more stories cherry-picked from all the interesting ML- and AI-related news from September. Right here in the AI Monthly Digest.

The Digest gathers machine learning and AI news to spot the most important and interesting events and developments of the past month. The five events below were curated from last month’s events and chosen by Arkadiusz Nowaczyński and Konrad Budek from deepsense.ai team.

Deep learning takes a deep dive into the stock market

Deep reinforcement learning can be applied as a complete AI solution for algorithmic trading.
The authors of “Deep Reinforcement Learning in Portfolio Management” set out to determine whether methods derived primarily for playing Atari games and continuous control would work on the stock market. The algorithm they used, called deep deterministic policy gradient (DDPG), returned promising results in an offline backtest.
The second paper, “Deep Reinforcement Learning in High Frequency Trading,” provides convincing arguments about why AI stock trading is suitable for trading in a timescale below 1 second (High Frequency Trading). The authors did a solid evaluation of their approach with a few noteworthy tips:

Online learning at test time makes it possible to maintain high accuracy over time;
A small neural network is enough for this problem, meaning AI for trading can be developed on laptops;
Predicting the next 100 ticks from the last 500 ticks works best for them.

Progress remains to be made and questions to be answered. Does this algorithm work when deployed on the real market? How much money can you actually make with it? The lack of answers is certainly intriguing, as is the fact that algorithmic trading may soon be powered mostly by Deep RL, if it’s not already. We think that the potential financial reward will push people to develop further breakthroughs in AI. After all, setting high scores in Atari games isn’t as satisfying as having supersmart AI earning you gobs of money.

A record-breaking Kaggle competition

Over 8500 data scientists on no fewer than 7000 teams took part in the Kaggle Home Credit Default Risk evaluation record-breaking competition. The goal of the competition was to predict the risk of giving a loan to a particular customer. The teams were provided with rich datasets containing historical and transactional data on the customer’s behavior.

Perfectly designed, the competition attracted teams from far and wide, mostly thanks to the outstanding dataset. It allowed the teams to harvest insights and play with data in often surprising ways. Looking to tune up their models and further polish their skills, participants engaged in discussions and peer-reviews long after the competition had ended.
deepsense.ai took part in the competition, with Paweł Godula leading a team that took 5th place overall and finished first on the public leaderboard.

Volvo trucks introduce Vera, the cabless truck

According to PwC data, by 2030 the transport sector will require 138 million fewer cars in Europe and the US, mostly thanks to the rise of autonomous vehicles and the development of new business models. What’s more, it is predicted that by 2030 autonomous vehicles will be driving 40% of all miles driven.
As a proof of concept, Volvo has brought out Vera, the cabless truck to be used in short-haul transportation at logistics centres or ports. With the fleet of vehicles able to communicate and be supervised by a cloud-based management system, the truck is an interesting glimpse of the driverless future.

DARPA announced $2 billion investment in AI

At it’s 60th anniversary conference, the DARPA (Defense Advanced Research Projects Agency) announced that it is going to invest $2 billion in artificial intelligence. The agency is known for developing cutting-edge technology, be it ARPANET, which later evolved into the Internet, or the Aspen Movie Map, which was among the predecessors of Google Street View.
According to John Everrett (via CNNMoney), the deputy director of DARPA’s Information Innovation Office, the agency’s investment is intended to accelerate the development of AI from 20 years down to five years.
DARPA’s investment is not the first a government has made in AI. The most notable example comes from the United Arab Emirates, which has appointed an AI minister.

NIPS conference sold out in less than 13 minutes

NIPS, hosted in Montreal, Canada, is currently the most important machine learning and AI research conference in the world. Initially held as an interdisciplinary meeting of experts interested in sharing their knowledge on neural networks, it has evolved into the machine learning meeting with thousands of papers sent for review. It is also a place to run competitions with the “Learning to run” in 2017 as an example.

In 2017, the tickets sold out in two weeks, a relative eternity compared to the rock concert-like 12 minutes and 38 seconds they flew out in this year. Tickets for last year’s Comic-Con, one of the world’s most beloved pop culture events, sold out in a bit more than an hour.
So, when it comes to selling tickets, Marvel superheroes would appear to have nothing on machine learning. This year’s NIPS conference will feature Henryk Michalewski, visiting professor at Oxford University and a researcher at deepsense.ai, as a co-author of “Reinforcement Learning of Theorem Proving” paper.

Summary

September has clearly shown that AI is one of the most dominant trends in modern tech. Selling out venues faster than pop culture events goes a long way to proving that a scientific conference, or at least this one, can be as exciting as a concert or show – so long as it’s about Artificial Intelligence.

Museum Treasures – AI at the National Museum in Warsaw

September 13, 2018/in Data science /by Agata Chęcińska

Object recognition is common for street and satellite photos, diagram analysis and text recognition. After the DS team, including several deepsense.ai data scientists, took first place in the National Museum in Warsaw’s HackArt hackathon, designing a “Museum Treasures” game, the technique may soon be used to popularize art and culture, too.

In May 2018, the National Museum in Warsaw organized its HackArt hackathon. The task was to combine seemingly disparate fields: museology, art history and artificial intelligence. The goal of HackArt was to create tools: AI-based applications, bots and plug-ins that could help solve challenges the museum set.

The idea

Our focus was on the target group of parents and children, and on answering the questions of:

How to encourage families to visit the museum
How to make the visit interesting for children
How to build interest, among children and their parents, in the museum’s resources.

Somewhere along the way the idea of scavenger hunts came up, along with augmented reality games of record-breaking popularity, including Ingress and PokemonGO. Ingress players visiting a city fight for portals to another world. In PokemonGo, players look for pokemon in the least expected places.
That’s how the idea for the “Museum Treasures” game came about: A game in which children and parents get a map with fragments of paintings belonging to particular categories (animals or trees, for example). Armed with their maps, they then find the paintings that contain the fragments. The game’s formula was simple, but we wanted it also to be individualized and constantly changing. It was essential that parents with children be able to play it multiple times and that families searching for the treasures in the museum follow a number of “paths”. How could we go about doing that? The solution ended up using AI.

The execution

Artificial intelligence allows for the automation of many activities, especially tedious and time-consuming ones. Object recognition in street and satellite photos, diagram analysis, text recognition and analysis – there are countless applications. Now the automatic recognition of what can be found in paintings hanging in a museum can be added to the list.
Everyone participating in the hackathon had access to 200 photos of museum pieces. Our solution was to create a database of specific fragments of images – animals, trees, houses, feet, you name it – which could be used to create the various paths of a treasure hunt by category. The database would be extensive, and elements from new exhibitions could be added to it once they were digitalized. The only limitation in selecting the fragments to be used was in this case the set of elements which the object detection model we intended to use could recognize.

Image analysis

Take, for example, the image of Antoni Brodowski “Parys in a hat”. How would the object detection model work in this case?

In the image, we will recognize, for example, the head, the hand, the cap and the human, together with the areas where these elements appear (coordinates x, y of the rectangle) and the probability (the certainty with which the model found the element, p). A dictionary of fragments is thus created:

{category_1: image_id, detected fragment, (x, y), probability p}

Object detection models (among many other kinds of models) can be found in the open-source code repositories on GitHub. Depending on your needs and situation, a model can be taught from scratch using the data available, or, alternatively, pre-trained, ready-made models can be used. We chose the latter approach, because training a model requires a set of photos and labels. The labels in this case would define the category of elements and coordinates of their occurrence for each photo. Because we lacked such labels, we went with an object detection model from the Tensorflow Models (Google Vision API) repository, which allows quick detection and searches on 545 categories.

The code that enables objects to be found in images from a set is located on the stared/hackart-you-in-artwork repository. Below you will see a fragment that likewise makes it possible to detect objects in an image, and to then be saved as a json file. The code for each photo (from the list of photos to be processed) launches a neural network that recognizes the objects in the picture. As a result, we receive thousands of frame proposals along with the class and probability of the occurrence of this class in the area in question. We rejected results that were too uncertain, while adding the remaining ones to the dictionary to use in further stages of working with data. The fragment of code pictured here displays our results on the image and saves them to files, thanks to which we can visually verify the results obtained.

results = dict()
        for image_path in tqdm(list(TEST_IMAGE_PATHS)):
            base_name = image_path.split('/')[-1][:-4]
            image = Image.open(image_path)
            image_np = load_image_into_numpy_array(image)
            image_np_expanded = np.expand_dims(image_np, axis=0)
            (boxes, scores, classes, num) = sess.run(
                [detection_boxes, detection_scores,
                 detection_classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
            objects = []
            for s, c, b in zip(scores[0], classes[0], boxes[0]):
                if s > THRESH:
                    b = list(b)
                    objects.append({"prob": s,
                                    "name": str(category_index[c]['name']),
                                    "xmin": b[0], "ymin": b[1],
                                    "xmax": b[2], "ymax": b[3]})
            results[image_path.split('/')[-1]] = objects
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
              image_np,
              np.squeeze(boxes),
              np.squeeze(classes).astype(np.int32),
              np.squeeze(scores),
              category_index,
              min_score_thresh=THRESH,
              use_normalized_coordinates=True,
              line_thickness=8)
            cv2.imwrite('%s.jpg' % base_name, image_np)
            counter = counter+1
        print(results)
        with open('oidv3.json', 'w') as f:
            json.dump(results, f)

Source: https://github.com/stared/hackart-you-in-artwork/blob/master/aux/scripts_karol/evaluate_on_images.py

Outline of how the game works

Preparing the dictionary of fragments and their categories was only part of the task. At the same time, the work of developing the application demo awaited us. We had to focus on basic functionalities to be able to build the skeleton of the solution:

The application was intended to work automatically. After entering a new set of photos, a dictionary of categories was created (with additional filters to improve the quality of the items received), from which the user is then presented with 5 categories to choose from. During the hackathon we had a limited set of photos (→ lower credibility of automatically generated elements) and time constraints, so we supervised some of the tasks. For example, we checked the quality of the elements generated and merged several categories into one: cat, dog, fish, … → animals.
We created a web application and used Vue.js to create it. We assumed the following about the “Museum Treasures” game:

It could be played in an “analog” version: downloading a pdf with fragments of images and information about which room they could be found in → a designated “path” through the rooms; in this case, the player doesn’t need to use a smartphone or tablet, which may be important for parents and school trips.
it could also be played electronically: using a smartphone or tablet, with analogous information as above.

In both cases, the user first selects the category, and then receives a list of fragments (unique, random set) that must be found in the paintings.
During the hackathon we were able to create the basic version of the application, which allows users to select categories and shows the relevant fragments. We did not have time to hammer out map- and pdf-related functionalities.

Nor, during the event, did we have metadata about the location of the images from the collection, though these data can be added at a later stage of the work. The result of our efforts was a demo version.
To seehow the app works, check out the video below:

Further development ideas

In its basic form, “Museum Treasures” has players look for images, but in further development verification, rewarding and gamification could be added to enrich the experience. Defining the rules of the game, goals and motivation was very important and involved determining the age of the players. The challenges that await a five-year-old will differ from those 12-year-olds will take on. We believed the game could also be interesting for adults, as paths could likewise be created for mature users. We had several ideas for introducing these elements and developing them further. You can read a few of them below.

Verification
Metadata	To confirm that a painting has been found, the player enters information about the painter, the year the work was done, etc. These type of data can easily be entered into the dictionary, and questions can be generated based on them.
photos	A more advanced form of verification involves requiring the participant to take a picture – an image or note down information. The application could be enriched with a module comparing the image photographed with the fragment source. This solution is much more technically complex. There may also be occasions when photography is of poor quality and can’t be verified.

Clues
metadata	Hints about the painter
Generating descriptions with AI	Using algorithms to generate photo captions defining the context of what is found in the picture. This could be an interesting extra, though such captions don’t always work properly as clues.

The prize for correctly retrieving the images could be stickers or other small gadgets and a badge that could be awarded on social media.
The game can also be developed for group work. Ultimately, schools could use it. One of the ideas also assumes gamification:

two groups follow separate paths, with their times compared at the end,
two groups follow paths that end up at the same place, thus allowing the players to meet at the end of the game (and talk about who came in first).

The games could also be offered to adults: Races based on categories such as “Lips, lips” or “Buttocks over the centuries” are often very popular among adults.

Future plans

We’re currently working on getting our pilot solution up and running. It will be based on one of the museum’s exhibits – a collection of 19th century paintings. We would like to create a basic version of the game, which would then be tested for its level of difficulty, among other factors.

Wait, so loans need to be repaid? The home credit risk prediction competition on Kaggle

September 6, 2018/in Data science /by Konrad Budek

It was far and away the most popular Kaggle competition, gaining the attention of more than 8,000 data scientists globally. The team of Paweł Godula, team leader and deepsense.ai’s Director of Customer Analytics, Michał Bugaj and Aliaksandr Varashylau took fifth place and 1st on the public leaderboard.

The goal of the competition launched by Home Credit Group was to build a model that could predict the probability of a bank’s customer repaying a cash loan (90% of the training data) or installment loan (10% of the training data). Combining an exciting, real-life challenge and a high-quality dataset, this competition became the most popular ever featured competition on Kaggle.

The sandbox raiders

There can be no doubt that being a data scientist is fun. Playing with various datasets, finding patterns and exploring the needles hidden in the depths of the digital haystack. This time, the dataset was a marvel to behold. Why?

The bank behind the competition provided data on roughly 300,000 customers, including details on credit history, properties, family status, earnings and geographic location.
To enrich the dataset the bank provided information about the customers’ credit history taken from external sources, mostly from customer-ratings institutions.
The level of detail provided was astonishing. Participants could analyze the credit history of customers at the level of a single installment of a single loan.
While the personal data was of course perfectly anonymized, the features were not. This enabled endless feature engineering, which is every data scientist’s dream.

In other words, the dataset was the perfect sandbox, allowing all of the participants to get into the credit underwriters’ shoes for more than 3 months.
Our solution was based on three steps, described briefly below.

1. Hand crafting more than 10,000 features

Out of 10,000 features, we carefully chose the 2,000 strongest for the final model.
Endless brainstorming, countless creative sessions and discussions gave us more than 10,000 features that could possibly explain the default on a loan. As most of these features carried largely duplicate information, we used an algorithm for automatic feature selection based on feature importance. This procedure enabled us to eliminate ~8,000 features and reduce the training time significantly, while improving the cross validation score at the same time.
The heavily tuned, five-fold bagged, LightGBM model based on these 2,000 features was our submission’s workhorse.

2. Using deep learning to extract interactions among different data sources

We wondered how we could capture the interactions between signals coming from different data sources. For example, what if 20 months ago someone was rejected in an external credit Bureau, had a late payment in their installment payments, and applied for a loan at our branch? These types of interactions are very hard to capture by humans because of the number of possible options. So, we turned to deep learning and turned this problem into an image classification problem.
How?
We created a single vector of user characteristics coming from different data sources for every month of user history, going as far back as 96 months(8 years was the cutoff in most data sources). We then stacked those vectors and created a very sparse “user image” and, finally, fed this image into a neural network.

The network architecture was as follows:

Normalization – division by global max in every row
Input (the “user image” in format of n_characteristics x 96 months – we looked 8 years into the past)
1-D convolution spanning 2 consecutive months (to see the change between periods)
Bidirectional LSTM
Output

This model trained rather quickly–around 30 mins on a GTX 1080–and gave us a significant improvement on an already very strong model with 2000 hand-crafted features. This means that the network was able to extract some information on top of >2000 hand-crafted features.
We believe in this approach, particularly for more commercial settings, where the actual metric of the model is not only accuracy, but also the time the data science team (i.e. cost) requires to develop the model. For us, the opportunity cost was only the sleep we missed, which we gladly gave up to take part in this amazing competition. However, most businesses have a more rational and less emotional approach to data science and prefer cheap models to expensive ones. Deep learning offers an attractive alternative to manual feature engineering and is able to extract meaningful information from time-series bank data.

3. Using nested models

One of the things that bothered us throughout the competition was the somewhat arbitrary nature of various group-bys that we performed on data while doing the hand-crafted features. For example, we supposed that an overdue installment from five years ago would be less important than one from just a month ago. However, what is the exact relationship? The traditional way is to test different thresholds using a cv score, but there is also a more elegant way for the model to figure it out.
What we did was build a bunch of “limited-power” models using only only a single source of data (for example, only a credit card’s history). The purpose of that was to force the model to find all possible relationships in the given data source, even at the cost of accuracy. Below are the AUC (area under curve metric) scores that we got from models using only one data source:

Previous application: 0.63
Credit card balance: 0.58
Pos cash balance: 0.54
Installment payments: 0.58
Bureau: 0.61
Bureau balance: 0.55

The very low AUC scores for these models were hardly surprising, as they carry enormous amounts of noise. Even for default clients, the majority of their past behaviors (past loans, past installments, past credit cards) were OK. The point was to identify those few behaviors that were common across defaulters.
The way to use those models is to extract the most “default-like” behaviors and use them to describe every user. For example, a very strong feature in the final model was “the maximum default score on a single behaviour of a particular user”. Another very strong feature was “Number of behaviors with default score exceeding 0.2 for a particular user”.
Using features from these models further improved an already very strong model: The models learned to abstract whether a particular behavior would lead to a default or not.
In summary, the final model used the following features:

More than 2,000 hand-crafted features, selected out of 10,000 features created during brainstorming and creative sessions
one feature from the neural network from Step 2
Around 40-50 features coming from “nested models” described in Step 3

The portfolio of models we tested included an XGBoost, LightGBM, Random Forest, Ridge Regression and Neural Nets. A LightGBM proved to be the best model, especially with heavily tuned hyperparameters for regularization (the two most important parameters were feature fraction and L2 regularization).

The model was prepared by Michał Bugaj, Aliaksandr Varashylau and Paweł Godula (Customer Analytics Director at deepsense.ai), who led the team. They were able to predict if a lender would default on a loan with 80% AUC (meaning that there was an 80% probability that a randomly selected “defaulter”, or person who defaulted on a loan, would be ranked by the model as a defaulter before a non-defaulter).
Our solution ranked fifth, and a tenth of a percentage point less effective than the leader on the private leaderboard. We took 1st place on the public leaderboard.
The competition itself was a great experience, both for the organization behind it and the participants, as the models provided appeared to be effective and business oriented. Remember our blog post about launching a Kaggle competition? This competition may qualify as perfect, one that might fit right in Sevres.

Four ways to use a Kaggle competition to test artificial intelligence in business

August 24, 2018/in Data science /by Konrad Budek and Patryk Miziuła

For companies seeking ways to test AI-driven solutions in a safe environment, running a competition for data scientists is a great and affordable way to go – when it’s done properly.

According to a McKinsey report, only 20% of companies consider themselves adopters of AI technology while 41% remain uncertain about the benefits that AI provides. Considering the cost of implementing AI and the organizational challenges that come with it, it’s no surprise that smart companies seek ways to test the solutions before implementing them and get a sneak peek into the AI world without making a leap of faith.
That’s why more and more organizations are turning to data science competition platforms like Kaggle, CrowdAI and DrivenData. Making a data science-related challenge public and inviting the community to tackle it comes with many benefits:

Low initial cost – the company needs only to provide data scientists with data, pay the entrance fee and fund the award. There are no further costs.
Validating results – participants provide the company with verifiable, working solutions.
Establishing contacts – A lot of companies and professionals take part in Kaggle competitions. The ones who tackled the challenge may be potential vendors for your company.
Brainstorming the solution – data science is a creative field, and there’s often more than one way to solve a problem. Sponsoring a competition means you’re sponsoring a brainstorming session with thousands of professional and passionate data scientists, including the best of the best.
No further investment or involvement – the company gets immediate feedback. If an AI solution is deemed efficacious, the company can move forward with it and otherwise end involvement in funding the award and avoid further costs.

While numerous organizations – big e-commerce websites and state administrations among them – sponsor competitions and leverage the power of data science community, running a comptetion is not at all simple. An excellent example is the competition the US National Oceanic and Atmospheric Administration sponsored when it needed a solution that would recognize and differentiate individual right whales from the herd. Ultimately, what proved the most efficacious was the principle of facial recognition, but applied to the topsides of the whales, which were obscured by weather, water and the distance between the photographer above and the whales far below. To check if this was even possible, and how accurate a solution may be, the organization ran a Kaggle competition, which deepsense.ai won.

Having won several such competitions, we have encountered both brilliant and not-so-brilliant ones. That’s why we decided to prepare a guide for every organization interested in testing potential AI solutions in Kaggle, CrowdAI or DrivenData competitions.

Recommendation 1. Deliver participants high-quality data

The quality of your data is crucial to attaining a meaningful outcome. Minus the data, even the best machine learning model is useless. This also applies to data science competitions: without quality training data, the participants will not be able to build a working model. This is a great challenge when it comes to medical data, where obtaining enough information is problematic for both legal and practical reasons.

Scenario: A farming company wants to build a model to identify soil type from photos and probing results. Although there are six classes of farming soil, the company is able to deliver sample data for only four. Considering that, running the competition would make no sense – the machine learning model wouldn’t be able to recognize all the soil types.

Advice: Ensure your data is complete, clear and representative before launching the competition.

Recommendation 2. Build clear and descriptive rules

Competitions are put together to achieve goals, so the model has to produce a useful outcome. And “useful” is the point here. Because those participating in the competition are not professionals in the field they’re producing a solution for, the rules need to be based strictly on the case and the model’s further use. Including even basic guidelines will help them to address the challenge properly. Lacking these foundations, the outcome may be right but totally useless.

Scenario: Mapping the distribution of children below the age of 7 in the city will be used to optimize social, educational and healthcare policies. To make the mapping work, it is crucial to include additional guidelines in the rules. The areas mapped need to be bordered by streets, rivers, rail lines, districts and other topographical obstacles in the city. Lacking these, many of the models may map the distribution by cutting the city into 10-meter widths and kilometer-long stripes, where segmentation is done but the outcome is totally useless due to the lack of proper guidelines in the competition rules.

Advice: Think about usage and include the respective guidelines within the rules of the competition to make it highly goal-oriented and common sense driven.

Recommendation 3. Make sure your competition is crack-proof

Kaggle competition winners take home fame and the award, so participants are motivated to win. The competition organizer needs to remember that there are dozens (sometimes thousands) of brainiacs looking for “unorthodox” ways to win the competition. Here are three examples

Scenario 1: A city launches a competition in February 2018 to predict traffic patterns based on historical data (2010-2016). The prediction had to be done for the first half of 2017 and the real data from that time was the benchmark. Googling away, the participants found the data, so it was easy to fabricate a model that could predict with 100% accuracy. That’s why the city decided to provide an additional, non-public dataset to enrich the data and validate if the models are really doing the predictive work.

However, competitions are often cracked in more sophisticated ways. Sometimes data may ‘leak’: data scientists get access to data they shouldn’t see and use it to prepare their model to tailor a solution to spot the outcome, rather than actually predicting it.

Scenario 2: Participants are challenged to predict users’ age from internet usage data. Before the competition, the large company running it noticed that there was a long aplha-numeric ID, with the age of users embedded, for every record. Running the competition without deleting the ID would allow participants to crack it instead of building a predictive model.

Benchmark data is often shared with participants to let them polish their models. By comparing the input data and the benchmark it is sometimes possible to reverse-engineer the outcome. The practice is called leaderboard probing and can be a serious problem.

Scenario 3: The competition calls for a model to predict a person’s clothing size based on height and body mass. To get the benchmark, the participant has to submit 10 sample sizes. The benchmark then compares the outcome with the real size and returns an average error. By submitting properly selected numbers enough times, the participant cracks the benchmark. Anticipating the potential subterfuge, the company opts to provide a public test set and a separate dataset to run the final benchmark and test the model.

Advice: Look for every possible way your competition could be cracked and never underestimate your participants’ determination to win.

Recommendation 4. Spread the word about your competition

One of the benefits of running a competition is that you get access to thousands of data scientists, from beginners to superstars, who brainstorm various solutions to the challenge. Playing with data is fun and participating in competitions is a great way to validate and improve skills, show proficiency and look for customers. Spreading the word about your challenge is almost as important as designing the rules and preparing the data.

Scenario: A state administration is in need of a predictive model. It has come up with some attractive prizes and published the upcoming challenge for data scientists on its website. As these steps may not yield the results it’s looking for, it decides to sponsor a Kaggle competition to draw thousands of data scientists to the problem.

Advice: Choose a popular platform and spread the word about the competition by sending invitations and promoting the competition on social media. Data scientists swarm to Kaggle competitions by the thousands. It stands to reason that choosing a platform to maximize promotion is in your best interest.

Conclusion

Running a competition on Kaggle or a similar platform can not only help you determine if an AI-based solution could benefit your company, but also potentially provide the solution, proof of concept and the crew to implement it at the same time. Could efficiency be better exemplified?

Just remember, run a competition that makes sense. Although most data scientists engage in competitions just to win or validate their skills, it is always better to invest time and energy in something meaningful. It is easier to spot if the processing data makes sense than a lot of companies running competitions realize.
Preparing a model that is able to recognize plastic waste in a pile of trash is relatively easy. Building an automated machine to sort the waste is a whole different story. Although there is nothing wrong with probing the technology, it is much better to run a competition that will give feedback that can be used to optimize the company’s short- and long-term future performance. Far too many competitions either don’t make sense or produce results that are never used. Even if the competition itself proves successful, who really has the time or resources to do fruitless work?

Three reasons why data analysts make the perfect data scientists

August 9, 2018/in Data science /by Konrad Budek

Data scientist is considered the hottest job in 2018 for good reason. Combining the tech skills, business attitude and godlike-at-first-glance ability to build artificial intelligence, data scientists are indeed both highly desired and hard to come by.

Recruiting data scientists comes with several challenges, with their high salaries being the least significant one. As the profession is a new one, most individuals have neither the educational background nor the long employment history that might be considered ideal. The lion’s share come from IT, but a significant number are scientists and business people.
That’s why looking within your organization, particularly among data analysts, is a good way to find a data scientist. Below we discuss three benefits of this approach.

Overlapping skills, or why data analysts are already halfway to becoming data scientists

Data analysts not only have to be proficient with data processing tools but also to have business acumen that will allow them to harvest the database for meaningful insights. Analysts have to be data-driven, curious and inquiring, with the problem-solver’s mentality.
Adding programming skills to one’s skill set can go a long way to making the perfect data scientist. In fact, data analysts are often doing the work that data scientists try to automate with machine learning models. Preparing segmentation or rooting out anomalies within the database are among their daily tasks. Data analysts know what is and what is not the best part of a business to be automated. Being used to handling data-oriented tasks and having a strong business background, data analysts have the key qualities to become data scientists. What they need is to acquire the missing competence in building models.

Building loyalty within the team – reducing the turnover rate and gaining skills

Reducing churn on teams is a key challenge HR departments face as employees come and go, seeking a pay rise or new opportunities. So what motivates people to stay? According to Shiftlearning, 70% of employees consider training and development opportunities motivation to stay. Giving a data analyst the opportunity to work in the hottest job of 2018 only by training him or her is a growth opportunity like no other – for the company and the employee.
By transforming a data analyst into a data scientist, the company is both building a stronger bond with the employee and acquiring the skills it requires.

No need for recruitment – improve efficiency to lower costs

Seeking employees is a tedious process, with job postings revealing only the tip-of-the-tip of the iceberg. The HR specialists need to review resumes, pick the best candidates and conduct interviews. The company has to invest time and money in the process.
What’s more, it is far from certain that a given hire will fit the organization’s culture or work ethic.
Data analysts that work for the organization are already trained, fit into the culture and have domain knowledge. So, the cost of recruitment is a non-issue – there is no need for onboarding or elaborate introductions to the work. And there are likely no other surprises.
Last but not least, data scientists tend to be picky, as they have a plethora of job offers. If you are not Google Brain or an other tech giant, attracting them can be challenging. On the other hand, data analysts currently working for the company are already in and trust the brand.

Summary

There is a slew of technical skills and a sizable block of knowledge to be mastered in order to enter the ranks of data scientists. In fact, learning the tech part of the job is not rocket science. If a data analyst already working for the organization can translate data into business, then he or she is more than halfway there.
The distance between the world of AI and the business development department is smaller than one might expect, and many business analysts are already practically there without even knowing it. Supporting them with knowledge and training may be the best way to give them the skills they need and would no doubt love to acquire and to keep them at the company by giving them a great opportunity to grow.

Online course vs. instructor-led training – how to develop your team’s new skills?

July 27, 2018/in Data science /by Anna Kowalczyk

The e-learning market is anticipated to be worth $37.6 billion by 2020. On the other hand, the dropout rate of massive open online courses (MOOCs) is upwards of 87%! It’s tempting to send your team to any of the popular e-learning platforms to pick up the new skills they need. An important question, then, is are online courses superior to instructor-led training.

The opportunities online education brings

Online learning opens up a raft of benefits, flexibility foremost among them. You don’t have to look for a suitable time for everyone in the team to start training as they can adjust other responsibilities and study on their own schedule, no matter where they are.
Some courses are offered in collaboration with top universities, which provide a wide range of topics to choose from, and further guarantees quality. Moreover, leading companies and industry experts participate in creating the courses, so you can be sure that the educational content is up-to-date and has practical value.
There is also the issue–or non-issue, as the case be–of location. Instead of doing research and looking for a decent vendor, you can just send the team to a data visualization course at the University of Illinois, a Python course at the University of Michigan or a data science course at the Johns Hopkins University, all while your business is headquartered in Australia.
Your employees will surely appreciate the fact that you invest in their development and will be proud of the certification they get after finishing the course. Participants can learn at their own pace, and, if receiving a course certificate isn’t the aim, dedicate more time to one thing while skipping those parts with less value.
Last but not least, MOOCs are usually affordable; you can receive a group discount or wait for a sale season. Some platforms even offer free courses.

So why is the dropout rate so high?

We have already agreed that online education provides a lot of freedom. Of course, there is a set of guidelines and rules students should follow. The question is, do they have enough self-control and determination to stick with it? Online courses, especially ones in demanding and ambitious fields such as machine learning, require great self-discipline. It is essential the student be able to balance their priorities to finish the course. The dropout rate of massive open online courses (MOOCs) is north of 87%!
Although MOOCs have enjoyed wide publicity and numerous institutions (including MIT and Stanford) have invested heavily in developing and promoting such courses, the jury is out on their effectiveness. Staying motivated and keeping up with assignments is not a piece of cake, even if you work in a team.

Teams need support when learning

Online learning is awesome, no doubt. But it comes with a few drawbacks. The absence of direct interaction with an instructor makes online education more of a monolog than a dialogue. When it comes to educating teams, face-to-face communications is paramount, as everyone has to have the same understanding of the problem and the solutions. It therefore should not be replaced by technology. Learning from a live instructor helps students remain focused while enabling instructors to keep students motivated. Online courses don’t provide these opportunities, which may be a, a huge hurdle.

According to a study conducted by Susan Dynarski, professor of education, public policy and economics at the University of Michigan, online courses tend to be more beneficial for proficient students, while keeping the less motivated off track. Without a teacher present to help students with problems, online courses tend to lose their efficiency, while instructor-led sessions can be adjusted to the actual level of the team being trained.
The lack of an individual approach and support exactly when your team need it make the transition from theoretical learning to practical application required for the real-life problems challenging. During an online course, students won’t receive instructor support when the subjects become more and more difficult, leaving some feeling overwhelmed.
According to a study conducted by researchers at the University of Pennsylvania, instructional approach and instructor or peer feedback have a huge impact on the effectiveness of training. Direct and interactive instructions, experiential learning and an individual approach all increase participant engagement. Online courses use peer reviews to enhance learning, and they allow students to learn from each experience by providing feedback. However, this approach is not fully controlled by the instructor and may not always give students constructive feedback. Instructor-led courses can do that, giving the less focused or motivated attendees an opportunity to actually benefit from the course.

Why companies choose online courses vs. instructor-led training

According to Capterra, reducing costs is one reason companies decide to train teams online. But can low price give you the quality you need? The problem with MOOCs begins with the fact that, as their name says, they’re “massive” and “open”. They address many student profiles and the materials are not tailored to their particular needs. Companies sometimes choose courses with little understanding of what the course requires and have unrealistic expectations of it or of their employees abilities. Aside from their attractive price, online courses are easier to arrange than instructor-led training. You don’t have to look for a vendor and arrange a program. Just choose a course covering a relevant topic online.
Nevertheless, managers should look forward and think about the future outcome. Will the team put to good use the knowledge it gains during the online course? An internal instructor-led training program grounded in real projects may cost more than an online course, but that money will soon enough be recouped. Ask yourself whether you can afford ineffective training.

How to effectively develop technical skills in-house

Interaction with others can help you boost your knowledge and increase your interest in a particular topic. Indeed, nothing is more motivating than the people around you. Group training can help your team develop practical skills that are increasingly important in the professional world. Thanks to teamwork and brainstorming, they can tackle more complex problems than they could do individually–for example, develop new approaches to resolve an issue or pool their knowledge.
Some instructor-led courses provide the learning experience tailored to the technical goals and business strategy. Teams learn by building projects using hands-on, code-based training to be able to apply new skills and experience in practice. This practical approach can be followed up by a mentoring program to ground new competencies. A group can potentially understand a given technology more thoroughly when it applies it to specific business use cases. Online course can’t address a company’s particular problems.
So if you have a team which needs new skills which are crucial for your company, are online courses really the answer, or is instructor-led training the way to go? The former seem the riskier of the two, particularly if it is quality results and maximizing budget efficiency you’re after. Finally, the Association for Talent Development has stated that companies offering comprehensive training sessions to their people have 218% higher income per employee. This is a win-win situation for the employee and employer. It takes into account both the individual employee’s development and the company’s strategic goals. The optimal solution combines the best educational practices with real-life business cases. Participants can then turn around their knowledge and use it in their everyday work.

Keras or PyTorch as your first deep learning framework

June 26, 2018/in Data science, Deep learning, Machine learning /by Piotr Migdal and Rafał Jakubanis

So, you want to learn deep learning? Whether you want to start applying it to your business, base your next side project on it, or simply gain marketable skills – picking the right deep learning framework to learn is the essential first step towards reaching your goal.

What are Keras and PyTorch?

Keras and PyTorch are open-source frameworks for deep learning gaining much popularity among data scientists.

Keras is a high-level API capable of running on top of TensorFlow, CNTK, Theano, or MXNet (or as tf.contrib within TensorFlow). Since its initial release in March 2015, it has gained favor for its ease of use and syntactic simplicity, facilitating fast development. It’s supported by Google.
PyTorch, released in October 2016, is a lower-level API focused on direct work with array expressions. It has gained immense interest in the last year, becoming a preferred solution for academic research, and applications of deep learning requiring optimizing custom expressions. It’s supported by Facebook.

Before we discuss the nitty-gritty details of both frameworks, we want to preemptively disappoint you – there’s no straight answer to the ‘which one is better?’. The choice ultimately comes down to your technical background, needs, and expectations. This article aims to give you a better idea of where each of the two frameworks you should be pick as the first.

TL;DR:

Keras may be easier to get into and experiment with standard layers, in a plug & play spirit.
PyTorch offers a lower-level approach and more flexibility for the more mathematically-inclined users.

Ok, but why not any other framework?

TensorFlow is a popular deep learning framework. Raw TensorFlow, however, abstracts computational graph-building in a way that may seem both verbose and not-explicit. Once you know the basics of deep learning, that is not a problem. But for anyone new to it, sticking with Keras as its officially-supported interface should be easier and more productive.
[Edit: Recently, TensorFlow introduced Eager Execution, enabling the execution of any Python code and making the model training more intuitive for beginners (especially when used with tf.keras API).]
While you may find some Theano tutorials, it is no longer in active development. Caffe lacks flexibility, while Torch uses Lua (though its rewrite is awesome :)). MXNet, Chainer, and CNTK are currently not widely popular.

Keras vs. PyTorch: Ease of use and flexibility

Keras and PyTorch differ in terms of the level of abstraction they operate on.
Keras is a higher-level framework wrapping commonly used deep learning layers and operations into neat, lego-sized building blocks, abstracting the deep learning complexities away from the precious eyes of a data scientist.
PyTorch offers a comparatively lower-level environment for experimentation, giving the user more freedom to write custom layers and look under the hood of numerical optimization tasks. Development of more complex architectures is more straightforward when you can use the full power of Python and access the guts of all functions used. This, naturally, comes at the price of verbosity.
Consider this head-to-head comparison of how a simple convolutional network is defined in Keras and PyTorch:

Keras

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(MaxPool2D())
model.add(Conv2D(16, (3, 3), activation='relu'))
model.add(MaxPool2D())
model.add(Flatten())
model.add(Dense(10, activation='softmax'))

PyTorch

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3)
        self.conv2 = nn.Conv2d(32, 16, 3)
        self.fc1 = nn.Linear(16 * 6 * 6, 10)
        self.pool = nn.MaxPool2d(2, 2)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 6 * 6)
        x = F.log_softmax(self.fc1(x), dim=-1)
        return x
model = Net()

The code snippets above give a little taste of the differences between the two frameworks. As for the model training itself – it requires around 20 lines of code in PyTorch, compared to a single line in Keras. Enabling GPU acceleration is handled implicitly in Keras, while PyTorch requires us to specify when to transfer data between the CPU and GPU.
If you’re a beginner, the high-levelness of Keras may seem like a clear advantage. Keras is indeed more readable and concise, allowing you to build your first end-to-end deep learning models faster, while skipping the implementational details. Glossing over these details, however, limits the opportunities for exploration of the inner workings of each computational block in your deep learning pipeline. Working with PyTorch may offer you more food for thought regarding the core deep learning concepts, like backpropagation, and the rest of the training process.
That said, Keras, being much simpler than PyTorch, is by no means a toy – it’s a serious deep learning tool used by beginners, and seasoned data scientists alike.
For instance, in the Dstl Satellite Imagery Feature Detection Kaggle competition, the 3 best teams used Keras in their solutions, while our deepsense.ai team (4th place) used a combination of PyTorch and (to a lesser extend) Keras.
Whether your applications of deep learning will require flexibility beyond what pure Keras has to offer is worth considering. Depending on your needs, Keras might just be that sweet spot following the rule of least power.

Summary

Keras – more concise, simpler API
PyTorch – more flexible, encouraging deeper understanding of deep learning concepts

Keras vs. PyTorch: Popularity and access to learning resources

A framework’s popularity is not only a proxy of its usability. It is also important for community support – tutorials, repositories with working code, and discussions groups. As of June 2018, Keras and PyTorch are both enjoying growing popularity, both on GitHub and arXiv papers (note that most papers mentioning Keras mention also its TensorFlow backend). According to a KDnuggets survey, Keras and PyTorch are the fastest growing data science tools.

Unique mentions of deep learning frameworks in arxiv papers (full text) over time, based on 43K ML papers over last 6 years. So far TF mentioned in 14.3% of all papers, PyTorch 4.7%, Keras 4.0%, Caffe 3.8%, Theano 2.3%, Torch 1.5%, mxnet/chainer/cntk <1%. (cc @fchollet) pic.twitter.com/YOYAvc33iN

— Andrej Karpathy (@karpathy) 10 marca 2018

While both frameworks have satisfactory documentation, PyTorch enjoys stronger community support – their discussion board is a great place to visit to if you get stuck (you will get stuck) and the documentation or StackOverflow don’t provide you with the answers you need.
Anecdotally, we found well-annotated beginner level deep learning courses on a given network architecture easier to come across for Keras than for PyTorch, making the former somewhat more accessible for beginners. The readability of code and the unparalleled ease of experimentation Keras offers may make it the more widely covered by deep learning enthusiasts, tutors and hardcore Kaggle winners.
For examples of great Keras resources and deep learning courses, see “Starting deep learning hands-on: image classification on CIFAR-10“ by Piotr Migdał and “Deep Learning with Python” – a book written by François Chollet, the creator of Keras himself. For PyTorch resources, we recommend the official tutorials, which offer a slightly more challenging, comprehensive approach to learning the inner-workings of neural networks. For a concise overview of PyTorch API, see this article.

Summary

Keras – Great access to tutorials and reusable code
PyTorch – Excellent community support and active development

Keras vs. PyTorch: Debugging and introspection

Keras, which wraps a lot of computational chunks in abstractions, makes it harder to pin down the exact line that causes you trouble.
PyTorch, being the more verbose framework, allows us to follow the execution of our script, line by line. It’s like debugging NumPy – we have easy access to all objects in our code and are able to use print statements (or any standard Pythonic debugging) to see where our recipe failed.
A Keras user creating a standard network has an order of magnitude fewer opportunities to go wrong than does a PyTorch user. But once something goes wrong, it hurts a lot and often it’s difficult to locate the actual line of code that breaks. PyTorch offers a more direct, unconvoluted debugging experience regardless of model complexity. Moreover, when in doubt, you can readily lookup PyTorch repo to see its readable code.

Summary

PyTorch – way better debugging capabilities
Keras – (potentially) less frequent need to debug simple networks

Keras vs. PyTorch: Exporting models and cross-platform portability

What are the options for exporting and deploying your trained models in production?
PyTorch saves models in Pickles, which are Python-based and not portable, whereas Keras takes advantages of a safer approach with JSON + H5 files (though saving with custom layers in Keras is generally more difficult). There is also Keras in R, in case you need to collaborate with a data analyst team using R.
Running on Tensorflow, Keras enjoys a wider selection of solid options for deployment to mobile platforms through TensorFlow for Mobile and TensorFlow Lite. Your cool web apps can be deployed with TensorFlow.js or keras.js. As an example, see this deep learning-powered browser plugin detecting trypophobia triggers, developed by Piotr and his students.
Exporting PyTorch models is more taxing due to its Python code, and currently the widely recommended approach is to start by translating your PyTorch model to Caffe2 using ONNX.

Summary

Keras – more deployment options (directly and through the TensorFlow backend), easier model export.

Keras vs. PyTorch: Performance

Donald Knuth famously said:

Premature optimization is the root of all evil (or at least most of it) in programming.

In most instances, differences in speed benchmarks should not be the main criterion for choosing a framework, especially when it is being learned. GPU time is much cheaper than a data scientist’s time. Moreover, while learning, performance bottlenecks will be caused by failed experiments, unoptimized networks, and data loading; not by the raw framework speed. Yet, for completeness, we feel compelled to touch on this subject. We recommend these two comparisons:

TensorFlow, Keras and PyTorch comparison by Wojtek Rosiński
Comparing Deep Learning Frameworks: A Rosetta Stone Approach by Microsoft (make sure to check notebooks to get the taste of different frameworks). For a detailed explanation of the multi-GPU framework comparisons, see this article.

PyTorch is as fast as TensorFlow, and potentially faster for Recurrent Neural Networks. Keras is consistently slower. As the author of the first comparison points out, gains in computational efficiency of higher-performing frameworks (ie. PyTorch & TensorFlow) will in most cases be outweighed by the fast development environment, and the ease of experimentation Keras offers.

Summary

As far as training speed is concerned, PyTorch outperforms Keras

Keras vs. PyTorch: Conclusion

Keras and PyTorch are both excellent choices for your first deep learning framework to learn.

If you’re a mathematician, researcher, or otherwise inclined to understand what your model is really doing, consider choosing PyTorch. It really shines, where more advanced customization (and debugging thereof) is required (e.g. object detection with YOLOv3 or LSTMs with attention) or when we need to optimize array expressions other than neural networks (e.g. matrix decompositions or word2vec algorithms).

Keras is without a doubt the easier option if you want a plug & play framework: to quickly build, train, and evaluate a model, without spending much time on mathematical implementation details.
EDIT: For side-by-side code comparison on a real-life example, see our new article: Keras vs. PyTorch: Alien vs. Predator recognition with transfer learning.

Knowledge of the core concepts of deep learning is transferable. Once you master the basics in one environment, you can apply them elsewhere and hit the ground running as you transition to new deep learning libraries.

We encourage you to try out simple deep learning recipes in both Keras and PyTorch. What are your favourite and least favourite aspects of each? Which framework experience appeals to you more? Let us know in the comment section below!

Would you and your team like to learn more about deep learning in Keras, TensorFlow and PyTorch? Choose our custom-made AI workshops.

Spot the flaw – visual quality control in manufacturing

April 19, 2018/in Data science, Deep learning, Machine learning /by Konrad Budek

Quality assurance in manufacturing is demanding and expensive, yes, but also absolutely crucial. After all, selling flawed goods results in returns and disappointed customers. Harnessing the power of image recognition and deep learning may significantly reduce the cost of visual quality control while also boosting overall process efficiency.

According to “Forbes”, automating quality testing with machine learning can increase defect detection rates by up to 90%. Machines never tire, nor lose focus or need a break. And every product on a production line is inspected with the same focus and meticulousness.
Yield losses, the products that need to be reworked due to defects, may be one of the biggest cost-drivers in the production process. In semiconductor production, testing cost and yield losses can constitute up to 30% of total production costs.

Time and money for quality

Traditional quality control is time-consuming. It is manually performed by specialists testing the products for flaws. Yet the process is crucial for business, as product quality is the pillar a brand will stand on. It is also expensive. Electronics industry giant Flex claims that for every 1 dollar it spends creating a product, it lays out 100 more on resolving quality issues.
Since the inception of image recognition software, manufacturers have been able to incorporate IP cameras into the quality control process. Most of the implementations are based on complex systems of triggers. But with the conditions predefined by programmers, the cameras were able to spot only a limited number of flaws. While the technology may not yet have been worthy of the title game changer, the image recognition revolution was one step further.

Deep learning about perfection

Artificial intelligence may enhance the company’s ability to spot flawed products. Instead of embedding complex and lengthy lists of possible flaws into an algorithm, the algorithm learns the product’s features. With the vision of the perfect product, the software can easily spot imperfect ones.

Visual quality control in Fujitsu

A great example of how AI combined with vision systems can improve product quality is on display at Fujitsu’s Oyama factory. The Image Recognition System the company uses not only helps it ensure the production of parts of an optimal quality, but also supervises the assembly process. This dual role has markedly boosted the company’s efficiency.
As the company stated, the solution lacked the flexibility today’s fast-moving world demands. But powering up an AI-driven solution allowed it to quickly adapt its software to new products without the need for time-consuming recalibration. With the AI solutions, Fujitsu reduced its development time by 80% while keeping part recognition rates at 97%+.
As their solution proved successful, Fujitsu deployed it at all of its production sites.
Visual quality control is also factoring in the agricultural product packing arena. One company has recently introduced a high-performance fruit sorting machine that uses computer vision and machine learning to classify skin defects. The operator can teach the sorting platform to distinguish between different types of blemishes and sort the fruit into sophisticated pack grades. The solution combines hardware, software and operational optimization to reduce the complexity of the sorting process.

Summary

As automation becomes more widespread and manufacturing more complex, factories will need to employ AI. Self-learning machines ultimately allow the companies forward-thinking enough to use them to reduce operational costs while maintaining the highest quality possible.
However, an out-of-box solution is not always the best option. Limited flexibility and lower accuracy are the most significant obstacles most companies face. Sometimes building an in-house team of machine learning experts is the best way to provide both the competence and ability to tailor the right solutions for one’s business. As building the internal team to design visual quality control is more than challenging, finding the reliable partner to gain knowledge may be the best option.

What is the best method of efficiently training machine learning for teams?

April 6, 2018/in Data science /by Kamila Stępniowska

The average Briton has an average attention span of 14 minutes, as the recent study says. People tend to lose interest faster when they find the subject boring or complicated. That’s why the quality of training comes not only from the information provided but also the form, that it is served.

“You had my curiosity, but now you have my attention”

Adults learn selectively. We pay attention and learn what we find potentially useful and beneficial, or what we consider to be interesting. If I lay out for data scientists an example of a new convnet architecture that is ten times better than ResNet on the ImageNet dataset – I might well gain their attention. At the same time, if you are a sales representative who trades office furniture on the EMEA market, there is a great chance that I have “lost you”. You will never focus on this sentence… unless neural networks happen to be your hobby (in which case you already know that data science can help you at work too).
The point here is that good training is not only about the quality of knowledge involved, though that is crucial. Effective training is also about how new knowledge is served up, relating new information to previous experience, know-how and upcoming personal and work-oriented goals. This approach works for any learning process, data science education included.

Data science training in house

At the Training & Development Hub, we have built the 4T method – Tailored Team Training Tracks – which represents a fourth approach to machine learning education and good practices in providing learning experience, incorporated into data science education.
The 4T method is built on the assumption that practical education in data science should be provided in-company to software developers, data analytics, and other specialists familiar with computer programming or statistics. The 4T method can also be used by data scientists seeking to boost their skills. In both cases, the learning experience is tailored to the technical and business goals, provided for a team via hands-on code-based training structured in training tracks.

A win-win situation for the employee and employer

This approach to education benefits both the future data scientist (the learner) and his or her manager and company. 4T is focused on providing knowledge, experience and good practices, considering the nature of adult learners and teamwork benefits. The method emphasizes that the training needs to meet the specific practical goals understood by the individual and shared by his or her entire team, and it needs to provide skills that are going to be used in practice right away.
When these things happen, the company naturally uses the new knowledge to tackle its challenges. Because acquiring the new knowledge is a part of the job, the individual is strongly motivated and reasonable goals to be reached with the team so they have to work together using data science techniques.

Your internal data science team

Employers have plenty of reasons to build an internal data science team, but should bear in mind a couple of issues. First, there is a dearth of data scientists on the market. Further, good data scientists should have broad knowledge in numerous fields, such as computer programming, statistics and math, and that knowledge should be accompanied by a problem-solving mindset or at least a few years of experience in defining and solving work-related problems.
Such a skill set is nothing to sneeze at, and such employees are few and far between. Given that reality, is it worth the investment to retrain current employees such as software engineers and developers to build an internal data science team, or does outsourcing remain the better option?. If the company wants to grow its teams’ skills and develop know-how that will stay in-house, building a data science team from scratch is indeed a step worth considering.

Tailored Team Training Tracks

The 4T method was created by the Training & Development Hub based on seven years of experience in building data science solutions for our customers, and three years’ experience providing professional data science training in Europe and the US.
The method has four components that build a unique framework for training in data science and potentially other technological fields: Tailored, Team Training, Tracks. Below you will find an overview on each of the components.

Tailored

Every training is designed to meet the specific participant’s educational needs and, most essentially, the technical and business goals of the group involved. To give you an example, let’s say that a company would like to start a new project recognizing skin defects from photos. In this case, two different approaches – each based on deep neural networks – can be used: classification and anomaly detection. The training program provides all participants with knowledge about these exact techniques and the skills required to use them properly, no matter which approach they choose to work with finally. The participants can then turn around and use them right away in their everyday work.

Team

The training should be provided for a team, or at least for a group of a people working together who will be able to use the knowledge and skills they have gained in a common project. That is extremely beneficial for employers: their employees’ experience will be “calibrated” for future projects, which usually involve whole teams, for which a similar level of knowledge, understanding of issues and possible solutions are required. Thus armed, companies can smoothly and quickly take on new data science projects.
From the other hand, it’s useful also for the participant – even if the learning motivation for adults is extremely internal variety, though group approach and the need of cooperation are also a strong stimulus.

Training

In data science, the learning experience should be provided by hands-on, code-based, intensive training using real-life examples. In each minute of a training, participants should know why they are doing what they are doing. This is why the training should be as practical and project-based as possible.
In data science, that means working on datasets related to the team’s goals, writing some modules for the experiments (or in long-term, project-based programs where they run many experiments), and calibrating models, to name just a few of the issues involved. The training might be framed as a workshop, with mentoring sessions added, online exercises, projects and other online and offline forms.

Tracks

Blocking training sessions in one thematic track works better than having participants attend a bunch of one-off workshops that are not necessarily connected. Within well-designed development programs, knowledge, experience and good practices are much more efficiently transferred into projects. The track should be an end-to-end educational experience built up from workshops, mentoring sessions, projects and other educational forms that will result in a real, useful solution.

The 4T method was developed and is used by deepsense.ai to provide high-quality technical training. It takes into account both the individual employee’s development and the company’s strategic goals. We believe that it’s the way to go – the optimal solution combining the best educational practices with real-life business cases. So far so good, our clients are satisfied with this approach and regularly help us understand how it can be tailored more and more exactly. If you have any questions or ideas regarding 4Ts, let us know in the comments! We’ll be glad to hear your thoughts.

1. Enter the fakeburger – DeepMind managed to produce convincing images of hamburgers, animals and landscapes

2. AI-generated portrait sold for $432,500 at auction

3. The machine morality – biases and automated discrimination

4. And the machine morality again – autonomous trolley problem

5. Google’s BERT for NLP – new state-of-the-art in language modeling

And now for some bonus information:

Deep learning takes a deep dive into the stock market

A record-breaking Kaggle competition

Volvo trucks introduce Vera, the cabless truck

DARPA announced $2 billion investment in AI

NIPS conference sold out in less than 13 minutes

Summary

The idea

The execution

Image analysis

Outline of how the game works

Further development ideas

Future plans

The sandbox raiders

1. Hand crafting more than 10,000 features

2. Using deep learning to extract interactions among different data sources

3. Using nested models

Recommendation 1. Deliver participants high-quality data

Recommendation 2. Build clear and descriptive rules

Recommendation 3. Make sure your competition is crack-proof

Recommendation 4. Spread the word about your competition

Conclusion

Overlapping skills, or why data analysts are already halfway to becoming data scientists

Building loyalty within the team – reducing the turnover rate and gaining skills

No need for recruitment – improve efficiency to lower costs

Summary

The opportunities online education brings

So why is the dropout rate so high?

Teams need support when learning

Why companies choose online courses vs. instructor-led training

How to effectively develop technical skills in-house

What are Keras and PyTorch?

TL;DR:

Ok, but why not any other framework?

Keras vs. PyTorch: Ease of use and flexibility

Keras

PyTorch

Summary

Keras vs. PyTorch: Popularity and access to learning resources

Summary

Keras vs. PyTorch: Debugging and introspection

Summary

Keras vs. PyTorch: Exporting models and cross-platform portability

Summary

Keras vs. PyTorch: Performance

Summary

Keras vs. PyTorch: Conclusion

Time and money for quality

Deep learning about perfection

Visual quality control in Fujitsu

Summary

“You had my curiosity, but now you have my attention”

Data science training in house

A win-win situation for the employee and employer

Your internal data science team

Tailored Team Training Tracks

Tailored

Team

Training

Tracks

Contact us

Locations

Let us know how we can help

Services

Resources

About us

Support

Join our community