deepsense.aideepsense.ai logo
  • Careers
    • Job Offers
    • Summer Internship
  • Clients’ stories
  • Services
    • Customized AI software
    • Team augmentation
    • AI advisory
    • Train your team
  • Industries
    • Retail
    • Manufacturing
    • Financial & Insurance
    • IT Operations
    • TMT & Other
    • Medical & Beauty
  • Knowledge base
    • Blog
    • R&D Hub
  • About us
    • Our story
    • Management
    • Advisory Board
    • Press center
  • Contact
  • Menu Menu
AI Monthly Digest #12 - the shadow of malicious use

AI Monthly Digest #12 – the shadow of malicious use

September 6, 2019/in Data science, AI Monthly Digest /by Konrad Budek and Arkadiusz Nowaczynski

With this edition of AI Monthly Digest, we have now for a full year been bringing readers carefully selected and curated news from the world of AI and Machine Learning (ML) that deepsense.ai’s team considers important, inspiring and entertaining.

Our aim is to deliver information that people not necessarily involved in AI and ML may find interesting. Also, the digest is curated by data scientists who ensure that the information included isn’t just hot air or marketing mumbo-jumbo, but significant news that will impact the global machine learning and reinforcement learning world.

This edition focuses on natural language processing, as the GPT-2 model is still an important element of AI-related discourse. This edition also contrasts the enthusiasm of ML-developers with concerns expressed by a renowned professor of Psychology.

1200 questions to ask

With natural language processing, a computer needs to generate natural texts in response to a human. This is at least troublesome, especially if a longer text or speech is required.

While these problems are being tackled in various ways, the gold standard is currently to run the newest solution on a benchmark. Yet delivering one is another challenge, to put it mildly.

To tackle it, researchers from the University of Maryland created a set of over 1200 questions that are easy to answer for a human and nearly impossible for a machine. To jump from “easy” to “impossible” is sometimes a matter of very subtle changes. As the researchers have said:

if the author writes “What composer’s Variations on a Theme by Haydn was inspired by Karl Ferdinand Pohl?” and the system correctly answers “Johannes Brahms,” the interface highlights the words “Ferdinand Pohl” to show that this phrase led it to the answer. Using that information, the author can edit the question to make it more difficult for the computer without altering the meaning of the question. In this example, the author replaced the name of the man who inspired Brahms, “Karl Ferdinand Pohl,” with a description of his job, “the archivist of the Vienna Musikverein,” and the computer was unable to answer correctly. However, expert human quiz game players could still easily answer the edited question correctly.

Capitalizing on this knowledge, researchers will be able to deliver better benchmarking for models and thus determine which part of the question confuses the computer.

Why does it matter

With each and every breakthrough, researchers get closer to delivering human-level natural language processing. At the same time, however, it is increasingly hard to determine if the neural network is understanding the processed text, or is just getting better fitted to the benchmark. Were the latter the case, the model would outperform existing solutions but register no significant improvement in real-life performance.

An example with detailed explanations are available in the video below.

A benchmark updated with those 1200 questions delivers significantly more precise information on the model’s ability to process the language and spot the drawbacks.

Large GPT-2 released

GPT-2 is probably the hottest topic among AI Trends 2019, especially considering the groundbreaking effect and controversial decision to NOT make the model public. Instead, OpenAI, the company behind the model, decided to cooperate with chosen institutions to find a way to harden the model against potential misuse.

And the threat is serious. According to research published in Foreign Affairs, readers consider GPT-2-written texts nearly as credible and convincing as those written by journalists and published in The New York Times (72% compared to 83%). Thus the articles are good enough to be especially dangerous as a weapon of mass disinformation or fake news factory – AI can produce a limitless amount of credible-looking texts with no effort.

To find the balance between supporting the development of the global science of AI and protecting models from being used for maleficent ends, OpenAI is releasing the model in iterations, starting a small one and ultimately aiming to make the model public but with the threat of misuse minimized.

Why does it matter

As research published in Foreign Affairs states, the model produces texts that an unskilled reader will find comparable to journalist-written ones. Image recognition models are already outperforming human experts in their tasks. But with all these cultural contexts, humor and irony, natural language once seemed protected by the unassailable fortress of the human mind.

The GPT-2 model has apparently cracked the gates and with business appliances it may be on the road to delivering a model that can provide human-like performance. The technology just needs to be controlled so as not to fall into the wrong hands.

What is this GPT-2 all about?

A GPT-2 model is, as stated above, one of the hottest topics of AI in 2019. But even the specialist can find it hard to understand the nitty-gritty of how the model works. To make the matter more clear, Jay Alammar has prepared a comprehensive guide to the technology.

Why does it matter

The guide is good enough to allow a person who has limited to no knowledge on the matter to understand the nuances of the model. For a moderately skilled data scientist given sufficient computing power and a dataset, the guide is sufficient to reproduce the model for example to support demand forecasting with NLP. It enables a data scientist to broaden his or her knowledge with one comprehensive article – a convenient way indeed.

Doing research is one thing, but sharing the knowledge it affords is a whole different story.

Malicious use, you say?

Jordan Peterson is a renowned professor and psychologist who studies the structure of myth and its role in shaping social behavior. If not a household name, he is certainly a public person and well-known speaker.

Using deep neural networks, AI researcher Chris Vigorito launched a notjordanpeterson.com website that allowed any user to generate any text that was later read with the neural network-generated voice of Jordan Peterson. As was the case with Joe Rogan, the output was highly convincing, mirroring the manner of speaking, breathing and natural pauses.

The networks was trained on 20 hours of transcripted Jordan Peterson speeches, an easy number to obtain where a public speaker is concerned. The amount of work was considerable, but not overwhelming.

Why does it matter

The creation of the neural network is not as interesting as Jordan Peterson’s response. He has written a blogpost entitled “I didn’t say that”, where he calls the situation “very strange and disturbing”. In the post, he notes that while it was fun to hear himself singing popular songs, the prospect of being an unwitting part of a scam is more than real. Due to the rising computing power available at affordable prices and algorithms getting better and less data-hungry, the threat of this technology being used for malicious ends is rising. If you’d like to know just how malicious he means, I’ll leave you with this to consider.

I can tell you from personal experience, for what that’s worth, that it is far from comforting to discover an entire website devoted to allowing whoever is inspired to do so produce audio clips imitating my voice delivering whatever content the user chooses—for serious, comic or malevolent purposes. I can’t imagine what the world will be like when we will truly be unable to distinguish the real from the unreal, or exercise any control whatsoever on what videos reveal about behaviors we never engaged in, or audio avatars broadcasting any opinion at all about anything at all. I see no defense, and a tremendously expanded opportunity for unscrupulous troublemakers to warp our personal and collective reality in any manner they see fit.

Related posts

  • AI Monthly Digest #10 - AI tackles climate change and deciphers long-forgotten languages
    AI Monthly Digest #10 - AI tackles climate change and deciphers long-forgotten languages
  • AI Monthly Digest #11 - From OpenAI partnering with Microsoft to battle.net Blade Runners
  • AI Monthly Digest #9 – the double-edged sword of modern technology
    AI Monthly Digest #9 - the double-edged sword of modern technology
Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
  • Share on Reddit
  • Share by Mail
https://deepsense.ai/wp-content/uploads/2019/09/AI-Monthly-Digest-12-the-shadow-of-malicious-use.jpg 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-09-06 14:10:132021-01-05 16:12:06AI Monthly Digest #12 – the shadow of malicious use

Start your search here

NEWSLETTER SUBSCRIPTION

    You can modify your privacy settings and unsubscribe from our lists at any time (see our privacy policy).

    This site is protected by reCAPTCHA and the Google privacy policy and terms of service apply.

    THE NEWEST AI MONTHLY DIGEST

    • AI Monthly Digest 20 - TL;DRAI Monthly Digest 20 – TL;DRMay 12, 2020

    CATEGORIES

    • Elasticsearch
    • Computer vision
    • Artificial Intelligence
    • AIOps
    • Big data & Spark
    • Data science
    • Deep learning
    • Machine learning
    • Neptune
    • Reinforcement learning
    • Seahorse
    • Job offer
    • Popular posts
    • AI Monthly Digest
    • Press release

    POPULAR POSTS

    • AI trends for 2021AI trends for 2021January 7, 2021
    • A comprehensive guide to demand forecastingA comprehensive guide to demand forecastingMay 28, 2019
    • What is reinforcement learning? The complete guideWhat is reinforcement learning? The complete guideJuly 5, 2018

    Would you like
    to learn more?

    Contact us!
    • deepsense.ai logo white
    • Services
    • Customized AI software
    • Team augmentation
    • AI advisory
    • Knowledge base
    • Blog
    • R&D Hub
    • deepsense.ai
    • Careers
    • Summer Internship
    • Our story
    • Management
    • Scientific Advisory Board
    • Press center
    • Support
    • Terms of service
    • Privacy policy
    • Contact us
    • Join our community
    • facebook logo linkedin logo twitter logo
    • © deepsense.ai 2014-

    Related posts

    • AI Monthly Digest #10 - AI tackles climate change and deciphers long-forgotten languages
      AI Monthly Digest #10 - AI tackles climate change and deciphers long-forgotten languages
    • AI Monthly Digest #11 - From OpenAI partnering with Microsoft to battle.net Blade Runners
    • AI Monthly Digest #9 – the double-edged sword of modern technology
      AI Monthly Digest #9 - the double-edged sword of modern technology
    Scroll to top

    This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies.

    OKLearn more

    Cookie and Privacy Settings



    How we use cookies

    We may request cookies to be set on your device. We use cookies to let us know when you visit our websites, how you interact with us, to enrich your user experience, and to customize your relationship with our website.

    Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.

    Essential Website Cookies

    These cookies are strictly necessary to provide you with services available through our website and to use some of its features.

    Because these cookies are strictly necessary to deliver the website, refuseing them will have impact how our site functions. You always can block or delete cookies by changing your browser settings and force blocking all cookies on this website. But this will always prompt you to accept/refuse cookies when revisiting our site.

    We fully respect if you want to refuse cookies but to avoid asking you again and again kindly allow us to store a cookie for that. You are free to opt out any time or opt in for other cookies to get a better experience. If you refuse cookies we will remove all set cookies in our domain.

    We provide you with a list of stored cookies on your computer in our domain so you can check what we stored. Due to security reasons we are not able to show or modify cookies from other domains. You can check these in your browser security settings.

    Other external services

    We also use different external services like Google Webfonts, Google Maps, and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and appearance of our site. Changes will take effect once you reload the page.

    Google Webfont Settings:

    Google Map Settings:

    Google reCaptcha Settings:

    Vimeo and Youtube video embeds:

    Privacy Policy

    You can read about our cookies and privacy settings in detail on our Privacy Policy Page.

    Accept settingsHide notification only