deepsense.aideepsense.ai logo
  • Careers
    • Job offers
    • Summer internship
  • Clients’ stories
  • Services
    • AI software
    • Team augmentation
    • AI discovery workshops
    • GPT and other LLMs discovery workshops
    • Generative models
    • Train your team
  • Industries
    • Retail
    • Manufacturing
    • Financial & Insurance
    • IT operations
    • TMT & Other
    • Medical & Beauty
  • Knowledge base
    • deeptalks
    • Blog
    • R&D hub
  • About us
    • Our story
    • Management
    • Advisory board
    • Press center
  • Contact
  • Menu Menu
How artificial intelligence can fight hate speech in social media

How artificial intelligence can fight hate speech in social media

January 24, 2019/in Data science, Machine learning /by Konrad Budek

With social media users numbering in the billions, all hailing from various backgrounds and bringing diverse moral codes to today’s wildly popular platforms, a space for hate speech has emerged. Internet service providers have responded by employing AI-powered solutions to address this insidious problem.

Hate speech is a serious issue. It undermines the principles of democratic society and the rules of public debate. Legal views on the matter vary. On the internet, every statement that transgresses the standards for hate speech established by a given portal (Facebook, Twitter, Wikipedia etc.) may be banned from publication. To get around such bans, numerous groups have launched platforms to exchange their thoughts and ideas. Stricter definitions of hate speech are common. They make users feel safe, which is paramount for social media sites as the presence of users is often crucial to income. And that’s where building machine learning models spotting the hate speech comes in.

What is hate speech?

The definition of hate speech varies by country and may apply to various aspects of language. Laws prohibit directing hateful speech and defamatory language toward another’s religion, ethnicity or sexuality. Many countries penalize anyone who agitates violence or genocide. Additionally, many legislatures ban symbols of totalitarian regimes and limit the freedom of assembly when ideologies like fascism or communism are involved.

Related:  Five top artificial intelligence (AI) trends for 2019

In its most common form, hate speech attacks a person or group based on race, religion, ethnicity, national origin, disability, gender or sexual orientation. As regards what’s legal, the devil, as usual, is in the details. Finding the balance between freedom of speech and the protection of minority rights makes it difficult to produce a strict definition of hate speech. However, the problem has certainly grown with the rise of social media companies. The 2.27 bln active users of Facebook, who come from various backgrounds and bring diverse moral codes to the platform, have unwittingly provided a space for hate speech to emerge. Due to the international and flexible nature of the Internet, battling online hate speech is a complex task involving various parties.
Finally, there is a proven link between offensive name-calling and higher rates of suicide within migrant groups.

Why online hate speech is a problem

As a study from Pew Research Center indicates, 41% of American adults have experienced some form of online harassment. The most common is offensive name calling (experienced by 27%) and purposeful embarrassment (22%). Moreover, a significant number of American adults have experienced physical threats, sustained harassment, stalking and sexual harassment (10%, 7%, 7% and 6% respectively).
How artificial intelligence can fight hate speech in social media
Hate speech itself has serious consequences for online behavior and general well-being. 95% of Americans consider it a serious problem. At 27%, more than one in four Americans have reported deciding not to post something when encountering hate speech toward another user. 13%, meanwhile, have stopped using a certain online platform after witnessing harassment. Ironically, protected as a form of free speech, hate speech has resulted in muting more than a quarter of internet users.

Who should address the issue

Considering both vox populi and practice, online platforms are to tackle the problem of user’s hate speech. According to the Pew Research Center report cited above, 79% of Americans say that online service and social network providers are responsible for addressing harassment. In Germany, companies may face a fine of up to 50m euro if they fail to remove within 24 hours illegal material, including fake news and hate speech.

Related:  AI Monthly digest #4 - artificial intelligence and music, a new GAN standard and fighting depression

Hate speech is not always as blatant as calling people names. It can come in many subtler forms, posing as neutral statements or even care. That’s why building more sophisticated AI models that can recognize even the subtlest forms of hate speech is called for.

How those models should be built

When building a machine learning-powered hate speech detector, the first challenge is to build and label the dataset. Given that the differences between hate speech and non-hate speech are highly contextual, constructing the definition and managing the dataset is a huge challenge. The context may depend on:

  • The context of the discussion –  historical texts full of outdated expressions may be automatically (yet falsely) classified as hate speech
    • Example: Mark Twain’s novels use insulting language; citing them may set off  hate speech bells.
  • How the language is used – in many countries, hate speech used for artistic purposes is tolerated.
    • Example: Hip-hop often uses misogynistic language while heavy metal (especially the more extreme sub-genres) is rife with anti-religious lyrics.
  • The relationship of the speaker to the group being hated – the members of a group are afforded more liberties with using aggressive or insulting language when addressing other members of that group than are those who are not a part of it.
    • Example: the term “sans-cullottes” was originally coined to ridicule the opponents of conservatives. It literally meant “people with no trousers” and was aimed at the working class, members of whom wore long trousers instead of the fashionable short variety. The term went on to enter the vernacular of the working classes in spite of its insulting origins.

Irony and sarcasm pose yet another challenge. According to Poe’s law, without smileys or other overt signs from the writer, ironic statements made online are indistinguishable from serious ones. In fact, the now-ubiquitous emoticons were invented by professors at Carnegie Mellon University to avoid mistakes.
When the dataset is ready, building a blacklist of the most common epithets and hate-related slurs may be helpful, as automation-based blacklist models are effective 60% of the time in spotting online hate speech (based on our in-house benchmarks). Building both supervised and unsupervised learning models to spot the new combinations of harmful words or finding existing ones may raise that effectiveness further. Hate speech is dynamic and thus evolves rapidly as new forms and insulting words emerge. By keeping an eye on conversations and general discourse, machine learning models can spot suspicious new phrases and alert administration.

A formula of hate

An automated machine learning model is able to spot the patterns of hate speech based on word vectors and the positions of words with certain connotations. Thus, it is easier to spot emerging hate speech that went undetected earlier, as current politics or social events may trigger new forms of online aggression.
Unfortunately, people spreading hate have shown serious determination to overcome automated systems of spotting hate speech by combining common ways of fooling machines (like using acronyms and euphemisms) and perpetuate hate.

Challenges and concerns

One of the main concerns in building machine learning models is finding a balance between the model’s vigilance and the number of false positives it returns. Considering the uneasy relations between hate speech and freedom of speech, producing too many false positives may be considered by users an attempt at censorship.
Another challenge is to build the dataset and label the data to train the model to recognize hate speech. As machines themselves are truly neutral, the person responsible for the dataset may be biased or at least influenced to profile the hate speech recognition model. Thus, the model may be built to purposefully produce false-positives in order to reduce the prevalence of certain views in a discussion.

Share this entry
  • Share on Facebook
  • Share on Twitter
  • Share on WhatsApp
  • Share on LinkedIn
  • Share on Reddit
  • Share by Mail
https://deepsense.ai/wp-content/uploads/2019/02/How-artificial-intelligence-can-fight-hate-speech-in-social-media-platforms.png 337 1140 Konrad Budek https://deepsense.ai/wp-content/uploads/2019/04/DS_logo_color.svg Konrad Budek2019-01-24 14:38:322021-01-05 16:46:29How artificial intelligence can fight hate speech in social media

Start your search here

Build your AI solution
with us!

Contact us!

NEWSLETTER SUBSCRIPTION

    You can modify your privacy settings and unsubscribe from our lists at any time (see our privacy policy).

    This site is protected by reCAPTCHA and the Google privacy policy and terms of service apply.

    CATEGORIES

    • Generative models
    • Elasticsearch
    • Computer vision
    • Artificial Intelligence
    • AIOps
    • Big data & Spark
    • Data science
    • Deep learning
    • Machine learning
    • Neptune
    • Reinforcement learning
    • Seahorse
    • Job offer
    • Popular posts
    • AI Monthly Digest
    • Press release

    POPULAR POSTS

    • Diffusion models in practice. Part 1 - The tools of the tradeDiffusion models in practice. Part 1: The tools of the tradeMarch 29, 2023
    • Solution guide - The diverse landscape of large language models. From the original Transformer to GPT-4 and beyondGuide: The diverse landscape of large language models. From the original Transformer to GPT-4 and beyondMarch 22, 2023
    • ChatGPT – what is the buzz all about?ChatGPT – what is the buzz all about?March 10, 2023

    Would you like
    to learn more?

    Contact us!
    • deepsense.ai logo white
    • Services
    • AI software
    • Team augmentation
    • AI discovery workshops
    • GPT and other LLMs discovery workshops
    • Generative models
    • Train your team
    • Knowledge base
    • deeptalks
    • Blog
    • R&D hub
    • deepsense.ai
    • Careers
    • Summer internship
    • Our story
    • Management
    • Advisory board
    • Press center
    • Support
    • Terms of service
    • Privacy policy
    • Code of ethics
    • Contact us
    • Join our community
    • facebook logo linkedin logo twitter logo
    • © deepsense.ai 2014-
    Scroll to top

    This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies.

    OKLearn more

    Cookie and Privacy Settings



    How we use cookies

    We may request cookies to be set on your device. We use cookies to let us know when you visit our websites, how you interact with us, to enrich your user experience, and to customize your relationship with our website.

    Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.

    Essential Website Cookies

    These cookies are strictly necessary to provide you with services available through our website and to use some of its features.

    Because these cookies are strictly necessary to deliver the website, refuseing them will have impact how our site functions. You always can block or delete cookies by changing your browser settings and force blocking all cookies on this website. But this will always prompt you to accept/refuse cookies when revisiting our site.

    We fully respect if you want to refuse cookies but to avoid asking you again and again kindly allow us to store a cookie for that. You are free to opt out any time or opt in for other cookies to get a better experience. If you refuse cookies we will remove all set cookies in our domain.

    We provide you with a list of stored cookies on your computer in our domain so you can check what we stored. Due to security reasons we are not able to show or modify cookies from other domains. You can check these in your browser security settings.

    Other external services

    We also use different external services like Google Webfonts, Google Maps, and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and appearance of our site. Changes will take effect once you reload the page.

    Google Webfont Settings:

    Google Map Settings:

    Google reCaptcha Settings:

    Vimeo and Youtube video embeds:

    Privacy Policy

    You can read about our cookies and privacy settings in detail on our Privacy Policy Page.

    Accept settingsHide notification only