Big data with Spark for data scientists

Machine learning training


Python syntax

Skills your team will gain

An understanding of challenges in optical character recognition problems.

Experience in creating OCR solutions using modern deep learning methods.


1 day


Part 1

Introduction to Spark

  • MapReduce paradigm in Spark
  • Broadcasts and accumulators
  • Caching

Part 2

Spark SQL

  • Dataframes
  • RDDs vs Dataframes vs Datasets
  • User-defined functions

Part 2

data science in Spark – Spark MLLib

  • Machine learning pipelines
  • Data preparation
  • TLinear and logistic regression
  • Random forests
  • Evaluation, cross validation

Contact our training manager

    Fill out this quick form and we will contact you shortly

    You can modify your privacy settings and unsubscribe from our lists at any time (see our privacy policy).

    This site is protected by reCAPTCHA and the Google privacy policy and terms of service apply.

    Find us
    •, Inc.
    • 2100 Geng Road, Suite 210
    • Palo Alto, CA 94303
    • United States of America
    • Sp. z o.o.
    • al. Jerozolimskie 162A
    • 02-342 Warsaw
    • Poland
    Let us know how we can help