Apache Spark Basics – Quick Start to Spark

Get introduced to Apache Spark Basics

Course Code : 1339

$1995

Overview

Apache Spark Basics introduces the participants to the Spark environment. This course is designed as a two-day, fast-paced training which covers the benefits, features and common uses & tools. During the course, participants work in a dynamic and hands-on learning environment.

Schedule Classes

Looking for more sessions of this class?

Course Delivery

This course is available in the following formats:

Live Classroom
Duration: 3 days

Live Virtual Classroom
Duration: 3 days

What You'll learn

  • Where Spark fits into the Big Data ecosystem
  • How to use core Spark features for critical data analysis
  • Key Spark technologies such as Spark shell for interactive data analysis, Spark internals, RDDs, Dataframes and Spark SQL

Outline

  • Background and history
  • Spark and Hadoop
  • Spark concepts and architecture
  • Spark ecosystem (core, Spark SQL, MLib, streaming)
  • Spark in local mode
  • Spark web UI
  • Spark shell
  • Analyzing dataset – part 1
  • Inspecting RDDs
  • Partitions
  • RDD Operations / transformations
  • RDD types
  • MapReduce on RDD
  • Caching and persistence
  • Sharing cached RDDs
  • Dataframes
  • Dataframes DDL
  • Spark SQL
  • Defining tables and importing datasets
  • Queries
View More

Prerequisites

Participants must attend the Java Programming Fundamentals (for Java training), Introduction to Python Programming (for Python training) and Introduction to SQL (Basic familiarity is needed, not in-depth SQL skills) courses, prior to taking up this course, or have equivalent knowledge and skills.

Who Should Attend

This is an Introductory-level course, geared for Developers and Architects seeking to be proficient in Spark tools & technologies. Participants should be experienced developers who are comfortable with Java, Scala or Python programming.  Participants should also be able to navigate Linux command line and have basic knowledge of Linux editors (such as VI/nano) for editing code.

 

This course is highly recommended for:

  • Lead data scientists
  • Spark developers
  • Software developers
  • Big Data scientists
  • Software architects
  • Java developers
  • Application developers
  • Full stack developers
  • Python developers

 

Interested in this course? Let’s connect!

Customer Reviews

Name
Email
Rating
Comments

No reviews yet