Home About Courses Schedule Services Webinars Contact Search

Advanced Apache Spark

SEE SCHEDULE

Duration: 3.0 days

Method: Instructor led, Hands-on workshops

Price: $1920.00

Course Code: AP3001


Description

Advanced Apache Spark teaches attendees advanced Spark skills. Attendees discover how to integrate Spark with Cassandra, cluster data workflows, measure performance, and more.

Objectives

Upon successful completion of this course, the student will be able to:

  • Build on Spark fundamentals to gain a deeper understanding of Spark internals
  • Learn the operational tweaks to generate the maximum performance from Spark
  • Discover how to use GraphX and MLib for machine learning

Prerequisites

Developers who have taken an Introduction to Spark course or who have equivalent experience.

Topics

  • I. Introduction
    • Spark integration with Cassandra (other compatible NoSQL implementations can be substituted if supported)
    • Advanced Spark SQL and Spark Streaming
    • Implementing Spark on DataStax and Hortonworks
    • Cluster resource requirements
    • Debugging/troubleshooting Spark apps
    • Developing data workflows
    • Performance metrics
    • Cases studies