Analyzing Large Data Sets with Apache Spark

What’s included
$14.99 / $24.99
Get ready for your exam by enrolling in our comprehensive training course. This course includes a full set of instructional videos designed to equip you with in-depth knowledge essential for passing the certification exam with flying colors.
Pay once, own it forever
Video Courses
Getting Started with Spark
Lectures | Duration |
---|---|
1. Introduction | 2m 16s |
2. How to Use This Course | 1m 41s |
3. [Activity]Getting Set Up: Installing Python, a JDK, Spark, and its Dependencies. | 14m 50s |
4. [Activity] Installing the MovieLens Movie Rating Dataset | 3m 35s |
5. [Activity] Run your first Spark program! Ratings histogram example. | 4m 52s |
1. Introduction
2m 16s
2. How to Use This Course
1m 41s
3. [Activity]Getting Set Up: Installing Python, a JDK, Spark, and its Dependencies.
14m 50s
4. [Activity] Installing the MovieLens Movie Rating Dataset
3m 35s
5. [Activity] Run your first Spark program! Ratings histogram example.
4m 52s
Spark Basics and Simple Examples
Lectures | Duration |
---|---|
1. Introduction to Spark | 10m 11s |
2. The Resilient Distributed Dataset (RDD) | 12m 17s |
3. Ratings Histogram Walkthrough | 13m 33s |
4. Key/Value RDD's, and the Average Friends by Age Example | 16m 13s |
5. [Activity] Running the Average Friends by Age Example | 5m 39s |
6. Filtering RDD's, and the Minimum Temperature by Location Example | 8m 10s |
7. [Activity]Running the Minimum Temperature Example, and Modifying it for Maximums | 5m 8s |
8. [Activity] Running the Maximum Temperature by Location Example | 3m 21s |
9. [Activity] Counting Word Occurrences using flatmap() | 7m 28s |
10. [Activity] Improving the Word Count Script with Regular Expressions | 4m 44s |
11. [Activity] Sorting the Word Count Results | 7m 44s |
1. Introduction to Spark
10m 11s
2. The Resilient Distributed Dataset (RDD)
12m 17s
3. Ratings Histogram Walkthrough
13m 33s
4. Key/Value RDD's, and the Average Friends by Age Example
16m 13s
5. [Activity] Running the Average Friends by Age Example
5m 39s
6. Filtering RDD's, and the Minimum Temperature by Location Example
8m 10s
7. [Activity]Running the Minimum Temperature Example, and Modifying it for Maximums
5m 8s
8. [Activity] Running the Maximum Temperature by Location Example
3m 21s
9. [Activity] Counting Word Occurrences using flatmap()
7m 28s
10. [Activity] Improving the Word Count Script with Regular Expressions
4m 44s
11. [Activity] Sorting the Word Count Results
7m 44s
Advanced Examples of Spark Programs
Lectures | Duration |
---|---|
1. [Activity] Find the Most Popular Movie | 5m 52s |
2. [Activity] Use Broadcast Variables to Display Movie Names Instead of ID Numbers | 8m 23s |
3. Find the Most Popular Superhero in a Social Graph | 4m 29s |
4. [Activity] Run the Script - Discover Who the Most Popular Superhero is! | 6m |
5. Superhero Degrees of Separation: Introducing Breadth-First Search | 7m 54s |
6. Superhero Degrees of Separation: Accumulators, and Implementing BFS in Spark | 6m 44s |
7. [Activity] Superhero Degrees of Separation: Review the Code and Run it | 9m 14s |
8. Item-Based Collaborative Filtering in Spark, cache(), and persist() | 10m 12s |
9. [Activity] Running the Similar Movies Script using Spark's Cluster Manager | 10m 54s |
10. [Exercise] Improve the Quality of Similar Movies | 2m 58s |
1. [Activity] Find the Most Popular Movie
5m 52s
2. [Activity] Use Broadcast Variables to Display Movie Names Instead of ID Numbers
8m 23s
3. Find the Most Popular Superhero in a Social Graph
4m 29s
4. [Activity] Run the Script - Discover Who the Most Popular Superhero is!
6m
5. Superhero Degrees of Separation: Introducing Breadth-First Search
7m 54s
6. Superhero Degrees of Separation: Accumulators, and Implementing BFS in Spark
6m 44s
7. [Activity] Superhero Degrees of Separation: Review the Code and Run it
9m 14s
8. Item-Based Collaborative Filtering in Spark, cache(), and persist()
10m 12s
9. [Activity] Running the Similar Movies Script using Spark's Cluster Manager
10m 54s
10. [Exercise] Improve the Quality of Similar Movies
2m 58s
Running Spark on a Cluster
Lectures | Duration |
---|---|
1. Introducing Elastic MapReduce | 5m 8s |
2. [Activity] Setting up your AWS / Elastic MapReduce Account and Setting Up PuTTY | 9m 55s |
3. Partitioning | 4m 21s |
4. Create Similar Movies from One Million Ratings - Part 1 | 5m 12s |
5. [Activity] Create Similar Movies from One Million Ratings - Part 2 | 11m 27s |
6. Create Similar Movies from One Million Ratings - Part 3 | 3m 28s |
7. Troubleshooting Spark on a Cluster | 3m 43s |
8. More Troubleshooting, and Managing Dependencies | 5m 47s |
1. Introducing Elastic MapReduce
5m 8s
2. [Activity] Setting up your AWS / Elastic MapReduce Account and Setting Up PuTTY
9m 55s
3. Partitioning
4m 21s
4. Create Similar Movies from One Million Ratings - Part 1
5m 12s
5. [Activity] Create Similar Movies from One Million Ratings - Part 2
11m 27s
6. Create Similar Movies from One Million Ratings - Part 3
3m 28s
7. Troubleshooting Spark on a Cluster
3m 43s
8. More Troubleshooting, and Managing Dependencies
5m 47s
SparkSQL, DataFrames, and DataSets
Lectures | Duration |
---|---|
1. Introducing SparkSQL | 6m 8s |
2. Executing SQL commands and SQL-style functions on a DataFrame | 8m 16s |
3. Using DataFrames instead of RDD's | 5m 52s |
1. Introducing SparkSQL
6m 8s
2. Executing SQL commands and SQL-style functions on a DataFrame
8m 16s
3. Using DataFrames instead of RDD's
5m 52s
Other Spark Technologies and Libraries
Lectures | Duration |
---|---|
1. Introducing MLLib | 8m 10s |
2. [Activity] Using MLLib to Produce Movie Recommendations | 2m 56s |
3. Analyzing the ALS Recommendations Results | 4m 53s |
4. Using DataFrames with MLLib | 7m 31s |
5. Spark Streaming and GraphX | 7m 36s |
1. Introducing MLLib
8m 10s
2. [Activity] Using MLLib to Produce Movie Recommendations
2m 56s
3. Analyzing the ALS Recommendations Results
4m 53s
4. Using DataFrames with MLLib
7m 31s
5. Spark Streaming and GraphX
7m 36s