The course is designed to be accessible to anyone with a reasonable knowledge of basic Java. You will need to be able to write classes and create objects. Our Java Fundamentals course covers all the Java knowledge you need for this course.
Important note for Windows users: Hadoop is difficult to install on Windows, so in the course we show you to how set up a virtual machine running Linux. No prior knowledge of Linux is needed.
Having problems? check the errata for this course.
1 |
Welcome |
Preview
10m 49s |
|
A brief overview chapter, with a preview of the work we're going to be doing. | |||
2 |
Introducing Hadoop |
Watch
16m 12s |
|
An overview of what Hadoop is and introduction to the concept of map-reduce. | |||
3 |
The map-reduce programming model |
Preview
20m 45s |
|
A deeper look at the map-reduce programming model. | |||
4 |
Operating modes & installation environment |
Watch
25m 10s |
|
Understanding the operating modes of Hadoop, getting ready to install (including setting up a virtual machine if needed) | |||
5 |
Installing Hadoop |
Watch
40m 0s |
|
Installing Hadoop and configuring for both standalone and pseudo-distributed modes. | |||
6 |
Writing our first map-reduce job |
Watch
52m 36s |
|
Using a generic map-reduce template to create a real Hadoop job. | |||
7 |
HDFS |
Preview
24m 49s |
|
Understanding the Hadoop file system and how to put files into and out of it from the command line. | |||
8 |
Running in Pseudo-Distributed Mode |
Watch
11m 26s |
|
Running larger jobs in pseudo-distributed mode. Viewing the Hadoop Web User Interface. | |||
9 |
Map-reduce process flow 1 |
Watch
40m 36s |
|
Look at the steps in a map-reduce job in more detail. Learn about the shuffle process and adding a combine class. | |||
10 |
Map-reduce process flow 2 |
Watch
14m 38s |
|
An exercise to practice with the full map-reduce workflow. | |||
11 |
Enhancing Map and Reduce |
Watch
23m 41s |
|
An overview of the built in map and reduce functions, and learning to create custom key and value data types. | |||
12 |
Job Configuration |
Watch
25m 11s |
|
Understanding Hadoop file formats, and using the tool runner template to set command line parameters. | |||
13 |
Case Study 1 - Part 1 |
Watch
53m 8s |
|
An explanation of the first major case study, using real-world data, together with a walk through of the first 2 tasks. | |||
14 |
Case Study 1 - Part 2 |
Watch
9m 16s |
|
Walk through of task 3 in our case study. | |||
15 |
Case Study 1 - Part 3 |
Watch
9m 13s |
|
Walk through of task 4 in our case study. | |||
16 |
Chaining Multiple Map-Reduce Jobs |
Watch
27m 27s |
|
Learning to automate the chaining of jobs with the JobControl object. Using the sequence file format | |||
17 |
Pre and Post Processing |
Watch
47m 39s |
|
Using the ChainMapper and ChainReducer objects to add additional Map steps. | |||
18 |
Optimising Map-Reduce jobs |
Watch
29m 46s |
|
Looking at multiple ways to improve the efficiency of Map-Reduce jobs | |||
19 |
Log Files & Counters |
Watch
36m 28s |
|
Learning to use log files and counters as a tool to debug map-reduce code. | |||
20 |
Working with relational databases |
Watch
56m 11s |
|
Reading and writing from relational databases using JDBC | |||
21 |
Unit testing |
Watch
40m 56s |
|
Using Junit to test map-reduce code with the MRUnit library. | |||
22 |
Secondary Sorting |
Watch
36m 11s |
|
Understanding how to sort the values before the reduce phase. | |||
23 |
Joining data |
Watch
51m 56s |
|
Joining 2 data sets together with a reduce-side join. | |||
24 |
Using Amazon Elastic Map Reduce |
Watch
40m 38s |
|
Using the Amazon EMR cloud based Hadoop platform to run map-reduce jobs. | |||
25 |
Case Study 2 |
Watch
42m 45s |
|
Our second major case study based on a real world use of Hadoop. | |||
26 |
Course Summary |
Watch
14m 47s |
|
Review of what we've learned, and ideas of where to go next. |