Hadoop

Course Fees : 990 USD /30,000 INR

Virtual box/VM Ware
  1. Basics
  2. Installations
  3. Backups
  4. Snapshots
Linux
  1. Basics
  2. Installations
  3. Commands
Hadoop
  1. Why Hadoop?
  2. Scaling
  3. Distributed Framework
  4. Hadoop v/s RDBMS
  5. Brief history of hadoop
Setup hadoop
  1. Pseudo mode
  2. Cluster mode
  3. IPv6
  4. Installation of java, hadoop
  5. Configurations of hadoop
  6. Hadoop Processes ( NN, SNN, JT, DN, TT)
  7. Temporary directory
  8. UI
  9. Common errors when running hadoop cluster, solutions
HDFS- Hadoop distributed File System
  1. HDFS Design and Architecture & Concepts
  2. Interacting HDFS using command line
  3. Interacting HDFS using Java APIs
  4. Dataflow
  5. Blocks
  6. Replica
Hadoop Processes
  1. Name node
  2. Secondary name node
  3. Job tracker
  4. Task tracker
  5. Data node
Map Reduce
  1. Developing Map Reduce Application
  2. Phases in Map Reduce Framework
  3. Map Reduce Input and Output Formats
  4. Advanced Concepts
  5. Sample Applications
  6. Combiner
Joining datasets in Mapreduce jobs
  1. Map-side join
  2. Reduce-Side join
Map reduce – customization
Hadoop Programming Languages:
HIVE
  • Introduction
  • Installation and Configuration
  • Interacting HDFS using HIVE
  • Map Reduce Programs through HIVE
  • HIVE Commands
  • Loading, Filtering, Grouping…
  • Data types, Operators…
  • Joins, Groups…
  • Sample programs in HIVE
PIG
  • Basics
  • Installation and Configurations
  • Interacting HDFS using HIVE
  • Commands…
NOSQL Databases Concepts
The Motivation for Hadoop
  1. Problems with traditional large-scale systems
  2. Requirements for a new approach
Hadoop: Basic Concepts
  1. An Overview of Hadoop
  2. The Hadoop Distributed File System
  3. Hands-On Exercise
  4. How MapReduce Works
  5. Hands-On Exercise
  6. Anatomy of a Hadoop Cluster
  7. Other Hadoop Ecosystem Components
Writing a MapReduce Program
  1. The MapReduce Flow
  2. Examining a Sample MapReduce Program
  3. Basic MapReduce API Concepts
  4. The Driver Code
  5. The Mapper
  6. The Reducer
  7. Hadoop's Streaming API
  8. Using Eclipse for Rapid Development
  9. Hands-on exercise
  10. The New MapReduce API
Common MapReduce Algorithms
  1. Sorting and Searching
  2. Indexing
  3. Machine Learning With Mahout
  4. Term Frequency – Inverse Document Frequency
  5. Word Co-Occurrence
  6. Hands-On Exercise