forum

Have Questions? 8373 99 4242 or 8860 93 4343

Free Session


Hadoop Developer Training & Certification Course


 
1. Hadoop Ecosystem & Cluster

Available version Hadoop 1.x & 2
Available Distributions of Hadoop (Cloudera, Hortonworks)
Hadoop Projects & Components
Architecture of Hadoop & Planning for cluster
The Hadoop Distributed File System (HDFS)
Cluster Daemons & Its Functions.

  • Name Node
  • Secondary Node
  • Data Nodes
  • Application Master and Task Tracker

YARN Responsibilities
Deployment of Hadoop Cluster

2. Cloudera Sandbox or Quick Start

Installation of cloudera quick start
Difference in sandbox and distributed environment
Overview of apache HUE

3. Map-Reduce, Map-Reduce Steaming (in Java)

All Map-Reduce API Concepts
Architecture of Map-Reduce
Writing Map-Reduce Drivers, Mappers, and Reducers in Java
Speeding Up Hadoop Development by Using Eclipse
Differences between the Old and New Map-Reduce APIs
Writing Mappers and Reducers with the Streaming API
Different question raised for Map-Reduce

4. HBase: The Hadoop Database

Problems with RDBMS
Introduction to HBase
Non-RDBMS, Not-Only SQL or No-SQL
Installation HBase & Deployment
Types CRUD & Batch Operations
Filters, Counters, Pool
Rest Interface & Web-UI

5. Hadoop Shell and Commands

Hadoop Developer commands using shell
Map-Reduce job deployment
Oozie workflow design
Different Components Jobs design.

6. Hcatalog or MetaStore Tables

Introduction of apache Hcatalog
Creating tables using Hcatalog
Bulk uploads using MetaStore Tables
Play with semi-structured data
Integration of Hcatalog with Hive
Hive SQL query analysis


7. Hive

Problems with No-SQL Database
Introduction & Installation Hive
Hive Schema and Data Storage
Data Types & Introduction to SQL
Hive-SQL: DML & DDL
Hive-SQL: Views & Indexes
Explain and use the various Hive file formats
Use Hive to run SQL-like queries to perform data analysis
Use Hive to join data sets using a variety of techniques, including Mapside joins and Sort-Merge-Bucket joins
Integration to HBase & Cassandra
Sentiment Analysis and N-Grams
Hive Thrift Service

8. Flume

Installation of Flume
Ingesting Data from External Sources with Flume
Configuration for flume
REST Interfaces
Best Practices for Importing Data

9. Sqoop

Installation of Sqoop
Ingesting Data from External (RDBMS) Sources with Sqoop
Ingesting Data from/to Relational Databases with Sqoop
Integration of Sqoop and Hbase
Integration of Sqoop and Hive
Best Practices for Importing Data

10. Conclusion & FAQs
Note:
  • Every Topic has practical session
  • Hadoop uses different components which discussed in required
sessions
  • Hue
  • Cloudera Manager
  • Zookeeper
  • Ooozie
  • etc
Prerequisites

This course is best suited to developers and engineers who have some or little bit programming experience. Knowledge of Java is not mandatory, Any programming language can be used with Hadoop and is required to complete the hands-on exercises.

Fees: 14,500 Rs/-
Duration: 1 Month

1. Why Spark?

Problems with Traditional Large-Scale Systems
Introducing Spark
Spark Basics

2. What is Apache Spark?

Using the Spark Shell
Resilient Distributed Datasets (RDDs)
Functional Programming with Spark
Working with RDDs

3. RDD Operations

Key-Value Pair RDDs
MapReduce and Pair RDD Operations
The Hadoop Distributed File System

4. Overview

A Spark Standalone Cluster
The Spark Standalone Web UI
Parallel Programming with Spark

5. RDD Partitions and HDFS Data Locality

Working With Partitions
Executing Parallel Operations
Caching and Persistence

6. RDD Lineage
Caching Overview
Distributed Persistence
Writing Spark Applications

 

7. Spark Applications vs. Spark Shell

Creating the SparkContext
Configuring Spark Properties
Building and Running a Spark Application Logging
Spark, Hadoop, and the Enterprise Data Center

8. Spark Streaming Overview

Example: Streaming Word Count
Other Streaming Operations
Sliding Window Operations
Developing Spark Streaming Applications
Common Spark Algorithms

9. Shark, Spark SQL

Implement SparkSQL queries to perform several computations

10. APACHE DRILL – REPLACEMENT OF MAP-REDUCE

Installation of Drill
Query data using apache drill
Query data from Hadoop/HDFS file system
Drill & Hbase integration
Drill & Hive integration & Replacement

Fees: 5,000 Rs/-(Extra)
Duration: 1 Month

Developers
1). File Management

Managing HDFS files with command line
Creating HDFS Snapshots to backup important Enterprise datasets
Installation and Configuration of Cloudera/Hortonworks ODBC driver on Windows/Mac
Sorted insert for circular linked list

2). Data Visualization

Qlikview - Business Discovery and Visualizing Your Data Using QlikView
Tableau - Visualize Data with Tableau

3). Sentiment, predictive, Sensor Data Analysis

Learning Labs of Apache NiFi
Process Data with Apache Hive
Loading and Querying Data with Hadoop
Website Clickstream Data Analysis in Qlikview
Refine and Visualize Server Log Data
Social Media and Customer Sentiment Analysis
Analyze Machine and Sensor Data

Contact Us

  • Address: 4-B Pusa Road, Near Karol bagh Metro Station, Delhi-110005 India

  • Phone: +91-8373994242

  • Email: contact@mappingminds.org

Follow Us