forum

Have Questions? 8373 99 4242 or 8860 93 4343

Free Session


Hadoop Administrator Certification Course


 
1. The Motivation & Limitation for Hadoop

Motivation of Hadoop
Big data features and challenges
Problems with Traditional Large-Scale Systems
Why Hadoop & Hadoop Fundamental Concepts
Comparison between Hadoop and RDBMS
Is Hadoop replacing RDBMS?
History of Hadoop with Hadoopable problems
Limitation of Hadoop

2. Hadoop Ecosystem & Cluster

Available version Hadoop 1.x & 2
Available Distributions of Hadoop (Cloudera, Hortonworks)
Hadoop Projects & Components
Architecture of Hadoop & Planning for cluster
The Hadoop Distributed File System (HDFS)
Cluster Daemons & Its Functions

  • Name Node
  • Secondary Node
  • Data Nodes
  • Application Master and Task Tracker
  • Namespace federation

YARN Responsibilities
Deployment of Hadoop Cluster

3. Linux Initials

Installation of Linux (Red Hat)
Basic Linux configurations
Basic Linux commands

  • Password less ssh
  • IP address and hostname
  • Firewall and selinux
  • Yum and creating yum repository
  • NTP configurations
4. Planning Your Hadoop Cluster

Installation Prerequisites
General Planning Considerations
Choosing the Right Hardware
Network Considerations
Configuring Nodes
Planning for Cluster Management

5. Installation & Deployment of Hadoop

Deployment Types
Setting up Cloudera repository
Installation for Cloudera Manager
Installing Hadoop (Cloudera)
Setting up Cloudera Hadoop environment
Specifying the Hadoop Configuration
Performing Initial HDFS Configuration
Performing Initial YARN and Map Reduce Configuration
Hadoop Logging & Cluster Monitoring

6. 3rd party Vendor Solutions
  • Cloudera Manager
  • Ambari
  • HUE

7. Configuration of services
  • Configuring Services
  • Configuring HDFS
  • Configuring Hadoop Operating System
  • (YARN) & Map-Reduce
  • Configuring ZooKeeper
  • Configuring Hive
  • Configuring Pig
  • Configuring Schedulers
  • Hadoop Logging
8. Advanced Cluster Configuration
  • Advanced Configuration Parameters
  • Configuring Hadoop Ports
  • Explicitly Including and Excluding Hosts
  • Rack Awareness and Topology
  • Name Node Federation Architecture
  • Name Node High-Availability (HA) Architecture
9. Hadoop Security
  • Why Hadoop Security Is Important
  • Hadoop’s Security System Concepts
  • What Kerberos is and how it Works
  • Securing a Hadoop Cluster with Kerberos
10. Managing and Scheduling Jobs
  • Managing Running Jobs
  • Scheduling Hadoop Jobs
  • Configuring the Fair Scheduler
11. Cluster Maintenance
  • Checking HDFS Status
  • Copying Data between Clusters
  • Adding and Removing Cluster Nodes
  • Rebalancing the Cluster
  • Cluster Upgrading
12. Sqoop, Flume & HDFS Client
  • Sqoop & Flume installation
  • Ingesting Data from External (RDBMS) Sources with Sqoop
  • Ingesting Data from/to Relational Databases with Sqoop
  • Ingesting Data from External Sources with Flume
  • Integration of Sqoop and Hbase
  • Integration of Flume and Hbase
  • Integration of Sqoop and Hive
  • Best Practices for Importing Data
13. Conclusion & FAQs
Note:
  • Every Topic has practical session
  • Hadoop uses different components which discussed in required
Session
  • Hue
  • Cloudera Manager
  • Zookeeper
  • Ooozie
  • etc.
Prerequisites

This course is best suited to developers and engineers who have some or little bit programming experience. Knowledge of Java is not mandatory, Any programming language can be used with Hadoop and is required to complete the hands-on exercises.

Fees: 11,500 Rs/-
Duration: 1 Month

 

Setup of Data centers
1). Cloudera

CDH 5 Installation for Apache Hadoop developers and system administrators interested in Hadoop installation.
Describes installation and configuration of cloudera CDH 5.x on mutiple machines.
Deploy all 21 components like HBase, Hive, Sqoop, Flume etc. in data centers machines
Learning Labs: Planning & Deployment, Monitoring, Performance tuning, Security using Kerberos, HDFS High Availability using Quorum Journal Manager (QJM) and Oozie, Hcatalog/Hive Administration.

2). Hortonworks

HDP 2.x Installation for Apache Hadoop developers and system administrators interested in Hadoop installation.
Describes installation and configuration of cloudera HDP 2.x on multiple machines.
Deploy components like HBase, Hive, Sqoop, Flume etc. in data centers machines
Learning Labs: Planning & Deployment, Monitoring, Performance tuning, Security using Kerberos, HDFS High Availability using Quorum Journal Manager (QJM) and Oozie, Hcatalog/Hive Administration.
Introduction of Apache Ambari for deploying and managing Apache Hadoop Securing your hadoop infrastructure with Apache Knox

Note: Hortonworks deployment is same as Cloudera but with different flavours.

3). Apache Hadoop

Setup a minimum 3-4 Node Hadoop Cluster
Node 1 - Namenode, Other Master services
Node 2 – Secondary Name Node, Resourcemanager /> Node 3 - Data Node, Task Tracker
Node 4 - Data Node, Task Tracker)

HDFS High Availability using Quorum Journal Manager (QJM)
Hive installation on HDFS
Security implementation with Kerberos
Apache Ambari to add new nodes to your existing cluster

Contact Us

  • Address: 4-B Pusa Road, Near Karol bagh Metro Station, Delhi-110005 India

  • Phone: +91-8373994242

  • Email: contact@mappingminds.org

Follow Us