- Copyright 2015
- Edition: 2nd
-
Downloadable Video
- ISBN-10: 0-13-405240-4
- ISBN-13: 978-0-13-405240-3
7+ Hours of Video Instruction
Hadoop Fundamentals LiveLessons provides users, developers, and administrators with a practical introduction to the many facets of Hadoop Versions 1 and 2.
Description
Hadoop Fundamentals LiveLessons provides 7+ hours of video introduction to the Apache Hadoop (Versions 1 and 2) Big Data ecosystem. The tutorial includes background information and explains the core components of Hadoop, including Hadoop Distributed File Systems (HDFS), MapReduce, the new YARN resource manager, and YARN Frameworks. In addition, it demonstrates how to use Hadoop at several levels, including the native Java interface, C++ pipes, and the universal streaming program interface. Examples include how to use benchmarks and high-level tools, including the Apache Pig scripting language, Apache Hive "SQL-like" interface, Apache Flume for streaming input, Apache Sqoop for import and export of relational data, and Apache Oozie for Hadoop workflow management. The steps for easily installing a working Hadoop system on a desktop/laptop, in a Cloud environment, and on a local stand-alone cluster are presented. Installation and administration examples using the powerful Ambari GUI are also included. All software used in these LiveLessons is open source and freely available for your use and experimentation. Finally, the important differences between Hadoop Version 1 MapReduce and the new Hadoop Version 2 MapReduce Framework are highlighted.
Skill Level
What You Will Learn- Hadoop design and components
- How the MapReduce process works in Hadoop
- The differences between and advantages of Hadoop Versions 1 and 2
- Key aspects of the new YARN design and Frameworks
- How to use, administer, and program HDFS
- How to run and administer Hadoop programs
- How to write basic MapReduce programs
- How to install Hadoop on a laptop/desktop, on a cluster, or in the Cloud
- How to run Apache Pig, Hive, Flume, Sqoop, and Oozie applications
- How to install and administer Hadoop with the Apache Ambari GUI tool
Who Should Take This Course- Users, developers, and administrators who are interested in learning the fundamental aspects and operations of the open source Hadoop ecosystems
Course Requirements- Basic understanding of programming and development
- A working knowledge of Linux systems and tools
- Familiarity with Bash, Java, and C++
Table of Contents Lesson 1: Background Concepts "Background Concepts" provides background concepts for Hadoop and big data. You learn Hadoop history and design principles along with an introduction to the MapReduce paradigm and the new Hadoop Version 2 YARN architecture. The various components and new YARN Frameworks in the Hadoop ecosystem are introduced.
Lesson 2: Running Hadoop on a Desktop or Laptop "Running Hadoop on a Desktop or Laptop" shows you how to install a real Hadoop working installation in a virtual Linux sandbox. All software is freely available, can be easily installed on a desktop or laptop computer, and can be used for many of the examples in this tutorial. In addition, a more detailed, single-machine installation example using the Apache Software Foundation Version is provided.
Lesson 3: The Hadoop Distributed File System"The Hadoop Distributed File System (HDFS)" introduces you to the distributed storage system of Hadoop. In this lesson, you learn HDFS design basics, how to perform basic file operations, and how to use HDFS in programs. In addition, many of the new features in HDFS--including high availability, federations, snapshots, and NFSv3 access--are presented.
Lesson 4: Hadoop MapReduce "Hadoop MapReduce" presents the MapReduce methodology in more detail using simple command line examples. You also learn how to run a Java MapReduce application on a Hadoop cluster and then learn each step of the full Hadoop MapReduce process.
Lesson 5: Hadoop MapReduce Examples"Hadoop MapReduce Examples" teaches you how to write MapReduce programs in almost any language using the Streaming and Pipes interface. You also learn how to run a “grep”-like Hadoop application and some basic debugging techniques. The new Version 2 MapReduce Framework is also introduced, along with examples (including Hadoop Benchmarks), a walk-through of the new Jobs Manager GUI, and new log management tools.
Lesson 6: Higher Level Tools "Higher Level Tools" shows you how to use Apache Pig for scripting, Apache Hive for SQL-like access, Apache Flume for capturing streaming input, Apache Flume for relational data input/output, and Apache Oozie for workflow management. Each sublesson teaches you the various execution modes and commands needed to use the tools.
Lesson 7: Setting Up Hadoop in the Cloud "Setting Up Hadoop in the Cloud" demonstrates the simple steps needed to start a Hadoop cluster in the Cloud using an open source tool called Apache Whirr.
Lesson 8: Setting Up Hadoop on a Local Cluster "Setting Up Hadoop on a Local Cluster" teaches you how to install Hadoop on a basic four-node cluster. You learn the steps needed to configure, install, start, test, and monitor a fully functional Hadoop cluster. In addition, the new open source Apache Ambari installation and administration tool is explained and demonstrated as part of a real Hadoop deployment.
About LiveLessons Video Training The LiveLessons Video Training series publishes hundreds of hands-on, expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. This professional and personal technology video series features world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, IBM Press, Pearson IT Certification, Prentice Hall, Sams, and Que. Topics include: IT Certification, Programming, Web Development, Mobile Development, Home and Office Technologies, Business and Management, and more. View all LiveLessons on InformIT at:
http://www.informit.com/livelessons