Nicht aus der Schweiz? Besuchen Sie lehmanns.de
Practical Hadoop Ecosystem -  Deepak Vohra

Practical Hadoop Ecosystem (eBook)

A Definitive Guide to Hadoop-Related Frameworks and Tools

(Autor)

eBook Download: PDF
2016 | 1st ed.
XX, 421 Seiten
Apress (Verlag)
978-1-4842-2199-0 (ISBN)
Systemvoraussetzungen
56,99 inkl. MwSt
(CHF 55,65)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to running sample applications each chapter is a practical tutorial on using a Apache Hadoop ecosystem project. While several books on Apache Hadoop are available, most are based on the main projects MapReduce and HDFS and none discusses the other Apache Hadoop ecosystem projects and how these all work together as a cohesive big data development platform.


What you'll learn
  • How to set up environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5. 
  • How to run a MapReduce job
  • How to store data with Apache Hive, Apache HBase
  • How to index data in HDFS with Apache Solr
  • How to develop a Kafka messaging system
  • How to develop a Mahout User Recommender System
  • How to stream Logs to HDFS with Apache Flume
  • How to transfer data from MySQL database to Hive, HDFS and HBase with Sqoop
  • How create a Hive table over Apache Solr

Who this book is for:

The primary audience is Apache Hadoop developers. Pre-requisite knowledge of Linux and some knowledge of Hadoop is required.


Deepak Vohra is a coder, developer, programmer, book author and technical reviewer.

Learn how to use the Apache Hadoop projects, including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout, and Apache Solr. From setting up the environment to running sample applications each chapter in this book is a practical tutorial on using an Apache Hadoop ecosystem project.While several books on Apache Hadoop are available, most are based on the main projects, MapReduce and HDFS, and none discusses the other Apache Hadoop ecosystem projects and how they all work together as a cohesive big data development platform.What You Will Learn:Set up the environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5Run a MapReduce jobStore data with Apache Hive, and Apache HBaseIndex data in HDFS with Apache SolrDevelop a Kafka messaging systemStream Logs to HDFS with Apache FlumeTransfer data from MySQL database to Hive, HDFS, and HBase with SqoopCreate a Hive table over Apache SolrDevelop a Mahout User Recommender SystemWho This Book Is For:Apache Hadoop developers. Pre-requisite knowledge of Linux and some knowledge of Hadoop is required.

Deepak Vohra is a coder, developer, programmer, book author, and technical reviewer.

Introduction1. HDFS and MapReduceHadoop Distributed FileSystemMapReduce FrameworksSetting the EnvironmentHadoop Cluster ModesRunning a MapReduce Job with MR1 FrameworkRunning MR1 in Standalone ModeRunning MR1 in Psuedo-Distributed ModeRunning MapReduce with Yarn FrameworkRunning YARN in Psuedo-Distributed ModeRunning Hadoop StreamingSection II Storing & Querying2. Apache HiveSetting the EnvironmentConfiguring HadoopConfiguring HiveStarting HDFSStarting the Hive ServerStarting the Hive CLICreating a DatabaseUsing a DatabaseCreating a Managed TableLoading Data into a TableCreating a table using LIKEAdding Data with INSERT INTO TABLEAdding Data with INSERT OVERWRITECreating Table using AS SELECTAltering a TableTruncating a TableDropping a TableCreating an External Table3. Apache HBaseSetting the EnvironmentConfiguring HadoopConfiguring HBaseConfiguring HiveStarting HBaseStarting HBase ShellCreating a HBase TableAdding Data To HBase TableListing All TablesGetting a Row of DataScanning a TableCounting Number of Rows in a TableAltering a TableDeleting a RowDeleting a ColumnDisabling and Enabling a TableTruncating a TableDropping a TableFinding if a Table existsCreating a Hive External TableSection III Bulk Transferring & Streaming4. Apache SqoopInstalling MySQL DatabaseCreating MySQL Database TablesSetting the EnvironmentConfiguring HadoopStarting HDFSConfiguring HiveConfiguring HBaseImporting into HDFSExporting from HDFSImporting into HiveImporting into HBase5. Apache FlumeSetting the EnvironmentConfiguring HadoopConfiguring HBaseStarting HDFSConfiguring FlumeRunning a Flume AgentConfiguring Flume for HBase SinkStreaming MySQL Log to HBase Sink Section IV Serializing  6. Apache AvroSetting the EnvironmentCreating an Avro SchemaCreating a Hive Managed TableCreating a Hive  (version prior to 0.14) External Table Stored as AvroCreating a Hive  (version 0.14 and later) External Table Stored as AvroTransferring MySQL Table Data as Avro Data File with Sqoop7. Apache Parquet    Setting the Environment    Creating a Oracle Database Table    Exporting Oracle Database to a CSV File    Importing the CSV File in MongoDB    Exporting MongoDB Document as CSV File    Importing a CSV File to Oracle DatabaseSection V Messaging & Indexing8. Apache KafkaSetting the EnvironmentStarting the Kafka ServerCreating a TopicStarting a Kafka ProducerStarting a Kafka ConsumerProducing and Consuming MessagesStreaming Log Data to Apache Kafka with Apache Flume     Setting the Environment  Creating Kafka Topics  Configuring Flume  Running Flume Agent  Consuming Log Data as Kafka Messages9. Apache SolrSetting the EnvironmentConfiguring the Solr SchemaStarting the Solr Server Indexing a Document in SolrDeleting a Document from Solr Indexing a Document in Solr with Java ClientSearching a Document in SolrCreating a Hive Managed TableCreating a Hive External TableLoading Hive External Table DataSearching Hive Table Data Indexed in SolrSection VI Machine Learning         10.Apache MahoutSetting the EnvironmentStarting HDFSSetting the Mahout EnvironmentRunning a Mahout Classification SampleRunning a Mahout Clustering  SampleDeveloping a User Based Recommender System   The Sample Data  Setting the Environment  Creating a Maven Project in Eclipse  Creating a User Based Recommender  Creating a Recommender Evaluator  Running the Recommender  Choosing a Recommender Type  Choosing a User Similarity  Measure  Choosing a Neighborhood Type  Choosing a Neighborhood Size for NearestNUserNeighborhood  Choosing  a Threshold for ThresholdUserNeighborhood  Running the Evaluator  Choosing the Split between Training Percentage and Test Percentage

Erscheint lt. Verlag 30.9.2016
Zusatzinfo XX, 421 p. 311 illus., 293 illus. in color.
Verlagsort Berkeley
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Informatik Netzwerke
Schlagworte Apache • Apache Hadoop • Apache HBase • Big Data • Cloud • Database • Framework • Hadoop • HBase • Open Source
ISBN-10 1-4842-2199-0 / 1484221990
ISBN-13 978-1-4842-2199-0 / 9781484221990
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 26,1 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Das umfassende Handbuch

von Wolfram Langer

eBook Download (2023)
Rheinwerk Computing (Verlag)
CHF 48,75
Das umfassende Handbuch

von Jürgen Sieben

eBook Download (2023)
Rheinwerk Computing (Verlag)
CHF 87,80
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Fachbuchverlag
CHF 29,30