Zum Hauptinhalt springen
Nicht aus der Schweiz? Besuchen Sie lehmanns.de
Using Flume - Hari Shreedharan

Using Flume

Flexible, Scalable, and Reliable Data Streaming
Buch | Softcover
230 Seiten
2014
O'Reilly Media (Verlag)
978-1-4493-6830-2 (ISBN)
CHF 49,95 inkl. MwSt
Looking to use Apache Flume in your environment? This ultimate reference guide not only shows operations engineers how to configure, deploy, and monitor Flume, but also provides developers with detailed explanations and examples for writing Flume plugins and custom components to their specific use-cases. This book includes: In-depth explanations of how different Flume components work Detailed examples for customizing Flume using your own code Real world examples on capacity planning, configuring, and deploying Flume
How can you get your data from frontend servers to Hadoop in near real time?

With this complete reference guide, you’ll learn Flume’s rich set of features for collecting, aggregating, and writing large amounts of streaming data to the Hadoop Distributed File System (HDFS), Apache HBase, SolrCloud, Elastic Search, and other systems.

Using Flume shows operations engineers how to configure, deploy, and monitor a Flume cluster, and teaches developers how to write Flume plugins and custom components for their specific use-cases.

You’ll learn about Flume’s design and implementation, as well as various features that make it highly scalable, flexible, and reliable. Code examples and exercises are available on GitHub.

Topics included:
  • Learn how Flume provides a steady rate of flow by acting as a buffer between data producers and consumers
  • Dive into key Flume components, including sources that accept data and sinks that write and deliver it
  • Write custom plugins to customize the way Flume receives, modifies, formats, and writes data
  • Explore APIs for sending data to Flume agents from your own applications
  • Plan and deploy Flume in a scalable and flexible way—and monitor your cluster once it’s running

Hari Shreedharan is a PMC Member and Committer on the Apache Flume Project. As a PMC member, he is involved in making decisions on the direction of the project. Hari is also a Software Engineer at Cloudera where he works on Apache Flume and Apache Sqoop. He also ensures that customers can successfully deploy and manage Flume and Sqoop on their clusters, by helping them resolve any issues they are facing. Hari completed his Bachelors from Malaviya National Institute of Technology, Jaipur, India and his Masters in Computer Science from Cornell University in 2010.

Chapter 1Apache Hadoop and Apache HBase: An Introduction
HDFS
Apache HBase
Summary
References
Chapter 2Streaming Data Using Apache Flume
The Need for Flume
Is Flume a Good Fit?
Inside a Flume Agent
Configuring Flume Agents
Getting Flume Agents to Talk to Each Other
Complex Flows
Replicating Data to Various Destinations
Dynamic Routing
Flume’s No Data Loss Guarantee, Channels, and Transactions
Agent Failure and Data Loss
The Importance of Batching
What About Duplicates?
Running a Flume Agent
Summary
References
Chapter 3Sources
Lifecycle of a Source
Sink-to-Source Communication
HTTP Source
Spooling Directory Source
Syslog Sources
Exec Source
JMS Source
Writing Your Own Sources*
Summary
References
Chapter 4Channels
Transaction Workflow
Channels Bundled with Flume
Summary
References
Chapter 5Sinks
Lifecycle of a Sink
Optimizing the Performance of Sinks
Writing to HDFS: The HDFS Sink
HBase Sinks
RPC Sinks
Morphline Solr Sink
Elastic Search Sink
Other Sinks: Null Sink, Rolling File Sink, Logger Sink
Writing Your Own Sink*
Summary
References
Chapter 6Interceptors, Channel Selectors, Sink Groups, and Sink Processors
Interceptors
Channel Selectors
Sink Groups and Sink Processors
Summary
References
Chapter 7Getting Data into Flume*
Building Flume Events
Flume Client SDK
Embedded Agent
log4j Appenders
Summary
References
Chapter 8Planning, Deploying, and Monitoring Flume
Planning a Flume Deployment
Deploying Flume
Monitoring Flume
Summary
References

Zusatzinfo black & white illustrations
Verlagsort Sebastopol
Sprache englisch
Maße 178 x 233 mm
Gewicht 387 g
Einbandart kartoniert
Themenwelt Mathematik / Informatik Informatik Datenbanken
Schlagworte Datenbanken • Hadoop
ISBN-10 1-4493-6830-1 / 1449368301
ISBN-13 978-1-4493-6830-2 / 9781449368302
Zustand Neuware
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Der Leitfaden für die Praxis

von Christiana Klingenberg; Kristin Weber

Buch (2025)
Hanser (Verlag)
CHF 69,95