apacciooutlook logo

LinkedIn Open-Sources Dr. Elephant Hadoop, Spark Tuning Tool

Thursday, April 14, 2016

content-image

FREMONT, CA: LinkedIn open sources Dr. Elephant tool, a performance monitoring and tuning tool that helps Hadoop and Spark users understand analyze and improve their workflows.

Dr. Elephant is a performance monitoring and tuning tool for Hadoop and Spark,that automatically gathers all the metrics, runs analysis on them, and presents them in a simple way for easy consumption. The goal of this tool is to improve developer productivity and increase cluster efficiency by making it easier to tune the jobs. It analyzes the Hadoop and Spark jobs using a set of pluggable, configurable, rule-based heuristics that provide insights on how a job performed, and then uses the results to make suggestions about how to tune the job to make it perform more efficiently.

LinkedIn has employees with different levels of experience with Hadoop using different frameworks to run their Hadoop jobs, but due to the growing number of Hadoop users, having regular sessions for different users on distinct frameworks did not work anymore. LinkedIn was unable to verify if they were able to achieve optimal performance for the job or guarantee performance coverage, which is why they needed to standardize and automate the process.

Hadoop is an open-source software framework that facilitates the distributed storage and processing of large distributed datasets involving a number of components interacting with each other. Apache Spark is a fast engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.

Working of Dr. Elephant

Dr. Elephant gets a list of all recent succeeded and failed applications at regular intervals from the YARN resource manager. The metadata for each application—namely, the job counters, configurations, and the task data—are fetched from the Job History server. Once it has all the metadata, Dr. Elephant runs a set of heuristics on them and generates a diagnostic report on how the individual heuristics and the job as a whole performed. These are then tagged with one of five severity levels, to indicate potential performance problems.

LinkedIn uses Dr. Elephant for many different use cases, including monitoring how a flow is performing on the cluster, understanding why a flow is running slowly, how and what can be tuned to improve a flow, comparing a flow against previous executions, and troubleshooting.

Apart from adding and improving heuristics and extending to newer job types, LinkedIn plans to upgrade, job-specific tuning suggestions based on real-time metrics; Visualizations of jobs’ cluster resource usage and trends; Better Spark integration; integrating more schedulers.

Leaders Speak

Andy Nallappan, VP & CIO,

The Industry Demands Quick Upgrade into Cloud

By Andy Nallappan, VP & CIO,

Global Information Technology, Avago Technologies

Steven Weinreb, CIO & EVP, Technology & Operations, Asia, MetLife

Embracing Advanced Tech-enabled Solutions that Foster Innovation and Growth

By Steven Weinreb, CIO & EVP, Technology & Operations, Asia, MetLife

Anil Khatri,

Trends that are on Every CIO's Watch-list

By Anil Khatri,

Head IT-South Asia,

SAP

James F. Hanauer, CTO, VP Engineering and Art Saisuphaluck, Solutions Architect, R&D Lead, CTSI-Global

Simplifying Infrastructure Management with Microsoft Solutions

By James F. Hanauer, CTO, VP Engineering and Art Saisuphaluck, Solutions Architect, R&D Lead, CTSI-Global

Mickey Bradford, VP-IT/CTO, Exchange; & Jay McCartin, VP-Logistic Operations,  Army & Air Force Exchange Service

Embracing Cloud Hosting Benefits

By Mickey Bradford, VP-IT/CTO, Exchange; & Jay McCartin, VP-Logistic Operations, Army & Air Force Exchange Service

Featured Vendors