Thank you for Subscribing to Apac CIO Outlook Weekly Brief
MapR Announces Dedicated Spark Distribution for Advanced Big Data Analytics
By apacciooutlook | Monday, December 03, 2018
SAN JOSE, CA: MapR Technologies, a provider of Apache Hadoop and big data analytics, announces a new enterprise-grade Apache Spark Distribution. The latest offering is packed with a complete Spark stack engineered to support advanced analytic applications, patented innovations in the MapR platform and also open source projects that compliment Spark.
The new Spark includes in-memory processing capabilities for big data, enabling faster application development and allowing for code reuse across batch, interactive, and streaming applications. The distribution aims to help companies transform the way they leverage their big data.
The distribution enables advanced analytics such as batch processing, machine learning, procedural SQL,and graph computation. Since Spark runs seamlessly with MapR’s platform, it incorporates features from the platform such as web-scale storage, high availability, mirroring, snapshots and NFS. Thus making MapR a production ready and stable platform for the Spark workloads suited for both on premises and in cloud.
The Spark-focused distribution can include product extensions such as real-time streaming and operational analytic capabilities, with MapR-Streams, MapR-DB, and Hadoop as add-ons. Sparks most popular uses cases include building data pipelines and developing advanced analytical applications leveraging machine learning.
MapR will also include Spark distribution in its Quick Start Solution offerings, which has pre-built templates, configuration and installation.
“We’ve built this new distribution to make it easier for customers that leverage the power of Spark for their big data initiatives,” says Anoop Dawar, vice president product management, MapR Technologies. “We’ve seen significant growth of customers deploying Spark as their primary compute engine. We believe this gives our customers a converged compute and storage engine for batch, analytics, and real-time processing that helps build and deploy applications rapidly.”