Cloudera, Inc, the provider of the leading
modern platform for machine learning and advanced analytics, announced the
release of Cloudera Altus, a Platform-as-a-Service (PaaS) offering that makes
it easier to run large-scale data processing applications on public cloud. The
initial Altus service helps data engineers use on-demand infrastructure to
speed the creation and operation of elastic data pipelines that power
sophisticated, data-driven applications.
Data engineering applications like
ETL (Extract, Transform and Load) or batch scoring are often large,
batch-oriented workloads that run for a fixed period of time and help companies
extract critical insights from raw data. Organizations can gain significant
flexibility and efficiency advantages by running these pipelines on elastic
infrastructure. Enterprises want to leverage cloud infrastructure alongside
familiar large-scale data processing tools and technologies.
The Cloudera Altus Data Engineering
service simplifies the development and operations of elastic data pipelines;
putting data engineering jobs front and center and abstracting infrastructure
management and operations that can be both time consuming and complex. Altus
also reduces the risk associated with cloud migrations. It provides users with
familiar tools packaged in an open, unified, enterprise-grade platform service
that delivers common storage, metadata, security, and management across
multiple data engineering applications.
“Data engineering workloads are
foundational for today’s data-driven applications,” said Charles Zedlewski,
senior vice president of Products at Cloudera. “Altus simplifies the process of
building and running elastic data pipelines while preserving portability and
making it easy to incorporate data engineering elements into more complex BI,
data science and real-time applications.”
Cloudera makes it easy,
cost-effective, and convenient to deploy these workloads on cloud providers,
such as Amazon Web Services (AWS), taking advantage of cloud elasticity,
low-cost storage and compute options, and rapid provisioning to deliver a
modern data service that can tackle even the most challenging business
problems. Cloud object stores such as Amazon Simple Storage Service (Amazon S3)
are becoming increasingly popular for their resiliency, scalability, and
relatively low cost.
According to IDC, public cloud
deployments are now at 12% of the overall worldwide business analytics software
market and expected to grow at a 25% CAGR through 2020[1]. Cloud is one of the fastest growing deployment
environments for Cloudera customers, and Altus makes it easier than ever to run
data engineering workloads in the cloud.
Features and benefits of Altus
include:
● Managed service for elastic data pipelines - Cloudera
Altus is a PaaS that allows data engineers to easily and quickly provision
Apache Spark, Apache Hive, Hive on Spark, and MapReduce2 capacity on cloud-native
infrastructure. Altus presents intelligent default cluster settings and
environments that dramatically reduce cluster deployment times and operations,
automating processes like cluster provisioning, configuring, and termination.
● Workload orientation - Cloudera Altus centers around data pipelines rather than
clusters or infrastructures, so users can easily submit, clone, and
troubleshoot pipelines with minimal attention paid to the underlying
infrastructure.
● No data siloes - The Altus Data Engineering service enables data engineers to run
direct reads from and writes to cloud object storage as does the rest of
Cloudera’s platform. This data is immediately available for use by other
Cloudera workloads without requiring data replication, ETL or changes to file
formats. In doing so users can more easily incorporate data engineering into
their data science, BI and real time DB applications.
● Backward compatibility and platform portability - Altus
supports multiple versions of CDH the most widely used open source platform in
the industry. Users can easily move workloads to and from the cloud without
needing to modify their applications. Because CDH is backward compatible across
minor releases, customers can harness the latest innovation from the Apache big
data open source community without fear of breaking their applications from
release to release.
● Built-in workload management - Altus automates and simplifies the common operational issues
related to elastic data pipelines with workload management. Users can
troubleshoot failed jobs with or without the clusters or compute infrastructure
being present. In addition Altus’ workload management flags significant
performance deviations and proposes a root cause analysis. In doing so
customers can run their data pipelines with greater reliability and lower cost.
No comments:
Post a Comment