Case Studies
Automated provisioning, configuration and management of Hortonworks Data Platform (HDP) cluster
Automated, programmatic provisioning, configuration and management of Hortonworks Data Platform (HDP) cluster.
Some of the problems included:
- Company’s product required automated, programmatic provisioning, configuration and management of Hortonworks Data Platform (HDP) cluster
- Running of MapReduce, Yarn and Oozie jobs in HDP cluster could only be done from edge node or via GUI and could not be done programmatically
- It was not known whether bootstrapping hosts in HDP cluster requires GUI or can be done programmatically
Some of the solutions applied included:
- Researching and prototyping to understand how provisioning, configuration and management of Hortonworks Data Platform (HDP) cluster and its components can be automated
- Developing proof of concept for bootstrapping hosts in HDP cluster programmatically
- Implementing on-demand programmatic provisioning, configuration and management of Hortonworks Data Platform (HDP) cluster off blueprint and integrating it into Company’s product
- Implementing automated generation of graph depicting dependencies of HDP stack components to facilitate planning of topologies for HDP cluster
- Implementing running of MapReduce, Yarn and Oozie jobs programmatically in HDP cluster
- Implementing transferring data in and out of HDP cluster programmatically via HDFS web service (HttpFS)
Technology stack
- Java
- Spring
- OSGi
- vSphere
- VMware VI (vSphere) Java API
- Hortonworks Data Platform (HDP), including:
- Ambari
- HDFS
- HttpFS
- MapReduce
- Yarn
- Zookeeper
- Spark
- Storm
- Pig
- Hive
- Kafka
- Flume
- Oozie
- Nifi
- Hue
- Zeppelin
- SAP HANA Vora
Industry
IT