Category: Data, Kubernetes, containerization, automation, machine-learning, artificial-intelligence

By: Vidyasagar Machupalli, Technical Offering Manager & Polyglot ProgrammerShare this page on TwitterShare this page on FacebookShare this page on LinkedInE-mail this page Apache Spark (Spark) is an open source data-processing engine for large data sets.

IBM Cloud Kubernetes Service is a managed offering to create your own Kubernetes cluster of compute hosts to deploy and manage containerized apps on IBM Cloud.

The following occurs when you run your Python application on Spark: In short, you need three things to complete this journey: In this section, you will access the IBM Cloud Kubernetes Service cluster and will create a custom serviceaccount and a clusterrolebinding.

To do this, follow these steps: You can add Apache Spark Streaming for PySpark applications like wordcount to read and write batches of data to Cloud services like IBM Cloud Object Storage (COS).

Related Articles