This page explains how to install the DC/OS Apache Spark service.
Prerequisites
-
DC/OS and DC/OS CLI installed with a minimum of three agent nodes, with 8 GB of memory and 10 GB of disk space
-
Depending on your security mode, Spark requires service authentication for access to DC/OS. See Provisioning a service account for more information.
Security mode Service account Disabled Not available Permissive Optional Strict Required
-
Install the Spark package. This may take a few minutes. This step installs the Spark DC/OS service, Spark CLI, dispatcher, and, optionally, the history server. See the History Server section for information about how to install the history server.
Expected output:
-
Run the sample SparkPi jar for DC/OS.
You can view the example source here.
-
Use the following command to run a Spark job which calculates the value of Pi.
Expected output:
-
View the standard output from your job:
Expected output should contain:
-
-
Run a Python SparkPi jar. You can view the example source here.
-
Use the following command to run a Python Spark job which calculates the value of Pi.
Expected output:
-
View the standard output from your job:
Expected output should contain:
-
-
Run an R job. You can view the example source here.
-
Use the following command to run an R job.
Expected output:
-
Use the following command to view the standard output from your job.
Expected output:
-
Next steps
- To view the status of your job, run the
dcos spark webui
command then visit the Spark cluster dispatcher UI athttp://<dcos-url>/service/spark/
. - To view the logs, see the documentation for Mesosphere DC/OS monitoring.
- To view details about your Spark job, run the
dcos task log --completed <submissionId>
ordcos spark log --completed <submissionId>
command.