Skip to content Skip to sidebar Skip to footer

How To Correctly Set Python Version In Spark?

My spark version is 2.4.0, it has python2.7 and python 3.7 . The default version is python2.7. Now I want to submit a pyspark program which uses python3.7. I tried two ways, but bo

Solution 1:

From my experience, I found that including the spark location in the python script tends to be much easier, for this use findspark.

import findspark
spark_location='/opt/spark-2.4.3/' # Set your own
findspark.init(spark_home=spark_location) 

Solution 2:

I encountered the same problem.

Solution of configuring the env in the beginning of the script (in Spark not executing tasks) did not work for me.

Without restarting the cluster, just executing the command below worked for me.

sudo sed -i -e '$a\export PYSPARK_PYTHON=/usr/bin/python3' /etc/spark/conf/spark-env.sh

Post a Comment for "How To Correctly Set Python Version In Spark?"