在wiindows环境下,使用PySpark的时候报错:
Python in worker has different version 3.9 than that in driver 3.8, PySpark cannot run with different minor versions
File "E:\Anaconda3\envs\tf38\Lib\site-packages\pyspark\python\lib\pyspark.zip\pyspark\worker.py", line 473, in main
Exception: Python in worker has different version 3.9 than that in driver 3.8, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
根据报错内容,设置了以下变量,
PYSPARK_PYTHON=E:\Anaconda3\envs\tf38\python.exe
问题解决。
PS: 根据官方的说法,如果设置了PYSPARK_DRIVER_PYTHON,则PYSPARK_DRIVER_PYTHON会优先。我这里没有设置,所以程序直接用的PYSPARK_PYTHON。
在Linux下,一般需要在~/.bashrc中设置,例如
export PYSPARK_PYTHON=~/Anaconda3/envs/tf38/python
另外,找到一篇介绍PYSPARK_DRIVER_PYTHON的文章,里面有pycharm的设置方法,
Demystify Pyspark_driver_Python Comfortably - Python Pool
本文结束