Aug-14-2023, 10:17 AM
Hello!
Belows are my dev environment,
Belows are my dev environment,
OS : Windows 11 python : Anaconda3 Apache Spark : 3.4.1 IDE : Visual Studio Code 1.18.1And I try to integrate apache spark pyspark into VS code. So I set the python path into settings.json file of vs code.
"python.defaultInterpreterPath": "C:\\Anaconda3\\python.exe",
"python.condaPath": "C:\\Anaconda3\\Scripts\\conda.exe",
"terminal.integrated.env.windows": {
"PYTHONPATH": "C:/spark-3.4.1-bin-hadoop3/python;C:/spark-3.4.1-bin-hadoop3/python/pyspark;C:/spark-3.4.1-bin-hadoop3/python/lib/py4j-0.10.9.7-src.zip;C:/spark-3.4.1-bin-hadoop3/python/lib/pyspark.zip"
},
"python.autoComplete.extraPaths": [
"C:\\spark-3.4.1-bin-hadoop3\\python",
"C:\\spark-3.4.1-bin-hadoop3\\python\\pyspark",
"C:\\spark-3.4.1-bin-hadoop3\\python\\lib\\py4j-0.10.9.7-src.zip",
"C:\\spark-3.4.1-bin-hadoop3\\python\\lib\\pyspark.zip"
],
"python.analysis.extraPaths": [
"C:\\spark-3.4.1-bin-hadoop3\\python",
"C:\\spark-3.4.1-bin-hadoop3\\python\\pyspark",
"C:\\spark-3.4.1-bin-hadoop3\\python\\lib\\py4j-0.10.9.7-src.zip",
"C:\\spark-3.4.1-bin-hadoop3\\python\\lib\\pyspark.zip"
]The pyspark works without errors with these configuration. But the issue happens when I import pandas of python default module.import pandas as pdThis simple and basic expression throws the errors like below,
Error:import pandas as pd
File "C:\spark-3.4.1-bin-hadoop3\python\pyspark\pandas\__init__.py", line 29, in <module>
from pyspark.pandas.missing.general_functions import MissingPandasLikeGeneralFunctions
File "C:\spark-3.4.1-bin-hadoop3\python\pyspark\pandas\__init__.py", line 34, in <module>
require_minimum_pandas_version()
File "C:\spark-3.4.1-bin-hadoop3\python\pyspark\sql\pandas\utils.py", line 37, in require_minimum_pandas_version
if LooseVersion(pandas.__version__) < LooseVersion(minimum_pandas_version):
^^^^^^^^^^^^^^^^^^
AttributeError: partially initialized module 'pandas' has no attribute '__version__' (most likely due to a circular import)As you see, the imported pandas module is not python module, but pyspark.pandas module. So the code brings the error. I set the python default interpreter to anaconda3 at the top line of the settings.json file. But it still brings the errors. Any reply will be thanksful.

![[Image: b08f5m.png]](https://imagizer.imageshack.com/v2/xq70/923/b08f5m.png)
![[Image: u4BrpA.png]](https://imagizer.imageshack.com/v2/xq70/924/u4BrpA.png)