Pyspark: How To Covert Column With Ljava.lang.object
I created data frame in PySpark by reading data from HDFS like this: df = spark.read.parquet('path/to/parquet') I expect the data frame to have two column of strings: +-----------
Solution 1:
Jaroslav,
I tried with the following code, and have used a sample parquet file from here. I am able to get the desired output from the dataframe, can u please chk your code using the code snippet below and also sample file referred above to see if there's any other issue:
from pyspark.sql importSparkSessionspark= SparkSession.builder.appName("Read a Parquet file").getOrCreate()
df = spark.read.parquet('E:\\...\\..\\userdata1.parquet')
df.show(10)
df.printSchema()
Replace the path to your HDFS location.
Dataframe output for your reference:
Post a Comment for "Pyspark: How To Covert Column With Ljava.lang.object"