Pyspark: How To Covert Column With Ljava.lang.object

October 26, 2023 Post a Comment

I created data frame in PySpark by reading data from HDFS like this: df = spark.read.parquet('path/to/parquet') I expect the data frame to have two column of strings: +-----------

Solution 1:

Jaroslav,

I tried with the following code, and have used a sample parquet file from here. I am able to get the desired output from the dataframe, can u please chk your code using the code snippet below and also sample file referred above to see if there's any other issue:

from pyspark.sql importSparkSessionspark= SparkSession.builder.appName("Read a Parquet file").getOrCreate()
df = spark.read.parquet('E:\\...\\..\\userdata1.parquet')
df.show(10)
df.printSchema()

Replace the path to your HDFS location.

Dataframe output for your reference:

Python Programming Language

Pyspark: How To Covert Column With Ljava.lang.object

Solution 1:

Post a Comment for "Pyspark: How To Covert Column With Ljava.lang.object"