
上QQ阅读APP看书,第一时间看更新
How to do it...
This section walks through the steps for the string conversion to a numeric value in the dataframe:
- Female --> 0
- Male --> 1
- Convert a column value inside of a dataframe requires importing functions:
from pyspark.sql import functions
- Next, modify the gender column to a numeric value using the following script:
df = df.withColumn('gender',functions.when(df['gender']=='Female',0).otherwise(1))
- Finally, reorder the columns so that gender is the last column in the dataframe using the following script:
df = df.select('height', 'weight', 'gender')