作者:海螺里的秘密_471 | 来源:互联网 | 2022-12-05 18:42
1> hamza tuna..:
如果你想在python中:
nameList = [c for x in df.rdd.collect() for c in x['name']]
或者如果你想在spark中这样做:
from pyspark.sql import functions as F
df.withColumn('name', F.split(F.col('name'), '')).show()
结果:
+---+--------------+-----+----------+--------+
| id| name|class|start_data|end_date|
+---+--------------+-----+----------+--------+
| 1|[j, o, h, n, ]| xii| 20170909|20210909|
+---+--------------+-----+----------+--------+