I have PySpark in my local environment:
pyspark[sql]==3.5.0
And SPARK_REMOTE
pointing to Databricks:
SPARK_REMOTE=SPARK_REMOTE=sc://{dbx_workspace}:443/;x-databricks-cluster-id={dbx_cluster_id};token={pat}
If I call df.foreach(my_function)
I get:
raise PySparkNotImplementedError(
pyspark.errors.exceptions.base.PySparkNotImplementedError: [NOT_IMPLEMENTED] foreach() is not implemented.
There are no problems with other transformations and actions, that I've seen.
A fix is to use databricks-connect
but I would like to decouple from Databricks as much as possible.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745273398a4619888.html
评论列表(0条)