apache spark - Databricks: Generate Multiple Excels for SQL Query - Stack Overflow

I am getting "OSError: Errno 95: Operation not supported for the code below. I have 'openpyxl

I am getting "OSError: Errno 95: Operation not supported for the code below. I have 'openpyxl 3.1.5' installed on the cluster and have imported all required modules. I am sure this is something small, but I can't put my finger on it why this is erroring out. Thanks for taking a look!

    df = spark.sql(f"""
    SELECT
        custId,
        locId
    FROM salesorders
    WHERE year = '2025'
    LIMIT 100
""")
df_region = spark.sql(f"""
    SELECT DISTINCT Id
    FROM sales.region
    WHERE isactive = 1 AND year = '2025'
""")
for row in df_region.rdd.collect():
    regId = row.__getitem__('Id')
    df_filtered = df.where(df.locId == regId)
    df_filtered.toPandas().to_excel(filePath.format(locId=regId), index=False, engine='openpyxl')

I am getting "OSError: Errno 95: Operation not supported for the code below. I have 'openpyxl 3.1.5' installed on the cluster and have imported all required modules. I am sure this is something small, but I can't put my finger on it why this is erroring out. Thanks for taking a look!

    df = spark.sql(f"""
    SELECT
        custId,
        locId
    FROM salesorders
    WHERE year = '2025'
    LIMIT 100
""")
df_region = spark.sql(f"""
    SELECT DISTINCT Id
    FROM sales.region
    WHERE isactive = 1 AND year = '2025'
""")
for row in df_region.rdd.collect():
    regId = row.__getitem__('Id')
    df_filtered = df.where(df.locId == regId)
    df_filtered.toPandas().to_excel(filePath.format(locId=regId), index=False, engine='openpyxl')
Share Improve this question asked Mar 3 at 18:27 libpekin1847libpekin1847 174 bronze badges 2
  • post full stack trace – Kashyap Commented Mar 3 at 21:18
  • Make sure you have write permission to the target directory. – Andrew Commented Mar 3 at 21:23
Add a comment  | 

1 Answer 1

Reset to default 0

Got it figure out. Databricks File System (DBFS) does not support random writes, which are required for writing Excel files. This limitation necessitates workarounds like writing to a local disk first and then copying files to DBFS. I leveraged the cluster, then copied the files to the storage account.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745077833a4609939.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信