apache spark - Databricks Optimize - Zorder command is taking too long - Stack Overflow

I wrote a DataFrame into a Delta table (e.g., demo_table) using the overwrite mode, which involves drop

I wrote a DataFrame into a Delta table (e.g., demo_table) using the overwrite mode, which involves dropping the table beforehand. After the write operation was successful, I executed the OPTIMIZE command on the table. However, the OPTIMIZE operation took nearly an hour to complete. How can I improve this process?

Note : The table is in a partitioned format. Command : OPTIMIZE schema.demo_table ZORDER BY (custom_id,sales_date) Note : custom_id : Generated new columns , when we create final df final record count would be 3 million records. Not a wider table. Schema have basic data types . integer,string. there is no complex data types. Observation : when i use existing column in Zorder , it executed within 5 min.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745614735a4636163.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信