apache spark - Databricks Optimize - Zorder command is taking too long - Stack Overflow

admin•2025-04-26 17:06:56•questions•阅读3

I wrote a DataFrame into a Delta table (e.g., demo_table) using the overwrite mode, which involves drop

I wrote a DataFrame into a Delta table (e.g., demo_table) using the overwrite mode, which involves dropping the table beforehand. After the write operation was successful, I executed the OPTIMIZE command on the table. However, the OPTIMIZE operation took nearly an hour to complete. How can I improve this process?

Note : The table is in a partitioned format. Command : OPTIMIZE schema.demo_table ZORDER BY (custom_id,sales_date) Note : custom_id : Generated new columns , when we create final df final record count would be 3 million records. Not a wider table. Schema have basic data types . integer,string. there is no complex data types. Observation : when i use existing column in Zorder , it executed within 5 min.

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745614735a4636163.html

apache sparkDatabricks OptimizeZorder command is taking too longStack Overflow