I am very new to Postgresql, using PostgreSQL 14.1.
I have been tasked to partition a table that tracks events (Data Model attached below) on a non-key column. The main reasons being speed up retrieves and dropping old data using a drop partition instead of deletes. Volume is around 2-3 million records per AS_OF_DT in the table. retention period is not decided yet, but likely 1 quarter. Most likely partition key candidate is daily partition on AS_OF_DT given the retrieval patterns and data purging use-case. Second best candidate (or both together could be partition keys) is EVENT_TYPE_ID. AS_OF_DT from what I gathered, is the date portion of START_TMSTP
Coming from Oracle background, which lets me partition on a non-key column, Postgres throws an error "ERROR: unique constraint on partitioned table must include all partitioning columns DETAIL: PRIMARY KEY constraint on table "event_tracking_partition_test" lacks column "as_of_dt" which is part of the partition key.SQL state: 0A000"*
Event_id is the Primary Key which is a surrogate key , along with table having alternate key. I don't prefer sacrificing the data integrity by dropping the PK and AK. Moreover there are child tables referencing Event_id. Wondering if there are any workarounds like sub partitioning etc..?
Attaching the DDL used (Created a parallel table in dev). Please note that the partitions would be a daily partition, but I was trying to test the concept, so created yearly.
CREATE TABLE IF NOT EXISTS process_tracking_data.event_tracking_partition_test
(
event_id uuid NOT NULL,
event_num numeric,
event_type_id uuid NOT NULL,
entity_id character varying(255) COLLATE pg_catalog."default" NOT NULL,
entity_type character varying(100) COLLATE pg_catalog."default",
message_id character varying(500) COLLATE pg_catalog."default",
event_data character varying(500) COLLATE pg_catalog."default",
event_status character varying(30) COLLATE pg_catalog."default" NOT NULL,
as_of_dt date NOT NULL,
start_tmstp timestamp without time zone NOT NULL,
end_tmstp timestamp without time zone,
source_event_id character varying(250) COLLATE pg_catalog."default",
err_msg_cd character varying(50) COLLATE pg_catalog."default",
err_msg_desc character varying(2000) COLLATE pg_catalog."default",
created_by character varying(50) COLLATE pg_catalog."default" NOT NULL,
created_tmstp timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
last_updated_by character varying(50) COLLATE pg_catalog."default" NOT NULL,
last_updated_tmstp timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT pk_event_tracking_partition_test PRIMARY KEY (event_id),
CONSTRAINT uk_event_tracking_partition_test UNIQUE (event_type_id, entity_id, entity_type, start_tmstp)
)
PARTITION BY RANGE (as_of_dt);
;
CREATE TABLE process_tracking_data.event_tracking_partition_test_part1 PARTITION OF process_tracking_data.event_tracking_partition_test
FOR VALUES FROM ('2022-01-01') TO ('2022-12-31');
CREATE TABLE process_tracking_data.event_tracking_partition_test_part2 PARTITION OF process_tracking_data.event_tracking_partition_test
FOR VALUES FROM ('2023-01-01') TO ('2026-12-31');
ERROR: unique constraint on partitioned table must include all partitioning columns DETAIL: PRIMARY KEY constraint on table "event_tracking_partition_test" lacks column "as_of_dt" which is part of the partition key. SQL state: 0A000
I am very new to Postgresql, using PostgreSQL 14.1.
I have been tasked to partition a table that tracks events (Data Model attached below) on a non-key column. The main reasons being speed up retrieves and dropping old data using a drop partition instead of deletes. Volume is around 2-3 million records per AS_OF_DT in the table. retention period is not decided yet, but likely 1 quarter. Most likely partition key candidate is daily partition on AS_OF_DT given the retrieval patterns and data purging use-case. Second best candidate (or both together could be partition keys) is EVENT_TYPE_ID. AS_OF_DT from what I gathered, is the date portion of START_TMSTP
Coming from Oracle background, which lets me partition on a non-key column, Postgres throws an error "ERROR: unique constraint on partitioned table must include all partitioning columns DETAIL: PRIMARY KEY constraint on table "event_tracking_partition_test" lacks column "as_of_dt" which is part of the partition key.SQL state: 0A000"*
Event_id is the Primary Key which is a surrogate key , along with table having alternate key. I don't prefer sacrificing the data integrity by dropping the PK and AK. Moreover there are child tables referencing Event_id. Wondering if there are any workarounds like sub partitioning etc..?
Attaching the DDL used (Created a parallel table in dev). Please note that the partitions would be a daily partition, but I was trying to test the concept, so created yearly.
CREATE TABLE IF NOT EXISTS process_tracking_data.event_tracking_partition_test
(
event_id uuid NOT NULL,
event_num numeric,
event_type_id uuid NOT NULL,
entity_id character varying(255) COLLATE pg_catalog."default" NOT NULL,
entity_type character varying(100) COLLATE pg_catalog."default",
message_id character varying(500) COLLATE pg_catalog."default",
event_data character varying(500) COLLATE pg_catalog."default",
event_status character varying(30) COLLATE pg_catalog."default" NOT NULL,
as_of_dt date NOT NULL,
start_tmstp timestamp without time zone NOT NULL,
end_tmstp timestamp without time zone,
source_event_id character varying(250) COLLATE pg_catalog."default",
err_msg_cd character varying(50) COLLATE pg_catalog."default",
err_msg_desc character varying(2000) COLLATE pg_catalog."default",
created_by character varying(50) COLLATE pg_catalog."default" NOT NULL,
created_tmstp timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
last_updated_by character varying(50) COLLATE pg_catalog."default" NOT NULL,
last_updated_tmstp timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT pk_event_tracking_partition_test PRIMARY KEY (event_id),
CONSTRAINT uk_event_tracking_partition_test UNIQUE (event_type_id, entity_id, entity_type, start_tmstp)
)
PARTITION BY RANGE (as_of_dt);
;
CREATE TABLE process_tracking_data.event_tracking_partition_test_part1 PARTITION OF process_tracking_data.event_tracking_partition_test
FOR VALUES FROM ('2022-01-01') TO ('2022-12-31');
CREATE TABLE process_tracking_data.event_tracking_partition_test_part2 PARTITION OF process_tracking_data.event_tracking_partition_test
FOR VALUES FROM ('2023-01-01') TO ('2026-12-31');
ERROR: unique constraint on partitioned table must include all partitioning columns DETAIL: PRIMARY KEY constraint on table "event_tracking_partition_test" lacks column "as_of_dt" which is part of the partition key. SQL state: 0A000
Share Improve this question edited Mar 6 at 22:40 Sunil asked Mar 6 at 22:12 SunilSunil 515 bronze badges 6- Where did you read that removing a partition is faster as delete the rows? – nbk Commented Mar 6 at 22:17
- Overview section in the documentation postgresql./docs/current/… In oracle have seen drop partition much faster than deletes, so per documentation it seems its the same here. – Sunil Commented Mar 6 at 22:23
- 1 @nbk. Partition: Bulk loads and deletes can be accomplished by adding or removing partitions, if the usage pattern is accounted for in the partitioning design. Dropping an individual partition using DROP TABLE, or doing ALTER TABLE DETACH PARTITION, is far faster than a bulk operation. – Adrian Klaver Commented Mar 6 at 22:25
- You are going to need the actual statements you used to create the partitions to your question as text. – Adrian Klaver Commented Mar 6 at 22:29
- 1 Your primary key must be part of the partitioning key. You can also create a primary key for only the partitions, not the parent. – Frank Heikens Commented Mar 7 at 0:24
1 Answer
Reset to default 1If you insist on keeping the primary key on the table, you won't be able to partition on a non-key column. The reason is that PostgreSQL does not support global indexes on partitioned tables.
However, such a global primary key index would be harmful for your intended (good!) use case of deleting data by dropping partitions: the global index would have to be maintained, which would slow down dropping a partition.
I recommend that you create primary key constraints on the individual partitions instead of on the partitioned table. That will go a long way, and it won't conflict with your partitioning scheme. If you want to have the benefits of partitioning, you have to sacrifice some integrity constraints.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744947185a4602692.html
评论列表(0条)