I have a case where I need to filter data using a view and a partition column.
Table:
CREATE TABLE `testing-442412.test_dataset.Product`(
Id INTEGER NOT NULL,
UserId INTEGER NOT NULL,
CompanyId INTEGER NOT NULL,
DateCreated DATETIME NOT NULL
) PARTITION BY DATE_TRUNC(DateCreated, MONTH) CLUSTER BY UserId, CompanyId;
View:
CREATE VIEW `testing-442412.test_dataset.ProductView` AS
SELECT
UserId,
CompanyId,
EXTRACT(YEAR FROM DateCreated) AS Year,
EXTRACT(MONTH FROM DateCreated) AS Month,
COUNT(DISTINCT Id) AS Total
FROM `testing-442412.test_dataset.Product`
GROUP BY
UserId,
CompanyId,
Year,
Month
I WANT TO FILTER MY AGGREGATED DATA USING QUERY LIKE THIS:
SELECT * FROM `testing-442412.test_dataset.ProductView` WHERE Year = 2024 AND Month = 1;
However, when I do this, I notice that partition pruning doesn't seem to be working, as many records are read in the execution details. How can I improve my view in this case? Does partition pruning even work with the EXTRACT function? Or do I need to use the DateCreated column directly without any manipulation, with a query like this:
SELECT FROM testing-442412.test_dataset.Product WHERE DateCreated BETWEEN DATETIME("2024-01-01") AND DATETIME("2024-01-30");
I don't really want to force my clients to use table Product directly, so I hope you will be able to help me.
I have a case where I need to filter data using a view and a partition column.
Table:
CREATE TABLE `testing-442412.test_dataset.Product`(
Id INTEGER NOT NULL,
UserId INTEGER NOT NULL,
CompanyId INTEGER NOT NULL,
DateCreated DATETIME NOT NULL
) PARTITION BY DATE_TRUNC(DateCreated, MONTH) CLUSTER BY UserId, CompanyId;
View:
CREATE VIEW `testing-442412.test_dataset.ProductView` AS
SELECT
UserId,
CompanyId,
EXTRACT(YEAR FROM DateCreated) AS Year,
EXTRACT(MONTH FROM DateCreated) AS Month,
COUNT(DISTINCT Id) AS Total
FROM `testing-442412.test_dataset.Product`
GROUP BY
UserId,
CompanyId,
Year,
Month
I WANT TO FILTER MY AGGREGATED DATA USING QUERY LIKE THIS:
SELECT * FROM `testing-442412.test_dataset.ProductView` WHERE Year = 2024 AND Month = 1;
However, when I do this, I notice that partition pruning doesn't seem to be working, as many records are read in the execution details. How can I improve my view in this case? Does partition pruning even work with the EXTRACT function? Or do I need to use the DateCreated column directly without any manipulation, with a query like this:
SELECT FROM testing-442412.test_dataset.Product WHERE DateCreated BETWEEN DATETIME("2024-01-01") AND DATETIME("2024-01-30");
I don't really want to force my clients to use table Product directly, so I hope you will be able to help me.
Share Improve this question edited Mar 10 at 15:43 M_G asked Mar 10 at 15:42 M_GM_G 112 bronze badges 1- Kudos on your first question on SO. It is high quality with plenti of info and sample code. – Yun Zhang Commented Mar 12 at 4:14
1 Answer
Reset to default 0(Unfortunately) BigQuery has relatively sensitive rules on what predicate can be used as a partition filter. Not every reasonable combination is now detected as a valid partition filter. I personally think your filter is reasonable and could be expected for "monthly" partition but it is indeed not supported.
Here's the best way of confirming by set the table option require_partition_filter to true, then when the table is not access with a "detected" partition filter, the query will just fail
CREATE OR REPLACE TABLE mydataset.temp (
Id INTEGER NOT NULL,
UserId INTEGER NOT NULL,
CompanyId INTEGER NOT NULL,
DateCreated DATETIME NOT NULL
)
PARTITION BY DATE_TRUNC(DateCreated, MONTH)
CLUSTER BY UserId, CompanyId
OPTIONS (require_partition_filter=true); -- <=== this one
The here's an very non-exhaustive list of what works and doesn't work:
select * from mydataset.temp
-- where EXTRACT(MONTH FROM DateCreated) = 1; # <=== doesn't work
-- where EXTRACT(DATE FROM DateCreated) = '2020-01-01'; # <=== works
-- where DATE(DateCreated) = '2020-01-01'; # <=== works
-- WHERE DateCreated BETWEEN DATETIME("2024-01-01") AND DATETIME("2024-01-30"); # <=== works
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744837339a4596343.html
评论列表(0条)