postgresql - how to optimize SQL query with multiple JSONB contains conditions?

i am working on a PostgreSQL database where i need to filter rows based on multiple conditions in a JSONB column. the column attributes stores data like this

{
"color": "red",
"size": "large",
"shape": "circle"
}

i want to find rows where, color is "red" and size is "large"

currently i use the @> operator like this

SELECT *
FROM products
WHERE attributes @> '{"color": "red"}'
  AND attributes @> '{"size": "large"}';

this works but i suspect it is not the most efficient way since i will be adding more conditions later. would creating a GIN index help with performance and if so how should i create it?

i am working on a PostgreSQL database where i need to filter rows based on multiple conditions in a JSONB column. the column attributes stores data like this

{
"color": "red",
"size": "large",
"shape": "circle"
}

i want to find rows where, color is "red" and size is "large"

currently i use the @> operator like this

SELECT *
FROM products
WHERE attributes @> '{"color": "red"}'
  AND attributes @> '{"size": "large"}';

this works but i suspect it is not the most efficient way since i will be adding more conditions later. would creating a GIN index help with performance and if so how should i create it?

Share Improve this question asked Nov 16, 2024 at 15:53 Dulashani Sithara 191 silver badge5 bronze badges

Both the default as well as jsonb_path_ops support the containment @> operator. Create the index, vacuum analyze the table and run some tests. How much of a boost you'll get depends various factors, like the table size, value size and complexity, other parts of the query etc. With jsonpath expressions you can squeeze all these conditions into a single @? operation. – Zegarek Commented Nov 16, 2024 at 18:24
1 Try with … WHERE attributes @> '{"color": "red", "size": "large"}'; instead – Bergi Commented Nov 16, 2024 at 19:51
@Bergi Good point. I added a test showing this does run faster, with and without an index. – Zegarek Commented Nov 16, 2024 at 21:10

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

Both the default GIN operator class jsonb_ops as well as jsonb_path_ops support the containment @> operator. Create the index, vacuum analyze the table:

create index on products using gin(attributes);
vacuum analyze products;

Then run some test with explain analyze verbose or even proper benchmarks with pgbench.

How much of a boost you'll get depends various factors, like the table size, value size and complexity, other parts of the query etc. Tables in the lower thousands may not even use the index because scanning them whole might be so cheap that Postgres won't bother looking for shortcuts.

With jsonpath expressions you can squeeze all these conditions and more, into a single @? or @@ operation:
_{demo at db<>fiddle}

create index on products using gin(attributes jsonb_path_ops);
vacuum analyze products;

explain analyze verbose
SELECT *
FROM products
WHERE attributes @@ '$.color=="red" 
                     && 
                     $.size=="large"';

QUERY PLAN
Bitmap Heap Scan on public.products (cost=25.48..207.32 rows=707 width=105) (actual time=1.551..2.941 rows=621 loops=1)
Output: id, attributes
Recheck Cond: (products.attributes @@ '($."color" == "red" && $."size" == "large")'::jsonpath)
Heap Blocks: exact=168
-> Bitmap Index Scan on products_attributes_idx (cost=0.00..25.30 rows=707 width=0) (actual time=1.266..1.267 rows=621 loops=1)
Index Cond: (products.attributes @@ '($."color" == "red" && $."size" == "large")'::jsonpath)
Planning Time: 1.201 ms
Execution Time: 3.283 ms

If you don't need all the additional features of JSONPath, @Bergi's idea to merge the conditions can speed up your query a bit. In a test with some 300k somewhat randomised rows, it reliably faster with and without an index. Here's without:

variant	avg	min	max	sum	stddev	mode
bergi_merged	00:00:00.101224	00:00:00.088824	00:00:00.109947	00:00:04.048966	00:00:00.004158	00:00:00.088824
op_version	00:00:00.104604	00:00:00.093529	00:00:00.113067	00:00:04.184155	00:00:00.004641	00:00:00.093529
jsonpath	00:00:00.154695	00:00:00.140819	00:00:00.16828	00:00:06.187784	00:00:00.006629	00:00:00.140819

And here's with

variant	avg	min	max	sum	stddev	mode
bergi_merged	00:00:00.016154	00:00:00.013517	00:00:00.017935	00:00:00.646145	00:00:00.000994	00:00:00.016901
op_version	00:00:00.018822	00:00:00.013958	00:00:00.09185	00:00:00.752897	00:00:00.01193	00:00:00.013958
jsonpath	00:00:00.019326	00:00:00.016472	00:00:00.023866	00:00:00.773032	00:00:00.001564	00:00:00.016472

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745653675a4638415.html

postgresql - how to optimize SQL query with multiple JSONB contains conditions? - Stack Overflow

1 Answer 1

发表回复

评论列表（0条）

联系我们

400-800-8888

postgresql - how to optimize SQL query with multiple JSONB contains conditions? - Stack Overflow

1 Answer 1

相关推荐