sql - Is it possible to query the state of an updated table in postgresql within the same query? - Stack Overflow

I want to update a table in a WITH query (not a requirement, if it's achievable another way that&#

I want to update a table in a WITH query (not a requirement, if it's achievable another way that's fine too) and count rows of the updated table (including the updated row) for the subsequent query.

example:

WITH updated AS (
  UPDATE table1
  SET enabled = true
  WHERE id = 'efgh'
  RETURNING data
)
UPDATE table2 t
SET data = u.data, count = (
    SELECT COUNT(*)
    FROM table1
    WHERE enabled = true
)
FROM updated u
WHERE id = 'abcd'
RETURNING t.count

The goal is that count would be set to 1

I've tried several variations and am starting to think I will need an extra request to count the columns after the update. Is there a way to achieve what I am trying to do in a single SQL query?

I want to update a table in a WITH query (not a requirement, if it's achievable another way that's fine too) and count rows of the updated table (including the updated row) for the subsequent query.

example:

WITH updated AS (
  UPDATE table1
  SET enabled = true
  WHERE id = 'efgh'
  RETURNING data
)
UPDATE table2 t
SET data = u.data, count = (
    SELECT COUNT(*)
    FROM table1
    WHERE enabled = true
)
FROM updated u
WHERE id = 'abcd'
RETURNING t.count

The goal is that count would be set to 1

I've tried several variations and am starting to think I will need an extra request to count the columns after the update. Is there a way to achieve what I am trying to do in a single SQL query?

Share Improve this question asked Mar 23 at 20:01 Marcus RuddickMarcus Ruddick 10.4k9 gold badges33 silver badges51 bronze badges 2
  • The subselect in the set..count=(select.. subquery, as well as all other CTEs/subqueries, sees the snapshot from before the whole statement they're a part of. You don't want another request because that would introduce a window of time when something else might alter the state of the table concurrently, between your two requests. Move the select count(*).. to a CTE that resolves it against your update table1..returning.. set and use that instead. If you need help, add an MRE (create statements for objects involved, inserts of some sample/mock data, expected vs current result). – Zegarek Commented Mar 23 at 20:57
  • 2 You lost me at "The goal is that count would be set to 1". To achieve that goal, the whole approach seems nonsensical. Please elaborate on your actual requirements. Provide relevant table definition(s) as CREATE TABLE statement(s). And always declare your version of Postgres. Seems like you want to enforce a single "enabled" row per - what exactly? I suspect you really want the right set of constraints in the table definitions. Then your UPDATE gets trivially simple. – Erwin Brandstetter Commented Mar 24 at 3:47
Add a comment  | 

2 Answers 2

Reset to default 2

Your question shows that you already know that DML in CTE is not seen until out of the CTE, which is a base principle clearly specified in the doc (see "snapshot"); which implies there will be no exception to this (consistence preserving) rule.

I can see two ways to implement your need within the CTE (in addition to the decoupled ways you are considering, and which I'll mention in my third paragraph).

Things will be eased if you've got a primary key,
so in what follows let's suppose your id is one.

WHERE looking up in updated

If the test in the final count is strictly equivalent to the prior SET (which looks true in the simplified query you gave, with enabled = true first as a SET then in the WHERE),
you can of course count on your condition OR id IN (SELECT id FROM updated).

Or, in your case of a simple count (with count(a||b) == count(a) + count(b)), just add counts of just updated rows with past updated rows excluding those counted in the "just updated" ones:

WITH updated AS (
  UPDATE table1
  SET enabled = true
  WHERE id = 'efgh'
  RETURNING *
)
UPDATE table2 t
SET data = u.data, count = (
    SELECT COUNT(*)
    FROM table1
    WHERE id NOT IN (SELECT id FROM UPDATED)
    AND enabled = true
) + (SELECT COUNT(*) FROM updated)
FROM updated u
WHERE t.id = 'abcd'
RETURNING t.count;

(see it in a demo)

Tables masking

If the filter is more complex, you'll prefer working on a CTE that (incidentally…) has the same structure and data as table1, except they're updated.

WITH updated AS (
  UPDATE table1
  SET enabled = CASE WHEN id = 'efgh' THEN true ELSE enabled END
  -- ← No more WHERE here, we update the full table, but ↑ look above the CASE to preserve data where change is unwanted.
  RETURNING *
)
UPDATE table2 t
SET data = u.data, count = (
    SELECT COUNT(*)
    FROM updated -- ← Now every new data is seeable.
    WHERE enabled = true
)
FROM updated u
WHERE t.id = 'abcd'
RETURNING t.count;

(you could even start with an WITH table1 AS […] to be sure you never refer to the non-updated table1 in the rest of the CTE)

Of course this updates the full table1 (with triggers, reindexing, and so on), so a preferable way would combine the non-modified part of table1 with the rows of updated:

WITH updated AS (
  UPDATE table1
  SET enabled = true
  WHERE id = 'efgh'
  RETURNING *
),
-- Here we redefine table1 as the union of unmodified and modified rows:
table1 AS (
  SELECT * FROM table1 WHERE id NOT IN (SELECT id FROM UPDATED)
  UNION ALL
  SELECT * FROM updated
)
UPDATE table2 t
SET data = u.data, count = (
    SELECT COUNT(*)
    FROM table1 -- ← Now we can use the (updated) table1 transparently.
    WHERE enabled = true
)
FROM updated u -- ← Except here where we explicitely want to join on only the updated rows.
WHERE t.id = 'abcd'
RETURNING t.count;

(demoed in this fiddle)

After the CTE

But those "tables masking" solutions start by making full rows copies CTE (and even full table copies),
which may not be entirely compensated by PostgreSQL's clever optimizer (directly reading the source table instead of working on its materialized full copy).

I for one would opt for the "2 queries way", benefiting from all newly updated indexes and letting the optimizer use well-known execution paths;
even if it introduces a race condition, see the must-read Zegarek's comment, both enlightened and enlighting.

with updated as(
  update table1
  set enabled = true
  where id = 'efgh'
  returning data )
update table2 as t
set data=u.data
  , count= (select count(*)from table1  --those enabled before this query
            where enabled and id<>'efgh')
          +(select count(*)from updated)--adds those enabled by this query
from updated as u
where id = 'abcd'
returning t.count;
count
1

Each query in your statement sees snapshots of tables from before it started, meaning that if you want the count(*)from table1 where enabled to include those just enabled by the CTE in this statement, you need to source them from your updating CTE's returning clause, not directly from the table. demo at db<>fiddle

starting to think I will need an extra request to count the columns after the update

Splitting this into two requests would introduce a window of time when something else might alter the state of the table concurrently, between your two requests. That leads to inconsistent behaviour, costs you additional round trip time and offers zero benefits overall.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744270862a4566093.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信