python - Keep rows where a field of a list[struct] column contains a message - Stack Overflow

Say I have the following data:import duckdbrel = duckdb.sql("""FROM VALUES([{'a&#

Say I have the following data:

import duckdb
rel = duckdb.sql("""
    FROM VALUES
        ([{'a': 'foo', 'b': 'bta'}]),
        ([]),
        ([{'a': 'jun', 'b': 'jul'}, {'a':'nov', 'b': 'obt'}])
        df(my_col)
    SELECT *
""")

which looks like this:

┌──────────────────────────────────────────────┐
│                    my_col                    │
│        struct(a varchar, b varchar)[]        │
├──────────────────────────────────────────────┤
│ [{'a': foo, 'b': bta}]                       │
│ []                                           │
│ [{'a': jun, 'b': jul}, {'a': nov, 'b': obt}] │
└──────────────────────────────────────────────┘

I would like to keep all rows where for any of the items in one of the elements of 'my_col', field 'a' contains the substring 'bt'

So, expected output:

┌──────────────────────────────────────────────┐
│                    my_col                    │
│        struct(a varchar, b varchar)[]        │
├──────────────────────────────────────────────┤
│ [{'a': foo, 'b': bta}]                       │
│ [{'a': jun, 'b': jul}, {'a': nov, 'b': obt}] │
└──────────────────────────────────────────────┘

How can I write a SQL query to do that?

Say I have the following data:

import duckdb
rel = duckdb.sql("""
    FROM VALUES
        ([{'a': 'foo', 'b': 'bta'}]),
        ([]),
        ([{'a': 'jun', 'b': 'jul'}, {'a':'nov', 'b': 'obt'}])
        df(my_col)
    SELECT *
""")

which looks like this:

┌──────────────────────────────────────────────┐
│                    my_col                    │
│        struct(a varchar, b varchar)[]        │
├──────────────────────────────────────────────┤
│ [{'a': foo, 'b': bta}]                       │
│ []                                           │
│ [{'a': jun, 'b': jul}, {'a': nov, 'b': obt}] │
└──────────────────────────────────────────────┘

I would like to keep all rows where for any of the items in one of the elements of 'my_col', field 'a' contains the substring 'bt'

So, expected output:

┌──────────────────────────────────────────────┐
│                    my_col                    │
│        struct(a varchar, b varchar)[]        │
├──────────────────────────────────────────────┤
│ [{'a': foo, 'b': bta}]                       │
│ [{'a': jun, 'b': jul}, {'a': nov, 'b': obt}] │
└──────────────────────────────────────────────┘

How can I write a SQL query to do that?

Share Improve this question edited Mar 3 at 14:15 ignoring_gravity asked Mar 3 at 14:03 ignoring_gravityignoring_gravity 10.7k7 gold badges44 silver badges88 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

Maybe list_sum() the bools or list_bool_or()?

  • https://duckdb./docs/stable/sql/functions/list.html#list_-rewrite-functions
duckdb.sql("""
FROM VALUES
    ([{'a': 'foo', 'b': 'bta'}]),
    ([]),
    ([{'a': 'jun', 'b': 'jul'}, {'a':'nov', 'b': 'obt'}])
    df(my_col)
SELECT *
WHERE list_bool_or(['bt' in s.b for s in my_col])
""")
┌──────────────────────────────────────────────┐
│                    my_col                    │
│        struct(a varchar, b varchar)[]        │
├──────────────────────────────────────────────┤
│ [{'a': foo, 'b': bta}]                       │
│ [{'a': jun, 'b': jul}, {'a': nov, 'b': obt}] │
└──────────────────────────────────────────────┘

The list comprehension is the same as list_apply(my_col, s -> 'bt' in s.b)

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745091285a4610708.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信