How to update an element of a Julia DataFrame in place - Stack Overflow

I am trying to select rows from a Julia DataFrame according to a query, and to update a single column.

I am trying to select rows from a Julia DataFrame according to a query, and to update a single column.

So far I tried two methods, using filter and select. However, in both cases these appear to return a copy of the original dataframe, and therefore changing a value in these objects does not change the original dataframe data.

A short example:

An example dataframe has 4 columns. The first 3 columns are A, B and C. These are used to query the dataframe rows. The final column is enabled which is either true or false.

using DataFrames

df = DataFrame(A=String[], B=String[], C=String[], enabled=Bool[])
push!(df, Dict(:A=>"A", :B=>"B", :C=>"C", :enabled=>true)

# one method, which does not work
selected_rows = subset(df, [:A, :B, :C] => (A, B, C)->(A=="A", B=="B", C=="C"))
selected_rows[:enabled] = false

# another method, which also does not work
filtered_rows = filter([:A, :B, :C] => (A, B, C)->(A=="A", B=="B", C=="C"), df)
filtered_rows[:enabled] = false

I suspect no method similar to this will work. I tried to

  • filter the dataframe by some column values
  • modify the object returned by a filtering operation

However it is difficult to see how a filtering operation would return anything other than a copy of the original dataframe. I don't know of a way to get a "view" object out of a filtering operation.

Therefore, I suspect the way this has to be done would be to

  • find the index of rows matching a filter query
  • update the values of a column using the index

However I don't know how to do that either.

I have looked at the Julia DataFrames documentation and cannot find any functions in the API which seem like they might provide the behavior I am looking for.

I am trying to select rows from a Julia DataFrame according to a query, and to update a single column.

So far I tried two methods, using filter and select. However, in both cases these appear to return a copy of the original dataframe, and therefore changing a value in these objects does not change the original dataframe data.

A short example:

An example dataframe has 4 columns. The first 3 columns are A, B and C. These are used to query the dataframe rows. The final column is enabled which is either true or false.

using DataFrames

df = DataFrame(A=String[], B=String[], C=String[], enabled=Bool[])
push!(df, Dict(:A=>"A", :B=>"B", :C=>"C", :enabled=>true)

# one method, which does not work
selected_rows = subset(df, [:A, :B, :C] => (A, B, C)->(A=="A", B=="B", C=="C"))
selected_rows[:enabled] = false

# another method, which also does not work
filtered_rows = filter([:A, :B, :C] => (A, B, C)->(A=="A", B=="B", C=="C"), df)
filtered_rows[:enabled] = false

I suspect no method similar to this will work. I tried to

  • filter the dataframe by some column values
  • modify the object returned by a filtering operation

However it is difficult to see how a filtering operation would return anything other than a copy of the original dataframe. I don't know of a way to get a "view" object out of a filtering operation.

Therefore, I suspect the way this has to be done would be to

  • find the index of rows matching a filter query
  • update the values of a column using the index

However I don't know how to do that either.

I have looked at the Julia DataFrames documentation and cannot find any functions in the API which seem like they might provide the behavior I am looking for.

Share Improve this question asked Mar 12 at 15:04 user2138149user2138149 17.7k30 gold badges149 silver badges296 bronze badges 2
  • 1 Maybe df[df.A .== "A" .&& df.B .== "B" .&& df.C .== "C", "enabled"] .= false does what you want? – Andre Wildberg Commented Mar 12 at 18:16
  • @AndreWildberg Ah of course, thanks – user2138149 Commented Mar 17 at 10:10
Add a comment  | 

1 Answer 1

Reset to default 5

The Julia DataFrames.jl documentation has a section called getindex and view which spells out which methods return a copy of the selected columns or subset rows and which return a view into the data. In addition, the documentation for subset details the view keyword argument which might be of interest to you.

You have several options for the kind of operation you are trying to do. Here are some examples:

using DataFrames

reset() = DataFrame(A=["A","X"], B=["B","X"], C=["C","X"], enabled=[true,false])

# using simple indexing and @view
df = reset()
selected_rows = @view df[(df.A .== "A") .& (df.B .== "B") .& (df.C .== "C"), :]

@assert all(selected_rows.enabled .== true)
selected_rows.enabled .= false
@assert all(selected_rows.enabled .== false)
@assert all(df.enabled .== false)  # original DataFrame modified

# using subset and view=true
df = reset()
selected_rows = subset(df, :A => ByRow(==("A")), :B => ByRow(==("B")), :C => ByRow(==("C")), view=true)

selected_rows.enabled .= false
@assert all(df.enabled .== false)  # original DataFrame modified

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744746103a4591304.html

相关推荐

  • How to update an element of a Julia DataFrame in place - Stack Overflow

    I am trying to select rows from a Julia DataFrame according to a query, and to update a single column.

    22小时前
    20

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信