python - dictionary of columns to compare in data frame

i have a data frame with two columns (name1, name2) i would like to use a dictionary of column names and then do a for loop that compares if the values are the same and specifically show the values that are not the same

when i try the following i get an error "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()"

# create df
test2 = {'NAME1': ['Tom', 'nick', 'krish', 'jack'],
        'NAME': ['Tom', 'nick', 'Carl', 'Bob']}
dfx = pd.DataFrame(test2)

#create dictionary
thisdict = {
  "NAME1": "NAME"
}

#loop and display differences
for a, b in thisdict.items():
    if dfx[a] != dfx[b]:
        x = dfx[[a, b]]
        print(x)

when i try the following i get an error "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()"

# create df
test2 = {'NAME1': ['Tom', 'nick', 'krish', 'jack'],
        'NAME': ['Tom', 'nick', 'Carl', 'Bob']}
dfx = pd.DataFrame(test2)

#create dictionary
thisdict = {
  "NAME1": "NAME"
}

#loop and display differences
for a, b in thisdict.items():
    if dfx[a] != dfx[b]:
        x = dfx[[a, b]]
        print(x)

Share Improve this question edited Mar 4 at 14:50 jsbueno 111k11 gold badges157 silver badges235 bronze badges asked Mar 4 at 13:57 chemicaluser 417 bronze badges

Can you share the output you're trying to get for this sample data? – Mureinik Commented Mar 4 at 14:07
Please refer to stackoverflow/questions/36921951/… – Salt Commented Mar 4 at 14:08
@Mureinik the output is a list or a df that shows the values 'krish', 'jack' and beside it 'Carl', "Bob' ... its the values that do not match in the two columns – chemicaluser Commented Mar 4 at 14:10
The error occurs because you are trying to compare two Series directly using !=, which results in a Series of boolean values. Instead, you should use the .ne() method to compare the Series element-wise and then filter the DataFrame based on the result. Let me know if able to understand, if not I will attach the code and output in answer – Anant Arun Commented Mar 4 at 14:13

Add a comment |

2 Answers 2

Sorted by: Reset to default 3

You need to compare the values row by row and filter the rows where the values in the two columns are not equal, try like below:

# Loop and display differences
for a, b in thisdict.items():
    # Compare the columns row by row
    mismatches = dfx[dfx[a] != dfx[b]]
    if not mismatches.empty:
        print(f"Mismatches between '{a}' and '{b}':")
        print(mismatches[[a, b]])

This kind of filtering for all rows is what Pandas and other dataframe frameworks give you for free.

By making the comparison on the columns you get a boolean series, which in turn can work as indices for the original dataframe and automatically select the columns that interest your:

for a, b in thisdict.items():
     diff = dfx[dfx[a] != dfx[b]]
     print(diff[[a, b]])

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745039865a4607745.html

python - dictionary of columns to compare in data frame - Stack Overflow

2 Answers 2

发表回复

评论列表（0条）

联系我们

400-800-8888

python - dictionary of columns to compare in data frame - Stack Overflow

2 Answers 2

相关推荐