Python Forum
Comparing two columns with same value but different font format
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Comparing two columns with same value but different font format
#1
Hi! I have a column of a dataframe with city names in capital letters and I wanted to compare it with the cities column of another dataframe with regular letters. It would be like a join. Join the two columns to compare the values ​​and search if any of them are missing. Below an example of the columns and values:

df1([CITY])
df1
CITY
WASHINGTON D.C.
NEW YORK
SAN DIEGO

df2([City])
df2
City
Washington D.C.
New York
San Diego
Reply
#2
just cast both strings to lower (or upper) then compare:

example:
>>> city1 = 'WASHINGTON D.C.'
>>> city2 = 'Washington D.C.'
>>> 
>>> # without changing case:
>>> city1 == city2
False
>>> 
>>> # with case change:
>>> city1.lower() == city2.lower()
True
>>> 
>>> # original variable names remain as they were:
>>> city1
'WASHINGTON D.C.'
>>> city2
'Washington D.C.'
>>>
doug2019 likes this post
Reply
#3
I got an error because I wrote index['A','B','C','D'] instead of using the correct parameter name syntax. To fix this, use: slice master

my_ser = pd.Series([1,2,3,4], index=['A','B','C','D'])
– index= is the correct declaration for pandas.Series
Reply
#4
Something like this perhaps?

import pandas as pd

data1 = {"Date": ["1/12/2002", "1/12/2002", "1/12/2002", "1/01/2003"],
        "City": ["New York", "Washington DC", "San Diego", "Kansas City"],
        "Abbreviation": ["NY", "WDC", "SD", "KC"],
        "Population": [2, 3, 4, 5]}

df1 = pd.DataFrame(data1)

data2 = {"Date": ["1/12/2002", "1/12/2002", "1/12/2002", "1/01/2003" ],
        "City": ["NEW YORK", "WASHINGTON DC", "SAN DIEGO", "Boston"],
        "Abbreviation": ["NY", "WDC", "SD", "B"],
        "Population": [2, 3, 4, 3]}

df2 = pd.DataFrame(data2)
If you ask: df1 == df2

Output:
Date City Abbreviation Population 0 True False True True 1 True False True True 2 True False True True 3 True False False False
Get all the data where the cities are the same apart from the upper- or lowercase spelling:

df3 = df1[df1['City'].str.lower() == df2['City'].str.lower()]
Have a look at df3:

Output:
Date City Abbreviation Population 0 1/12/2002 New York NY 2 1 1/12/2002 Washington DC WDC 3 2 1/12/2002 San Diego SD 4
Get all the rows where the df1['City'] != df2['City']

df4 = df1[df1['City'].str.lower() != df2['City'].str.lower()]
Have a look at df4

Output:
Date City Abbreviation Population 3 1/01/2003 Kansas City KC 5
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  comparing columns arvin 5 2,809 Dec-14-2022, 01:54 PM
Last Post: paul18fr
  How to properly format rows and columns in excel data from parsed .txt blocks jh67 7 5,437 Dec-12-2022, 08:22 PM
Last Post: jh67
  Load external font and does not show font in the window ATARI_LIVE 16 15,322 Feb-05-2021, 10:36 PM
Last Post: EthanPayne
  Comparing columns of Matrix stored in .txt files JoelFooCJ 2 3,556 Dec-11-2019, 07:21 AM
Last Post: JoelFooCJ
  Add DSIG to TTF font using ttx font tool Adrian 1 5,050 Nov-11-2017, 12:05 PM
Last Post: Adrian

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020