Oct-31-2019, 12:15 PM
What I have is a large 707,000 row CVS file. In col 9 of 10 there is an id which looks like EG47213, EB53955 and so on. There are probably about 700 different such ids. Column 9 is called 'individual-local-identifier'. I need to separate out all the rows with a given id. By looking at the first few rows "manually" so to speak I found that EG47213 ran from row 3844 for another 4127 rows.
I then tried
There must be a better way of going on. I had hoped to find in the pandas documentation something like a FOR instruction and then some way of writing IF ... THEN escape or something similar.
Maybe I missed something.
I then tried
import pandas as pd
print("Hello, World!")
# Read the data into a variable pacto
url = "F:/Carrier Bag F/NAV Padget/Padget.csv"
pacto = pd.read_csv(url)
frutj = pacto[pacto['individual-local-identifier'] == "EG47213"]
kb = frutj.shape
print(kb)To my surprise this said that there were 8627 such rows, not 4127. When I started searching by hand I found the missing rows in two separate locations. (This took me a long time and made me mad)There must be a better way of going on. I had hoped to find in the pandas documentation something like a FOR instruction and then some way of writing IF ... THEN escape or something similar.
Maybe I missed something.
