The problem, as I see it, with this ics data is: How many columns do you have? I think each data set may have different lengths and categories. Maybe there is a definite set of column names, but that does not mean all datasets will have the same column names.
Also, you have info like this, which really should be 2 columns.
Quote:DTEND;TZID=America/Toronto:20170518T120000
Doesn't really matter, turns out you can sort by CREATED just using the whole string, without recourse to datetime:
Quote:CREATED:20101219T022011Z
CREATED:20101219T022012Z
CREATED:20111219T021727Z
CREATED:20111219T021728Z
CREATED:20151219T021727Z
CREATED:20201219T022011Z
CREATED:20251219T021727Z
CREATED:20251219T021729Z
And pandas is always handy. First sort by datasets, give each dataset a name, like set_1, set_2, etc.
import csv
import pandas as pd
ics_data_file = '/home/peterr/temp/ics_stuff/ics_data.ics'
savepath = '/home/peterr/temp/ics_stuff/ics_data_sorted.csv'
with open(ics_data_file) as infile:
ics_data = infile.read()
temp = ics_data.rstrip() # get rid of newline at the end
data = temp.split('\n')
#len(data) # 62 is correct here
wanted = data[1:-1] # don't want "BEGIN:VCALENDAR\n" or "END:VCALENDAR\n" here
#len(wanted) # 60 is correct here
data_sets = []
satz = []
# get each data set
for w in wanted:
if not w == 'END:VEVENT':
satz.append(w)
else:
satz.append(w)
data_sets.append(satz)
satz = []Now make a datasets dictionary, which can easily be imported into pandas:
# key is like set_1
data_dict = {}
for i in range(len(data_sets)):
#print(data_sets[i])
key = 'set_' + str(i)
data_dict[key] = data_sets[i]
# orient='index' sets the keys as index
dict_df = pd.DataFrame.from_dict(data_dict, orient='index')
#print(dict_df.to_string())
#cols = [str(num) for num in range(12)]
sorted_df = dict_df.sort_values(by=1)
#print(sorted_df.to_string())
sorted_df.to_csv(savepath, index=True)Once you have your data in pandas, you can do alll kinds of things!
Happy New Year of The Horse! (I'm a horse!)