Mar-25-2020, 04:18 PM
Hi all,
I'm trying to implement this example:
Thanks a lot
I'm trying to implement this example:
import pandas as pd
import io
df = pd.read_csv(io.StringIO('''transactionid;event;datetime;info
1;START;2017-04-01 00:00:00;
1;END;2017-04-01 00:00:02;foo1
2;START;2017-04-01 00:00:02;
3;START;2017-04-01 00:00:02;
2;END;2017-04-01 00:00:03;foo2
4;START;2017-04-01 00:00:03;
3;END;2017-04-01 00:00:03;foo3
4;END;2017-04-01 00:00:04;foo4'''), sep=';', parse_dates=['datetime'])
df.datetime = pd.to_datetime(df.datetime)
funcs = {
'datetime':{
'start_date': 'min',
'end_date': 'max',
'duration': lambda x: x.max() - x.min(),
},
'info': 'last'
}
df.groupby(by='transactionid')['datetime','info'].agg(funcs).reset_index()
print(df)The expected output should be:Output: transactionid start_date end_date duration info
0 1 2017-04-01 00:00:00 2017-04-01 00:00:02 00:00:02 foo1
1 2 2017-04-01 00:00:02 2017-04-01 00:00:03 00:00:01 foo2
2 3 2017-04-01 00:00:02 2017-04-01 00:00:03 00:00:01 foo3
3 4 2017-04-01 00:00:03 2017-04-01 00:00:04 00:00:01 foo4Using python3.7 and I'm getting the following error:Error:python.py:24: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
df.groupby(by='transactionid')['datetime','info'].agg(funcs).reset_index()
Traceback (most recent call last):
File "python.py", line 24, in <module>
df.groupby(by='transactionid')['datetime','info'].agg(funcs).reset_index()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 928, in aggregate
result, how = self._aggregate(func, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/base.py", line 342, in _aggregate
raise SpecificationError("nested renamer is not supported")
pandas.core.base.SpecificationError: nested renamer is not supportedAny idea to solve this issue?Thanks a lot
