Looking for direction
Albert-Jan Roskam
fomcl at yahoo.com
Thu May 14 13:08:21 EDT 2015
-----------------------------
On Thu, May 14, 2015 3:35 PM CEST Dennis Lee Bieber wrote:
>On Wed, 13 May 2015 16:24:30 -0700, 20/20 Lab <lab at pacbell.net> declaimed
>the following:
>
>>Now is were I have my problem:
>>
>>myList = [ [123, "XXX", "Item", "Qty", "Noise"],
>> [72976, "YYY", "Item", "Qty", "Noise"],
>> [123, "XXX" "ItemTypo", "Qty", "Noise"] ]
>>
>>Basically, I need to check for rows with duplicate accounts row[0] and
>>staff (row[1]), and if so, remove that row, and add it's Qty to the
>>original row. I really dont have a clue how to go about this. The
>>number of rows change based on which run it is, so I couldnt even get
>>away with using hundreds of compare loops.
>>
>>If someone could point me to some documentation on the functions I would
>>need, or a tutorial it would be a great help.
>>
>
> This appears to be a matter of algorithm development -- there won't be
>an pre-made "function" for it. The closest would be the summing functions
>(control break http://en.wikipedia.org/wiki/Control_break ) of a report
>writer application.
>
> The short gist would be:
>
> SORT the data by the account field
> Initialize sum using first record
> loop
> read next record
> if end of data
> output sum record
> exit
> if record is same account as sum
> add quantity to sum
> else
> output sum record
> reset sum to the new record
>
> Granted -- loading the data into an SQL capable database would make
>this simple...
>
> select account, sum(quantity) from table
> order by account
You could also use pandas. Read the data in a DataFrame, create a groupby object, use the sum() and the first() methods.
http://pandas.pydata.org/pandas-docs/version/0.15.2/groupby.html
More information about the Python-list
mailing list