Looking for direction

Albert-Jan Roskam fomcl at yahoo.com
Thu May 14 13:08:21 EDT 2015


-----------------------------
On Thu, May 14, 2015 3:35 PM CEST Dennis Lee Bieber wrote:

>On Wed, 13 May 2015 16:24:30 -0700, 20/20 Lab <lab at pacbell.net> declaimed
>the following:
>
>>Now is were I have my problem:
>>
>>myList = [ [123, "XXX", "Item", "Qty", "Noise"],
>>            [72976, "YYY", "Item", "Qty", "Noise"],
>>            [123, "XXX" "ItemTypo", "Qty", "Noise"]    ]
>>
>>Basically, I need to check for rows with duplicate accounts row[0] and 
>>staff (row[1]), and if so, remove that row, and add it's Qty to the 
>>original row. I really dont have a clue how to go about this.  The 
>>number of rows change based on which run it is, so I couldnt even get 
>>away with using hundreds of compare loops.
>>
>>If someone could point me to some documentation on the functions I would 
>>need, or a tutorial it would be a great help.
>>
>
>	This appears to be a matter of algorithm development -- there won't be
>an pre-made "function" for it. The closest would be the summing functions
>(control break http://en.wikipedia.org/wiki/Control_break ) of a report
>writer application.
>
>	The short gist would be:
>
>	SORT the data by the account field
>	Initialize sum using first record
>	loop
>		read next record
>		if end of data
>			output sum record
>			exit
>		if record is same account as sum
>			add quantity to sum
>		else
>			output sum record
>			reset sum to the new record
>
>	Granted -- loading the data into an SQL capable database would make
>this simple...
>
>	select account, sum(quantity) from table
>	order by account

You could also use pandas. Read the data in a DataFrame, create a groupby object, use the sum() and the first() methods.

http://pandas.pydata.org/pandas-docs/version/0.15.2/groupby.html





More information about the Python-list mailing list