Unable to delete duplicates in excel with Python

tysondogerz · Nov-07-2017, 07:13 AM

I am trying to delete duplicates but the job just finishes with an exit code 0 and does not delete any duplicates.

I have attempted to do this with openpyxl for an excel as well as other methods (including csv though this deleted rows excessively).

The duplicates for the data always exist in Column F and I am desiring to delete the entire row B-I

Any ideas?

import openpyxl
wb1 = openpyxl.load_workbook('C:/dwad/SWWA.xlsx')
ws1 = wb1.active # keep naming convention consistent

values = []
for i in range(2,ws1.max_row+1):
  if ws1.cell(row=i,column=1).value in values:
    #pass
  #else:
    values.append(ws1.cell(row=i,column=1).value)

for value in values:
  ws1.append([value])

CSV:

with open('1.csv','r') as in_file, open('2.csv','w') as out_file:
    seen = set() # set for fast O(1) amortized lookup
    for line in in_file:
        if line not in seen: 
            seen.add(line)
            out_file.write(line)

tysondogerz · (This post was last modified: Nov-07-2017, 10:19 AM by tysondogerz.)

Trying to set it for a set range... As it's only getting one row so far.. Likely have to iterate...

wb1 = openpyxl.load_workbook('C:/adw.xlsx')
ws = wb1.active
wb2 = openpyxl.Workbook()
ws2 = wb2.active

ws1 = wb1.active  # keep naming convention consistent

column = ['A:B']
values = []
for row in range(1, ws1.max_row):
    if ws1.cell(row=row, column=row).value in values:
        pass  # if already in list do nothing
    else:
        values.append(ws1.cell(row=row, column=row).value)

directory = 'C:/dwadwaddwad.csv'
with open(directory, 'a', newline='', encoding="utf-8") as outfile:
    for value in values:
        ws2.append([value])
        print(value)
wb2.save('C:/dwadwadaw.xlsx')

I don't think openpyxl supports this. It can be done with line in Vba though https://msdn.microsoft.com/en-us/vba/exc...thod-excel

tysondogerz · Nov-07-2017, 11:25 AM

Solved. I use about 400 lines of code. But it does the job if you just iterate through the same file 12 times.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Delete all Excel named ranges (local and global scope)	pfdjhfuys	2	6,111	Mar-24-2023, 01:32 PM Last Post: pfdjhfuys
	python move specific files from source to destination including duplicates	mg24	3	2,643	Jan-21-2023, 04:21 AM Last Post: deanhystad
	remove partial duplicates from csv	ledgreve	0	2,475	Dec-12-2022, 04:21 PM Last Post: ledgreve
	DELETE Post using Python FaceBook Graph API	BIG_PESH	0	3,869	Mar-24-2022, 08:28 PM Last Post: BIG_PESH
	Removal of duplicates	teebee891	1	2,944	Feb-01-2021, 12:06 PM Last Post: jefsummers
	Displaying duplicates in dictionary	lokesh	2	3,395	Oct-15-2020, 08:07 AM Last Post: DeaD_EyE
	delete a Python object that matches an atribute	portafreak	2	3,675	Feb-19-2020, 12:48 PM Last Post: portafreak
	Unable to write to excel - Using openpyxl module	starstud	2	10,043	Feb-05-2020, 03:53 AM Last Post: starstud
	Deleting duplicates in tuples	Den	2	4,053	Dec-14-2019, 10:32 PM Last Post: ichabod801
	Not Able To Delete First Node From Python Linked List	ribena1980	8	7,693	Mar-05-2019, 03:14 PM Last Post: ichabod801

Unable to delete duplicates in excel with Python

User Panel Messages

Announcements