Dyanmically Naming Files

ovidius · Jan-29-2020, 01:41 PM

Hello Everybody

I wrote this code which takes 2 files removes any letters inside of them keeping only the phone numbers, then removes any duplicates and compares the files to find the common content.
The Code is this:

import re
import csv

filename_list=[]
file1 = input("Please input file1:   ")
filename_list.append(file1)
file2 = input("Please input file2:  ")
filename_list.append(file2)
duplicate_list=[]

def clean_file(filename):
	with open (filename,'r') as f:
		list1=f.readlines()
		for ch in list1:
			result=re.sub('[^0-9]','',ch)
			with open(('{}_clean.csv').format(filename),'a+') as cl:
				if len(result)<10:
					result=result.strip()
				else:
					cl.write(result + '\n')

def clean_duplicates(filename):
	lines_seen = set()
	with open(('{}_clean_dup.csv').format(filename),'w') as rf:
		duplicate_list.append(rf.name)
		for line in open(('{}_clean.csv').format(filename),'r'):
			if line not in lines_seen:
				rf.write(line)
				lines_seen.add(line)

def find_common():
	comp_file1 = open(duplicate_list[0], "r")
	comp_file2 = open(duplicate_list[1], "r")
	result = open("results.csv", "a")
	list1 = comp_file1.readlines()
	list2 = comp_file2.readlines()
	for i in list1:
		for j in list2:
			if i==j:
				result.write(i)

	comp_file1.close()
	comp_file2.close()
	result.close()

for filename in filename_list:
	clean_file(filename)
	clean_duplicates(filename)

find_common()

So the code works but I have a slight problem. The produced files get filenames like this: filename.csv_clean.csv and filename.csv_clean_dup.csv.

I tried .rsplit, .rpartition trying to drop the .csv extension from the initial filename but it doesn't work.

Can anyone help?

**buran** · (This post was last modified: Jan-29-2020, 02:06 PM by buran.)

using os module functions

>>> import os
>>> os.path.split(r'c:\some_folder\some_file.csv')
('c:\\some_folder', 'some_file.csv')
>>> root, file = os.path.split(r'c:\some_folder\some_file.csv')
>>> os.path.join(root, 'new_file.csv')
'c:\\some_folder\\new_file.csv'
>>> root, ext = os.path.splitext(r'c:\some_folder\some_file.csv')
>>> root
'c:\\some_folder\\some_file'
>>> '_'.join((root, 'clean.csv'))
'c:\\some_folder\\some_file_clean.csv'

or using pathlib module

>>> import pathlib
>>> p = pathlib.Path(r'c:\some_folder\some_file.csv')
>>> p.with_name('new_file.csv')
WindowsPath('c:/some_folder/new_file.csv')
>>> p.parent.joinpath(''.join((p.stem, '_clean.csv')))
WindowsPath('c:/some_folder/some_file_clean.csv')
>>>

ovidius · Jan-29-2020, 02:28 PM

(Jan-29-2020, 02:06 PM)buran Wrote: using os module functions

>>> import os
>>> os.path.split(r'c:\some_folder\some_file.csv')
('c:\\some_folder', 'some_file.csv')
>>> root, file = os.path.split(r'c:\some_folder\some_file.csv')
>>> os.path.join(root, 'new_file.csv')
'c:\\some_folder\\new_file.csv'
>>> root, ext = os.path.splitext(r'c:\some_folder\some_file.csv')
>>> root
'c:\\some_folder\\some_file'
>>> '_'.join((root, 'clean.csv'))
'c:\\some_folder\\some_file_clean.csv'

or using pathlib module

>>> import pathlib
>>> p = pathlib.Path(r'c:\some_folder\some_file.csv')
>>> p.with_name('new_file.csv')
WindowsPath('c:/some_folder/new_file.csv')
>>> p.parent.joinpath(''.join((p.stem, '_clean.csv')))
WindowsPath('c:/some_folder/some_file_clean.csv')
>>>

Buran Thank you for the swift reply. My problem is that I cannot incorporate this to my code. If you see the code I posted the deletion of .csv extension from the initial file has to happen in the with open statement

 with open(('{}_clean.csv').format(filename),'a+') as cl:

also I need only the filename to change and not the path because the code will work in the same path where the 2 files are. So what I need -if it's at all possible - it's something that works with string formatting inside the .format(filename).

Again thank you for your help and your time.

**buran** · Jan-29-2020, 02:44 PM

all of the examples I have provided would do. Obviously the os.path functions will require less change in your code.

(Jan-29-2020, 02:28 PM)ovidius Wrote: something that works with string formatting inside the .format(filename)

you can construct the new file name in the open function using os.path functions, but for sake of readability it's better to do it on separate line, e.g. replace line 26

for line in open(('{}_clean.csv').format(filename),'r'):

with

new_name = ''.join((os.path.splitext(filename)[0], '_clean.csv'))
for line in open(new_name, 'r'):

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	naming entities(city abbreviations)	tirumalaramakrishna	1	2,310	May-06-2022, 11:22 AM Last Post: jefsummers
	Python Style and Naming Variables	rsherry8	3	6,121	Jun-07-2021, 09:30 PM Last Post: deanhystad
	Naming the file as time and date.	BettyTurnips	3	6,155	Jan-15-2021, 07:52 AM Last Post: BettyTurnips
	naming conventions	mvolkmann	4	3,805	Sep-28-2020, 05:51 PM Last Post: Gribouillis
	Question about naming variables in class methods	sShadowSerpent	1	4,043	Mar-25-2020, 04:51 PM Last Post: ndc85430
	naming images adding to number within multiple functions	Bmart6969	0	2,789	Oct-09-2019, 10:11 PM Last Post: Bmart6969
	Sub: Python-3: Better Avoid Naming A Variable As list ?	adt	9	6,799	Aug-29-2019, 08:15 AM Last Post: adt
	Naming convention advice	Alfalfa	5	5,466	Jul-21-2018, 11:47 AM Last Post: Larz60+
	Python Naming Error: List not defined	Intelligent_Agent0	1	15,728	Mar-13-2018, 08:34 PM Last Post: nilamo

Dyanmically Naming Files

User Panel Messages

Announcements