Apr-27-2018, 11:18 AM
I've to export the database table to Excel (xlsx) format.
I came to know that Python is very fast in file processing.
I've installed python 3.6.1, XlsxWriter-1.0.2 and cx_Oracle-5.3-12c
But I saw the date field from the database was getting converted to the number. To resolve this, I made a check if a list value \ (cell) value is date time, if yes then formatted the value to date type which resolves my problem
But, I am now running into performance issue with this extra check any suggestion how to gain performance
Below python function is used to generate excel xlsx from the database cursor
PS: Quick background of how I am proessing:
I am opening a loop which is processing only 50,000 rows and then calls this python excel writer function:
I've close to 900 such table and data counts in these ranges from 1,00,000 to 3,20,00,00,000
I am newbie to python and only think of executing the same python code for separate code parallel 10 times.
Guidance would be much appreciated
I came to know that Python is very fast in file processing.
I've installed python 3.6.1, XlsxWriter-1.0.2 and cx_Oracle-5.3-12c
But I saw the date field from the database was getting converted to the number. To resolve this, I made a check if a list value \ (cell) value is date time, if yes then formatted the value to date type which resolves my problem
But, I am now running into performance issue with this extra check any suggestion how to gain performance
Below python function is used to generate excel xlsx from the database cursor
PS: Quick background of how I am proessing:
I am opening a loop which is processing only 50,000 rows and then calls this python excel writer function:
I've close to 900 such table and data counts in these ranges from 1,00,000 to 3,20,00,00,000
I am newbie to python and only think of executing the same python code for separate code parallel 10 times.
Guidance would be much appreciated
#Function to write excel from Oracle Cursor
def writeToExcel(cur_sor, targetDir, export_file_name):
Actual_Path = os.path.join(targetDir, export_file_name)
#Array to capture Date type columns
DateTimeColumns = []
print('\t\t\t writing: '+export_file_name+'\t\t\t\t'+str(datetime.datetime.now()))
workbook = xlsxwriter.Workbook(Actual_Path) # Create Excel Object for new workbook
worksheet = workbook.add_worksheet(sourceSYS) # Add a New Worksheet Name - scott_schema
row = 0
col = 0
for i in range(len(cur_sor.description)):
desc = cur_sor.description[i]
#Only Data Type column will be capture
if format(desc[1])== "<class 'cx_Oracle.TIMESTAMP'>":
DateTimeColumns.append(i)
bold = workbook.add_format({'bold': True})
date_format = workbook.add_format({'num_format': 'dd/mm/yy'})
worksheet.write(row, (col + i), format(desc[0]), bold) # Iterate for column headers
date_format = workbook.add_format({'num_format': 'dd/mm/yy'})
color_format = workbook.add_format()
color_format.set_font_color('red')
row = row + 1
#Loop for each row, return by database cursor
for tupple_row in cur_sor:
col = 0
#Loop for each column, for particular row open in level-up cursor
for list_item in tupple_row:
#If column position matches with datetype column position
if col in DateTimeColumns:
#Check if the cell value is date type, additional check to handle "None" (blank date cell)
if isinstance(list_item, datetime.date) or isinstance(list_item, datetime.datetime) \
or isinstance(list_item, datetime.time) or isinstance(list_item, datetime.timedelta):
#Format the date, inorder to save as date rather number
worksheet.write(row, col, list_item.strftime("%Y-%b-%d %H:%M:%S.%f"))
else:
worksheet.write(row, col, list_item)
col = col + 1
row = row + 1
workbook.close()
