Feb-13-2024, 06:45 PM
(This post was last modified: Feb-13-2024, 06:46 PM by MasterOfDestr.)
I need to create excel report (update data in existing file) based on variety source data (also excel files). All data will be .xlsx. In general steps that I'll do are:
1.Download settings information;
2.Create "starting" reports based on settings information;
3.Filter & Copy & paste data to created reports (in some of the cases data starts from as example B3:T10 - so not always A1);
4.Copy & paste (autofill) formulas.
5.Launch macro (if necessary I can try rewrite macro code in Python)
I've tried to use openpyxl and pandas but I find it as not stable solution. As example if I run code for 1st time it's working but in 2nd approach I receive errors like
-"BadZipFile: File is not a zip file" -> (if I upload starting excel that will be updated then issue disappears);
-"template = load_workbook("_MAT_TEMPLATE.xlsx") KeyError: "There is no item named '[Content_Types].xml' in the archive" -> again uploading file resolves the issue;
-ValueError: I/O operation on closed file.-> issue when I want to save excel after pasting pandas dataframe - not solved yet.
As I haven't work on such complex topic yet, could you advice what library I could use for working on such task? I've tired openpyxl and pandas but I feel that they are somehow unstable. I've learned about Spire.XLS but seems that this is paid solution. If you also know already existing project on github that I could use as "inspiration" I would be also grateful.
Best regards, Jakub
What I did so far is
1.Download settings information;
2.Create "starting" reports based on settings information;
3.Filter & Copy & paste data to created reports (in some of the cases data starts from as example B3:T10 - so not always A1);
4.Copy & paste (autofill) formulas.
5.Launch macro (if necessary I can try rewrite macro code in Python)
I've tried to use openpyxl and pandas but I find it as not stable solution. As example if I run code for 1st time it's working but in 2nd approach I receive errors like
-"BadZipFile: File is not a zip file" -> (if I upload starting excel that will be updated then issue disappears);
-"template = load_workbook("_MAT_TEMPLATE.xlsx") KeyError: "There is no item named '[Content_Types].xml' in the archive" -> again uploading file resolves the issue;
-ValueError: I/O operation on closed file.-> issue when I want to save excel after pasting pandas dataframe - not solved yet.
As I haven't work on such complex topic yet, could you advice what library I could use for working on such task? I've tired openpyxl and pandas but I feel that they are somehow unstable. I've learned about Spire.XLS but seems that this is paid solution. If you also know already existing project on github that I could use as "inspiration" I would be also grateful.
Best regards, Jakub
What I did so far is
from openpyxl.workbook import workbook
from openpyxl import load_workbook
import pandas as pd
import xlsxwriter
xlPY = load_workbook("FX_rates_PY.xlsx")
wsPY = xlPY["englisch"]
FXPY = wsPY["B7:u32"]
template = load_workbook("_MAT_TEMPLATE.xlsx")
templateFXPY = template["FX rates_PY"]
materials = template["Materials"]
dest_rangePY = templateFXPY["A1:T26"]
templateFXCY = template["FX rates_CY"]
dest_rangeCY = templateFXCY["A1:T26"]
for i, row in enumerate(FXPY):
for j, cell in enumerate(row):
dest_rangePY[i][j].value=cell.value
xlCY = load_workbook("FX_rates_CY.xlsx")
wsCY = xlCY["englisch"]
FXCY = wsCY["B7:U32"]
for i, row in enumerate(FXCY):
for j, cell in enumerate(row):
dest_rangeCY[i][j].value=cell.value
cover_sheet = template["Cover"]
cover_sheet["C66"] = "Y"
cover_sheet["C55"] = "Y"
Month = cover_sheet['c7'].value
CurrentYear = cover_sheet['c5'].value
Bot_input = template["BOT Input"]
data_values = []
for row in Bot_input:
data_values.append([cell.value for cell in row])
PlantDataClean = pd.DataFrame(data_values[1:], columns=data_values[0])
PlantDataClean = PlantDataClean[PlantDataClean["GRIP Plant"].notna()]
PlantDataClean['GRIP Plant'] = PlantDataClean['GRIP Plant'].astype(int)
print(PlantDataClean)
erp_value = PlantDataClean['ERP'].values[0]
SAP_Plant_value = PlantDataClean['SAP Plant'].values[0]
print(SAP_Plant_value)
from openpyxl.utils.dataframe import dataframe_to_rows
if erp_value == "BKP":
BKP = load_workbook("BKP_Material List.xlsx")
BKPsheet = BKP["Sheet1"]
data_values = []
for row in BKPsheet:
data_values.append([cell.value for cell in row])
BKPsheetPD = pd.DataFrame(data_values[1:], columns=data_values[0])
BKPsheetPDfilter = BKPsheetPD[BKPsheetPD["Plant"] ==
SAP_Plant_value]
BKPsheetPDfilter =BKPsheetPDfilter["Material"]
print(BKPsheetPDfilter)
BKPsheetPDfilter.to_excel('_MAT_TEMPLATE.xlsx',sheet_name="sheet1")
#,startcol=2,startrow=0,header=None)
else:
print("S4 not avaiable yet")
template.save("_MAT_TEMPLATE.xlsx")
