Mar-22-2024, 11:07 AM
(This post was last modified: Mar-23-2024, 08:07 AM by Gribouillis.)
Hello there,
First of all, I am not an expert for Python, but I rather learned it AD HOC. Namely, I have a script that scrapes name, sizes and prices for each size for products. I create excel file that the script reads, open links, gets the data, and saves it in a new excel file. Now, the issue is that the script is not able to click on a popup window (cookie) that shows up when run on server and accept the cookies - when I run it on my PC, the script works fine.
Here is the code:
First of all, I am not an expert for Python, but I rather learned it AD HOC. Namely, I have a script that scrapes name, sizes and prices for each size for products. I create excel file that the script reads, open links, gets the data, and saves it in a new excel file. Now, the issue is that the script is not able to click on a popup window (cookie) that shows up when run on server and accept the cookies - when I run it on my PC, the script works fine.
Here is the code:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import StaleElementReferenceException
from bs4 import BeautifulSoup
from xlwt import Workbook
import pandas as pd
from selenium.webdriver.chrome.options import Options
# Set up Chrome options for headless browsing
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox") # Add this line if running as root user
# Initialize WebDriver
driver = webdriver.Chrome(options=chrome_options)
# Reading the file where the links of products are saved
df = pd.read_excel('OttyDuvets_Links.xlsx')
# Create a single Workbook and Worksheet
wb = Workbook()
ws = wb.add_sheet('Sheet1')
# Defining the Headers in Excel Worksheet
ws.write(0, 0, 'Name')
ws.write(0, 1, 'Size')
ws.write(0, 2, 'Price')
row = 1
# Iterate through all size options using a for loop
for i in range(len(df['Links'])):
driver.get(df['Links'][i])
time.sleep(5)
# Wait for the "cc-btn-decision" class to appear and then click on it
try:
decision_btn = WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.XPATH, '/html/body/div[2]/div/div[2]/a[3]'))
)
decision_btn.click()
except Exception as e:
print("Error clicking on 'cc-btn-decision':", str(e))
time.sleep(3) # Adjust this delay as needed
# Parsing the source code of the webpage
soup = BeautifulSoup(driver.page_source, 'html.parser')
# Name of the Product
name_element = soup.find('h1')
name = name_element.text.strip() if name_element else "Name not found"
print(name)
# Find the button that triggers the dropdown
button = driver.find_element(By.XPATH, '/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/variant-selects[1]/div[1]/div[1]/select[1]')
button.click()
# Add a delay to allow the dropdown to appear
time.sleep(3)
# Iterate through all size options using a for loop
for j in range(len(driver.find_elements(By.XPATH, '/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/variant-selects[1]/div[1]/div[1]/select[1]/option'))):
# Find the dropdown container and size options each time
dropdown_container = driver.find_element(By.CLASS_NAME, 'select')
size_options = dropdown_container.find_elements(By.TAG_NAME, 'option')
option = size_options[j]
# Get the size from the option
selected_size = option.text.strip()
print(selected_size)
# Scroll into view using JavaScript
driver.execute_script("arguments[0].scrollIntoView();", option)
time.sleep(1)
# Click on the size option to select it
try:
option.click()
except StaleElementReferenceException:
# If the element is stale, re-find the dropdown and the option
dropdown_container = driver.find_element(By.CLASS_NAME, 'select')
size_options = dropdown_container.find_elements(By.TAG_NAME, 'option')
option = size_options[j]
option.click()
time.sleep(3)
# Find all the price elements for each selected size
price_elements = driver.find_elements(By.XPATH,
'/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/div[1]/div[1]/div[1]/span[2]')
# Iterate through all price elements for the selected size
for price_element in price_elements:
price = price_element.text.strip()
print(price)
# Saving the name, size, and price of the product in the worksheet
ws.write(row, 0, name)
ws.write(row, 1, selected_size)
ws.write(row, 2, price)
row += 1
# Click on the dropdown again to show the rest of the options
driver.find_element(By.XPATH, '/html[1]/body[1]/main[1]/section[1]/section[1]/div[1]/div[2]/product-info[1]/div[3]/variant-selects[1]/div[1]/div[1]/select[1]').click()
time.sleep(3)
# Save the single Workbook with its name and type
wb.save('OttyDuvets_Details.xls')
# Close the browser
driver.quit()This is the error log I get:Error:root@ip-xxx-xx-xx-xxx:/path/path/scrappingProjects/Otty# python3 scrappingOttyDuvets.py
Deluxe Microfibre Duvet
Traceback (most recent call last):
File "scrappingOttyDuvets.py", line 50, in <module>
button.click()
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webelement.py", line 94, in click
self._execute(Command.CLICK_ELEMENT)
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webelement.py", line 395, in _execute
return self._parent.execute(command, params)
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/webdriver.py", line 347, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element <select id="Option-template--14439267893323__main-0" class="select__select" name="options[Size]" form="product-form-template--14439267893323__main" data-forced-variant="">...</select> is not clickable at point (600, 580). Other element would receive the click: <iframe referrerpolicy="origin" id="attentive_creative" title="Sign Up via Text for Offers" src="https://creatives.attn.tv/creatives-dynamic/multiPage/index.html" style="width: 100%; height: 100%; position: fixed; inset: 0px; opacity: 1; transition: opacity 500ms ease 0s; border: 0px; overflow: hidden; visibility: visible;"></iframe>
(Session info: chrome-headless-shell=122.0.6261.128)
Stacktrace:
#0 0x56443887bf33 <unknown>
#1 0x564438573ce6 <unknown>
#2 0x5644385c5a73 <unknown>
#3 0x5644385c398e <unknown>
#4 0x5644385c1367 <unknown>
#5 0x5644385c0732 <unknown>
#6 0x5644385b3d27 <unknown>
#7 0x5644385e0eb2 <unknown>
#8 0x5644385b36b8 <unknown>
#9 0x5644385e107e <unknown>
#10 0x5644385ff899 <unknown>
#11 0x5644385e0c53 <unknown>
#12 0x5644385b1db3 <unknown>
#13 0x5644385b277e <unknown>
#14 0x56443884186b <unknown>
#15 0x564438845885 <unknown>
#16 0x56443882f181 <unknown>
#17 0x564438846412 <unknown>
#18 0x56443881325f <unknown>
#19 0x56443886a528 <unknown>
#20 0x56443886a723 <unknown>
#21 0x56443887b0e4 <unknown>
#22 0x7f7d88b75609 start_threadTo me, it looks like ti cannot click on the dropdown element due to the overlay of the cookie popup. Can someone help? I also have a few scripts that work great on my PC but not on the server....the URL I am trying to scrape is: https://otty.com/products/luxury-microfi...uble-duvet
Gribouillis write Mar-23-2024, 08:07 AM:
lease post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
lease post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
