Python Forum
Struggling to Scrape Structured Lottery Results from Dynamic Page
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Struggling to Scrape Structured Lottery Results from Dynamic Page
#1
I’ve been trying to scrape the results from a star49s website and I’m running into some unexpected problems that I can’t seem to debug properly.

Now the issue is that when I try to fetch the data using requests and BeautifulSoup, I can’t consistently get the numbers inside the result boxes. Sometimes the HTML looks fine in view-source, but when I request it programmatically, either the tags are missing or the response is incomplete. This makes me think that the website is possibly rendering results dynamically via JavaScript rather than being fully static HTML.

I also tried using requests_html and Selenium, but Selenium feels like overkill and way too slow for something that should be straightforward. Another confusing part is that when I open the Network tab, I can’t find a direct JSON API endpoint that contains just the draw numbers, even though the site must be pulling them from somewhere.

Here’s a sample code snippet:

import requests
from bs4 import BeautifulSoup

url = "https://star49s.com/results/lunchtime"
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
}

response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")

results = soup.find_all("div", class_="result-number")
for r in results:
    print(r.text.strip())
The problem is, the div.result-number often comes back empty when I run this script, even though when I check the page manually in Chrome DevTools, I can clearly see all the numbers printed inside those divs from this page.

So now I’m stuck wondering if the site is hiding the data behind some obfuscated JavaScript rendering, or if I’m just missing the right request headers / cookies to get the server to return the actual rendered content. I’ve even tried adding delays, changing headers, and simulating a real browser, but the behavior is inconsistent. Sometimes I’ll get partial results, sometimes none at all.

My main confusion is: how do I reliably extract the Lunchtime numbers from this page without relying on a headless browser (like Selenium or Playwright)? Ideally, I just want to pull them using requests in a clean way, but right now it feels like I’m missing something obvious.

Has anyone faced a similar situation with sites that half-render their data, or does anyone know if Star49s exposes an API endpoint for these results? Any tips on debugging these kinds of scraping roadblocks would be really helpful.
Reply
#2
Can do it like this,div:nth-child(2) will be next day numbers ect...
import requests
from bs4 import BeautifulSoup

url = "https://star49s.com/results/lunchtime"
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
}

response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
numbers = soup.select_one("div:nth-child(1) > div > div.flex.justify-center > div")
>>> lotto_numbers, Bonus = numbers.text.split('Bonus:')
>>> Bonus
'28'
>>> lotto_numbers = [int(lotto_numbers [i:i+2]) for i in range(0, len(lotto_numbers ), 2)]
>>> lotto_numbers
[16, 19, 24, 29, 35, 41]
Reply
#3
To scrape structured lottery results from a dynamic page:

Check Network Requests: Use your browser’s developer tools to see if the data is loaded from an API. If it is, scrape it directly via an API request using requests or axios.

Use Selenium/Playwright: If the data is rendered by JavaScript, use tools like Selenium or Playwright to simulate a browser and wait for the content to load before scraping.

Look for Structured Data: Check for JSON-LD, Microdata, or similar structured data in the page source.

Respect Robots.txt: Ensure the site allows scraping by reviewing its robots.txt.

These methods should help you get the data you need.
Reply
#4
(Aug-27-2025, 08:51 AM)Fliminio Wrote: without relying on a headless browser (like Selenium or Playwright)
Generally speaking, this is the way to go for any webpage being dynamically build in the browser. Which is pretty common nowadays for websites like this. You may also need to take under consideration that webpages obfuscate its code and JavaScript action to prevent scrapping or at least make it even way harder.

But looking into the HTML in the browser, it very much looks like results are posted as a SVG graphic on the page. So you may want to try extracting all SVG graphics from the page and then query the XML to extract results. Python has XML tools on board (https://docs.python.org/3/library/xml.html) but I think thr more common, very popular 3rd party module is lxml.

(Aug-27-2025, 08:51 AM)Fliminio Wrote: Ideally, I just want to pull them using requests in a clean way,
I guess the only way to use requests the "clean" way is if the provider of the web page offers an official API to be queried. It doesn't look like star49s is providing that, but you may want to double check with their support.

Regards, noisefloor
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  how to scrape page that works dynamicaly? samuelbachorik 0 1,801 Sep-23-2023, 10:38 AM
Last Post: samuelbachorik
  <title> django page title dynamic and other field (not working) lemonred 1 3,390 Nov-04-2021, 08:50 PM
Last Post: lemonred
  to scrape wiki-page: getting back the results - can i use pandas also apollo 2 4,125 Feb-09-2021, 03:57 PM
Last Post: apollo
Photo How do I scrape a web page? oradba4u 2 3,329 Dec-23-2020, 12:35 PM
Last Post: codeto
  Getting a list in dynamic page probottpric 1 2,928 Oct-12-2020, 01:11 AM
Last Post: Larz60+
  Beautifulsoup doesn't scrape page (python 2.7) Hikki 0 3,114 Aug-01-2020, 05:54 PM
Last Post: Hikki
  Struggling to set up Shared Hosting virtual envoronment martworth 1 3,738 Jun-03-2020, 03:06 PM
Last Post: martworth
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 5,987 Mar-19-2020, 06:13 PM
Last Post: apollo
  scrape data 1 go to next page scrape data 2 and so on alkaline3 6 10,406 Mar-13-2020, 07:59 PM
Last Post: alkaline3
  Django Two blocks of dynamic content on one page iFunKtion 5 7,515 Jul-04-2019, 02:31 AM
Last Post: noisefloor

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020