Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED] Header to query Amazon?
#1
Question 
Hello,

I'm trying to query Amazon but it fails although I'm using the header infos sent by my browser per Whatismybrowser.

Does someone know what I could try instead?

Thank you.

import requests
from bs4 import BeautifulSoup
import datetime
from datetime import datetime

HEADERS = '''
({
"ACCEPT":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"ACCEPT-ENCODING":"gzip, deflate, br, zstd",
"ACCEPT-LANGUAGE":"en-US,en;q=0.9,fr;q=0.8,de;q=0.7",
"HOST":"www.amazon.com",
"REFERER":"https://www.amazon.com/",
"SEC-CH-PREFERS-COLOR-SCHEME":"light",
"SEC-CH-PREFERS-REDUCED-MOTION":"no-preference",
"SEC-CH-UA":""Not;A=Brand";v="99", "Google Chrome";v="139", "Chromium";v="139"",
"SEC-CH-UA-ARCH":""x86"",
"SEC-CH-UA-FULL-VERSION":""139.0.7258.155"",
"SEC-CH-UA-MOBILE":"?0",
"SEC-CH-UA-MODEL":"""",
"SEC-CH-UA-PLATFORM":""Windows"",
"SEC-CH-UA-PLATFORM-VERSION":""15.0.0"",
"SEC-FETCH-DEST":"document",
"SEC-FETCH-MODE":"navigate",
"SEC-FETCH-SITE":"cross-site",
"SEC-FETCH-USER":"?1",
"UPGRADE-INSECURE-REQUESTS":"1",
"USER-AGENT":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36",
"VIEWPORT-WIDTH":"1208"
})'''

URL = "https://www.amazon.com/s?k=123&i=stripbooks"
try:
	reqs = requests.get(URL, headers=HEADERS)
except:
	error = f"{datetime.now()} Failed downloading {URL}"
	print(error)
	exit()

soup = BeautifulSoup(reqs.text, 'lxml')
print(soup)
Reply
#2
Headers must be a dict you’re passing a string.
You’re copying a bunch of browser/HTTP2-only headers that don’t help (and can hurt)
Try this.
import requests
from bs4 import BeautifulSoup
from pprint import pprint

URL = "https://www.amazon.com/s?k=123&i=stripbooks"

headers = {
    "User-Agent": (
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
        "AppleWebKit/537.36 (KHTML, like Gecko) "
        "Chrome/121.0.0.0 Safari/537.36"
    ),
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    # Don't set Accept-Encoding; let requests manage it
    "Referer": "https://www.amazon.com/",
}

try:
    with requests.Session() as s:
        s.headers.update(headers)
        r = s.get(URL, timeout=15)
        print("Status:", r.status_code, "URL:", r.url)
        r.raise_for_status()
except Exception as e:
    print(f"Failed downloading {URL}: {e}")
    raise

soup = BeautifulSoup(r.text, "lxml")
all_title = soup.select('a > h2 > span')
pprint(all_title)
Output:
Status: 200 URL: https://www.amazon.com/s?k=123&i=stripbooks [<span>123 Count with Me: An Interactive Numbers Book With Tracks to Trace and Flaps to Flip! (Smart Kids Trace-and-flip)</span>, <span>123</span>, <span>123 ZOOM</span>, <span>123: Learn to Count with Songs and Rhymes</span>, <span>1-2-3 Magic: Gentle 3-Step Child &amp; Toddler Discipline for Calm, Effective, and Happy Parenting (Positive Parenting Guide for Raising Happy Kids)</span>, <span>ABC &amp; 123 Learning Songs: Interactive Children's Sound Book (11 Button Sound) (11 Button Sound Book)</span>, <span>123 Counting Sticker Book (My Little World)</span>, <span>123: Let's Count! (Learning &amp; Laughing)</span>, <span>123 Count with me (Early learning is fun!)</span>, <span>My First 123 (My First Board Books)</span>, <span>Museum 123</span>, <span>123 New York (Cool Counting Books)</span>, <span>ABC-123 Fingerspelling and Numbering in ASL (Student Workbook)</span>, <span>1-2-3 Magic: Effective Discipline for Children 2-12</span>, <span>1-2-3 Magic Parenting Book Set: The Original Gentle Parenting Program Beloved by Millions of Parents (Parenting Toddlers and School Age Kids)</span>, <span>123 Counting With Roger And Friends</span>]
Winfried likes this post
Reply
#3
Thank you.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Requests_HTML not getting all data on Amazon aaander 1 2,855 Nov-19-2022, 02:09 AM
Last Post: aaander
  Getting a URL from Amazon using requests-html, or beautifulsoup aaander 1 3,428 Nov-06-2022, 10:59 PM
Last Post: snippsat
  Can't open Amazon page Pavel_47 3 5,235 Oct-21-2020, 09:13 AM
Last Post: Aspire2Inspire
  New in Python Amazon Scraping brian1425 1 3,242 Jul-10-2020, 01:00 PM
Last Post: snippsat
  Amazon AWS - how to install the library chatterbot wpaiva 9 6,770 Feb-01-2020, 08:18 AM
Last Post: brighteningeyes
  Execute search query on Amazon website Pavel_47 7 6,605 Nov-07-2019, 10:43 AM
Last Post: snippsat
  malformed header from script 'main.py': Bad header: * Serving Flask app "main" anuragsapanbharat 2 6,270 Jun-12-2019, 07:26 AM
Last Post: anuragsapanbharat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020