Message326592
Thank you for the quick reply. You are correct about the difficulties of using a universally accepted list.This is one example that generates errors on the server side. Just for the record.
#!/usr/bin/env python3
from urllib.request import Request, urlopenfrom urllib.error import URLError
# process SSB dataurl1 = 'https://raw.githubusercontent.com/mapnik/test-data/master/csv/points.csv'url2 = 'https://gitlab.cncf.ci/kubernetes/kubernetes/raw/c69582dffba33e9f1c08ff2fc67924ea90f1448c/test/test_owners.csv'url3 = 'http://data.ssb.no/api/klass/v1/classifications/131/changes?from=2016-01-01&to=9999-12-31'headers1 = {'Accept': 'text/csv'}headers2 = {'Akcept': 'text/csv'}headers3 = {'Accept': 'tekst/cxv'}headers4 = {'Accept': '1234'}req = Request(url3, headers=headers4)resp = urlopen(req)content = resp.read().decode(resp.headers.get_content_charset()) # get the character encoding from the server responseprint(content)
'''req = Request(url3, headers=headers3)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error
req = Request(url3, headers=headers4)
urllib.error.HTTPError: HTTP Error 406: Not Acceptable'''
On Tuesday, September 25, 2018, 8:38:26 AM GMT+2, Karthikeyan Singaravelan <report@bugs.python.org> wrote:
Karthikeyan Singaravelan <tir.karthi@gmail.com> added the comment:
Thanks for the report. I tried similar requests and it works this way for other tools like curl since Akcept could be a custom header in some use cases though it could be a typo in this context. There is no predefined set of media types that we need to validate as far as I can see from https://tools.ietf.org/html/rfc2616#section-14.1 and it depends on the server configuration to do validation. It's hard for Python to maintain a list of acceptable MIME types for validation across releases. A list of registered MIME types that is updated periodically : https://www.iana.org/assignments/media-types/media-types.xhtml and RFC for registration : https://tools.ietf.org/html/rfc6838
Some sample requests from curl with invalid headers.
curl -X GET https://httpbin.org/get -H 'Authorization: Token bc23f14356c114a8ffa319773583426878b7b37f' -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' -H 'Akcept: tekst/csv'
{
"args": {},
"headers": {
"Accept": "*/*",
"Akcept": "tekst/csv",
"Authorization": "Token bc23f14356c114a8ffa319773583426878b7b37f",
"Cache-Control": "no-cache",
"Connection": "close",
"Content-Type": "application/json",
"Host": "httpbin.org",
"User-Agent": "curl/7.37.1"
},
"origin": "182.73.135.26",
"url": "https://httpbin.org/get"
}
curl -X GET https://httpbin.org/get -H 'Authorization: Token bc23f14356c114a8ffa319773583426878b7b37f' -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' -H 'Accept: tekst'
{
"args": {},
"headers": {
"Accept": "tekst",
"Authorization": "Token bc23f14356c114a8ffa319773583426878b7b37f",
"Cache-Control": "no-cache",
"Connection": "close",
"Content-Type": "application/json",
"Host": "httpbin.org",
"User-Agent": "curl/7.37.1"
},
"origin": "182.73.135.26",
"url": "https://httpbin.org/get"
}
Feel free to add in if I am missing something here but I think it's hard for Python to maintain the updated list and adding warning/error might break someone's code.
Thanks
----------
nosy: +xtreak
_______________________________________
Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue34777>
_______________________________________ |
|
| Date |
User |
Action |
Args |
| 2018-09-27 20:36:23 | tuxcell | set | recipients:
+ tuxcell, xtreak |
| 2018-09-27 20:36:23 | tuxcell | link | issue34777 messages |
| 2018-09-27 20:36:23 | tuxcell | create | |
|