I'm relatively new to Python and I'm trying to scrape some selling data from the Stubhub, an example of this data seen here:
https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata
You'll notice that if you try and visit that url without logging into stubhub.com first, it won't work. You can sign in here:
https://myaccount.stubhub.com/login/Signin
Once I've signed in via my web browser and then open the URL which I want to scrape in a new tab, then this:
r = requests.get('https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata')
returns the scraped data successfully. However, once the browser session expires after ten minutes, I get this:
<FormErrors>
<FormField>User Auth Check</FormField>
<ErrorMessage>
Either is not active or the session might have expired. Please login again.
</ErrorMessage>
So I think that I need to implement the session ID via cookie to keep my authentication alive and well. The Requests library documentation is pretty terrible for someone who has never done this sort of thing before, so I was hoping you folks might be able to help. The example provided by Requests is:
s = requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")
print r.text
# '{"cookies": {"sessioncookie": "123456789"}}'
and I honestly can't make heads or tails of that. Any assistance would be appreciated!
Aucun commentaire:
Enregistrer un commentaire