dimanche 10 juillet 2016

Scraping on Python

I wanted to get the caption, no. of likes and comments of the recent 10 images of a particular user. Using below code I am just able to get the latest one.

Code:

from selenium import webdriver
from bs4 import BeautifulSoup
import json, time, re
phantomjs_path = r'C:Usersravi.janjwadiaDesktopphantomjs-2.1.1-windowsbinphantomjs.exe'
browser = webdriver.PhantomJS(phantomjs_path)
user = "barackobama"     
browser.get('https://instagram.com/' + user)
time.sleep(0.5)
soup = BeautifulSoup(browser.page_source, 'html.parser')
script_tag = soup.find('script',text=re.compile('window._sharedData'))
shared_data = script_tag.string.partition('=')[-1].strip(' ;')
result = json.loads(shared_data)
print(result['entry_data']['ProfilePage'][0]['user']['media']['nodes'][0]['caption'])

Result: LAST CALL: Enter for a chance to meet President Obama this summer before tonight's deadline. → Link in profile.

Aucun commentaire:

Enregistrer un commentaire