Resultset Object Has No Attribute 'get'
Hi I'm currently trying to scrape this https://www.sec.gov/ix?doc=/Archives/edgar/data/1090727/000109072720000003/form8-kq42019earningsr.htm SEC link with beautifulsoup to get the
Solution 1:
Basically the page is dynamically rendered via JavaScript
once it's loads. so you will not be able to parse the objects until you render it firstly. Therefore requests
module will not render the JavaScript
.
You can use selenium
approach to achieve that. otherwise you can use HTMLSession
from html_request
module to render it on the fly.
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from bs4 import BeautifulSoup
import re
from time import sleep
options = Options()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)
driver.get("https://www.sec.gov/ix?doc=/Archives/edgar/data/1090727/000109072720000003/form8-kq42019earningsr.htm")
sleep(1)
soup = BeautifulSoup(driver.page_source, 'html.parser')
for item in soup.findAll("a", style=re.compile("^text")):
print(item.get("href"))
driver.quit()
Output:
https://www.sec.gov/Archives/edgar/data/1090727/000109072720000003/exhibit991-q42019earni.htm
https://www.sec.gov/Archives/edgar/data/1090727/000109072720000003/exhibit992-q42019finan.htm
However if you want just the first url;
url = soup.find("a", style=re.compile("^text")).get("href")
print(url)
Output:
https://www.sec.gov/Archives/edgar/data/1090727/000109072720000003/exhibit991-q42019earni.htm
Solution 2:
Your issue is that soup3.find_all() returns a list of results and you are trying to use the .get() method on this list, when you are supposed to use it on only one item.
Try something like iterating through them and printing each one:
pressting = soup3.find_all("a", string="UPS")
for i in pressting:
print(i.get('href'))
Post a Comment for "Resultset Object Has No Attribute 'get'"