python - How do I disguise my web scraper to read adobe pages?

I tried to use the following agent disguise: "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36" I also tried downloading the html and scanning it after with `elif app == "Acrobat Reader":

            options = webdriver.ChromeOptions()
            options.add_argument("--headless")
            options.add_argument("user-agent=Mozilla/5.0")

            driver = webdriver.Chrome(options=options)
            driver.get(url)

            html = driver.page_source

            with open("C:/PowerShellShit/website.html", "w", encoding="utf-8") as f: #speichert html via powershell lokal zwischen
                f.write(html)

            driver.quit()

            soup = BeautifulSoup(html, "html.parser")
            table = soup.find("table")
            version = "Unbekannt"

            first_data_row = table.find_all("tr")[1] 
            version = first_data_row.find_all("td")[1].text.strip()`

I always received the following error: HTTPSConnectionPool(host='helpx.adobe', port=443): Read timed out. (read timeout=20)

Do you guys have any ideas?

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1744208918a4563239.html

python - How do I disguise my web scraper to read adobe pages? - Stack Overflow

发表回复

评论列表（0条）

联系我们

400-800-8888

python - How do I disguise my web scraper to read adobe pages? - Stack Overflow

相关推荐

python - How do I disguise my web scraper to read adobe pages? - Stack Overflow

发表回复

评论列表（0条）

联系我们

400-800-8888