I'm using scrapy
and playwright
to scrape booking
in this way I need to click on a button and get ajax
response.
but when I run my code it returns error :
TypeError: Page.locator() missing 1 required positional argument: 'selector'
this is my code:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect
class BookingSpider(scrapy.Spider):
name='booking'
start_urls=[".en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]
def start_requests(self):
yield scrapy.Request(self.start_urls[0], meta={
"playwright": True,
"playwright_include_page":True,
"playwright_page_methods":[
PageMethod("wait_for_selector",".e1793b8db2")
]
})
def parse(self,response):
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
with open("copy.txt", "w", encoding="utf-8") as file:
file.write((response.text))
process=CrawlerProcess()
process.crawl(BookingSpider)
process.start()
error Message:
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
current.result = callback( # type: ignore[misc]
~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
current.result, *args, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
return self.parse(response, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'
I'm using scrapy
and playwright
to scrape booking
in this way I need to click on a button and get ajax
response.
but when I run my code it returns error :
TypeError: Page.locator() missing 1 required positional argument: 'selector'
this is my code:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect
class BookingSpider(scrapy.Spider):
name='booking'
start_urls=["https://www.booking/hotel/it/hotelnordroma.en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]
def start_requests(self):
yield scrapy.Request(self.start_urls[0], meta={
"playwright": True,
"playwright_include_page":True,
"playwright_page_methods":[
PageMethod("wait_for_selector",".e1793b8db2")
]
})
def parse(self,response):
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
with open("copy.txt", "w", encoding="utf-8") as file:
file.write((response.text))
process=CrawlerProcess()
process.crawl(BookingSpider)
process.start()
error Message:
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
current.result = callback( # type: ignore[misc]
~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
current.result, *args, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
return self.parse(response, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'
Share
Improve this question
edited Feb 25 at 12:17
Mojsa
asked Feb 21 at 17:46
MojsaMojsa
297 bronze badges
3
|
1 Answer
Reset to default -1Issues:
Incorrect start_urls usage in start_requests
start_urls is a class attribute, and in start_requests, you should reference self.start_urls. Incorrect use of Page.locator
Page is not defined in your parse function. You need to extract the page from the meta field in response. Incorrect indentation for CrawlerProcess
process = CrawlerProcess() and related lines should not be inside the class. Missing imports
You need to import scrapy, CrawlerProcess, and PageMethod from playwright.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745162999a4614483.html
Page.locator()
needs two arguments but you use only one. Maybe it needsresponse
as second (or first) argument. OR maybe you should useresponse.locator()
instead ofPage.locator()
? – furas Commented Feb 21 at 19:53page = response.meta["playwright_page"]
like in question python - Scrapy and Scrapy-playwright scrape first comment of every page instead of every comment for every page - Stack Overflow. And maybe later use this instancepage
instead of class namePage
– furas Commented Feb 21 at 19:59Page
comes from. – lmtaq Commented Feb 21 at 22:36