How can I wait for a JavaScript function to evaluation when scraping a website using PuppeteerNode.js? - Stack Overflow

I have a website where I want to manipulate certain user inputs, generate a report by clicking a button

I have a website where I want to manipulate certain user inputs, generate a report by clicking a button, and download the resulting report by clicking on button.

I recorded the user behaviour (manipulation) with Chrome DevTools Recorder, exported the Puppeteer script and adapted it a bit. Everything works fine, until the report is generated (some JavaScript function is evaluated in the back). Is there any chance to wait for this evaluation to be complete before the script tries to click something which is not yet available?

Here is the code up until I got an error (I redacted the URL for now):

        console.log('Navigating to the target page...');
        await page.goto('', { waitUntil: 'networkidle2' });

        console.log('Clicking language selector...');
        await page.waitForSelector('div.active-lang', { visible: true });
        await page.click('div.active-lang');

        console.log('Switching to English...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('li.en-gb > a')
        ]);

        console.log('Opening report...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('#treeMenu\\:0_4_1_3 a')
        ]);

        console.log('Selecting year...');
        await page.waitForSelector('#input > table:nth-of-type(1) span', { visible: true });
        await page.click('#input > table:nth-of-type(1) span');
        await page.waitForSelector('#yearBeg_0', { visible: true });
        await page.click('#yearBeg_0');

        console.log('Generating report...');
        await page.waitForSelector('td.action_c2 span', { visible: true });
        await page.click('td.action_c2 span', { timeout: 30000, waitUntil: 'networkidle2' });

        console.log('Exporting file...');
        await page.waitForSelector('td.action_c1 span.ui-button-text', { visible: true });
        await page.click('td.action_c1 span.ui-button-text', { timeout: 60000, waitUntil: 'networkidle2' });
        await page.screenshot({ path: 'debug.png_exporting' });

        console.log('Hovering menu...');
        // Hover over the parent menu to reveal XLS option
        await page.waitForSelector('li.ui-state-hover', { visible: true, timeout: 30000, waitUntil: 'networkidle2' });
        await page.hover('li.ui-state-hover > a', { timeout: 30000, waitUntil: 'networkidle2' });  // Hover to reveal the submenu
        await page.screenshot({ path: 'debug.png_hovering' });

        
        console.log('Selecting XLS format...');
        try {
        
            // Wait for the XLS option and click it by text content
            const xlsMenuItem = await page.waitForFunction(() => {
                const menuItems = Array.from(document.querySelectorAll('li.ui-state-hover span.ui-menuitem-text'));
                return menuItems.find(item => item.textContent.trim() === 'XLS');
            }, { timeout: 30000 });
        
            if (xlsMenuItem) {
                console.log('Clicking on XLS menu item...');
                await xlsMenuItem.click();
            } else {
                throw new Error('XLS menu item not found.');
            }
            console.log('Successfully clicked on XLS format.');
        } catch (error) {
            console.error('Failed to select XLS format:', error);
            throw error;
        }

As you can see from the screenshot provided, the evaluation is still not complete:

and this is how it should look like after evaluation:

this is the output from Node after calling the script:

Clicking language selector...
Switching to English...
Opening report...
Selecting year...
Generating report...
Exporting file...
Hovering menu...
An error occurred: TimeoutError: Waiting for selector `li.ui-state-hover` failed: Waiting failed: 30000ms exceeded
    at new WaitTask (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/WaitTask.js:50:34)
    at IsolatedWorld.waitForFunction (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Realm.js:25:26)
    at CSSQueryHandler.waitFor (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/QueryHandler.js:172:95)
    at async CdpFrame.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Frame.js:522:21)
    at async CdpPage.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Page.js:1304:20)
    at async /home/zenz/ShinyApps/AID/puppeteer/tasks/MD_FDI_IIP_orig.js:98:9
Browser closed.

I tried with varying timeouts, but this doesn't seem to change anything.

I have a website where I want to manipulate certain user inputs, generate a report by clicking a button, and download the resulting report by clicking on button.

I recorded the user behaviour (manipulation) with Chrome DevTools Recorder, exported the Puppeteer script and adapted it a bit. Everything works fine, until the report is generated (some JavaScript function is evaluated in the back). Is there any chance to wait for this evaluation to be complete before the script tries to click something which is not yet available?

Here is the code up until I got an error (I redacted the URL for now):

        console.log('Navigating to the target page...');
        await page.goto('https://website.url', { waitUntil: 'networkidle2' });

        console.log('Clicking language selector...');
        await page.waitForSelector('div.active-lang', { visible: true });
        await page.click('div.active-lang');

        console.log('Switching to English...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('li.en-gb > a')
        ]);

        console.log('Opening report...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('#treeMenu\\:0_4_1_3 a')
        ]);

        console.log('Selecting year...');
        await page.waitForSelector('#input > table:nth-of-type(1) span', { visible: true });
        await page.click('#input > table:nth-of-type(1) span');
        await page.waitForSelector('#yearBeg_0', { visible: true });
        await page.click('#yearBeg_0');

        console.log('Generating report...');
        await page.waitForSelector('td.action_c2 span', { visible: true });
        await page.click('td.action_c2 span', { timeout: 30000, waitUntil: 'networkidle2' });

        console.log('Exporting file...');
        await page.waitForSelector('td.action_c1 span.ui-button-text', { visible: true });
        await page.click('td.action_c1 span.ui-button-text', { timeout: 60000, waitUntil: 'networkidle2' });
        await page.screenshot({ path: 'debug.png_exporting' });

        console.log('Hovering menu...');
        // Hover over the parent menu to reveal XLS option
        await page.waitForSelector('li.ui-state-hover', { visible: true, timeout: 30000, waitUntil: 'networkidle2' });
        await page.hover('li.ui-state-hover > a', { timeout: 30000, waitUntil: 'networkidle2' });  // Hover to reveal the submenu
        await page.screenshot({ path: 'debug.png_hovering' });

        
        console.log('Selecting XLS format...');
        try {
        
            // Wait for the XLS option and click it by text content
            const xlsMenuItem = await page.waitForFunction(() => {
                const menuItems = Array.from(document.querySelectorAll('li.ui-state-hover span.ui-menuitem-text'));
                return menuItems.find(item => item.textContent.trim() === 'XLS');
            }, { timeout: 30000 });
        
            if (xlsMenuItem) {
                console.log('Clicking on XLS menu item...');
                await xlsMenuItem.click();
            } else {
                throw new Error('XLS menu item not found.');
            }
            console.log('Successfully clicked on XLS format.');
        } catch (error) {
            console.error('Failed to select XLS format:', error);
            throw error;
        }

As you can see from the screenshot provided, the evaluation is still not complete:

and this is how it should look like after evaluation:

this is the output from Node after calling the script:

Clicking language selector...
Switching to English...
Opening report...
Selecting year...
Generating report...
Exporting file...
Hovering menu...
An error occurred: TimeoutError: Waiting for selector `li.ui-state-hover` failed: Waiting failed: 30000ms exceeded
    at new WaitTask (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/WaitTask.js:50:34)
    at IsolatedWorld.waitForFunction (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Realm.js:25:26)
    at CSSQueryHandler.waitFor (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/QueryHandler.js:172:95)
    at async CdpFrame.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Frame.js:522:21)
    at async CdpPage.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Page.js:1304:20)
    at async /home/zenz/ShinyApps/AID/puppeteer/tasks/MD_FDI_IIP_orig.js:98:9
Browser closed.

I tried with varying timeouts, but this doesn't seem to change anything.

Share Improve this question edited Nov 18, 2024 at 12:00 David asked Nov 18, 2024 at 10:45 DavidDavid 608 bronze badges 2
  • 1 li.ui-state-hover sounds like a class that pops up when you hover something, but there is no hovering happening in your code. In your log the error appears after Hovering menu... which is not part of the code you provided. Are you sure you provided the correct code? – Tim Hansen Commented Nov 18, 2024 at 10:51
  • @TimHansen I added the lines you are referring to. these are after the code breaks, that's why I didn't include them in the first place. – David Commented Nov 18, 2024 at 12:00
Add a comment  | 

1 Answer 1

Reset to default 2

Instead of depending on timeouts, you need to look for DOM changes. For eg, when certain data elements are rendered on these mentioned screens, your script should wait for those DOM objects to be created at run time.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745625234a4636765.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信