javascript - Reading dynamic web page content in java - Stack Overflow

I need help reading the contents of a webpage. Currently i am using the following method to read the co

I need help reading the contents of a webpage. Currently i am using the following method to read the contents

BufferedReader in = new BufferedReader(new InputStreamReader(page.openStream())); 
String inputLine;
while ((inputLine = in.readLine()) != null)
{Content = Content + inputLine;}

However with this method there is a problem. . some jsp pages have ajax in them which randomly updates a css class of a webpage like so Javascript code just to give an idea:

if (request.readyState === 4 && request.status === 200) 
{
var type = request.getResponseHeader("Content-Type");
$('.update').empty();
$('.update').append(request.responseText); //update the css class
}

So as a result when this page reader is read through my java method as mentioned above i just get

<div class="update"></div>

although on the screen this class has a value. Now however if i save the page first (by clicking save as in Firefox) then the values appended in the CSS class by jquery are also visible. Is there a method or a way on how i could read the values or obtain the values like firefox does by saving the pages.. I want to read the contents of the entire webpage with the Ajax values present in the string.

On one side i read that this is difficult since the JAvascript in rendered and executed by the browser so i wanted to know does firefox have any apis that might help ? Any suggestions would be appreciated.

I need help reading the contents of a webpage. Currently i am using the following method to read the contents

BufferedReader in = new BufferedReader(new InputStreamReader(page.openStream())); 
String inputLine;
while ((inputLine = in.readLine()) != null)
{Content = Content + inputLine;}

However with this method there is a problem. . some jsp pages have ajax in them which randomly updates a css class of a webpage like so Javascript code just to give an idea:

if (request.readyState === 4 && request.status === 200) 
{
var type = request.getResponseHeader("Content-Type");
$('.update').empty();
$('.update').append(request.responseText); //update the css class
}

So as a result when this page reader is read through my java method as mentioned above i just get

<div class="update"></div>

although on the screen this class has a value. Now however if i save the page first (by clicking save as in Firefox) then the values appended in the CSS class by jquery are also visible. Is there a method or a way on how i could read the values or obtain the values like firefox does by saving the pages.. I want to read the contents of the entire webpage with the Ajax values present in the string.

On one side i read that this is difficult since the JAvascript in rendered and executed by the browser so i wanted to know does firefox have any apis that might help ? Any suggestions would be appreciated.

Share Improve this question asked Apr 9, 2012 at 14:10 RajeshwarRajeshwar 11.7k27 gold badges88 silver badges173 bronze badges 1
  • 1 You're going to have to render the web page, not just read it with a StreamReader. Google search "Web rendering in Java" (without the quotes) to see if you find something you can work with. – Gilbert Le Blanc Commented Apr 9, 2012 at 14:17
Add a ment  | 

2 Answers 2

Reset to default 4

You may find the following project useful:

  • HTMLUnit

Here is also a very informative blog post from Data Big Bang.

Also check out PhantomJS. In the same way that Crowbar is a headless Mozilla browser, PhantomJS is a headless WebKit browser - the engine that Safari and Google Chrome use.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1742329109a4423349.html

相关推荐

  • javascript - Reading dynamic web page content in java - Stack Overflow

    I need help reading the contents of a webpage. Currently i am using the following method to read the co

    5小时前
    20

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信