I need to find out if something has changed in a website using an RSS Feed. My solution was to constantly download the entire rss file, get the entries.length and pare it with the last known entries.length. I find it to be a very inelegant solution. Can anyone suggest a different approach?
Details:
• My application is an html file which uses javascript. It should be small enough to function as a desktop gadget or a browser extension.
• Currently, it downloads the rss file every thirty seconds just to get the length.
• It can download from any website with an Rss feed.
Comments and suggestions are appreciated, thanks in advance~ ^^
I need to find out if something has changed in a website using an RSS Feed. My solution was to constantly download the entire rss file, get the entries.length and pare it with the last known entries.length. I find it to be a very inelegant solution. Can anyone suggest a different approach?
Details:
• My application is an html file which uses javascript. It should be small enough to function as a desktop gadget or a browser extension.
• Currently, it downloads the rss file every thirty seconds just to get the length.
• It can download from any website with an Rss feed.
Comments and suggestions are appreciated, thanks in advance~ ^^
Share Improve this question asked Dec 31, 2009 at 7:30 GaiusSenseiGaiusSensei 1,8904 gold badges27 silver badges45 bronze badges 2- 1 Simply checking the number of entries isn't enough, as some feeds only show a certain number of entries. Example: A feed might show 10 entries, once the 11th is added, the 1st is removed, so the total is still 10. – Peter Di Cecco Commented Dec 31, 2009 at 7:35
- o_O; crap. Didn't think of that... Follow up question: Is it possible to create an app that checks if something has changed in a website using an RSS Feed? – GaiusSensei Commented Dec 31, 2009 at 7:41
3 Answers
Reset to default 5Many RSS feeds use the <lastBuildDate>
element, which is a child of <channel>
, to indicate when they were last updated. There's also a <pubDate>
element, child of <item>
, that serves the same purpose. If you plan on reading ATOM feeds, they have the <updated>
element.
There are HTTP headers that can be used to determine if a resource has changed. Learn how to use the following headers to make your application more efficient.
HTTP Request Headers
If-Modified-Since
If-None-Match
HTTP Response Headers
Last-Modified
ETag
The basic strategy is to store the above-mentioned response headers that are returned on the first request and then send the values you stored in the HTTP request headers in future requests. If the HTTP resource has not been changed, you'll get back an HTTP 304 - Not Modified
response and the resource will not even be downloaded. So this results in a very lightweight check for updates. If the resource has changed, you'll get back an HTTP 200 OK
response and the resource will be downloaded in the usual way.
You should be keeping track of the GUID's/ArticleId's to see if you've seen an article before.
You should also see if your source supports conditional gets. It will allow you to check if anything has changed without needing to download the whole file. You can quickly check with this tool to see if your source supports conditional gets. (I wish everyone did.)
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744332262a4568942.html
评论列表(0条)