python - using requests to login to a website that has javascript login form - Stack Overflow

Let me preface by saying I have very little programming experience. I've learned a bunch in the la

Let me preface by saying I have very little programming experience. I've learned a bunch in the last few days trying to write this program. I am running Python 2.7 on Windows 7 using PyCharm, requests, Beautiful Soup, and lxml.

I am trying to scrape data from a website that relies heavily on Javascript. I have two options:

1) The data I need is populated through Javascript and does not necessarily need a login. However I have not been able to figure how to get at this data. I've live monitored headers with live HTTP Headers chrome plugin and I think I've found the Javascript that does it but I'ts beyond my means to figure it out. Its a long bit of code, I'll post it if anyone is interested in taking a look.

or

2)On one of the main pages I found a series of ID numbers which I can use to generate URL's for each of the individual items I am analyzing. Problem is I have to be logged in to see these individual item pages. My code is as follows:

from requests.adapters import HTTPAdapter
from requests.packages.urllib3.poolmanager import PoolManager
from BeautifulSoup import BeautifulSoup
import ssl

# Request a date from user
UDate = "06/22/2015"  # raw_input('Enter a date mm/dd/yyyy\n')

# Open TLSv1 Adapter (Whataver that means)
class MyAdapter(HTTPAdapter):
    def init_poolmanager(self, connections, maxsize, block=False):
        self.poolmanager = PoolManager(num_pools=connections,
                                       maxsize=maxsize,
                                       block=block,
                                       ssl_version=ssl.PROTOCOL_TLSv1)

# Begin a requests session. Every get from here on out will use TLSv1 Protocol
import requests

payload = {
    'LogName': 'xxxxxxxx',
    'LogPass': 'xxxxxxxx'
}

s = requests.Session()
s.mount('', MyAdapter())

# Login with post and Request source code from main page.
log = s.post('LoginURL', data=payload)
print log.text

result = s.get(url)
soup = BeautifulSoup(result.content)
print soup

Neither the post or the get show me a logged in website. The logform id's from the HTML source code look like this:

<div id="DivLogForm">
        <label for="BadText"><div id="BadText" class="BadText" style="display:none" tabindex="-2">User Name or Password is Invalid</div></label>

        <div class="LogLabel">
            <label for="LogName" > User Name&nbsp;&nbsp;</label><input tabindex="0" id="LogName" class="LogInput" value="" />
        </div>
        <div  class="LogLabel">
            <label for="LogPass" >User Password&nbsp;&nbsp;</label><input  tabindex="0"id="LogPass" type="password" class="LogInput" value="" />
        </div>

So I'm passing LogName and LogPass with the post.

There is also a logform.js with this bit of code

$("#LogButton").click(function()
        {   //$('#divLogForm').hide();
            //$('#divLoading').show();  

           var uName = $("#LogName").val();
           var uPass = $("#LogPass").val();
           var url = "/index.cfm";
           $.post(url, {ZACTION:'AJAX',ZMETHOD:'LOGIN',func:'LOGIN',USERNAME:uName, USERPASS:uPass}, 
                  function(data){if (data.isOk =="YES"){location.href="/index.cfm";}
                                  else {$('.BadText').show(); $('#BadText').focus();};
                                 },"json");
        });

The LoginURL in my code is taken from the var url in this script. I have tried using USERNAME & USERPASS and I have tried uName and uPass with my post but these didnt work either.

Not sure how to move forward here. Any help is greatly appreciated

Let me preface by saying I have very little programming experience. I've learned a bunch in the last few days trying to write this program. I am running Python 2.7 on Windows 7 using PyCharm, requests, Beautiful Soup, and lxml.

I am trying to scrape data from a website that relies heavily on Javascript. I have two options:

1) The data I need is populated through Javascript and does not necessarily need a login. However I have not been able to figure how to get at this data. I've live monitored headers with live HTTP Headers chrome plugin and I think I've found the Javascript that does it but I'ts beyond my means to figure it out. Its a long bit of code, I'll post it if anyone is interested in taking a look.

or

2)On one of the main pages I found a series of ID numbers which I can use to generate URL's for each of the individual items I am analyzing. Problem is I have to be logged in to see these individual item pages. My code is as follows:

from requests.adapters import HTTPAdapter
from requests.packages.urllib3.poolmanager import PoolManager
from BeautifulSoup import BeautifulSoup
import ssl

# Request a date from user
UDate = "06/22/2015"  # raw_input('Enter a date mm/dd/yyyy\n')

# Open TLSv1 Adapter (Whataver that means)
class MyAdapter(HTTPAdapter):
    def init_poolmanager(self, connections, maxsize, block=False):
        self.poolmanager = PoolManager(num_pools=connections,
                                       maxsize=maxsize,
                                       block=block,
                                       ssl_version=ssl.PROTOCOL_TLSv1)

# Begin a requests session. Every get from here on out will use TLSv1 Protocol
import requests

payload = {
    'LogName': 'xxxxxxxx',
    'LogPass': 'xxxxxxxx'
}

s = requests.Session()
s.mount('https://xxxx.xxx', MyAdapter())

# Login with post and Request source code from main page.
log = s.post('LoginURL', data=payload)
print log.text

result = s.get(url)
soup = BeautifulSoup(result.content)
print soup

Neither the post or the get show me a logged in website. The logform id's from the HTML source code look like this:

<div id="DivLogForm">
        <label for="BadText"><div id="BadText" class="BadText" style="display:none" tabindex="-2">User Name or Password is Invalid</div></label>

        <div class="LogLabel">
            <label for="LogName" > User Name&nbsp;&nbsp;</label><input tabindex="0" id="LogName" class="LogInput" value="" />
        </div>
        <div  class="LogLabel">
            <label for="LogPass" >User Password&nbsp;&nbsp;</label><input  tabindex="0"id="LogPass" type="password" class="LogInput" value="" />
        </div>

So I'm passing LogName and LogPass with the post.

There is also a logform.js with this bit of code

$("#LogButton").click(function()
        {   //$('#divLogForm').hide();
            //$('#divLoading').show();  

           var uName = $("#LogName").val();
           var uPass = $("#LogPass").val();
           var url = "/index.cfm";
           $.post(url, {ZACTION:'AJAX',ZMETHOD:'LOGIN',func:'LOGIN',USERNAME:uName, USERPASS:uPass}, 
                  function(data){if (data.isOk =="YES"){location.href="/index.cfm";}
                                  else {$('.BadText').show(); $('#BadText').focus();};
                                 },"json");
        });

The LoginURL in my code is taken from the var url in this script. I have tried using USERNAME & USERPASS and I have tried uName and uPass with my post but these didnt work either.

Not sure how to move forward here. Any help is greatly appreciated

Share Improve this question asked Jun 19, 2015 at 20:00 Gustavo CostaGustavo Costa 1111 gold badge1 silver badge8 bronze badges
Add a ment  | 

1 Answer 1

Reset to default 2

The last bit of javascript you posted gives a clue as to why your login POST request isn't working.

According to the javascript, you should be sending a dictionary that looks like the following with your login POST:

{
    'ZACTION': 'AJAX',
    'ZMETHOD': 'LOGIN',
    'func': 'LOGIN',
    'USERNAME': '<enter username>',
    'USERPASS': '<enter password>'
}, 

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745168446a4614776.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信