Using Python Requests module to log into a site.

As a monitoring engineer, its my responsibility to ensure that the services I monitor are up and working. Most of the time this is normally just graphing some snmp oids, or writing a simple script to check a service is still alive.

This time, I needed to check the following.

      Customer Site was up (Return a 200 status code)
      ldap Auth is working
      Clients can log in
      Certain pages exist


Now most of the above is fairly easy, the site being up for instance is a simple check, connect to site, read response code alert if not 200, same with checking that certain pages exist. But what about testing the ldap auth and that clients can login. I decided the easist way to test everything all in one check was to write a script that actually logs into the site using the same mechanism as a client. I.e fills out a form and posts it to the server.

For this i decided to use the trusty Requests module (python-requests.org), as it makes dealing with sessions nice and easy. The result is the following script.


import logging
import requests
import re
url = "https://relf.co/login"

s = requests.session()
r = s.get(url,verify = False)

matchme = ‘meta content=”(.*)” name=”csrf-token” /’

csrf = re.search(matchme,str(r.text))
payload = {
‘user[email]’ : ‘monitoring@relf.co’,
‘user[password]’ : ‘apasswordgoeshere’,
‘authenticity_token’ : csrf.group(1),
‘_portal_session’ : r.cookies[“_portal_session”]
}
r = s.post(url,data=payload,verify = False)

So how does this code work, well first up with have the usual imports. Logging (not actually used, naughty steve), Requests (the module that will do most of the work) and RE, which is the python regex implementation.

First we set the url we will be attempting to log in against (this is set to relf.co but wont actually work :D).
Then we declare s as a requests.session object. This allows us to keep a single session for the life of the script, otherwise every request we sent via the request module would act like a separate session.
Then we declare r as the result of a requests.session.get against the url we will be attempting to login to. We do this for a couple of reasons, 1 to grab the session cookie, which is stored in the requests.session cookie jar for later use, and two so we can pull the CSRF-token which can then be posted back to the site to show we are genuine and not attempting a CrossSite attack (Cross-site request forgery explanation )

The line that starts with “matchme” is the regex expression we will be using to get the CSRF token, and we do that by defining “CSRF” as the result of re.search(matchme,str(r.text)).

Next we build the payload which we are going to post to the form. Here you will need to break out a copy of chome/firebug and inspect the form you wish to submit against. You will need to look for the name attribute of the fields you would normally fill in. On the form I wrote this against, the fields I need to fill out were;

      user[email]

 

    user[password]

We also need to send the cookie we received on the first get, and the CSRF token.
So using this code

payload = {
'user[email]' : 'monitoring@relf.co',
'user[password]' : 'apasswordgoeshere',
'authenticity_token' : csrf.group(1),
'_portal_session' : r.cookies["_portal_session"]
}

We build the payload by specifying a key pair, the key is on the left, the value on the right separated by a colon. so as you can see above we have the user[email] and user[password] key value pairs, also we have the authenticity_token and we send back the original cookie.

So now we have the payload we need to actually post this back to the server, which we do with

r = s.post(url,data=payload,verify = False)

If all goes well if you print r.text you should now see the landing page after a successful login.

Let me know how it goes, and any questions, please feel free to ask.

HowTo ,

14 comments


  1. dustin

    you. rock! solid month bursting my brain over this, and even though i came to your page after learning from 100’s of failures, YOUR clarity and example was the breakthrough for me. I finally authenticated and accessed the pages I needed moments ago. Ecstatic i can sleep easy tonight.

    thank you, thank you, thank you!

  2. Hi Dustin,

    Glad to have been of service. I really need to start blogging again, just haven’t really had time.

    Glad this helped though.

  3. terminalcommand

    I just stumbled upon your website. After days trying to login to a website, I had installed firebug and noticed the csrf token. I was searching for it and it was right there in front of my eyes in the HTMLcode.

    Nice post BTW.

  4. Sasha

    Hi! I’m really hoping your code can solve my login problem because nothing else has so far, but I’m VERY new to python and to web scraping. So, when I copied your code to my script and then ran it I got the following error:
    Non-ASCII character ‘\xe2’ in file newtest.py on line 19, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
    Line 19 is the matchme regex expression.
    Could you please spell it out to the noob and tell me exactly what I need to put at the top of my file? I tried a few of the examples from the url listed in the error message but none of them worked.
    I really appreciate it!

  5. Xargonus

    Hi,
    I have been trying to create an account using python, but CSRF seems to be an impassable problem for me. Every time I make a request, the CSRF token changes. The token I get from using get(url) is not the same the website wants from me when i try to post(url).
    Also I am not getting the csrf in the cookies, but extracting it from the request.text. There is no csrf in the cookies. The site is battle.net by the way. Do you have any idea how to help me ? Or am i doing something wrong ?

  6. sachin

    Traceback (most recent call last):
    File “new2.py”, line 82, in
    ‘authenticity_token’ : csrf.group(1),
    AttributeError: ‘NoneType’ object has no attribute ‘group’

    I am getting this error. Any ideas how can this be fixed?
    I am totally new at this

  7. Ilya Topper

    I have been looking for a few days on how to login account page. This blog helped me a lot. Thank you very much!

  8. John

    Help.

    I’m trying to login to http://www.investors.com/ using the same design but I’m stuck. I think the login process on this website is a little more complicated but I was wondering if you could help me.

    Thanks,
    John

Leave a Reply

Your email address will not be published. Required fields are marked *