How to handle HTTP 403 Forbidden Error in Python


Solve HTTP Error 403: Forbidden in Python

Sometimes when trying to access a web service using a Python script, you get a HTTP Error 403: Forbidden response code, even though the service is accessible normally from web browsers. The HTTP 403 forbidden error doesn’t necessarily occur due to missing authentication attributes, some web services would only allow web browsers or some specific clients to access them, while they deny any requests coming from third-party clients, in our case the Python program.



Fix HTTP 403 Forbidden Error in Python




Hi Everyone,

         As you all know, Python can be used to automate a lot of tasks related to the web. But, you might have come across situations in which you are been blocked from consuming a web service, when trying to access it with a Python script or any other programming language. Actually this service may be accessible normally from any web browsers. But, whenever you want to visit a website in a more smarter way, that is the programming way, sometimes the website owners don't like that. They don't want you on their website with an automated program, they want only real users on their websites. So they will block you if they sense that you're not a real user and you will get this error: "HTTP Error 403: Forbidden".


 Hey cheer up guys, you can easily solve the HTTP 403 forbidden error with some modifications in your code, its not a big deal. This problem can be normally resolved by imitating the web browser request, so that the web service deals with the Python program as if it is a web browser. I will show you how exactly you can get rid of this HTTP 403 forbidden error with an example.



Image_downloader.py


This Python script from my previous post can be used to download any image from the web with its URL. Although this works pretty good in almost every cases, some websites won't allow this program to access their content and it will generate the HTTP 403 forbidden error.

Let's have a look at the code.



response = urllib.request.urlopen(image_url)

When you try to access the data at the URL provided by the user with the above line of code some websites will detect that a program is making this request and it will kick you out immediately.



Now likely for us, it is actually somewhat easy to fool basic systems by imitating the web browser request.


This is the modified script:



Image_downloader_modified.py


Have a look at it and i will explain what exactly the newly added block of code is doing.



headers = {}

Created an empty dictionary and assigned it to a variable 'headers'. Headers are basically the data you send in a header everytime you visit a website and it contains informations on you. Who you are, your IP address, the browser you are using, your Operating System, it sends a bunch of information on you.



headers['User-Agent'] = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36 OPR/38.0.2220.41'  

Within your headers there is a piece of data called the User Agent. User Agent is basically the type of browser you are using. When you are accessing a website using a Python script, what Python does is that, it says python-urllib/python_Version, in my case it is Python 3.6.7. Basically its like saying "Hello I'm Python, give me access to your data". Within almost an instant when you visit a website with Python that website knows exactly who you are, knows you are a program. So it is very easy for them to shut you down.

So here we specified the User Agent that the program should use when making a request to the website. Its a little long text that you can easily find on Google, also you can get it from here.

Just paste it in your code. Thus we replaced the default User Agent and we are no longer announcing ourselves as Python.



req = urllib.request.Request(image_url, headers=headers)

This makes Python visit the provided URL of the image and instead of setting the default headers it will use the headers we defined in the script.



resp = urllib.request.urlopen(req)

This will open the URL with the specified headers and surely the image will be downloaded to your hard disk.



              That's it guys, Now our code will work perfectly fine even if the website would only allow web browsers or some specific clients to access them. We will be able to access the website with this Python script without any issues, it will not return the HTTP 403 forbidden error anymore.


Python HTTP 403 Forbidden Error Solved
Screenshot Attached


Hope this helps you.

Thanks you and have a bright future ahead.









How to handle HTTP 403 Forbidden Error in Python How to handle HTTP 403 Forbidden Error in Python Reviewed by Cyril Tom Mathew on April 24, 2019 Rating: 5

No comments:

Powered by Blogger.