ZetCode

Python Requests

last modified May 9, 2026

In this article, we explore practical techniques for working with the Python Requests module, one of the most widely used libraries for interacting with web resources. Step by step, we demonstrate how to retrieve data from remote servers, submit form data, upload JSON payloads, stream large responses efficiently, and establish secure HTTPS connections. Each concept is illustrated with clear, real-world examples.

To make the examples concrete, we interact with several types of back-end services: a public online API, an Nginx server configured for local testing, Python's built-in HTTP server, and a small Flask application. This variety shows how Requests behaves across different environments and how to adapt your code to each scenario.

The Hypertext Transfer Protocol (HTTP) is the underlying application protocol that enables communication across the Web. It defines how clients and servers exchange messages, how resources are addressed, and how data is transferred in a reliable, extensible way. Because HTTP is the foundation of nearly all modern web services, understanding how to work with it programmatically is essential for tasks such as automation, data collection, testing, and system integration.In this article, we explore practical techniques for working with the Python Requests module, one of the most widely used libraries for interacting with web resources. Step by step, we demonstrate how to retrieve data from remote servers, submit form data, upload JSON payloads, stream large responses efficiently, and establish secure HTTPS connections. Each concept is illustrated with clear, real-world examples.

The Hypertext Transfer Protocol (HTTP) is the underlying application protocol that enables communication across the Web. It defines how clients and servers exchange messages, how resources are addressed, and how data is transferred in a reliable, extensible way. Because HTTP is the foundation of nearly all modern web services, understanding how to work with it programmatically is essential for tasks such as automation, data collection, testing, and system integration.

Python requests

Requests is a simple and elegant Python HTTP library. It provides methods for accessing Web resources via HTTP. The library abstracts away the low-level details of working with sockets, headers, and query strings, giving the programmer a clean and intuitive API. With a few lines of code, we can send GET and POST requests, handle cookies, upload files, work with JSON data, or communicate with RESTful services. Requests focuses on readability and convenience, making everyday HTTP tasks straightforward and pleasant to write.

Library version

The first program prints the version of the Requests library.

version.py
#!/usr/bin/python

import requests

print(requests.__version__)
print(requests.__copyright__)

The program prints the version and copyright of Requests.

$ ./version.py
2.33.1
Copyright Kenneth Reitz

Reading a web page

The get function issues a GET request and returns a Response object containing the status code, headers, and body of the server's reply.

read_webpage.py
#!/usr/bin/python

import requests

url = "https://example.com"
resp = requests.get(url)
print(resp.text)

The text attribute exposes the body decoded as Unicode; content gives the same data as raw bytes.

Resource management

When requests.get is called without a with block, the underlying TCP connection is not closed immediately after the response is read — it is returned to an internal connection pool and kept alive for potential reuse. For a short script like this one that is perfectly fine; the interpreter releases everything on exit. In longer-running programs, however, responses that are never explicitly closed can hold sockets open longer than necessary, and under high request volume that can exhaust the connection pool.

read_webpage_with.py
#!/usr/bin/python

import requests

with requests.get("https://example.com") as resp:
    resp.raise_for_status()
    print(resp.text)

The with block calls resp.close() automatically on exit — whether the body was read successfully or an exception was raised — returning the underlying TCP socket to the connection pool immediately. The raise_for_status call is added here as a matter of habit: any production code that reads a page should verify it actually received a valid response before trying to use the content.

Stripping HTML tags

The following program gets a small web page and strips its HTML tags.

strip_tags.py
#!/usr/bin/python

import requests
import re

url = "https://example.com"

with requests.get(url) as resp:
    resp.raise_for_status()
    content = resp.text

    stripped = re.sub('<[^<]+?>', '', content)
    print(stripped)

The script strips the HTML tags of the https://example.com web page.

stripped = re.sub('<[^<]+?>', '', content)

A simple regular expression is used to strip the HTML tags. For more complex HTML documents, consider using a library like Beautiful Soup instead of regular expressions for more robust parsing.

Getting status

The Response object contains a server's response to an HTTP request. Its status_code attribute returns HTTP status code of the response, such as 200 or 404.

get_status.py
#!/usr/bin/python

import requests

url = "https://example.com"
with requests.get(url) as resp:
    print(resp.status_code)

url = "https://example.com/news"
with requests.get(url) as resp:
    print(resp.status_code)

We perform two HTTP requests with the get method and check for the returned status.

$ ./get_status.py
200
404

200 is a standard response for successful HTTP requests and 404 tells that the requested resource could not be found.

The Response object

Every request method — get, post, put, and the rest — returns a Response object. It contains everything the server sent back: the status code, headers, body, and metadata about the exchange itself. The table below covers the attributes and methods you will reach for most often.

Attribute / method Description
status_code
int
The HTTP status code returned by the server, such as 200, 301, or 404. Use raise_for_status() to turn error codes into exceptions rather than checking this value manually.
ok
bool
True when status_code is less than 400, False otherwise. Convenient for a quick success check, but raise_for_status() is safer in production code because it forces the caller to handle the failure explicitly rather than risk silently ignoring it.
headers
CaseInsensitiveDict
The response headers as a dictionary-like object. Key lookup is case-insensitive, so resp.headers['content-type'] and resp.headers['Content-Type'] are equivalent. Use .get(key, default) to avoid a KeyError when a header may be absent.
text
str
The response body decoded as a Unicode string. The encoding is inferred from the Content-Type header or detected by chardet if the header is absent. Suitable for HTML, JSON, XML, and any other text-based content.
content
bytes
The raw, undecoded response body as a byte string. Use this for binary responses such as images, PDFs, or ZIP files, and when passing the body to a parser that expects bytes (e.g. lxml.html.fromstring).
json()
Any
Decodes the response body as JSON and returns the corresponding Python object — typically a dict or list. Raises a ValueError if the body is not valid JSON, regardless of the Content-Type header. Equivalent to json.loads(resp.text) but raises a more descriptive exception on failure.
encoding
str
The encoding used to decode the body when accessing text. Inferred from the Content-Type header by default. Can be set explicitly before accessing text if the server returns an incorrect or missing charset declaration: resp.encoding = 'utf-8'.
url
str
The final URL of the response after all redirects have been followed. Useful when the original URL was a shortlink or a redirect that resolves to a canonical address.
history
list[Response]
A list of Response objects for any redirects that occurred before reaching the final response, ordered oldest to newest. Empty when no redirects took place. Each entry has its own status_code and headers, making it possible to inspect the full redirect chain.
cookies
RequestsCookieJar
Cookies set by the server in this response, exposed as a dictionary-like object. Can be passed directly to the cookies parameter of a subsequent request to send them back, or merged into a Session for automatic handling across all future requests.
elapsed
timedelta
The time between sending the request and receiving the first byte of the response headers. Does not include the time spent reading the body. Useful for basic performance measurement and for logging slow requests. Access the value in seconds with resp.elapsed.total_seconds().
request
PreparedRequest
The PreparedRequest object that was sent to produce this response. Useful for debugging: resp.request.headers shows the exact headers that left the client, and resp.request.body shows the serialised request body.
raise_for_status()
None
Raises requests.exceptions.HTTPError if the status code indicates a client error (4xx) or server error (5xx). Does nothing for successful responses (2xx) and redirects (3xx). The exception carries the original Response object in its response attribute, so the status code and body are still accessible inside the handler.

The following example exercises several of these attributes against httpbin.org, which returns a JSON description of the request it received — making it straightforward to verify what the client actually sent.

response_object.py
#!/usr/bin/python

import requests

url = "https://httpbin.org/get"

try:
    with requests.get(url, params={"lang": "python"}, timeout=(5, 10)) as resp:
        resp.raise_for_status()

        print(f"Status code:  {resp.status_code}")
        print(f"OK:           {resp.ok}")
        print(f"Final URL:    {resp.url}")
        print(f"Redirects:    {len(resp.history)}")
        print(f"Elapsed:      {resp.elapsed.total_seconds():.3f} s")
        print(f"Encoding:     {resp.encoding}")
        print(f"Content-Type: {resp.headers.get('content-type', 'n/a')}")
        print(f"Sent headers: {dict(resp.request.headers)}")
        print()
        print("Body (JSON):")
        data = resp.json()
        for key, value in data.items():
            print(f"  {key}: {value}")

except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")

The program sends a GET request to httpbin.org/get with a query parameter lang=python. The resp.json() method parses the JSON response body and returns a Python dictionary. The output shows the status code, final URL, headers sent by the client, and the JSON body returned by the server, which includes the query parameters and other request details.

$ ./response_object.py
Status code:  200
OK:           True
Final URL:    https://httpbin.org/get?lang=python
Redirects:    0
Elapsed:      0.621 s
Encoding:     utf-8
Content-Type: application/json
Sent headers: {'User-Agent': 'python-requests/2.33.1', 'Accept-Encoding': 
  'gzip, deflate, zstd', 'Accept': '*/*', 'Connection': 'keep-alive'}

Body (JSON):
  args: {'lang': 'python'}
  ...

resp.request.headers shows the headers the library added automatically — User-Agent, Accept-Encoding, and Connection — without any explicit configuration. These defaults are applied by the PreparedRequest layer before the request leaves the client and can be overridden by passing a headers dictionary to the request method.

Python requests raise_for_status

The raise_for_status method inspects the HTTP response code and raises an HTTPError exception for any response that signals a client error (4xx) or server error (5xx). For successful responses (2xx) it does nothing, making it a concise way to treat bad status codes as exceptions without manually checking resp.status_code after every request.

While resp.status_code simply exposes the raw integer returned by the server, raise_for_status adds a decision on top of it — turning error codes into exceptions so they cannot be silently ignored. Using status_code directly requires an explicit check such as if resp.status_code == 200 or if resp.ok after every request, and it is easy to forget or handle inconsistently across a codebase. raise_for_status centralises that logic in one call: successful responses pass through untouched, while anything in the 4xx or 5xx range immediately raises an HTTPError that must be handled — or will propagate up the call stack, making the failure visible rather than hidden behind a status integer that nobody checked.

raise_for_status.py
#!/usr/bin/python

import requests

urls = [
    "https://example.com",               # 200 OK — succeeds
    "https://httpbin.org/status/404",    # 404 Not Found — client error
    "https://httpbin.org/status/500",    # 500 Internal Server Error — server error
]

for url in urls:

    print(f"GET {url}")

    try:
        resp = requests.get(url)
        resp.raise_for_status()
        print(f"  Status code: {resp.status_code} — OK\n")
    except requests.exceptions.HTTPError as e:
        print(f"  HTTP error: {e}\n")

The httpbin.org/status/{code} endpoint returns whatever HTTP status code is embedded in the URL, making it ideal for testing error-handling paths without needing a broken server. The 404 triggers a client-error branch and the 500 a server-error branch — both surfaced as an HTTPError with the status code and reason phrase included in the message.

$ ./raise_for_status.py
GET https://example.com
  Status code: 200 — OK

GET https://httpbin.org/status/404
  HTTP error: 404 Client Error: NOT FOUND for url: https://httpbin.org/status/404

GET https://httpbin.org/status/500
  HTTP error: 500 Server Error: INTERNAL SERVER ERROR for url: https://httpbin.org/status/500

HTTPError is a subclass of requests.exceptions.RequestException, the base class for every exception the library raises. If you need to distinguish between client and server errors, inspect e.response.status_code inside the handler — values in the 400–499 range indicate a problem with the request itself, while 500–599 point to a fault on the server side.

RequestException

Every exception the requests library raises is a subclass of requests.exceptions.RequestException, the root of its exception hierarchy. Catching it in a single except clause is therefore sufficient to handle any transport-level failure — but knowing the subclasses lets you respond to each failure mode appropriately rather than treating all errors the same way.

The hierarchy below shows the most commonly encountered exceptions and their inheritance relationships.

RequestException
├── ConnectionError
│   ├── ProxyError
│   └── SSLError
├── HTTPError              # raised by resp.raise_for_status()
├── URLRequired
├── TooManyRedirects
├── Timeout
│   ├── ConnectTimeout
│   └── ReadTimeout
└── InvalidURL

The most specific exceptions are at the bottom of the tree, and the most general at the top. This means that when you want to handle different failure modes separately, the specific exceptions must be caught before the more general ones — otherwise the base class will catch every error and make the specific branches unreachable.

exceptions.py
#!/usr/bin/python

import requests

URLS = [
    "https://httpbin.org/status/404",   # HTTPError
    "https://httpbin.org/delay/10",     # ReadTimeout
    "https://httpbin.org/get",          # success
    "https://invalid.invalid",          # ConnectionError
    "http://",                          # InvalidURL
]

for url in URLS:
    print(f"GET {url}")
    try:
        with requests.get(url, timeout=(5, 3)) as resp:
            resp.raise_for_status()
            print(f"  OK — status {resp.status_code}\n")

    except requests.exceptions.ConnectTimeout:
        print("  ConnectTimeout — server did not accept the connection in time\n")
    except requests.exceptions.ReadTimeout:
        print("  ReadTimeout — server connected but stalled mid-response\n")
    except requests.exceptions.HTTPError as e:
        print(f"  HTTPError — {e}\n")
    except requests.exceptions.SSLError as e:
        print(f"  SSLError — certificate or handshake failure: {e}\n")
    except requests.exceptions.ProxyError as e:
        print(f"  ProxyError — could not connect through proxy: {e}\n")
    except requests.exceptions.ConnectionError as e:
        print(f"  ConnectionError — DNS failure or refused connection: {e}\n")
    except requests.exceptions.TooManyRedirects:
        print("  TooManyRedirects — redirect loop detected\n")
    except requests.exceptions.InvalidURL as e:
        print(f"  InvalidURL — malformed URL: {e}\n")
    except requests.exceptions.RequestException as e:
        print(f"  Unexpected error: {e}\n")

The exceptions are ordered from most specific to most general, which is how Python resolves except clauses — top to bottom, first match wins. Because ConnectTimeout and ReadTimeout are both subclasses of Timeout, and Timeout itself is a subclass of RequestException, placing the base class first would swallow every subclass and make the specific branches unreachable.

$ ./exceptions.py
GET https://httpbin.org/status/404
  HTTPError — 404 Client Error: NOT FOUND for url: https://httpbin.org/status/404

GET https://httpbin.org/delay/10
  ReadTimeout — server connected but stalled mid-response

GET https://httpbin.org/get
  OK — status 200

GET https://invalid.invalid
  ConnectionError — DNS failure or refused connection: ...

GET http://
  InvalidURL — malformed URL: ...

When to catch the base class

In code where the reaction to any failure is the same — log the error, return a default value, show a generic message to the user — catching RequestException alone keeps the handler concise:

def fetch(url):
    try:
        with requests.get(url, timeout=(5, 10)) as resp:
            resp.raise_for_status()
            return resp.json()
    except requests.exceptions.RequestException as e:
        print(f"Could not fetch {url}: {e}")
        return None

This pattern is appropriate for helper functions whose callers do not care about the specific cause of failure — only whether a result was returned. When the caller needs to distinguish between a server being down and a server returning a bad response, the specific subclasses should be caught and re-raised or handled individually instead.

HTTPError and raise_for_status

HTTPError is the one exception in the hierarchy that is never raised automatically — it only fires when you explicitly call resp.raise_for_status(). A 404 or 500 response does not raise anything on its own; without that call, resp.status_code simply holds the error code and execution continues normally. This is by design — some applications treat 404 as a valid negative result rather than a failure — but it means that omitting raise_for_status in code that expects a successful response can lead to subtle bugs where an error body is silently processed as real data.

HTTP request methods

An HTTP request is a message sent from a client to a server asking it to perform a specific action. The action is identified by a method — the most common being GET (retrieve a resource), POST (submit data), PUT (replace a resource), and HEAD (retrieve headers only, without a body). The requests library exposes each method as a dedicated function: requests.get, requests.post, requests.put, requests.head, and so on. All of them are thin wrappers around the lower-level requests.request, which accepts the method name as an explicit string argument when none of the shortcuts fit.

head_request.py
#!/usr/bin/python

import requests

url = "https://example.com"

with requests.head(url, timeout=(5, 10)) as resp:
    resp.raise_for_status()
    print("Server:        ", resp.headers.get("server", "n/a"))
    print("Last modified: ", resp.headers.get("last-modified", "n/a"))
    print("Content type:  ", resp.headers.get("content-type", "n/a"))

A HEAD request asks the server to return only the response headers, omitting the body entirely. This makes it useful for checking metadata — content type, last-modified date, server software — without paying the cost of transferring the full document. The headers are accessed through the resp.headers dictionary, which is case-insensitive. Using .get with a fallback of "n/a" avoids a KeyError when a header is absent, since not every server includes all three fields.

$ ./head_request.py
Server:         cloudflare
Last modified:  Wed, 06 May 2026 14:17:14 GMT
Content type:   text/html

Python requests get method

The get method issues a GET request to the server. The GET method requests a representation of the specified resource.

The httpbin.org is a freely available HTTP Request & Response Service.

get_request.py
#!/usr/bin/python

import requests

url = "https://httpbin.org/get?name=Peter"

with requests.get(url) as resp:
    resp.raise_for_status()
    print(resp.text)

The script sends a variable with a value to the httpbin.org server. The variable is specified directly in the URL.

$ ./mget.py
{
  "args": {
    "name": "Peter"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate, zstd", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.33.1", 
    "X-Amzn-Trace-Id": "Root=1-69ff0896-272a70fa29b81c2556b6fffb"
  }, 
  ...
  "url": "https://httpbin.org/get?name=Peter"
}
get_request2.py
#!/usr/bin/python

import requests

payload = {'name': 'Peter', 'age': 23}
url = "https://httpbin.org/get"

with requests.get(url, params=payload) as resp:
    resp.raise_for_status()
    print(resp.url)
    print(resp.text)

The get method takes a params parameter where we can specify the query parameters.

payload = {'name': 'Peter', 'age': 23}

The data is sent in a Python dictionary.

resp = req.get("https://httpbin.org/get", params=payload)

We send a GET request to the httpbin.org site and pass the data, which is specified in the params parameter.

print(resp.url)
print(resp.text)

We print the URL and the response content to the console.

Python requests timeout attribute

The timeout attribute defines how long (in seconds) the client waits for a server response before raising an exception. Without it, a request can hang indefinitely if the remote server is slow or unresponsive — making it essential for any production code.

The timeout applies to two distinct phases: the connection phase (establishing the TCP handshake) and the read phase (waiting for the server to send data). You can control both independently by passing a tuple (connect, read), or set a single value that applies to each phase separately.

timeout_request.py
#!/usr/bin/python

import requests

TIMEOUT = (3, 5)

urls = [
    "https://example.com",  # responds immediately
    "https://httpbin.org/delay/10",  # deliberately delays 10 s — will time out
]

for url in urls:
    print(f"GET {url}")
    try:
        with requests.get(url, timeout=TIMEOUT) as resp:
            resp.raise_for_status()
            print(f"  Status code: {resp.status_code}\n")
    except requests.exceptions.ConnectTimeout:
        print("  Connection timed out — server not reachable\n")
    except requests.exceptions.ReadTimeout:
        print("  Read timed out — server connected but did not respond in time\n")
    except requests.exceptions.Timeout:
        print("  Request timed out\n")

The first URL responds normally within the allowed window. The second points to httpbin.org/delay/10, which deliberately waits 10 seconds before replying — well beyond the 5-second read timeout — so a ReadTimeout is raised instead of receiving a response.

TIMEOUT = (3, 5)

The timeout is set to 3 seconds for the connection phase and 5 seconds for the read phase.

$ ./timeout_request.py
GET https://example.com
  Status code: 200

GET https://httpbin.org/delay/10
  Read timed out — server connected but did not respond in time

Both ConnectTimeout and ReadTimeout are subclasses of requests.exceptions.Timeout, so catching the base class alone is sufficient when you do not need to distinguish between the two failure modes. Splitting them — as shown above — lets you react differently: a ReadTimeout against a live server may be worth retrying, while a ConnectTimeout against an unreachable host usually is not.

Python requests redirection

Redirection is a process of forwarding one URL to a different URL. The HTTP response status code 301 Moved Permanently is used for permanent URL redirection; 302 Found for a temporary redirection.

redirect.py
#!/usr/bin/python

import requests

url = "https://httpbin.org/redirect-to?url=/"

with requests.get(url) as resp:
    resp.raise_for_status()

    print(resp.status_code)
    print(resp.history)
    print(resp.url)

In the example, we issue a GET request to the https://httpbin.org/redirect-to page. This page redirects to another page; redirect responses are stored in the history attribute of the response.

$ ./redirect.py
200
[<Response [302]>]
https://httpbin.org/

A GET request to https://httpbin.org/redirect-to was 302 redirected to https://httpbin.org.

In the second example, we do not follow a redirect.

redirect2.py
#!/usr/bin/python

import requests

url = "https://httpbin.org/redirect-to?url=/"

with requests.get(url, allow_redirects=False) as resp:
    resp.raise_for_status()

    print(resp.status_code)
    print(resp.url)

The allow_redirects parameter specifies whether the redirect is followed; the redirects are followed by default.

$ ./redirect2.py
302
https://httpbin.org/redirect-to?url=/

Redirect with nginx

In the next example, we show how to set up a page redirect in nginx server.

location = /oldpage.html {

        return 301 /newpage.html;
}

Add these lines to the nginx configuration file, which is located at /etc/nginx/sites-available/default on Debian.

$ sudo service nginx restart

After the file has been edited, we must restart nginx to apply the changes.

oldpage.html
<!DOCTYPE html>
<html>
<head>
<title>Old page</title>
</head>
<body>
<p>
This is old page
</p>
</body>
</html>

This is the oldpage.html file located in the nginx document root.

newpage.html
<!DOCTYPE html>
<html>
<head>
<title>New page</title>
</head>
<body>
<p>
This is a new page
</p>
</body>
</html>

This is the newpage.html.

redirect3.py
#!/usr/bin/python

import requests

url = "http://localhost/oldpage.html"

with requests.get(url) as resp:
    resp.raise_for_status()

    print(resp.status_code)
    print(resp.history)
    print(resp.url)

    print(resp.text)

This script accesses the old page and follows the redirect. As we already mentioned, Requests follows redirects by default.

$ ./redirect3.py
200
(<Response [301]>,)
http://localhost/files/newpage.html
<!DOCTYPE html>
<html>
<head>
<title>New page</title>
</head>
<body>
<p>
This is a new page
</p>
</body>
</html>
$ sudo tail -2 /var/log/nginx/access.log
127.0.0.1 - - [21/Jul/2019:07:41:27 -0400] "GET /oldpage.html HTTP/1.1" 301 184
"-" "python-requests/2.4.3 CPython/3.4.2 Linux/3.16.0-4-amd64"
127.0.0.1 - - [21/Jul/2019:07:41:27 -0400] "GET /newpage.html HTTP/1.1" 200 109
"-" "python-requests/2.4.3 CPython/3.4.2 Linux/3.16.0-4-amd64"

As we can see from the access.log file, the request was redirected to a new file name. The communication consisted of two GET requests.

User agent

A user agent is a short identification string that an HTTP client sends to a server in the User-Agent header. Every browser, crawler, and script has one. It tells the server who is making the request and often what capabilities that client has. Web servers use this information for logging, analytics, content negotiation, or to apply special handling for specific clients.

When we write our own Python HTTP server, we can choose any user-agent name we like. This makes it easy to distinguish our requests from browsers, automated tools, or other scripts in server logs. It also helps when debugging, because the server can immediately see that the request came from our custom client rather than from Chrome, Firefox, or a bot.

http_server.py
#!/usr/bin/python

from http.server import BaseHTTPRequestHandler, HTTPServer


class MyHandler(BaseHTTPRequestHandler):

    def do_GET(self):

        message = "Hello there"

        self.send_response(200)

        if self.path == "/agent":

            message = self.headers["user-agent"]

        self.send_header("Content-type", "text/html")
        self.end_headers()

        self.wfile.write(bytes(message, "utf8"))

        return


if __name__ == "__main__":

    with HTTPServer(("127.0.0.1", 8081), MyHandler) as server:
        print("Listening on http://127.0.0.1:8081")
        server.serve_forever()

We have a simple Python HTTP server.

if self.path == '/agent':

    message = self.headers['user-agent']

If the path contains '/agent', we return the specified user agent.

user_agent.py
#!/usr/bin/python

import requests

headers = {'user-agent': 'Python script'}

url = "http://localhost:8081/agent"

with requests.get(url, headers=headers) as resp:
    resp.raise_for_status()

    print(resp.text)

This script creates a simple GET request to our Python HTTP server. To add HTTP headers to a request, we pass in a dictionary to the headers parameter.

headers = {'user-agent': 'Python script'}

The header values are placed in a Python dictionary.

resp = requests.get("http://localhost:8081/agent", headers=headers)

The values are passed to the headers parameter.

$ ./http_server.py
starting server on port 8081...

First, we start the server.

$ ./user_agent.py
Python script

Then we run the script. The server responded with the name of the agent that we have sent with the request.

Python requests post value

The post method dispatches a POST request on the given URL, providing the key/value pairs for the fill-in form content.

post_value.py
#!/usr/bin/python

import requests

data = {'name': 'Peter'}
url = "https://httpbin.org/post"

with requests.post(url, data=data) as resp:
    resp.raise_for_status()
    
    print(resp.text)

The script sends a request with a name key having Peter value. The POST request is issued with the post method.

$ ./post_value.py
{
  "args": {},
  "data": "",
  "files": {},
  "form": {
    "name": "Peter"
  },
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate",
    "Content-Length": "10",
    "Content-Type": "application/x-www-form-urlencoded",
    "Host": "httpbin.org",
    "User-Agent": "python-requests/2.21.0"
  },
  "json": null,
  ...
}

This is the output of the post_value.py script.

Python requests upload image

In the following example, we are going to upload an image. We create a web application with Flask.

app.py
#!/usr/bin/python

import os
from flask import Flask, request

app = Flask(__name__)

@app.route("/")
def home():
    return 'This is home page'

@app.route("/upload", methods=['POST'])
def handleFileUpload():

    msg = 'failed to upload image'

    if 'image' in request.files:

        photo = request.files['image']

        if photo.filename != '':

            photo.save(os.path.join('.', photo.filename))
            msg = 'image uploaded successfully'

    return msg

if __name__ == '__main__':
    app.run()

This is a simple application with two endpoints. The /upload endpoint checks if there is some image and saves it to the current directory.

upload_file.py
#!/usr/bin/python

import requests

url = 'http://localhost:5000/upload'
image_file = 'data/sid.jpg'

with open(image_file, 'rb') as f:

    files = {'image': f}

    with requests.post(url, files=files) as resp:

        resp.raise_for_status()
        print(resp.text)

We send the image to the Flask application. The file is specified in the files attribute of the post method.

JSON

JSON is a lightweight text format for exchanging structured data. It maps directly onto Python dictionaries and lists, making it the natural choice for HTTP APIs: the requests library can both decode incoming JSON and serialise outgoing data automatically, without any manual calls to the json module.

Reading JSON from a server

When the server sets Content-Type: application/json, resp.json() decodes the body and returns a Python object — a dictionary, list, or scalar, depending on what the server sent. It is equivalent to calling json.loads(resp.text) but raises a more informative exception if decoding fails.

http_server_json.py
#!/usr/bin/python

import json
from http.server import BaseHTTPRequestHandler, HTTPServer


class Handler(BaseHTTPRequestHandler):

    def do_GET(self):
        payload = json.dumps({"name": "Jane", "age": 17}).encode()

        self.send_response(200)
        self.send_header("Content-Type", "application/json; charset=utf-8")
        self.send_header("Content-Length", str(len(payload)))
        self.end_headers()
        self.wfile.write(payload)


if __name__ == "__main__":
    with HTTPServer(("127.0.0.1", 8000), Handler) as server:
        print("Listening on http://127.0.0.1:8000")
        server.serve_forever()

This is a simple HTTP server that returns a JSON response to any GET request. The Content-Type header is set to application/json to indicate that the response body is JSON. The body itself is a JSON-encoded dictionary containing a name and age.

read_json.py
#!/usr/bin/python

import requests

url = "http://127.0.0.1:8000"

try:
    with requests.get(url, timeout=(5, 10)) as resp:
        resp.raise_for_status()
        data = resp.json()
        print(f"Name: {data['name']}, age: {data['age']}")
except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")
except ValueError as e:
    print(f"Failed to decode JSON: {e}")

resp.json() raises a ValueError if the response body is not valid JSON — for example when a proxy returns an HTML error page with a 200 status. Catching it separately from RequestException keeps network failures and malformed responses as distinct error cases.

$ ./read_json.py
Name: Jane, age: 17

Sending JSON to a server

Passing a dictionary to the json parameter of requests.post serialises it automatically and sets the Content-Type header to application/json. This is the preferred approach over manually calling json.dumps and setting the header by hand.

http_server_json2.py
#!/usr/bin/python

import json
from http.server import BaseHTTPRequestHandler, HTTPServer


class Handler(BaseHTTPRequestHandler):

    def do_POST(self):
        length = int(self.headers.get("Content-Length", 0))

        try:
            data = json.loads(self.rfile.read(length))
        except json.JSONDecodeError:
            self.send_response(400)
            self.send_header("Content-Type", "text/plain; charset=utf-8")
            self.end_headers()
            self.wfile.write(b"400 Bad Request: invalid JSON\n")
            return

        lines = [f"{key}: {value}" for key, value in data.items()]
        payload = "\n".join(lines).encode()

        self.send_response(200)
        self.send_header("Content-Type", "text/plain; charset=utf-8")
        self.send_header("Content-Length", str(len(payload)))
        self.end_headers()
        self.wfile.write(payload)


if __name__ == "__main__":
    with HTTPServer(("127.0.0.1", 8000), Handler) as server:
        print("Listening on http://127.0.0.1:8000")
        server.serve_forever()

This server accepts POST requests with a JSON body, decodes it, and returns a plain-text summary of the fields. If the body is not valid JSON, it responds with400 Bad Request without attempting to process the data further.

send_json.py
#!/usr/bin/python

import requests

url = "http://127.0.0.1:8000"
data = {"name": "Jane", "age": 17}

try:
    with requests.post(url, json=data, timeout=(5, 10)) as resp:
        resp.raise_for_status()
        print(resp.text)
except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")

The server reads Content-Length to know exactly how many bytes to consume from the request body, then attempts to parse them as JSON. If parsing fails it returns 400 Bad Request immediately, before touching the data, which prevents a malformed payload from propagating further into the handler. The valid path iterates over the decoded dictionary and returns a plain-text summary — one key: value line per field.

$ ./send_json.py
name: Jane
age: 17

Working with cookies

A cookie is a small piece of data the server asks the browser (or any HTTP client) to store and send back with every subsequent request to the same origin. Cookies are the standard mechanism for maintaining state across otherwise stateless HTTP connections — session tokens, user preferences, and tracking identifiers are all commonly stored this way. The client sends cookies in the Cookie request header; the server sets them via the Set-Cookie response header.

The following examples use a minimal http.server server running locally alongside a requests client, so the full exchange is visible at both ends without depending on an external service.

Header / attribute Type Description
Cookie request Carries all cookies the client holds for the current origin, serialised as name=value pairs separated by ; . Set automatically by the browser; in requests, populated via the cookies parameter or a Session object.
Set-Cookie response Instructs the client to store a cookie. One header per cookie; repeated as many times as needed. The value contains the cookie name and value followed by optional attributes separated by ; . For example: sessionid=abc123; Path=/; HttpOnly; Max-Age=3600.
Expires attribute Sets an absolute expiry date in RFC 1123 format (e.g. Thu, 01 Jan 2026 00:00:00 GMT). When the date is reached the cookie is deleted. Omitting both Expires and Max-Age creates a session cookie that is discarded when the browser closes. Superseded by Max-Age in modern clients when both are present.
Max-Age attribute Sets a relative lifetime in seconds from the moment the cookie is received. Max-Age=3600 expires the cookie after one hour. A value of 0 or negative deletes the cookie immediately, which is the standard way to invalidate a cookie on logout. Takes precedence over Expires in all modern browsers.
Domain attribute Specifies which hosts may receive the cookie. Domain=example.com includes all subdomains (api.example.com, www.example.com, etc.). Omitting the attribute restricts the cookie to the exact host that set it, excluding subdomains.
Path attribute Limits the cookie to URLs whose path begins with the given value. Path=/admin sends the cookie only on requests to /admin and its sub-paths. Path=/ sends it on every request to the domain, which is the most common setting.
Secure attribute A flag (no value) that prevents the cookie from being sent over plain HTTP. The browser only includes it in requests made over HTTPS, protecting it from interception on unencrypted connections. Should always be set on session and authentication cookies in production.
HttpOnly attribute A flag that makes the cookie inaccessible to JavaScript — document.cookie will not include it. This limits the blast radius of an XSS attack: even if an attacker injects a script, they cannot read session tokens marked HttpOnly.
SameSite attribute Controls whether the cookie is sent on cross-site requests, mitigating CSRF attacks. Three values: Strict — never sent on cross-site requests; Lax — sent on top-level navigations (e.g. clicking a link) but not on embedded requests such as images or iframes (the browser default since 2020); None — always sent, but requires Secure to be set.

The table covers both HTTP headers (Cookie and Set-Cookie) and all standard Set-Cookie attributes. The type column distinguishes between what the client sends, what the server sends, and what travels as part of a Set-Cookie value rather than as a standalone header.

Sending cookies from the client

The cookies parameter of requests.get accepts a plain dictionary. The library serialises it into a Cookie header automatically before sending the request.

cookie_server.py
#!/usr/bin/python

from http.server import BaseHTTPRequestHandler, HTTPServer


class Handler(BaseHTTPRequestHandler):

    def do_GET(self):
        cookies = self.headers.get("Cookie", "(no cookies)")
        body = f"Received cookies: {cookies}\n".encode()

        self.send_response(200)
        self.send_header("Content-Type", "text/plain; charset=utf-8")
        self.send_header("Content-Length", str(len(body)))
        self.end_headers()
        self.wfile.write(body)


if __name__ == "__main__":
    with HTTPServer(("127.0.0.1", 8000), Handler) as server:
        print("Listening on http://127.0.0.1:8000")
        server.serve_forever()

This is a simple HTTP server that reads the Cookie header from incoming GET requests and echoes it back in the response body. If no cookies are sent, it responds with a message indicating that as well.

cookie_request.py
#!/usr/bin/python

import requests

url = "http://127.0.0.1:8000"
cookies = {"user": "jane", "theme": "dark"}

try:
    with requests.get(url, cookies=cookies, timeout=(5, 10)) as resp:
        resp.raise_for_status()
        print(resp.text)
except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")

The Cookie header is a single semicolon-separated string regardless of how many cookies are sent. The requests library handles the serialisation, so the caller only needs to provide a dictionary. The server reads the header as-is and echoes it back in the response body.

$ ./cookie_request.py
Received cookies: user=jane; theme=dark

Receiving cookies set by the server

When the server wants to create a cookie in the client, it includes a Set-Cookie header in its response. The requests library parses these automatically and exposes them through resp.cookies, a RequestsCookieJar that behaves like a dictionary.

cookie_set_server.py
#!/usr/bin/python

import time
from http.server import BaseHTTPRequestHandler, HTTPServer
from email.utils import formatdate


class Handler(BaseHTTPRequestHandler):

    def do_GET(self):
        incoming = self.headers.get("Cookie")

        if incoming:
            body = f"Client sent cookies: {incoming}\n".encode()
            self.send_response(200)
            self.send_header("Content-Type", "text/plain; charset=utf-8")
            self.send_header("Content-Length", str(len(body)))
            self.end_headers()
            self.wfile.write(body)
        else:
            ts = int(time.time())
            expires = formatdate(ts + 3600, usegmt=True)
            max_age = 3600

            cookies = [
                f"sessionid={ts}; Path=/; HttpOnly; Max-Age={max_age}; Expires={expires}",
                f"theme=dark; Path=/; Max-Age={max_age}; Expires={expires}",
                f"user=jane; Path=/; Max-Age={max_age}; Expires={expires}",
            ]

            body = b"Cookies set send them back on the next request.\n"
            self.send_response(200)
            self.send_header("Content-Type", "text/plain; charset=utf-8")
            self.send_header("Content-Length", str(len(body)))
            for cookie in cookies:
                self.send_header("Set-Cookie", cookie)
            self.end_headers()
            self.wfile.write(body)


if __name__ == "__main__":
    with HTTPServer(("127.0.0.1", 8000), Handler) as server:
        print("Listening on http://127.0.0.1:8000")
        server.serve_forever()

The server distinguishes the two requests by checking whether a Cookie header is present. On the first visit it responds with three Set-Cookie headers, each carrying a different cookie along with Max-Age and Expires attributes that tell a real browser how long to keep them. On the second visit it simply echoes the cookies back, confirming what the client returned.

Max-Age and Expires serve the same purpose — limiting cookie lifetime — but Max-Age takes precedence in all modern browsers when both are present. Expires is included here for compatibility with older clients. The HttpOnly flag on the session cookie instructs the browser not to expose it to JavaScript, which reduces the risk of it being stolen via an XSS attack. The Path=/ attribute means the cookie is sent on every request to the server, not only those under a specific sub-path.

cookie_set_request.py
#!/usr/bin/python

import requests

url = "http://127.0.0.1:8000"

try:
    # First request — server sets cookies.
    with requests.get(url, timeout=(5, 10)) as resp:
        resp.raise_for_status()
        print("First response:", resp.text)
        print("Cookies received:")
        for name, value in resp.cookies.items():
            print(f"  {name} = {value}")

    # Second request — send the cookies back.
    with requests.get(url, cookies=resp.cookies, timeout=(5, 10)) as resp2:
        resp2.raise_for_status()
        print("\nSecond response:", resp2.text)

except requests.exceptions.RequestException as e:
    print(f"Request failed: {e}")

This script performs two GET requests to the server. The first one receives the cookies set by the server and prints them to the console. The second request sends the received cookies back to the server, which responds with a message confirming the cookies it got from the client.

$ ./cookie_set_request.py
First response: Cookies set send them back on the next request.

Cookies received:
  sessionid = 1778336578
  theme = dark
  user = jane

Second response: Client sent cookies: sessionid=1778336578; theme=dark; user=jane

Retrieving definitions from a dictionary

The following example scrapes word definitions from dictionary.com by sending a GET request and parsing the returned HTML with the lxml library. Because the site's markup can change without notice, the script tries several XPath expressions in priority order, falling back to broader selectors when the preferred ones yield nothing.

get_term.py
#!/usr/bin/python

import sys
import textwrap
import requests
from lxml import html


HEADERS = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64)"}
BASE_URL = "https://www.dictionary.com/browse/"
WRAP_WIDTH = 80  # matches the standard 80-column terminal width

# Tried in order; the first that yields text wins.
XPATHS = [
    "//span[contains(@class,'one-click-content')]//text()",
    "//*[contains(@data-testid,'definition')]//text()",
    "//section[contains(@class,'definitions') or contains(@class,'css-pnw38j')]//text()",
    "//main//p//text()",
    "//meta[@name='description']/@content",
]


def fetch_html(term):
    """Return the parsed HTML tree for *term*, or raise on HTTP/network error."""
    with requests.get(BASE_URL + term, headers=HEADERS, timeout=(5, 10)) as resp:
        resp.raise_for_status()
        return html.fromstring(resp.content)


def extract_definitions(root):
    """Try each XPath in XPATHS and return the first non-empty result set."""
    for xpath in XPATHS:
        texts = []
        for node in root.xpath(xpath):
            text = node.strip() if isinstance(node, str) else node.text_content().strip()
            if text:
                texts.append(text)
        if texts:
            return texts

    # Last resort: first 40 non-empty text nodes anywhere inside main.
    return [t.strip() for t in root.xpath("//main//text()") if t.strip()][:40]


def deduplicate(texts):
    """Split on newlines, drop fragments shorter than 4 chars, remove duplicates."""
    seen = set()
    out = []
    for block in texts:
        for part in (s.strip() for s in block.split("\n") if s.strip()):
            if len(part) > 3 and part not in seen:
                seen.add(part)
                out.append(part)
    return out


def main():
    term = sys.argv[1] if len(sys.argv) > 1 else "dog"

    try:
        root = fetch_html(term)
    except requests.exceptions.ConnectTimeout:
        sys.exit("Connection timed out — check your network and try again.")
    except requests.exceptions.ReadTimeout:
        sys.exit("Server took too long to respond — try again later.")
    except requests.exceptions.HTTPError as e:
        sys.exit(f"HTTP error: {e}")
    except requests.exceptions.ConnectionError as e:
        sys.exit(f"Network error: {e}")
    except requests.exceptions.RequestException as e:
        sys.exit(f"Unexpected request error: {e}")

    lines = deduplicate(extract_definitions(root))

    if not lines:
        sys.exit("No definition found — the site structure may have changed.")

    for line in lines:
        print(textwrap.fill(line, width=WRAP_WIDTH))


if __name__ == "__main__":
    main()

The script is structured around three focused functions. fetch_html owns the network layer — it sends the GET request with a browser-like User-Agent header (required to avoid a bot block), calls raise_for_status to surface HTTP errors immediately, and returns a parsed lxml tree. Crucially, it passes resp.content — the raw bytes — to html.fromstring rather than resp.text. Passing decoded text strips the encoding declaration that lxml needs to handle character sets correctly; the bytes form leaves that decision to the parser.

extract_definitions works through XPATHS in priority order, returning as soon as one expression yields results. Moving the XPath list to a module-level constant makes it easy to add or reorder selectors without touching any logic. When all named selectors fail — for example after a site redesign — it falls back to the first 40 text nodes inside <main>, giving enough context to identify what changed and update the XPath list accordingly.

deduplicate splits each extracted block on newlines, discards fragments of three characters or fewer (punctuation, stray labels), and filters out any text already seen. This removes the navigation items, labels, and repeated UI text that XPath broad-match expressions inevitably pick up alongside the definitions.

Exception handling lives in main, where the appropriate response to each failure — a message and a non-zero exit code via sys.exit — is known. The inner functions stay clean and reusable as a result.

Note: Web scraping is inherently fragile. Sites restructure their markup without warning, and dictionary.com in particular is a React application whose server-rendered HTML can vary by region, A/B test bucket, or deploy. If the script stops returning definitions, inspect the live page source and update XPATHS accordingly.
$ ./get_term.py ephemeral
lasting a very short time;
short-lived
transitory
The poem celebrates the ephemeral joys of
childhood.
...

Python requests streaming

By default, requests downloads the entire response body into memory before returning it to your code. For large files this can exhaust available RAM and cause unnecessary delays before any processing begins. Setting stream=True changes this behaviour: the body is not fetched immediately, letting you consume it incrementally in chunks using iter_content or line by line using iter_lines.

This makes streaming the correct approach whenever the response body is large (file downloads, database exports), potentially unbounded (live event feeds, log streams), or when you want to start processing early parts before the transfer is complete.

streaming.py
#!/usr/bin/python

from pathlib import Path
from urllib.parse import urlsplit
import requests


def download(url, dest=None, chunk_size=65_536, timeout=(5, 30)):
    """Stream *url* to disk, returning the Path of the saved file.

    Args:
        url:        Remote URL to fetch.
        dest:       Local file path. Inferred from the URL when omitted.
        chunk_size: Bytes per write cycle (default 64 KiB).
        timeout:    (connect, read) timeout in seconds.

    Raises:
        requests.HTTPError:       Non-2xx response.
        requests.Timeout:         Connection or read deadline exceeded.
        requests.ConnectionError: DNS failure or refused connection.
        requests.RequestException: Any other transport-level failure.
        OSError:                  Local filesystem write failure.
    """
    dest = Path(dest) if dest else Path(Path(urlsplit(url).path).name or "download.bin")

    with requests.get(url, stream=True, timeout=timeout) as r:
        r.raise_for_status()
        with dest.open("wb") as f:
            for chunk in r.iter_content(chunk_size=chunk_size):
                f.write(chunk)

    return dest


url = "https://docs.oracle.com/javase/specs/jls/se25/jls25.pdf"

try:
    path = download(url, "java25spec.pdf")
    print(f"Saved to {path} ({path.stat().st_size:,} bytes)")
except requests.exceptions.ConnectTimeout:
    print("Connection timed out — server did not accept the connection in time")
except requests.exceptions.ReadTimeout:
    print("Read timed out — server stalled mid-transfer")
except requests.exceptions.HTTPError as e:
    print(f"HTTP error: {e}")
except requests.exceptions.ConnectionError as e:
    print(f"Network error: {e}")
except requests.exceptions.RequestException as e:
    print(f"Unexpected request error: {e}")
except OSError as e:
    print(f"Could not write file: {e}")

The download function deliberately does not catch exceptions itself — it has no way of knowing whether the caller wants to retry, log, alert, or abort. Instead, every failure mode is documented in the docstring and handled at the call site, where that context exists.

The exception hierarchy moves from most specific to most general. ConnectTimeout and ReadTimeout are caught before the base RequestException so each can produce a distinct, actionable message — a connect timeout typically means the host is unreachable and retrying immediately is pointless, while a read timeout mid-transfer may be worth retrying. OSError is kept separate because it is a local filesystem failure, entirely unrelated to the network layer, and may warrant a different response such as checking disk space.

$ ./streaming.py
Saved to java25spec.pdf (5,331,364 bytes)

Python requests credentials

The auth parameter provides a basic HTTP authentication; it takes a tuple of a name and a password to be used for a realm. A security realm is a mechanism used for protecting web application resources.

$ sudo apt-get install apache2-utils
$ sudo htpasswd -c /etc/nginx/.htpasswd user7
New password:
Re-type new password:
Adding password for user user7

We use the htpasswd tool to create a user name and a password for basic HTTP authentication.

location /secure {

        auth_basic "Restricted Area";
        auth_basic_user_file /etc/nginx/.htpasswd;
}

Inside the nginx /etc/nginx/sites-available/default configuration file, we create a secured page. The name of the realm is "Restricted Area".

index.html
<!DOCTYPE html>
<html lang="en">
<head>
<title>Secure page</title>
</head>

<body>

<p>
This is a secure page.
</p>

</body>

</html>

Inside the /usr/share/nginx/html/secure directory, we have this HTML file.

credentials.py
#!/usr/bin/python

import requests

user = 'user7'
passwd = '7user'

with requests.get("http://localhost/secure/", auth=(user, passwd)) as resp:
    print(resp.text)

The script connects to the secure webpage; it provides the user name and the password necessary to access the page.

$ ./credentials.py
<!DOCTYPE html>
<html lang="en">
<head>
<title>Secure page</title>
</head>

<body>

<p>
This is a secure page.
</p>

</body>

</html>

With the right credentials, the credentials.py script returns the secured page.

Source

Python requests documentation

In this article we have worked with the Python Requests module. The Requests library is a powerful and user-friendly HTTP client for Python. It allows you to send HTTP requests with ease, making it a popular choice for developers when working with web APIs and handling HTTP interactions in their applications.

Author

My name is Jan Bodnar, and I am a passionate programmer with extensive programming experience. I have been writing programming articles since 2007. To date, I have authored over 1,400 articles and 8 e-books. I possess more than ten years of experience in teaching programming.

List all Python tutorials.