← Python Code Performance & Security
Browse Python Concepts

Dependency Auditing — pip-audit and CVE Scanning

Mental Model

Think of your application's dependencies as the bricks and mortar supplied by various vendors for your building. Even if your architectural design is sound, if a vendor provides faulty bricks (vulnerable libraries), your entire structure is compromised. Dependency auditing is like having an inspector check every batch of incoming materials for known defects.

Rule: Always integrate automated dependency auditing with tools like pip-audit into your CI/CD pipeline to continuously scan for and mitigate known vulnerabilities in third-party packages.

The Setup

A Python microservice has been in production for several years. Over time, its requirements.txt (or pyproject.toml) has grown to include dozens of dependencies, many of which haven't been updated in a long time. The team needs to assess the security posture of these external packages without manually checking each one.

What Does This Print?

Broken code
Python
# requirements.txt
# This file simulates a project with potentially vulnerable dependencies
# These versions are intentionally chosen to be old and known to have CVEs
django==2.2.0          # Known vulnerabilities in older versions
cryptography==2.8      # Known vulnerabilities in older versions
requests==2.25.0       # Some older versions have moderate vulns
pyyaml==5.3.0          # Older versions of PyYAML had arbitrary code execution vulns (e.g. CVE-2020-14343)

# main.py (The application code itself might be secure, but its dependencies are not)
import django
import cryptography
import requests
import yaml
import os

def get_django_version():
    return django.get_version()

def load_user_config(file_path):
    with open(file_path, 'r') as f:
        # Using yaml.load, which could be problematic with older versions or default loaders
        return yaml.load(f, Loader=yaml.Loader) # Even with Loader=yaml.Loader, old pyyaml might be vulnerable

def make_external_request(url):
    return requests.get(url).status_code

if __name__ == '__main__':
    print(f"Django version: {get_django_version()}")
    print(f"Cryptography version: {cryptography.__version__}")
    print(f"Requests version: {requests.__version__}")
    print(f"PyYAML version: {yaml.__version__}")
    print("\nNote: This application *itself* might not directly expose the vuln,")
    print("but the underlying libraries are outdated and known to be vulnerable.")
    print("Run 'pip install -r requirements.txt' then 'pip-audit' to see findings.")
If 'requirements.txt' is installed and then scanned using 'pip-audit', which of these dependencies do you predict will be flagged for known Common Vulnerabilities and Exposures (CVEs)?

The Output

What actually happens
$ pip-audit Found 3 vulnerabilities in your dependencies: Package Version ID Description ----------- ---------- ----------------- ----------------------------------------------------------- Django 2.2.0 PYSEC-2022-XXXXX Django <2.2.28 fixed template-related denial of service. cryptography 2.8 PYSEC-2020-XXXXX cryptography <=2.8 had potential side-channel attacks. pyyaml 5.3.0 PYSEC-2020-XXXXX PyYAML before 5.3.1 allows execution of arbitrary code...

Running pip install -r requirements.txt followed by pip-audit will reveal multiple known vulnerabilities in the listed dependencies. The specific CVEs and descriptions will vary based on the vulnerability database, but older versions of Django, cryptography, and PyYAML are almost certainly known to have security flaws that pip-audit will detect.

Why Python Does This

Python's package ecosystem is vast, relying on PyPI for distribution. While convenient, this introduces a supply chain risk: vulnerabilities in upstream packages become vulnerabilities in your application. pip-audit works by inspecting your installed Python packages (or requirements.txt / pyproject.toml) and comparing them against known vulnerability databases, such as the Python Packaging Advisory Database (PyPA) which aggregates information from CVEs and other security disclosures. It does not analyze your custom application code but rather focuses on the versions of third-party libraries you are using. The pip package manager itself doesn't inherently check for vulnerabilities during installation; it's designed for package resolution and installation. Tools like pip-audit fill this critical security gap by providing this post-installation or pre-deployment security check.

The Fix

Corrected pattern
Python
# requirements.txt
# FIX: Update dependencies to versions known to be secure, or at least patched.
# Always consult release notes and security advisories for specific versions.
django==4.2.0          # Example of a much newer, patched version
cryptography==41.0.0   # Example of a much newer, patched version
requests==2.31.0       # Example of a much newer, patched version
pyyaml==6.0            # Example of a much newer, patched version

# main.py (application code remains the same, but dependencies are now secure)
import django
import cryptography
import requests
import yaml
import os

def get_django_version():
    return django.get_version()

def load_user_config(file_path):
    with open(file_path, 'r') as f:
        # FIX: Explicitly use yaml.safe_load for untrusted input, regardless of library version,
        # as a defense-in-depth measure.
        return yaml.safe_load(f)

def make_external_request(url):
    return requests.get(url).status_code

if __name__ == '__main__':
    print(f"Django version: {get_django_version()}")
    print(f"Cryptography version: {cryptography.__version__}")
    print(f"Requests version: {requests.__version__}")
    print(f"PyYAML version: {yaml.__version__}")
    print("\nDependencies are now updated and should be free of known CVEs.")
    print("Run 'pip install -r requirements.txt' then 'pip-audit' to verify.")

Tools like pip-audit work by scanning your installed Python packages or your requirements.txt file, extracting their names and versions. They then cross-reference this information against publicly available vulnerability databases (like PyPI Advisory DB or OSV.dev), identifying known Common Vulnerabilities and Exposures (CVEs) associated with those specific package versions.

How This Fails in Real Systems

A medium-sized startup discovered a critical Remote Code Execution (RCE) vulnerability in their core API service, stemming from an outdated version of a common serialization library (pickle or PyYAML). The vulnerability had been present for over a year, unbeknownst to the team, until a security audit team ran pip-audit on their deployed images. This RCE, if exploited, could have allowed full takeover of their production servers. The incident led to immediate adoption of automated dependency scanning in their CI/CD pipeline, requiring pip-audit to pass before any code could be deployed to production, preventing future regressions.

Key Takeaway

Always integrate automated dependency auditing with tools like pip-audit into your CI/CD pipeline to continuously scan for and mitigate known vulnerabilities in third-party packages.
Common mistake: Developers treat their initial dependency versions as permanent fixtures, assuming packages remain secure after installation, and skip routine auditing that would reveal newly disclosed CVEs introduced into their existing dependency tree.