Hunting Typo Squatters

Posted May 15, 2023 by Adam Cole ‐ 4 min read

The Threat

In early May 2023, Checkmarx published a report detailing a subtle form of typosquatting in Node Package Manager (NPM). Typosquatting is a practice where cybercriminals copy legitimate web domains or package names using subtle misspellings, homoglyphs, or other misdirections. In the report, Checkmarx shows how attackers can mimic legitimate NPM packages by merely changing the capital letters in their titles to lower case. This opened a door for organizations to inadvertently download malicious packages, posing significant threats to enterprise security.

The NPM registry patched this vulnerability recently, but organizations need to be aware of any malicious packages that they may have downloaded prior to the change. As Yehuda Gelb, a security researcher at Checkmarx, highlights, “it’s essential to verify the names of the packages that you’re installing and have robust security systems in place”.

Hunting Process

To investigate the impact of this bug on our own repositories, we employed GitHub’s new Export SBOM function, which allows anyone with read access to a GitHub repository to generate an NTIA compliant Software Bill of Materials (SBOM) with a single click. The resulting JSON file saves project dependencies and metadata, such as versions and licenses, in the industry-standard SPDX format. This new self-service capability is part of GitHub’s supply chain security solution and is free for all cloud repositories on GitHub.

Our goal was to identify if any of our repositories had dependencies on packages with capital letters and if they had an all lower-case equivalent, implying we were affected by the bug. The hunting process was twofold:

  • Initially, we crafted a script to fetch the Software Bill of Materials (SBOM) for each repository leveraging the Export SBOM function provided by GitHub’s API. Here’s a generalized view of how it looked in code:
import requests
import json

# Github token for authentication
headers = {'Authorization': 'Token YOUR_GITHUB_TOKEN'}

def fetch_sbom(repo):
    # API endpoint for Export SBOM function
    url = f"https://api.github.com/repos/{repo}/generate_sbom"

    response = requests.post(url, headers=headers)

    if response.status_code == 200:
        return json.loads(response.text)
    else:
        print(f"Failed to generate SBOM for {repo}")
        return None

# Loop over your repos and fetch SBOMs
repos = ["repo1", "repo2", "repo3"]  # replace with your repos
for repo in repos:
    sbom = fetch_sbom(repo)
    with open(f"{repo}_sbom.json", "w") as f:
        json.dump(sbom, f)
  • Secondly, we parsed these SBOM files to identify dependencies with capital letters in their names. For each dependency, we converted the name to lowercase, formulated the corresponding NPM package path, and probed if any such packages were present on NPM. Here’s an illustrative code snippet:
import os

def check_NPM_package(package):
    # NPM registry API endpoint
    url = f"https://registry.NPMjs.org/{package}"

    response = requests.get(url, headers=headers)

    return response.status_code == 200

# Loop over the SBOMs
for repo in repos:
    with open(f"{repo}_sbom.json", "r") as f:
        sbom = json.load(f)
    
    # Loop over the dependencies
    for package in sbom["packages"]:
        # Convert to lowercase
        lowercase_package = package.lower()
        
        # Check if this package exists on NPM
        if check_NPM_package(lowercase_package):
            print(f"Package {lowercase_package} exists on NPM")
        else:
            print(f"Package {lowercase_package} not found on NPM")

This code will print out whether each lowercased package exists on NPM or not.

Please note that these code snippets are simplified for the sake of clarity and brevity. You would need to add in your own custom requirements.

Upon running this process for all our repositories, we discovered that while some packages with capital letters did indeed have lowercase typosquatters, we had not been tricked into using any of these squatted packages. This finding confirmed that we were secure from this particular vulnerability.git

Summary

The recently discovered NPM typosquatting vulnerability underscores the importance of vigilant package management practices in the open-source ecosystem. Even though NPM has patched this flaw, the potential for similar attack vectors remains, making it crucial for organizations to adopt robust security measures.

Our investigation, supported by GitHub’s new SBOM tooling, not only helped us verify we were not impacted but also let us take a proactive stance against possible future vulnerabilities. As we move forward, we remain committed to maintaining a secure open-source environment for our projects, encouraging awareness, and sharing our learnings with the wider community. We hope our journey can inspire others to investigate and secure their own codebases using good processes and continual improvement.

Additional References

A Software Bill of Materials (SBOM), much like a physical bill of materials for manufacturing, is a formal, comprehensive compilation of components that make up a software application. It lists all elements of a software product, including libraries, modules, and other dependencies, detailing their specific versions, licenses, and other relevant metadata.

The NTIA (National Telecommunications and Information Administration), a part of the U.S. Department of Commerce, has led the way in establishing guidelines for SBOMs as part of its multi-stakeholder process on software component transparency. An NTIA compliant SBOM adheres to these guidelines, providing a standardized format for tracking and detailing software components.