top of page

How to extract USDA Rural Development lender data from rd.usda.gov using the Minexa API

The USDA Rural Development lender directory at rd.usda.gov/resources/lenders lists hundreds of approved mortgage lenders across the United States, each with a company name and a direct website link. For developers building fintech tools, mortgage research pipelines, or lender qualification workflows, that data is genuinely useful. The problem is that it sits in an HTML table with no public API and no bulk export option.

This guide walks through how to extract that lender data programmatically using the Minexa API, a developer-facing extraction interface that lets you train a scraper once via the Minexa Chrome extension and then call it at scale through a standard HTTP endpoint.

What the data looks like

The lender directory page renders a multi-column list of company names paired with external website links. Each row in the extracted output maps directly to a position in the page structure. Here is a sample of what the API returns:

[
  {
    "company_name": "1st National City Mortgage",
    "company_name_2": "50 50 Mortgage Inc",
    "website_url": "https://www.imslending.com/",
    "website_url_2": "https://www.i3lending.com/"
  },
  {
    "company_name": "Academy Mortgage Corporation",
    "company_name_2": "Ace Mortgage",
    "website_url": "https://www.academymortgage.com",
    "website_url_2": "https://www.acemortgagela.com"
  },
  {
    "company_name": "Acopia, LLC",
    "company_name_2": "Advance Mortgage And Investment CO",
    "website_url": "https://www.acopiahomeloans.com",
    "website_url_2": "https://www.amic.co"
  }
]

Each record includes two company names and two website URLs per row, reflecting the two-column layout of the source page. Field names are clean and consistent across every record.

Step 1: Train the scraper in the Chrome extension

Before making any API call, you need a scraper_id. This comes from a one-time training session in the Minexa Chrome extension. Here is what that process looks like on the rd.usda.gov lender page.

Open the target page and launch the extension

Confirm the data container and field detection

Access the API request details and complete configuration

After completing configuration, Minexa assigns a stable scraper_id to this page structure. You use that ID in every subsequent API call. No retraining is needed unless the page layout changes significantly.

Step 2: Call the Minexa API

Once you have a scraper_id, extraction is a single POST request to https://api.minexa.ai/data. Here is a ready-to-run Python example:

import requests

url = "https://api.minexa.ai/data"
headers = {
    "Content-Type": "application/json",
    "x-api-key": "YOUR_API_KEY"
}
payload = {
    "scraper_id": 6271,
    "columns": {"top_n": "top_40"},
    "urls": ["https://www.rd.usda.gov/resources/lenders"],
    "scraping_params": {
        "js_render": False,
        "proxy_provider": False
    }
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())

The columns parameter with top_40 tells the API to return the 40 most relevant fields detected on the page. You can also pass named fields explicitly if you want to target specific columns. The scraping_params block controls rendering and proxy behavior. For a government HTML page like this one, JavaScript rendering is not required, which keeps credit consumption at the baseline rate of one credit per page.

Handling scale: URL batching and pipelines

If your use case involves pulling lender data from multiple state-specific or program-specific pages on rd.usda.gov, pass all target URLs in the urls array of a single API request. The Minexa API supports batching up to 50,000 URLs per request, so large-scale pulls are handled in one call rather than many sequential ones.

For recurring extraction needs, set up your own cron job to trigger the API on whatever schedule fits your pipeline. The API does not manage scheduling on its own when called programmatically, so the timing logic lives in your infrastructure.

Video walkthrough

The full scraper training and API setup process is shown in this tutorial:

Reusing the scraper

The scraper trained on the rd.usda.gov lender page will work on any page that shares the same structural layout. If USDA updates the page content but keeps the same HTML structure, the same scraper_id continues to work without any changes on your end. If the layout changes significantly, retraining takes the same amount of time as the initial setup.

The content_hash field in each API response lets you detect whether the page content has changed between runs, which is useful for building change-detection logic into your pipeline without storing full snapshots.

Read the full API documentation at minexa.stoplight.io/docs/minexa to explore all available parameters and response fields.

Recent Posts

See All

Comments


Heading 2

bottom of page