top of page

How to scrape business intelligence data from the Innovation Funding Service using the Minexa API

Government funding portals publish structured competition data in plain HTML, yet collecting it at scale means clicking through pages manually or writing brittle scraping scripts that break whenever the layout shifts. The Apply for Innovation Funding service, part of the UK government's service.gov.uk platform, lists open and upcoming Innovate UK competitions with details that matter to grant consultants, R&D teams, and business intelligence analysts. This guide shows how to extract that data programmatically using the Minexa API.

What the Apply for Innovation Funding service contains

The competition search page at apply-for-innovation-funding.service.gov.uk/competition/search lists all active and upcoming Innovate UK funding competitions. Each listing includes a competition title, funding type, open and close dates, competition status, and a direct link to the full competition brief. For anyone building a funding intelligence feed, tracking grant cycles, or monitoring which sectors Innovate UK is prioritising, this is a reliable primary source.

The challenge is that the data only exists as rendered HTML. There is no public API. Collecting it manually is slow, and keeping it current requires repeating that work on a schedule. The Minexa API solves both problems: train the scraper once in the browser extension, then call the API from your own code whenever you need fresh data.

Watch the full walkthrough first

The video below covers the complete workflow from opening the Minexa extension on the competition search page through to the final API configuration and data export. Watching it before reading through the steps gives useful context for each screenshot.

Phase one: training the scraper in the browser extension

The Minexa developer workflow has two distinct phases. The first phase happens once in the Chrome extension and produces a stable scraper ID. The second phase uses that ID in every API call you make afterward. You never repeat the training unless the page layout changes significantly.

Start by opening the Minexa extension on the Minexa home page, then navigate to the competition search page.

Once the competition search page has loaded, the extension detects the list of results on the page automatically. You do not need to point at individual fields or write any selectors.

The extension popup confirms the detected page. Click the button to confirm you are on the right page and continue to the next step.

The extension then shows the pagination options it detected for the competition search page. Note that when you move to the API phase, pagination is not handled automatically. You will need to write a JS code scenario that defines what gets clicked to advance through pages. This is different from the Chrome extension workflow, where pagination is fully automated.

After confirming pagination, you choose whether to scrape the list only or the list plus linked detail pages. For most funding intelligence use cases, the list page contains enough structured data. Select your preference and continue.

Next, choose between simple and advanced scraping mode. Simple mode covers standard list extraction. Advanced mode lets you define custom click sequences or interaction workflows before extraction begins.

The extension highlights the full data container automatically. You confirm the selection and click to create the scraper. Minexa identifies all repeating data points within the container without requiring you to specify them individually.

After creating the scraper, all extracted data points become visible with navigation controls to review each column. This is where you can confirm that the fields you need, such as competition title, status, and dates, have been detected correctly.

Click the API request option to see the generated JSON and Python code samples. This screen also shows the scraper ID, which is the key value you will use in every API call going forward. Copy it and keep it.

Phase two: calling the Minexa API

Once the scraper is trained and you have the scraper ID, all subsequent extractions happen through the API. The endpoint is https://api.minexa.ai/data. Pass your scraper ID, the target URLs, and a columns parameter to specify which fields to return.

Here is a working Python example for extracting competition listings from the Apply for Innovation Funding search page:

import requests

url = "https://api.minexa.ai/data"
headers = {
    "Content-Type": "application/json",
    "x-api-key": "YOUR_API_KEY"
}
payload = {
    "scraper_id": 6241,
    "urls": [
        "https://apply-for-innovation-funding.service.gov.uk/competition/search"
    ],
    "columns": "top_40"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())

If you need to collect data across multiple competition pages, pass additional URLs in the same request rather than making separate calls. For pagination, write a JS code scenario that defines the next-page interaction so the API knows what to execute between pages.

What the extracted data looks like

Below is a sample of what the API returns for competition listings on the Apply for Innovation Funding search page. Fields shown here use cleaned names with prefixes removed for readability.

[
  {
    "competition_title": "Sustainable Innovation Fund: Round 4",
    "competition_status": "Open",
    "competition_type": "Grant",
    "open_date": "12 May 2026",
    "close_date": "25 Jun 2026",
    "competition_link": "/competition/1847/overview"
  },
  {
    "competition_title": "Driving the Electric Revolution: Industrialisation",
    "competition_status": "Open",
    "competition_type": "Grant",
    "open_date": "03 Apr 2026",
    "close_date": "18 Jun 2026",
    "competition_link": "/competition/1791/overview"
  },
  {
    "competition_title": "Future Economy: Net Zero Built Environment",
    "competition_status": "Upcoming",
    "competition_type": "Grant",
    "open_date": "01 Jul 2026",
    "close_date": "10 Sep 2026",
    "competition_link": "/competition/1903/overview"
  }
]

The competition_status field distinguishes open competitions from upcoming ones, which is useful for filtering in downstream pipelines. The competition_link field provides the relative path to each competition brief, which you can resolve to a full URL and pass back into the API to extract detail page content if needed. The close_date field is particularly valuable for deadline tracking and alerting workflows.

Running the job and exporting results

After configuration, the scraping job appears in your job list with a run button. Once triggered, results populate in a structured table. You can export to Excel or JSON directly from the interface, or handle the API response programmatically in your own pipeline.

If you want to keep the dataset current, set up a cron job on your own infrastructure and call the API on whatever schedule fits your use case. Pass the target URLs directly in each request. The scraper ID stays the same across every run, so there is no additional setup required after the initial training session.

For a related example of extracting structured data from another government-adjacent source, see the post on how to scrape non-profits and NGOs data from Grants.gov using the Minexa API.

Recent Posts

See All

Comments


Heading 2

bottom of page