top of page

How to scrape scientific and research data from UCSF Clinical Trials

Collecting clinical trial data by hand means opening dozens of pages, copying fields one by one, and hoping nothing changes between sessions. There is a faster way.

The UCSF Clinical Trials browse directory lists active studies across conditions, phases, and study types. It is a structured, publicly accessible source that researchers, analysts, and health data teams regularly need in bulk. The problem is that the site was not built for export. Getting the data out in a usable format requires either a lot of manual effort or a tool that handles the extraction automatically.

This walkthrough shows how to do it with Minexa.ai, a Chrome extension that detects page structure automatically and exports clean, structured data to Excel, Google Sheets, or JSON without any code.

Watch the full tutorial first

The video below covers the complete extraction process from start to finish. It is worth watching before going through the steps.

What the data looks like once extracted

Here is a sample of what Minexa.ai pulls from the UCSF Clinical Trials browse page. Each row corresponds to one trial listing.

[
 {
 "title": "Improving Glycemic Control in Adults With Type 2 Diabetes",
 "study_type": "Interventional",
 "conditions": "Type 2 Diabetes Mellitus",
 "phase": "Phase 3",
 "listing_link": "https://clinicaltrials.ucsf.edu/trial/NCT0512XXXX"
 },
 {
 "title": "Cognitive Outcomes After Cardiac Surgery in Older Adults",
 "study_type": "Observational",
 "conditions": "Postoperative Cognitive Dysfunction",
 "phase": "N/A",
 "listing_link": "https://clinicaltrials.ucsf.edu/trial/NCT0489XXXX"
 },
 {
 "title": "Early Intervention for Pediatric Asthma in Urban Settings",
 "study_type": "Interventional",
 "conditions": "Asthma, Pediatric",
 "phase": "Phase 2",
 "listing_link": "https://clinicaltrials.ucsf.edu/trial/NCT0501XXXX"
 }
]

Each record gives you the trial name, study type, condition being studied, trial phase, and a direct link to the full listing page for deeper detail if needed.

Step-by-step: how the extraction works

Open Minexa.ai from your Chrome toolbar. The home screen confirms the extension is active and ready.

Navigate to clinicaltrials.ucsf.edu/browse/. The page loads the full trial directory. Minexa.ai detects the page automatically.

The extension popup appears with a confirmation prompt. Click 'I'm on the right page' to proceed.

Minexa.ai then shows the pagination it detected, including whether a next page button is present. Confirm and click Continue.

You will be asked whether to scrape just the list, or the list plus the detail page for each trial. For most research use cases, the list alone covers the core fields. If you need full trial descriptions, choose the list-and-details option.

Select simple scraping mode to proceed with the default automatic detection. Advanced mode is available if you need custom click workflows.

Minexa.ai highlights the full data container on the page automatically. No manual field selection is needed.

After creating the scraper, all extracted data points appear in a preview panel. Use the navigation arrows to review every column before running the full job.

The summary screen lets you connect Google Sheets or set up a recurring schedule so the job runs automatically at whatever interval fits your research workflow.

Once the job runs, your data appears in a structured table. Export to Excel or JSON directly from this screen.

Why this matters for research data workflows

Clinical trial directories update regularly. New studies open, phases change, and conditions shift. A one-time export captures a snapshot. A scheduled Minexa.ai job captures the changes over time, giving you a running record without any manual effort after the initial setup.

The same scraper trained once on the UCSF browse page can be reused on every future run. No reconfiguration, no maintenance unless the site structure changes significantly.

For a related walkthrough on extracting public records data from government sources, see: How to scrape non-profits and NGOs data from Grants.gov using the Minexa API.

Recent Posts

See All

Comments


Heading 2

bottom of page