top of page

How to scrape HR and workforce data from Cutshort using the Minexa API

Cutshort lists hundreds of recruitment and HR companies, each with funding status, headcount bands, founding year, and a full company description. Collecting that data page by page manually is not a realistic option at any meaningful scale. This guide shows how to train a scraper once using the Minexa Chrome extension and then call the Minexa API to extract it programmatically.

Watch the full walkthrough

The video above covers every step from opening Minexa to running the extraction. The screenshots below follow the same sequence.

Training the scraper on Cutshort

Open the Minexa home page, then navigate to cutshort.io/companies/recruitment.

Once the page loads, open the extension and confirm you are on the correct page.

The extension detects pagination options. Since you will drive pagination via your own JS code scenario in the API, review what is detected and click Continue.

Select the scraping mode. For a company listing page like this, choose the single list option.

Minexa highlights the full data container automatically. Confirm the selection and click Create Scraper.

All extracted columns appear for review. Use the next/prev navigation to inspect each field before proceeding.

Click API Request to get the pre-generated Python code and your scraper_id.

What the extracted data looks like

Each company record returns fields like these:

[{"company_name":"Randstad India","business_category":"Products & Services","company_status":"Profitable","company_size_range":"200-500","employee_count":"50","establishment_year":"1960","website_url":"https://randstad.in","company_profile_link":"/company/randstad-india-AarTarkO"},{"company_name":"Incruiter","business_category":"Products & Services","company_status":"Bootstrapped","company_size_range":"20-100","employee_count":"39","establishment_year":"2018","website_url":"https://incruiter.com"}]

The company_status field captures signals like Profitable or Bootstrapped, useful for segmenting outreach by funding stage. The company_size_range encodes headcount bands directly.

Calling the Minexa API

Once the scraper is created, use the scraper_id in your API request. Pagination across Cutshort pages must be handled by passing each page URL explicitly or via a js_code scenario.

data = {"batches": [{"scraper_id": 6231,"columns": ["top_30"],"urls": ["https://cutshort.io/companies/recruitment"],"scraping": {"js_render": True,"timeout": 30,"proxy": "verified","retry": 3}}],"threads": 4}

Set threads based on your plan limit to control parallel throughput. The scraper trained here works on any structurally identical Cutshort company listing page.

For a similar API-based jobs extraction tutorial, see how to scrape jobs data from FlexJobs using the Minexa API.

Recent Posts

See All

Comments


Heading 2

bottom of page