top of page

How to scrape SEC filings data from J.Jill using the Minexa API

Every SEC filing J.Jill submits is publicly visible on their investor relations page. The problem is that reading it is easy, but extracting it at scale is not. Form types, filing dates, PDF links, XBRL zips, conversion format arrays, and detail page URLs all sit inside a structured HTML table that changes with every new submission. Manually collecting that data is slow. Parsing it with custom selectors breaks whenever the page updates.

This guide shows how to use the Minexa API to extract structured filings data from the J.Jill SEC filings page programmatically, with a scraper trained once and reused on demand.

What the extracted data looks like

Each filing row returns a flat and nested structure. Here is a sample from two records:

[
  {
    "date": "June 10, 2026",
    "form_type": "8-K",
    "document_type": "Current report filing",
    "status_code": "ORIG",
    "file_format": "XBRL_HTML",
    "cloudfront_link": "//d18rn0p25nwr6d.cloudfront.net/...pdf",
    "cloudfront_link_2": "//d18rn0p25nwr6d.cloudfront.net/...html",
    "cloudfront_link_3": "//d18rn0p25nwr6d.cloudfront.net/...zip",
    "document_link": "//d18rn0p25nwr6d.cloudfront.net/...xls",
    "conversion_type": [{"value":"ORIG"},{"value":"CONVPDF"},{"value":"XBRL"}],
    "sec_filings_link": "/...FilingId=19526456"
  },
  {
    "date": "June 5, 2026",
    "form_type": "4",
    "document_type": "Statement of Changes in Beneficial Ownership",
    "status_code": "ORIG",
    "file_format": "XLS",
    "cloudfront_link": "//d18rn0p25nwr6d.cloudfront.net/...pdf",
    "sec_filings_link": "/...FilingId=19518355"
  }
]

The conversion_type field is a nested array of objects. Each object carries a value key indicating an available format variant for that filing, such as ORIG, CONVPDF, XBRL_HTML, or XLS. The display_style field captures whether a row is rendered as visible or hidden in the DOM, which is useful for filtering grouped amendment rows. The sec_filings_link field provides a relative path to the individual filing detail page, enabling downstream enrichment per record.

Training the scraper

Navigate to the J.Jill SEC filings page and open the Minexa Chrome extension.

Click the 'I'm on the right page' button to confirm the target URL, then review the pagination options detected by the extension and click Continue.

Select the list scraping mode, then choose Simple on the scenario screen. The extension highlights the full filings container automatically.

After confirming the container, all data columns are discovered and labelled automatically. Use the prev/next navigation to review each extracted field.

Click 'API Request' to view the pre-generated Python code with your scraper ID already filled in.

Calling the API at scale

Once the scraper is created, pass your scraper ID and target URLs into the request body. The endpoint is always https://api.minexa.ai/data/.

data = {
  "batches": [{
    "scraper_id": 4088,
    "columns": ["top_30"],
    "urls": ["https://investors.jjill.com/.../SEC-Filings/default.aspx"],
    "scraping": {
      "js_render": True,
      "timeout": 30,
      "proxy": "verified",
      "retry": 3
    }
  }],
  "threads": 4
}

The same scraper ID works across all structurally identical pages. Up to 50,000 URLs can be submitted in a single batch request. For recurring extraction, set up your own cron job and call the API on your preferred schedule.

Get started at minexa.ai or read the full API documentation to configure scraping parameters for your pipeline.

Recent Posts

See All

Comments


Heading 2

bottom of page