top of page

How to scrape developer and API discussion data from Zotero Forums

Forum threads hold more structured data than they appear to. Every comment has an author, a timestamp, a unique ID, and sometimes edit history. If you need to collect that data at scale, copying it manually is not realistic. This guide shows how to extract it cleanly from Zotero Forums using the Minexa.ai Chrome extension.

What data is available on a Zotero Forums thread

The target page is a discussion thread on forums.zotero.org. Each comment in the thread exposes a consistent set of fields that Minexa captures automatically:

  • comment_text: the full body of the comment

  • comment_id: a unique identifier per comment (e.g. Comment_217335)

  • comment_datetime: the ISO 8601 timestamp of when the comment was posted

  • user_name: the display name of the commenter

  • profile_link: the relative URL to the commenter's profile

  • edit_info: populated when a comment was edited after posting

  • date_time: a nested object capturing the human-readable date, title attribute, and datetime attribute from the time element

That last field is worth noting. Minexa captures data hidden inside HTML attributes, not just visible text. So you get the ISO timestamp from the datetime attribute alongside the human-readable label, all in one structured record.

How to set up the scraper

Open the Zotero Forums thread you want to extract. Once the page loads, open the Minexa.ai extension.

Confirm you are on the right page. Minexa then checks for pagination and shows you what it detected.

Choose whether to scrape the comment list only, or follow each comment's link to its detail page. For most forum research use cases, the list is sufficient.

Minexa highlights the comment container automatically. You confirm it, and the scraper is created. All data points are surfaced and labelled without any manual field selection.

Video walkthrough

Sample output

Here are two records from a real extraction run on the Zotero Forums thread:

[
 {
 "comment_id": "Comment_217335",
 "comment_datetime": "2015-02-26T07:08:22+00:00",
 "user_name": "aurimas",
 "profile_link": "/profile/275915/aurimas",
 "comment_text": "You cannot search across multiple libraries.",
 "edit_info": ""
 },
 {
 "comment_id": "Comment_270455",
 "comment_datetime": "2017-02-19T10:59:15+00:00",
 "user_name": "schibones",
 "profile_link": "/profile/1076249/schibones",
 "comment_text": "Would it be possible to leave the search term active as you click through libraries?",
 "edit_info": ""
 }
]

Exporting and reusing the scraper

Once the job finishes, export your data to Excel, JSON, or send it directly to Google Sheets.

The scraper you trained on this thread works on any structurally similar Zotero Forums discussion page. No retraining needed. Run it again on a different thread and extraction starts immediately.

Recent Posts

See All

Comments


Heading 2

bottom of page