Privacy Policy

Last reviewed on April 24, 2026

This page describes how GROBID Tools (the "site", reachable at grobid.org) handles information when someone visits the site or uses the in-browser PDF extractor. The goal is to explain things plainly, not to bury practices in legalese.

The short version

Information processed in your browser only

When you choose a PDF on the homepage, the file is loaded into your browser's memory using the standard FileReader and ArrayBuffer APIs. PDF.js parses the page content, and a set of regular-expression based heuristics extract titles, authors, identifiers, citations, and similar fields. Export files (JSON, plain text, Markdown, DOCX, BibTeX) are constructed in memory and offered as a browser download.

None of that data crosses the network. There is no server-side processing endpoint to send it to. If you stop the page or close the tab, the file and everything derived from it disappears with the tab's memory.

Information collected by the site

The site collects the kind of basic data that nearly every web property collects:

Cookies and similar technologies

Cookies are small text files that a website (or a third party loaded by the website) stores in your browser. The categories used here are described in the cookies policy. In summary:

Google AdSense disclosure

Third-party vendors, including Google, use cookies to serve ads based on a user's prior visits to this website or other websites. Google's use of advertising cookies enables it and its partners to serve ads to you based on your visit to this site and/or other sites on the Internet.

You may opt out of personalised advertising by visiting Google Ads Settings. You can also opt out of a third-party vendor's use of cookies for personalised advertising by visiting www.aboutads.info or, in Europe, www.youronlinechoices.eu. Google's own advertising and privacy practices are described at policies.google.com/technologies/ads.

Third-party services in use

Each of those vendors handles data under its own privacy policy.

Your rights under GDPR and similar laws

If you are in the European Economic Area, the United Kingdom, or another region with comparable privacy law, you generally have the right to:

Because the PDF extractor runs entirely client-side, the site itself does not hold a database of file contents that could be the subject of a typical access request — there is nothing on the server tied to your file.

Your rights under CCPA

If you are a California resident, you have the right to know what categories of personal information are collected, the right to request deletion, and the right to opt out of the "sale" or "sharing" of personal information as those terms are defined under California law. The site does not sell personal information for money. The use of advertising cookies described above may, however, qualify as "sharing" under California's broader definition, and you may opt out using the AdSense controls linked above and your browser's privacy controls.

Children

The site is not directed at children under the age of 13 (or the equivalent minimum age in your jurisdiction). The site does not knowingly collect personal information from children. If you believe a child has provided personal information here, please contact us so the relevant data can be removed.

Data retention

Server logs are retained for a short rolling window for diagnostic purposes. Analytics data is retained according to the Google Analytics retention setting in effect for this property. The site does not retain anything related to the contents of files processed by the extractor, because nothing related to those files reaches the server.

International transfers

The hosting and analytics providers used here operate globally. Data may be processed in countries other than the one you are in, including in the United States. Where transfers are subject to GDPR or comparable rules, the providers rely on the legal mechanisms (such as Standard Contractual Clauses) that they describe in their own policies.

Changes to this policy

This policy may be updated from time to time, for example when a new vendor is added or when an existing vendor changes its practices. The "Last reviewed" date at the top of the page reflects the most recent substantive update. Material changes will be reflected in the date and, where appropriate, called out on the homepage.

Contact

Privacy questions and data-related requests can be sent to [email protected] with "Privacy" in the subject line. Please describe the request as specifically as possible so it can be handled correctly.