Guides

Last reviewed on April 24, 2026

The guides below explain how PDF metadata is structured, how a heuristic extractor goes about pulling it out, and how to make the resulting records useful in a real workflow. They are written for people who handle a lot of academic and technical PDFs and want to understand what the tools are actually doing — and where they fall short.

Start here

Beyond the basics

Reference material

For format-specific notes — BibTeX, RIS, CSL-JSON, Markdown, DOCX — see the formats reference.