A high-quality tool for convert PDF to Markdown and JSON
MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
This Jupyter notebook extension allows you to save your notebook as a PDF. To make it easier to reproduce the contents of the PDF at a later date the original notebook is attached to the PDF. Unfortunately not all PDF viewers know how to deal with attachments. PDF viewers known to support downloading of file attachments are: Acrobat Reader, pdf.js and evince. The pdftk CLI program can also extract attached files from a PDF. Preview for OSX does not know how to display/give you access to...