allenai

allenai/olmocr

Toolkit for linearizing PDFs for LLM datasets/training

18,256stars1,503forksPythonView on GitHub →

Star Growth on Trending

07-01
18,256#7

Trending Appearances (1)