PubTables-1M: Towards comprehensive table extraction from unstructured
documents
Paper
• 2110.00061
• Published
• 3
Optimized Table Tokenization for Table Structure Recognition
Paper
• 2305.03393
• Published
• 1
Qwen3-VL Technical Report
Paper
• 2511.21631
• Published
• 157
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
Paper
• 2510.14528
• Published
• 118
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text
• Updated
• 9.96k
• 1.55k
DeepSeek-OCR: Contexts Optical Compression
Paper
• 2510.18234
• Published
• 93
Image-Text-to-Text
• Updated
• 3.21M
• 3.17k
HunyuanOCR Technical Report
Paper
• 2511.19575
• Published
• 22
Image-Text-to-Text
• Updated
• 896k
• 554
DocReward: A Document Reward Model for Structuring and Stylizing
Paper
• 2510.11391
• Published
• 27
SynthDoc: Bilingual Documents Synthesis for Visual Document
Understanding
Paper
• 2408.14764
• Published
OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal
Document Layout Generation
Paper
• 2510.26213
• Published
• 10
MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns
Paper
• 2511.10390
• Published
Structured Document Translation via Format Reinforcement Learning
Paper
• 2512.05100
• Published
• 2
DeepSeek-OCR 2: Visual Causal Flow
Paper
• 2601.20552
• Published
• 63
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
Paper
• 2601.21639
• Published
• 50
PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing
Paper
• 2601.21957
• Published
• 19
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper
• 2601.21468
• Published
• 25