Document Analysis and Recognition – ICDAR 2025 Workshops

Wuhan, China, September 20–21, 2025, Proceedings, Part I

Lianwen Jin, Richard Zanibbi, Veronique Eglin (Herausgeber)

Buch | Softcover

XV, 396 Seiten

2025
Springer International Publishing (Verlag)
978-3-032-09367-7 (ISBN)

Noch nicht erschienen - erscheint am 15.12.2025
Versandkostenfrei
Auch auf Rechnung

Artikel merken

The two-volume set LNCS 16225 + 16226 constitutes the proceedings of International Workshops co-located with the 19th International Conference on Document Analysis and Recognition, ICDAR 2025, held in Wuhan, China, during September 2025.

The 46 full papers included in these proceedings were carefully reviewed and selected from a total of 74 submissions. The contributions stem from the following workshops:

Part I: The Fifth ICDAR International Workshop on Machine Learning (WML 2025); ICDAR 2025 Workshop on Multi-Modal Mathematical Reasoning in Documents (M3RD 2025);

Part II: The 16th IAPR International Workshop on Graphics Recognition (GREC 2025); ICDAR 2025 Workshop on Visual Text Generation and Text Image Processing VT-TIP 2025); ICDAR 2025 Workshop on Documents Analysis of Low-resource Languages (DALL 2025)

.- The Fifth ICDAR International Workshop on Machine Learning (WML 2025)
.- PBa-LLM: Privacy- and Bias-aware NLP using Named-Entity Recognition (NER).
.- Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs.
.- Improving Handwritten Text Recognition via 3D Attention and Multi-Scale Training.
.- Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets.
.- Text Prompt to Image Generation for Classification of Similar and Non-Similar Scene Images to Improve Text Spotting Performance.
.- Enhancing Document VQA Models via Retrieval-Augmented Generation.
.- A New Multimodal Cross-Domain Network for Classifi-cation of Challenging Scene Images.
.- TextBite: A Historical Czech Document Dataset for Logical Page Segmentation.
.- Few-Part-Shot Font Generation.
.- Non-Linear Audio-Visual Storytelling from Scanned Comics: A Character-Centric Approach.
.- Automatic Text Box Placement for Supporting Typographic Design.
.- Visual Document Matching for Zero-Shot Document Classification.
.- Evaluating Popular Scene Text Detection and Recognition Methods on Tombstones.
.- Deep learning for defect detection in answer document image.
.- ResNet-TPP: A Parallel PHOC-PHOS Framework for Zero-Shot Handwritten Word Recognition in Low-Resource Scripts.
.- Interpret, prune and distill Donut :towards lightweight VLMs for VQA on documents.
.- Link prediction Graph Neural Networks for structure recognition of Handwritten Mathematical Expressions.
.- Rule-Based Reinforcement Learning for Document Image Classification with Vision Language Models.
.- ICDAR 2025 Workshop on Multi-Modal Mathematical Reasoning in Documents (M3RD 2025).
.- Boosting Handwritten Mathematical Expression Recognition through Contextual Reasoning with Vision Large Language Models (vLLMs).
.- S

Erscheinungsdatum	24.11.2025
Reihe/Serie	Lecture Notes in Computer Science
Zusatzinfo	XV, 396 p. 148 illus., 130 illus. in color.
Verlagsort	Cham
Sprache	englisch
Maße	155 x 235 mm
Themenwelt	Informatik ► Grafik / Design ► Digitale Bildverarbeitung
Schlagworte	Artificial Intelligence • benchmark dataset • Deep learning • Document Analysis and Recognition • Document Analysis Systems • Document Image Processing • document understanding • Graphics, Diagram, and Math Recognition • Graphics Recognition • handwriting analysis and recognition • Handwriting Recognition • Historical document • historical document analysis • low-resource language processing • machine learning • multimedia document analysis • multi-modal mathematical reasoning • Neural networks • NLP for Document Understanding • Optical Character Recognition • Scene Text Detection and Recognition • visual text generation
ISBN-10	3-032-09367-8 / 3032093678
ISBN-13	978-3-032-09367-7 / 9783032093677
Zustand	Neuware