The Wayback Machine - http://web.archive.org/web/20230301075931/https://github.com/topics/document-ai

#

document-ai

Here are 13 public repositories matching this topic...

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Updated Feb 27, 2023
Python

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

nlp ocr computer-vision document-ai multimodal-pre-trained-model eccv-2022

Updated Feb 13, 2023
Python

tstanislawek / awesome-document-understanding

A curated list of resources for Document Understanding (DU) topic

Updated Jan 18, 2023

Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Updated Mar 1, 2023
HTML

deepdoctection / deepdoctection

A Repo For Document AI

python nlp ocr tensorflow pytorch document-parser document-layout-analysis table-recognition table-detection document-understanding publaynet layoutlm document-ai document-image-analysis pubtabnet

Updated Feb 28, 2023
Python

jpWang / LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

nlp information-extraction document-analysis document-understanding multilingual-models document-ai multimodal-pre-trained-model

Updated Oct 31, 2022
Python

clovaai / webvicob

Official Implementation of Web-based Visual Corpus Builder (Webvicob)

nlp ocr document-ai

Updated Feb 21, 2023
Python

ZeningLin / ViBERTgrid-PyTorch

An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"

information-extraction document-analysis key-information-extraction document-ai visual-information-extraction

Updated Feb 4, 2023
Python

doc-analysis / ReadingBank

ReadingBank: A Benchmark Dataset for Reading Order Detection

nlp natural-language-processing ocr document-understanding document-ai document-intelligence

Updated Aug 29, 2021

nttmdlab-nlp / SlideVQA

SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)

nlp ocr computer-vision document-ai aaai2023

Updated Jan 12, 2023
Python

Unstructured-IO / community

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

open-source community machine-learning deep-learning nlp-parsing data-pipeline ocr-python document-ai preprocessing-data document-parsing

Updated Feb 24, 2023

whn09 / table_structure_recognition

Table detection and table structure recognition using Yolov5

ocr table table-detection table-structure-recognition yolov5 document-ai

Updated Oct 9, 2022
Python

bhadreshpsavani / SmartOCR-with-LayoutLM

Exploring LayoutLM for Smart OCR Capabilities

layoutlm document-ai document-inteligence

Updated Apr 1, 2021

Improve this page

Add a description, image, and links to the document-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the document-ai topic, visit your repo's landing page and select "manage topics."