Skip to content

This repo contains google colab notebook for handing Docling for data extraction such as text, image, table etc.

License

Notifications You must be signed in to change notification settings

ParthaPRay/Docling_Colab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Docling_Colab

This repo contains google colab notebook for handing Docling for data extrcation such as text, image, table etc.

image

Docling

https://github.com/DS4SD/docling

https://ds4sd.github.io/docling/examples/


The colaboratory notebook shows how to access Docling for extraction of content from popular document formats (PDF, DOCX, PPTX, XLSX, Images, HTML, AsciiDoc & Markdown) and exports to HTML, Markdown and JSON (with embedded and referenced images).

Also show hybrid chuking using transformers, embedding and vector database.

Releases

No releases published

Packages

No packages published