You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Docling's DocumentConverter supports several formats like PDF, HTML, or .docx, which allows the conversion of those files into a DoclingDocument object. Each supported format is enumerated as an InputFormat instance.
The DocumentConverter is leveraged in many integrations. For instance, in LlamaIndex the DoclingReader can be leveraged in SimpleDirectoryReader to convert PDF files
Since the conversion may be computationally costly, users may want to persist the converted documents as .json files and use them later in other data processing pipelines
Request
Create a new conversion backend that simply reads the content of a JSON file that contains a DoclingDocumentexport.
In this way, the pattern example above could be reused
Requested feature
Background
DocumentConverter
supports several formats like PDF, HTML, or.docx
, which allows the conversion of those files into aDoclingDocument
object. Each supported format is enumerated as anInputFormat
instance.DocumentConverter
is leveraged in many integrations. For instance, in LlamaIndex theDoclingReader
can be leveraged inSimpleDirectoryReader
to convert PDF files.json
files and use them later in other data processing pipelinesRequest
DoclingDocument
export.Alternatives
DoclingDocument
files to integration frameworks (e.g., create anotherDoclingReader
for JSON in LlamaIndex)docling-core
The text was updated successfully, but these errors were encountered: