The objective of this project is to provide the software tools that facilitate the administration and management of a web application for the cataloguing of incunabula.
Incunabula are the texts printed mainly during the second half of 15th century that are a key cultural element in a revolutionary period of the history and evolution of the book and the printing. Apart from the typical CRUD operations of a catalogue and the ability to annotate the main the metadata properties of incunabula documents, this catalogue has the ability to automatically identify the typeface used in the printed text by applying the Proctor-Haebler method, which classifies typefaces based on the height of the lines and the shape of the special letters. The identification of the typeface allows the incunabulum to be assigned to the printing office that created that typeface, as well as locating its printing spatially and temporally.
The software makes use of machine learning techniques for the segmentation of pages into lines and characters and for the identification of special characters and their shapes (see Lacasta et al, 2022).
The architecture of the project presented is shown in the following figure:
According to the components of this architecture, the repository is organized in four different parts:
This part performs the complete typography recognition process starting by accepting a page and outputting a resulting font.
This part related to the backend of expert/normal user
This part is the application of our website
This part is responsible for admin web application.
This Open Source project has the following software requirements:
For the deployment of the web site that provide access to the catalogue of incunabula, you should follow this order in the deployment of the different components:
- IncunabulaBackend
- incunabula-app
- fontIdentificationTool
All the deployment details that you need are available in the subfolder of each component.
In addition, if you want to deploy and run the admin tools, you can have a look at the folder incunabulaAdmin .
The pretrained data used in the 'fontIdentification' folder can be accessed through the following URL:
Download the folder and place it in the following directory: fontIdentificationTool/data/output/
Next, run the following command to create the Docker containers and start the application:
docker compose up --build
At this link you will see how the catalogue can be deployed as a web application.
In addition, you can watch the following video:
- J. Lacasta, J. Nogueras-Iso, F.J. Zarazaga-Soria, M.J. Pedraza-Gracia (2022). Tracing the origins of incunabula through the automatic identification of fonts in digitised documents. Multimedia Tools and Applications, 81:40977–40991