Skip to content

Google Season of Docs 2021

Émilie Pagé-Perron edited this page Mar 25, 2021 · 21 revisions

About CDLI

![cuneiform text image]( | width=400) CDLI is an international digital library of ancient artifacts inscribed with cuneiform writing. The mission of CDLI is to collect, preserve and make available images, text, and metadata of all artifacts inscribed with the cuneiform script. It is the sole project with this mission and we estimate that our 335,000 catalog entries cover some two-thirds of all sources in collections around the world. Our data are available publicly at and our audiences comprise primarily scholars and students, but with growing numbers of informal learners.

At the heart of CDLI is a group of developers, language scientists, machine learning engineers, and cuneiform specialists who develop software infrastructure to process and analyze curated data. To this effect, we are actively wrapping up two projects: CDLI Framework Update and Machine Translation and Automated Analysis of Cuneiform Languages As part of these projects, we have been building a natural language processing platform to empower specialists of ancient languages for undertaking automated annotation and translation of Sumerian language texts thus enabling the data-driven study of languages, culture, history, economy, and politics of ancient Near Eastern civilizations. As part of this platform, we are focusing on data standardization using Linked Open Data to foster best practices in data exchange and integration with other digital humanities and computational philology projects.

Our tools are available as standalone software but at the core of our services to the community is a web platform which we are actively developing as we hope to phase out our current web platform in the next 12-18 months.

Project: Documenting features, user & editor guides.

CDLI has a huge codebase version controlled at Github & Gitlab with more than 100+ contributing members, cdli operates in different technical areas like full-stack development, machine translations/learning & databases. With the growing codebase, we are interested in documenting the fundamental features and write guides for users and editors. This project involves documenting the latest features of the CDLI framework, re-structuring current documents, and make guides for users & editors. The existing documents can be found at

Project’s Scope

  • Audit current documentation, and organize existing and future documentation
  • Consolidate existing user guide documentation under the new documentation organization
  • Write up remaining user guides for the CDLI framework (website) core features and additional features
  • Write up the editor guide for the management of CDLI data on the framework (website)

Project’s Metrics

To track the project metrics we will take the necessary feedback from the users and editors based on how the guides helped them in understanding/using the project features. We will examine the documents based on the quantity and reach of the document whereas the document is expected to have good coverage of project features.

Project’s Budget

Budget Item Amount Running Total Notes
Technical writer’s stipend 6000 USD 6000 USD A technical writer to document, review and publish the needed project
Mentor’s stipend 500 USD 7500 USD 3 Volunteers into 500 USD Each
Organization Merchandise 200 USD 7700 USD Designing, printing & shipping CDLI t-shirts & stickers

Additional Information

  • No coding is required but the writer is expected to have a basic understanding of cataloging, and NLP principles.
  • We use WIKI and Github pages, knowledge of git is required.


Developpers guide

Some of the Framework features have very well organized and useful readmes, maybe we can keep those as is but we should centralize access to all those readmes somewhere so it's easy to fid the appropriate documentation for each cdli framework feature.

Framework Features

Morphological annotator
Minio server and fatcrossing feature

Framework core

Bibliographic data management

Administrators and Editors guide

Contributing and managing catalogue data

Inscriptions: Transliterations, transcriptions and translations

Textual annotations


User guide

Code base


Installation guide