diff --git a/doc/ingest/api/index.rst b/doc/ingest/api/index.rst index 75b56ddd4..0da70a073 100644 --- a/doc/ingest/api/index.rst +++ b/doc/ingest/api/index.rst @@ -1,20 +1,97 @@ + +.. note:: + + Information in this guide corresponds to the version **35** of the Qserv REST API. Keep in mind + that each implementation of the API has a specific version. The version number will change + if any changes to the implementation or the API that might affect users will be made. + The current document will be kept updated to reflect the latest version of the API. + ##################################### The Ingest Workflow Developer's Guide ##################################### -TBC +.. toctree:: + :maxdepth: 4 + + reference/index + +Introduction +============ + +This document presents an API that is available in Qserv for constructing the data ingest applications (also mentioned +in the document as *ingest workflows*). The API is designed to provide a high-performance and reliable mechanism for +ingesting large quantities of data where the high performance or reliability of the ingests is at stake. +The document is intended to be a practical guide for the developers who are building those applications. +It provides a high-level overview of the API, its main components, and the typical workflows that can be built using the API. + +At the very high level, the Qserv Ingest system is comprised of: + +- The REST server that is integrated into the Master Replication Controller. The server provides a collection + of services for managing metadata and states of the new catalogs to be ingested. The server also coordinates + its own operations with Qserv itself and the Qserv Replication System to prevent interferences with those + and minimize failures during catalog ingest activities. +- The Data Ingest services run at each Qserv worker alongside the Replication System's worker services. + The role of these services is to actually ingest the client's data into the corresponding MySQL tables. + The services would also do an additional (albeit, minimal) preprocessing and data transformation (where or when needed) + before ingesting the input data into MySQL. Each worker server also includes its own REST server for processing + the "by reference" ingest requests as well as various metadata requests in the scope of the workers. + +Implementation-wise, the Ingest System heavily relies on services and functions of the Replication System including +the Replication System's Controller Framework, various (including the Configuration) services, and the worker-side +server infrastructure of the Replication System. + +Client workflows interact with the system's services via open interfaces (based on the HTTP protocol, REST services, +JSON data format, etc.) and use ready-to-use tools to fulfill their goals of ingesting catalogs. + +Here is a brief summary of the INgest System's features: + +- It introduces well-defined states and semantics into the ingest process. With that, a process of ingesting a new catalog + now has to go through a sequence of specific steps maintaining a progressive state of the catalog within Qserv + while it's being ingested. The state transitions and the corresponding enforcements made by the system would + always ensure that the catalog would be in a consistent state during each step of the process. + Altogether, this model increases the robustness of the process, and it also makes it more efficient. + +- To facilitate and implement the above-mentioned state transitions the new system introduces a distributed + checkpointing mechanism named *super-transactions*. These transactions allow for incremental updates of + the overall state while allowing to safely roll back to a prior consistent state should any problem occur + during data loading within such transactions. + +- In its very foundation, the system has been designed for constructing high-performance and parallel ingest + workflows w/o compromising the consistency of the ingested catalogs. + +- For the actual data loading, the system offers a few options, inluding pushing data into Qserv directly + via a proprietary binary protocol, HTTP streaming, or ingesting contributions "by reference". In the latter + case, the input data will be pulled by the worker services from the remote locations specified by the ingest + workflows. The presently supported sources include the object stores (via the HTTP/HTTPS protocols) and + the locally mounted distributed filesystems (via the POSIX protocol). See Ingesting files directly from + workers for further details. + +- The data loading services also collect various information on the ongoing status of the ingest activities, + abnormal conditions that may occur during reading, interpreting, or loading the data into Qserv, as well + as the metadata for the data that is loaded. The information is retained within the persistent + state of the system for the monitoring and debugging purposes. A feedback is provided to the workflows + on various aspects of the ingest activities. The feedback is useful for the workflows to adjust their + behavior and to ensure the quality of the data being ingested. + + - To get further info on this subject, see the sections Error reporting (TODO) and Using MySQL warnings (TODO). + In addition, the API provides REST services for obtaining metadata on the state of catalogs, tables, distributed + transactions, contribution requests, the progress of the requested operations, etc. + +**What the Ingest System won't do**: + +- As per its current implementation (which may change in the future) it will not automatically partition + input files. This task is expected to be the responsibility of the ingest workflows. + +- It will not (with the very small exception of adding an extra leading column ``qserv_trans_id`` required by + the implementation of the new system) pre-process the input ``CSV`` files sent to the Ingest Data Servers + for loading into tables. It's up to the workflows to sanitize the input data and to make them ready to be + ingested into Qserv. -.. list-table:: Title - :widths: 25 25 50 - :header-rows: 1 +More documents on the requirements and the low-level technical details of its implementation (unless it's needed +for the purposes of the document's goals) can be found elsewhere. - * - Heading row 1, column 1 - - Heading row 1, column 2 - - Heading row 1, column 3 - * - Row 1, column 1 - - - - Row 1, column 3 - * - Row 2, column 1 - - Row 2, column 2 - - Row 2, column 3 +It's recommended to read the document sequentially. Most ideas presented in the document are introduced in section +An example of a simple workflow (TODO: add a link to the section). The section is followed by a few more sections +covering advanced topics (TODO: add a link to the section). The API Reference section at the very end of the document +should be used to find complete descriptions of the REST services and tools mentioned in the document. \ No newline at end of file diff --git a/doc/ingest/api/reference/index.rst b/doc/ingest/api/reference/index.rst new file mode 100644 index 000000000..70e1cb5c8 --- /dev/null +++ b/doc/ingest/api/reference/index.rst @@ -0,0 +1,10 @@ + +###################### +Ingest API Reference +###################### + +.. toctree:: + :maxdepth: 4 + + rest/index + tools diff --git a/doc/ingest/api/reference/rest/db-table-management.rst b/doc/ingest/api/reference/rest/db-table-management.rst new file mode 100644 index 000000000..7da453c1f --- /dev/null +++ b/doc/ingest/api/reference/rest/db-table-management.rst @@ -0,0 +1,685 @@ +Database and table management +============================= + +.. _ingest-db-table-management-config: + +Finding existing databases and database families +------------------------------------------------ + +The following service pulls all configuration information of of the Replication/Ingest System, including info +on the known database families, databases and tables: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``GET`` + - ``/replication/config`` + +Upon successful (see :ref:`Ingest error reporting`) completion of the request, the service will return an object +that has the following schema (of which only the database and database family-related fields are shown): + +.. code-block:: json + + { + "config": { + "database_families" : [ + { + "overlap" : 0.01667, + "min_replication_level" : 3, + "num_sub_stripes" : 3, + "name" : "production", + "num_stripes" : 340 + } + ], + "databases" : [ + { + "database" : "dp01_dc2_catalogs_02", + "create_time" : 0, + "is_published" : 1, + "publish_time" : 1662688661000, + "family_name" : "production", + "tables" : [ + { + "ang_sep" : 0, + "is_director" : 1, + "latitude_key" : "coord_dec", + "create_time" : 1662774817703, + "unique_primary_key" : 1, + "flag" : "", + "name" : "Source", + "director_database_name" : "", + "is_ref_match" : 0, + "is_partitioned" : 1, + "longitude_key" : "coord_ra", + "database" : "dp02_dc2_catalogs", + "director_table" : "", + "director_key2" : "", + "director_database_name2" : "", + "director_key" : "sourceId", + "director_table2" : "", + "director_table_name2" : "", + "is_published" : 1, + "director_table_name" : "", + "publish_time" : 1663033002753, + "columns" : [ + { + "name" : "qserv_trans_id", + "type" : "INT NOT NULL" + }, + { + "type" : "BIGINT NOT NULL", + "name" : "sourceId" + }, + { + "type" : "DOUBLE NOT NULL", + "name" : "coord_ra" + }, + { + "type" : "DOUBLE NOT NULL", + "name" : "coord_dec" + } + ] + } + ] + } + ] + } + } + +**Notes**: + +- The sample object was truncated for brevity. The actual number of families, databases, tables and columns were + much higher in the real response. +- The number of attributes varies depending on a particular table type. The example above shows + attributes for the table ``Source``. This table is *partitioned* and is a *director* (all *director*-type tables + are partitioned in Qserv). + + +.. _ingest-db-table-management-register-db: + +Registering databases +---------------------- + +Each database has to be registered in Qserv before one can create tables and ingest data. The following +service of the Replication Controller allows registering a database: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``POST`` + - ``/ingest/database`` + +The service requires a JSON object of the following schema: + +.. code-block:: + + {"database" : , + "num_stripes" : , + "num_sub_stripes" : , + "overlap" : , + "auto_build_secondary_index" : , + "local_load_secondary_index" : , + "auth_key" : + } + +Where: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``database`` + - [ *string* ] + + **Required**: The name of the database to be created. + + * - ``num_stripes`` + - [ *number* ] + + **Required**: The number of stripes that was used when partitioning data of + all tables to be ingested in a scope of the database. + + * - ``num_sub_stripes`` + - [ *number* ] + + **Required**: The number of sub-stripes that was used when partitioning data of + all tables to be ingested in a scope of the database. + + * - ``overlap`` + - [ *number* ] + + **Required**: The overlap between the stripes. + + * - ``auto_build_secondary_index`` + - [ *number* ] + + **Optional**: The flag that scpecifies the desired mode for building the *director* (used to be known as the *secondary*) + indexes of the director tables of the catalog + + **Default value**: is ``1`` which would result in an atempt to build the index automatically during transaction + commit time. If a value of ``0`` is passed into the service the index won't be built, and it will be up to a workflow + to trigger the index building as a separated "post-ingest" action using the corresponding service: + + - director index building service (**TODO**: add a link to the service). + + **Note**: Catalogs in Qserv may have more than one director table. This option applies to all such tables. + + * - ``auth_key`` + - [ *string* ] + + **Required**: The authentication key that is required to register the database. The key is used to prevent + unauthorized access to the service. + +.. warning:: + + - The service will return an error if the database with the same name already exists in the system. + - Values of attributes ``num_stripes``, ``num_sub_stripes`` and ``overlap`` are expected to match + the corresponding partitioning parameters used when partitioning all partitioned tables of the new database. + Note that the current implementation of the Qserv Ingest system will not validate contributions to the partitioned + tables to enforce this requirement. Only the structural correctness will be checked. It's up to a workflow + to ensure the data ingested into tables are correct. + +If the operation is successfully finished (see :ref:`Ingest error reporting`) a JSON object returned by the service +will have the following attribute: + +.. code-block:: + + { + "database": { + ... + } + } + +The object containing the database configuration information has the same schema as it was explained earlier in section: + +- :ref:`ingest-db-table-management-config` + + +.. _ingest-db-table-management-register-table: + +Registering tables +------------------ + +All tables, regardless if they are *partitioned* or *regular* (fully replicated on all worker nodes), have to be registered +using the following Replication Controller's service: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``POST`` + - ``/ingest/table`` + +The service requires a JSON object of the following schema: + +Where a JSON object sent to the service with the request shall describe that table. This is a schema of the object for +the **partitioned** tables is presented below: + +.. code-block:: + + { + "database" : , + "table" : , + "is_partitioned" : , + "schema" : [ + { "name" : , + "type" : + }, + ... + ], + "director_table" : , + "director_key" : , + "director_table2" : , + "director_key2" : , + "latitude_key" : , + "longitude_key" : , + "flag" : , + "ang_sep" : , + "unique_primary_key" : , + "auth_key" : + } + +A description of the *regular* tables has a fewer number of attributes (attributes that which are specific to the *partitioned* +tables are missing): + +.. code-block:: + + { + "database" : , + "table" : , + "is_partitioned" : , + "schema": [ + { + "name" : , + "type" : + }, + ... + ], + "auth_key" : + } + +Where the attributes are: + +.. list-table:: + :widths: 10 10 80 + :header-rows: 1 + + * - attr + - table + - description + + * - ``database`` + - *any* + - [ *string* ] + + **Required**: The name of the existing database. + + * - ``table`` + - *any* + - [ *string* ] + + **Required**: The name of a table to be created. + + * - ``is_partitioned`` + - *any* + - [ *number* ] + + **Required**: The type of table. Allowed values: + + - ``1`` for partitioned tables (including any subtypes) + - ``0`` for the regular tables. + + * - ``schema`` + - *any* + - [ *array* ] + + **Required**: A definition of the table schema, where each entry of the array is an object with the following attributes: + + - ``name``: The name of the column. + - ``type``: The type of the column. The type must adhere to the MySQL requirements for column types. + + * - ``director_table`` + - *partitioned* + - [ *string* ] + + **Required**: The name of the corresponding first (or left) *director* table. The name is required to be not empty for + the *dependent* tables and it has to be empty for the *director* tables. This is the only way to differentiate between + two types of *partitioned* tables. + + **Note**: The *ref-match* tables are considered as the *dependent* tables since they have columns that are pointing + to the corresponding *director* tables. See attributes: ``director_key``, ``director_table2``, and ``director_key2``. + + * - ``director_key`` + - *partitioned* + - [ *string* ] + + **Required**: The name of a column in a *partitioned* table. A role of the column depends on a subtype of + the table: + + - *director*: the primary key of the table + - *dependent*: the foreign key pointing to the corresponding column of the *director* table + + * - ``director_table2`` + - *partitioned* + - [ *string* ] + + **Required**: The name of the corresponding second (or right) *director* table. The non-empty value + name is required for the *ref-match* tables and it has to be empty for the *director* and *dependent* tables. + + **Note**: The very presence of this attribute in the input configuration would imply an intent to register + a "ref-match* table. In this case, non-empty values of the attributes ``director_key2`` , ``flag`` and ``ang_sep`` + will be required in order to succeed with the registration. + + * - ``director_key2`` + - *ref-match* + - [ *string* ] + + **Required**: The name of a column that is associated (AKA *foreign key*) with corresponding column of the second *director* table. + A value of this attribute must not be empty when registering the *ref-match* tables. It will be ignored for other table types. + See a description of the attribute ``director_table2``. + + * - ``latitude_key`` + - *partitioned* + - [ *string* ] + + The name of a column in a *partitioned* table represents latitude: + + - **Required** for the *director* tables + - **Optional** for the *dependent* tables + + * - ``longitude_key`` + - *partitioned* + - [ *string* ] + + The name of a column in a *partitioned* table represents longitude. + + - **Required** for the *director* tables + - **Optional** for the *dependent* tables + + * - ``flag`` + - *ref-match* + - [ *string* ] + + **Required**: The name of the special column that is required to be present on the *ref-match* tables. + Values of the column are populated by the tool ``sph-partition-matches`` when partitioning the input files + of the *ref-match* tables. The data type of this column is usually: + + .. code-block:: sql + + INT UNSIGNED + + * - ``ang_sep`` + - *ref-match* + - [ *double* ] + + **Required**: The value of the angular separation for the matched objects that is used by Qserv to process queries which + involve the *ref-match* tables. The value is in radians. + + * - ``unique_primary_key`` + - *director* + - [ *number* ] + + **Optional**: The optional flag allows to drop the uniqueness requirement for the *director* keys of the table. The parameter + is meant to be used for testing new table products, or for the *director* tables that won't have any dependants (child tables). + Allowed values: + + - ``0``: The primary key is not unique. + - ``1``: The primary key is unique. + + * - ``auth_key`` + - *any* + - [ *string* ] + + **Required**: The authentication key that is required to register the table. The key is used to prevent + unauthorized access to the service. + +.. warning:: + + - The table schema does not include definitions of indexes. Those are managed separately after the table is published. + The index management interface is documented in a dedicated document + + - **TODO**: Managing indexes of MySQL tables at Qserv workers. + + - The service will return an error if the table with the same name already exists in the system, or + if the database didn' exist at a time when teh request was delivered to the service. + + - The service will return an error if the table schema is not correct. The schema will be checked for the correctness. + +.. note:: Requirements for the table schema: + + - The variable-length columns are not allowed in Qserv for the *director* and *ref-match* tables. All columns of these + tables must have fixed lengths. These are the variable length types: ``VARCHAR``, ``VARBINARY``, ``BLOB``, ``TEXT``, + ``GEOMETRY`` and ``JSON``. + + - The *partitioned* tables are required to have parameters ``director_key``, ``latitude_key`` and ``longitude_key``. + - The *director* tables are required to have non-empty column names in the parameters ``director_key``, ``latitude_key`` and ``longitude_key``. + - The *dependent* tables are required to have a non-empty column name specified in the parameter ``director_key``. + - The *dependent* tables are allowed to have empty values in the parameters ``latitude_key`` and ``longitude_key``. + + - For tables where the attributes ``latitude_key`` and ``longitude_key`` are provided (either because they are required + of if they are optional), values must be either both non-empty or empty. An attempt to specify only one of the attribute + or have a non-empty value in an attribute while the other one has it empty will result in an error. + + - All columns mentioned in attributes ``director_key``, ``director_key2``, ``flag``, ``latitude_key`` and ``longitude_key`` + must be present in the table schema. + + - Do not use quotes around the names or type specifications. + + - Do not start the columm names with teh reserved prefix ``qserv``. This prefix is reserved for the Qserv-specific columns. + +An example of the schema definition for the table ``Source``: + +.. code-block:: json + + [ + { + "name" : "sourceId" + "type" : "BIGINT NOT NULL", + }, + { + "name" : "coord_ra" + "type" : "DOUBLE NOT NULL", + }, + { + "name" : "coord_dec" + "type" : "DOUBLE NOT NULL", + } + ] + +If the operation is successfully finished (see :ref:`Ingest error reporting`) a JSON object returned by the service +will have the following attribute: + +.. code-block:: + + { + "database": { + ... + } + } + +The object will contain the updated database configuration information that will also include the new table. +The object will have the same schema as it was explained earlier in section: + +- :ref:`ingest-db-table-management-config` + +**Notes on the table names**: + +- Generally, the names of the tables must adhere to the MySQL requirements for identifiers + as explained in: + + - https://dev.mysql.com/doc/refman/8.0/en/identifier-qualifiers.html + +- The names of identifiers (including tables) in Qserv are case-insensitive. This is not the general requirement + in MySQL, where the case sensitivity of identifiers is configurable one way or another. This requirement + is enforced by the configuration of MySQL in Qserv. + +- The length of the name should not exceed 64 characters as per: + + - https://dev.mysql.com/doc/refman/8.0/en/identifier-length.html + +- The names should **not** start with the prefix ``qserv``. This prefix is reserved for the Qserv-specific tables. + + +.. _ingest-db-table-management-publish-db: + +Publishing databases +-------------------- + +Databases are published (made visible to Qserv users) by calling this service: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``PUT`` + - ``/ingest/database/`` + +The name of the database is provided as a parameter ```` of the resource. There are a few optional +parameters to be sent in the JSON body of the request: + +.. code-block:: + + { + "consolidate_secondary_index" : , + "row_counters_deploy_at_qserv" : , + "auth_key" : + } + +Where: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``consolidate_secondary_index`` + - [ *number* ] + + **Optional**: The parameter that controls the final format of all the *director* index tables of the database. + Normally, the *director* indexes are MySQL-partitioned tables. If the value of this optional parameter is + not ``0`` then the Ingest System will consolidate the MySQL partitions and turn the tables into the monolitical form. + + **Default value**: is ``9`` which means that no consolidation will be done. + + .. warning:: + + Depending on the scale of the catalog (sizes of the affected tables), this operation may be quite lengthy (up to many hours). + Besides, based on the up to the date experience with using the MySQL-partitioned director indexes, the impact of the partitions + on the index's performance is rather negligible. So, it's safe to ignore this option in most but very special cases that are not + discussed by the document. + + One can find more info on the MySQL partitioning at: + + - https://dev.mysql.com/doc/refman/8.0/en/partitioning.html + + * - ``row_counters_deploy_at_qserv`` + - [ *number* ] + + **Optional**: This option allows scanning and deploying the row counters as explained at: + + - (**TODO** link) Managing statistics for the row counters optimizations (the expanded explanation of counters, + their use and the management techniques) + - (**TODO** link) Collecting row counters and deploying them at Qserv (the REST service description in + the API Reference section) + + To trigger this operation the ingest workflow should provide a value that is not 0. In this case the row counters + collection service will be invoked with the following combination of parameters: + + .. list-table:: + :widths: 60 40 + :header-rows: 1 + + * - attr + - value + * - ``overlap_selector`` + - ``CHUNK_AND_OVERLAP`` + * - ``force_rescan`` + - ``1`` + * - ``row_counters_state_update_policy`` + - ``ENABLED`` + * - ``row_counters_deploy_at_qserv`` + - ``1`` + + **Default value**: is ``0`` which means that the row counters won't be deployed. + + * - ``auth_key`` + - [ *string* ] + **Required**: The authentication key that is required to publish the database. The key is used to prevent + unauthorized access to the service. + +.. warning:: + + The row counters deployment is a very resource-consuming operation. It may take a long time to complete + depending on the size of the catalog. This will also delay the catalog publiushing stage of an ingest compaign. + A better approach is to deploy the row counters as the "post-ingest" operation as explained in: + + - (**TODO** link) Deploying row counters as a post-ingest operation + +.. note:: + + The catalogs may be also unpublished to add more tables. The relevant REST service is documented in: + + - (**TODO** link) Un-publishing databases to allow adding more tables + + +.. _ingest-db-table-management-unpublish-db: + +Un-publishing databases to allow adding more tables +--------------------------------------------------- + +Unpublished databases as well as previously ingested tables will be still visible to users of Qserv. +The main purpose of this operation is to allow adding new tables to the existing catalogs. +The new tables won't be seen by users until the catalog is published back using the following REST service: + +- :ref:`ingest-db-table-management-publish-db` + +Databases are un-published by calling this service: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``PUT`` + - ``/replication/config/database/`` + +The name of the database is provided as a parameter ```` of the resource. The only mandatory parameter +to be sent in the JSON body of the request is: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``admin_auth_key`` + - [ *string* ] + + **Required**: The administrator-level authentication key that is required to publish the database. + The key is used to prevent unauthorized access to the service. + + **Note**: The key is different from the one used to publish the database. The eleveated privileges + are needed to reduce risks of disrupting user access to the previously loaded and published databases. + + +.. _ingest-db-table-management-delete: + +Deleting databases and tables +----------------------------- + +These services can be used for deleting non-*published* (the ones that are still ingested) as well as *published* databases, +or tables, including deleting all relevant persistent structures from Qserv (such as CSS records, etc.): + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``DELETE`` + - ``/ingest/database/`` + * - ``DELETE`` + - ``/ingest/table//`` + +To delete a non-*published* database (or a table from such database) a client has to provide the normal level authentication +key ``auth_key`` in a request to the service: + +.. code-block:: + + { "auth_key" : + } + +The name of the databases affected by the operation is specified at the resource's path. + +Deleting databases (or tables from those databases) that have already been published requires a user to have +elevated administrator-level privileges. These privileges are associated with the authentication key ``admin_auth_key`` +to be sent with a request instead of ``auth_key``: + +.. code-block:: + + { "admin_auth_key" : + } + +Upon successful completion of the request (for both above-mentioned states of the database), the service will return the standard +response as explained in the section mentoned below. After that, the database (or the table, depending on a scope of a request) +name can be reused for further ingests if needed. + +- :ref:`Ingest error reporting` + diff --git a/doc/ingest/api/reference/rest/general.rst b/doc/ingest/api/reference/rest/general.rst new file mode 100644 index 000000000..140c031db --- /dev/null +++ b/doc/ingest/api/reference/rest/general.rst @@ -0,0 +1,226 @@ +General guidelines +================== + +Request headers +--------------- + +All services require the following HTTP header to be sent with requests if a service expects a non-empty JSON +object in the request's body: + +.. code-block:: + + Content-Type: application/json + +When requests are sent using the command line application curl then the following option must be used: + +.. code-block:: bash + + curl -X -H "Content-Type: application/json" + +In this case a JSON object can be specified using one of the following methods: + +.. code-block:: bash + + echo '{...}' | curl -X -H
-d@- + curl -X -H
-d '{...}' + +Where ``{...}`` represents a JSON object with details of the request. The object may not be required for some requests. +Specific requirements for this will be mentioned in each service. If the object is not required for a for particular +request then the body is allowed to be empty, or it could be an empty JSON object ``{}``. + +All (no exception) services return results and errors as JSON objects as explained in the next subsection below. + +.. _Ingest error reporting: + +Error reporting when calling the services +----------------------------------------- + +.. note: + + The error reporting mechanism implemented in the System serves as a foundation for building reliable workflows. + +All services explained in the document adhere to the usual conventions adopted by the Web community for designing and using the REST APIs. In particular, HTTP code 200 is returned if a request is well-formed and accepted by the corresponding service. Any other code shall be treated as an error. However, the implementation of the System further extends the error reporting mechanism by guaranteeing that all services did the fine-grain error reporting in the response objects. All services of the API are guaranteed to return an JSON object if the HTTP code is 200. The objects would have the following mandatory attributes (other attributes depend on a request): + +.. code-block:: + + { + "success" : , + "error" : , + "error_ext" : , + ... + } + +**Note**: depending on the service, additional attributes may be present in the response object. + +Therefore, even if a request is completed with HTTP code ``200``, a client (a workflow) must inspect the above-mentioned +fields in the returned object. These are the rules for inspecting the status attributes: + +- Successful completion of a request is indicated by having success=1 in the response. In these cases, the other + two fields should be ignored. +- Otherwise, a human-readable explanation of a problem would be found in the error field. +- Request-specific extended information on errors is optionally provided in the error_ext field. + +Optional warnings +^^^^^^^^^^^^^^^^^ + +**Note**: Warnings were introduced as of version ``12`` of the API. + +REST services may also return the optional attribute ``warning`` a caller about potential problems with a request. +The very presence of such a warning doesn't necessarily mean that the request failed. Users are still required +to use the above-described error reporting mechanism for inspecting the completion status of requests. +Warnings carry the additional information that may be present in any response regardless if it succeeded or not. +It's up to a user to interpret this information based on a specific request and the context it was made. + +Here is what to expect within the response object if the warning was reported: + +.. code-block:: + + { + "success" : , + ... + "warning" : , + ... + } + +Protocol Versioning +------------------- + +The API adheres to the optional version control mechanism introduced in: + +- https://rubinobs.atlassian.net/browse/DM-35456 + +Workflow developers are encouraged to use the mechanism to reinforce the integrity of the applications. + +There are two ways the workflows can use the version numbers: + +- *pull mode*: Ask the Replication Controller explicitly what version it implements and cross-check the returned + version versus a number expected by the application. +- *push mode*: Pass the expected version number as a parameter when calling services and let + the services verify if that version matches one of the frontend implementations. + +Workflow developers are free to use neither, either of two, or both methods of reinforcing their applications. + +Pull mode +^^^^^^^^^ + +To support the first scenario, the API provides a special metadata service that will return +the version number (along with some other information on the frontend): + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``GET`` + - ``/meta/version`` + +The request object for this request is not required, or it could be an empty JSON object ``{}``. +In case of its successful completion, the service will return a JSON object that will include +the following attributes (along with the other standard attributed that are used for error reporting): + +.. code-block:: + + { + "kind" : , + "name" : , + "id" : , + "instance_id" : , + "version" : , + "database_schema_version" : , + "success" : , + "warning" : , + "error" : , + "error_ext" : + } + +Where, the service-specific attributes are: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``kind`` + - [ *string* ] + + The name of the service. The following name is always reported: ``replication-controller`` + + * - ``name`` + - [ *string* ] + + The unique name of the frontend within a given Qserv. The current implementation will always return: ``http`` + + * - ``id`` + - [ *number* ] + + A unique identifier of the Replication Controller. The number returned here may vary. + + * - ``instance_id`` + - [ *string* ] + + An identifier of the Qserv. A value of the attribute depends on a particular deployment of Qserv. + + * - ``version`` + - [ *number* ] + + The current version number of the API. + + * - ``database_schema_version`` + - [ *number* ] + + The schema version number of the Replication System's Database. + +Example: + +.. code-block:: json + + { + "kind" : "replication-controller", + "id" : "9037c818-4820-4b5e-9219-edbf971823b2", + "instance_id" : "qserv_proj", + "version" : 27, + "database_schema_version" : 14, + "success" : 1, + "error" : "", + "error_ext" : {}, + "warning" : "" + } + +Push mode +^^^^^^^^^ + +In the case of the second scenario, an application will pass the desired version number as +a request parameter. The number would be a part of the request's query for the method. For example, +the following request for checking the status of the ongoing query might look like this: + +.. code-block:: bash + + curl 'http://localhost:25004/trans/contrib/1234?version=35' -X GET + +For other HTTP methods used by the API, the number is required to be provided within the body +of a request as shown below: + +.. code-block:: bash + + curl 'http://localhost:25004/trans/contrib' -X POST \ + -H 'Content-Type: application/json' \ + -d'{"version":35, ..."}' + +If the number does not match expectations, such a request will fail and the service return the following +response. Here is an example of what will happen if the wrong version number ``29`` is specified instead +of ``35`` (as per the current version of the API): + +.. code-block:: json + + { + "success" : 0, + "error" : "The requested version 29 of the API is not in the range supported by the service.", + "error_ext": { + "max_version" : 35, + "min_version" : 32 + }, + "warning" : "" + } diff --git a/doc/ingest/api/reference/rest/index.rst b/doc/ingest/api/reference/rest/index.rst new file mode 100644 index 000000000..880048420 --- /dev/null +++ b/doc/ingest/api/reference/rest/index.rst @@ -0,0 +1,11 @@ +############# +REST Services +############# + +.. toctree:: + :maxdepth: 4 + + general + db-table-management + trans-management + table-location diff --git a/doc/ingest/api/reference/rest/table-location.rst b/doc/ingest/api/reference/rest/table-location.rst new file mode 100644 index 000000000..057a9f380 --- /dev/null +++ b/doc/ingest/api/reference/rest/table-location.rst @@ -0,0 +1,287 @@ +Table location services +======================= + +.. _table-location-regular: + +Locate regular tables +--------------------- + +.. warning:: + This service was incorrectly designed by requiring the name of a database to be passed in an attribute ``database`` of + the ``GET`` request's body. The same problem is for the alternative method accepting a transaction identifier + (attribute ``transaction_id``). This is not a standard practice. The ``GET`` requests ae not supposed to have the body. + Both problems will be fixed in the next releases of Qserv by moving the parameters into the query part of + the URL. + +The service returns connection parameters of the Worker Data Ingest Services which are available for ingesting +the regular (fully replicated) tables: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + + * - ``GET`` + - ``/ingest/regular`` + +Where the request object passed in a request's body has the following schema, in which a client would have to provide the name of a database: + +.. code-block:: + + { + "database" : + } + +The database should not be published at a time when the request was being called. Otherwise the service will return an error. + +The service also supports an alternative method accepting a transaction identifier (transactions are always associated with +the corresponding databases): + +.. code-block:: + + { + "transaction_id" : + } + +If the transaction identifier was provided then the transaction is required to be in the ``STARTED`` state at the time of a request. +See the section :ref:`ingest-trans-management` for more details on transactions. + +In case of successful completion the service returns the following object: + +.. code-block:: + + { + "locations" : [ + { + "worker" : , + "host" : , + "host_name" : , + "port" : , + "http_host" : , + "http_host_name" : , + "http_port" : + }, + ... + ] + } + +Where, each object in the array represents a particular worker. See an explanation of the attributes in: + +- :ref:`table-location-connect-params` + +**Note**: If the service will returns an empty array then Qserv is either not properly configured, +or it's not ready to ingest the tables. + +.. _table-location-chunks: + +Allocate/locate chunks of the partitioned tables +------------------------------------------------ + +The current implementation of the system offers two services for allocating (or determining locations of existing) chunks: + +- :ref:`table-location-chunks-one` +- :ref:`table-location-chunks-many` + +Both techniques are explained in the current section. The choice of a particular technique depends on the requirements +of a workflow. However, the second service is recommended as it's more efficient in allocating large quanities of chunks. + +Also note, that once a chunk is assigned (allocated) to a particular worker node all subsequent requests for the chunk are guaranteed +to return the same name of a worker as a location of the chunk. Making multiple requests for the same chunk is safe. Chunk allocation +requests require a valid super-transaction in the ``STARTED`` state. See the section :ref:`ingest-trans-management` for more details on transactions. + +.. _table-location-chunks-one: + +Single chunk allocation +~~~~~~~~~~~~~~~~~~~~~~~ + +The following service is meant to be used for a single chunk allocation/location: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + + * - ``POST`` + - ``/ingest/chunk`` + +Where the request object has the following schema, in which a client would have to provide the name of a database: + +.. code-block:: + + { + "database" : , + "chunk" : , + "auth_key" : + } + +The service also supports an alternative method accepting a transaction identifier (transactions are always associated with the corresponding databases): + +.. code-block:: + + { + "transaction_id" : , + "chunk" : , + "auth_key" : + } + +If a request succeeded, the System would respond with the following JSON object: + +.. code-block:: + + { + "locations" : [ + { + "worker" : , + "host" : , + "host_name" : , + "port" : , + "http_host" : , + "http_host_name" : , + "http_port" : + }, + ... + ] + } + +Where, the object represents a worker where the Ingest system requests the workflow to forward the chunk contributions. +See an explanation of the attributes in: + +- :ref:`table-location-connect-params` + +.. _table-location-chunks-many: + +Multiple chunks allocation +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For allocating multiple chunks one would have to use the following service: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + + * - ``POST`` + - ``/ingest/chunks`` + +Where the request object has the following schema, in which a client would have to provide the name of a database: + +.. code-block:: + + { + "database" : , + "chunks" : [, , ... ], + "auth_key" : + } + +Like the above-explained case of the single chunk allocation service, this one also supports an alternative method accepting +a transaction identifier (transactions are always associated with the corresponding databases): + +.. code-block:: + + { + "transaction_id" : , + "chunks" : [, , ... ], + "auth_key" : + } + +**Note** the difference in the object schema - unlike the single-chunk allocator, this one expects an array of chunk numbers. + +The resulting object has the following schema: + +.. code-block:: + + { + "locations" : [ + { + "chunk" : , + "worker" : , + "host" : , + "host_name" : , + "port" : , + "http_host" : , + "http_host_name" : , + "http_port" : + }, + ... + ] + } + +Where, each object in the array represents a particular worker. See an explanation of the attributes in: + +- :ref:`table-location-connect-params` + +.. _table-location-connect-params: + +Connection parameters of the workers +------------------------------------- + +.. warning:: + In the current implementation of the Ingest system, values of the hostname attributes ``host_name`` and ``http_host_name`` are captured + by the worker services themselves. The names may not be in the FQDN format. Therefore this information has to be used with caution and + only in those contexts where the reported names could be reliably mapped to the external FQDN or IP addresses of the corresponding hosts + (or Kubernetes *pods*). + +Attributes of the returned object are: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``chunk`` + - [ *number* ] + + The unique identifier of the chunk in Qserv. + + **Note**: This attribute is reported in the chunk location/allocation requests: + + - :ref:`table-location-chunks` + + * - ``worker`` + - [ *string* ] + + The unique identifier of the worker in Qserv. + + **Note**: The worker's identifier is not the same as the worker's host name. + + * - ``host`` + - [ *string* ] + + The IP address of the worker's Ingest service that supports the proprietary binary protocol. + + * - ``host_name`` + - [ *string* ] + + The DNS name of the worker's Ingest service that supports the proprietary binary protocol. + + * - ``port`` + - [ *number* ] + + The port number of the worker's Ingest service that supports the proprietary binary protocol. This service requires + the content of an input file be sent directly to the service client. The Replication/Ingest system provides + a ready-to-use application (**TODO**) ``qserv-replica-file INGEST`` that is based on this protocol. + + * - ``http_host`` + - [ *string* ] + + The IP address of the worker's Ingest service that supports the HTTP protocol. + + * - ``http_host_name`` + - [ *string* ] + + The DNS name of the worker's Ingest service that supports the HTTP protocol. + + * - ``http_port`` + - [ *number* ] + + The port number of the worker's Ingest service that supports the HTTP protocol. The REST server that's placed in front of the service allows + ingesting a single file from a variety of external sources, such as the locally mounted (at the worker's host) filesystem, or a remote object store. + It's also possible to push the content of a file in the request body ether as teh JSON object or as a binary stream (``multipart/form-data``). diff --git a/doc/ingest/api/reference/rest/trans-management.rst b/doc/ingest/api/reference/rest/trans-management.rst new file mode 100644 index 000000000..689b5279c --- /dev/null +++ b/doc/ingest/api/reference/rest/trans-management.rst @@ -0,0 +1,1137 @@ +.. _ingest-trans-management: + +Transaction management +====================== + +.. note:: + + - The transaction management services which modify a state of transactions are available only to the authorized users. + The authorization is based on the authentication key. The key is used to prevent unauthorized access to the services. + + - The schema of the JSON object returned for each transaction is the same for all services in the group. + The schema is described in the section: + + - :ref:`ingest-trans-management-descriptor` + +.. _ingest-trans-management-status: + +Status of a transaction +----------------------- + +There are two services in this group. They are documented in the dedicated sections below. + +.. _ingest-trans-management-status-many: + +Database transactions +^^^^^^^^^^^^^^^^^^^^^ + +The service returns the information on many transactions in a scope of a database or databases selected via optional +filters passed via the request's query. The service is meant to be used by workflows for monitoring the status of +transactions and for debugging purposes. To see an actual progress of a transaction (e.g. to see the contributions +loaded into the destination table) a workflow should use the service: :ref:`ingest-trans-management-status-one`. + +.. list-table:: + :widths: 15 20 65 + :header-rows: 1 + + * - method + - resource + - query parameters + * - ``GET`` + - ``/ingest/trans`` + - ``database=`` + + ``family=`` + + ``all_databases={0|1}`` + + ``is_published={0|1}`` + + ``include_context={0|1}`` + + ``contrib={0|1}`` + + ``contrib_long={0|1}`` + + ``include_log={0|1}`` + + ``include_warnings={0|1}`` + + ``include_retries={0|1}`` + +Where: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``database`` + - [ *string* ] + + **Optional**: The name of the database to filter the transactions by. If the parameter is present and if + it's not empty then attributes ``family``, ``all_databases`` and ``is_published`` are ignored. + + **Default value**: ``""`` (empty string) + + * - ``family`` + - [ *string* ] + + **Optional**: The name of the database family. If the parameter is present and if + it's not empty then a scope of a request will be narrowed to databases - members of the given family. + Otherwise all databases regardless of their family membership will be considered. + + **Default value**: ``""`` (empty string) + + **Notes**: + + - The parameter is ignored if the parameter ``database`` is present. + - The final selection of the databases is also conditioned by the values of the optional parameters + ``all_databases`` and ``is_published``. See the description of the parameters for more details. + + * - ``all_databases`` + - [ *number* ] + + **Optional**: The flag which is used for further filtering of databases selected by the parameter family. + A value of ``0`` tells the service that the parameter ``is_published`` should be used to further filter database + selection to the desired subset. Any other value would mean no additional filters (hence ignoring ``is_published``), + hence including databases selected by the parameter family. + + **Default value**: ``0`` (rely on the parameter ``is_published`` for filtering databases) + + **Note**: The parameter is ignored if the parameter ``database`` is present. + + * - ``is_published`` + - [ *number* ] + + **Optional**: The flag is used only if enabled by setting the previous parameter ``all_databases=0``. + A value of ``0`` tells the service to narrow the database selection to databases which are not *published*. + Any other value would select the *published* databases. + + **Default value**: ``0`` (select non-*published* databases) + + **Note**: The parameter is ignored if the parameter ``database`` is present or when ``all_databases=1``. + + * - ``include_context`` + - [ *number* ] + + **Optional**: The flag tells the service to include the transaction context object in the report for each transaction. + See the documentation on services :ref:`ingest-trans-management-start` or :ref:`ingest-trans-management-end` for further + details. + + **Default value**: ``0`` (don't include the context) + + .. warning:: + + Potentially, each context object could be as large as **16 MB**. Enable this option only if you really need + to see contexts for all transactions. Otherwise use an alternative (single transaction) request to pull one + transaction at a time. + + * - ``contrib`` + - [ *number* ] + + **Optional**: The flag tells the service whether the transaction contribution objects should be included + into the report. See details on this flag in the dedicated section below. + + **Default value**: ``0`` (don't include contributions) + + .. warning:: + + Even though individual contribution objects aren't large, the total number of contribution ingested + in a scope of each transaction (and all transactions of a database, etc.) could be quite large. + This would result in a significant emount of data reported by the service. In extreme cases, the response + object could be **1 GB** or larger. Enable this option only if you really need to see contributions + for selected transactions. Otherwise use an alternative (single transaction) request to pull one transaction + at a time: :ref:`ingest-trans-management-status-one`. + + * - ``contrib_long`` + - [ *number* ] + + **Optional**: This flag is considered only if ``contrib=1``. Setting a value of the flag to any value other + than 0 will result in returning detailed info on the contributions. Otherwise (if a value of the parameter + is set to 0) only the summary report on contributions will be returned. + + **Default value**: ``0`` (return the the summary report on contributions) + + * - ``include_log`` + - [ *number* ] + + **Optional**: The flag tells the service to include the transaction log in the report for each transaction. + The log is a list of events that were generated by the system in response to the transaction management + reequests. Each entry in the log is a JSON object that includes the timestamp of the event, the event type, + etc. See **TODO** for the details on the log entries. + + **Default value**: ``0`` (do not return the extended info on the transactions) + + * - ``include_warnings`` + - [ *number* ] + + **Optional**: The flag, if set to any value that differs from ``0``, tells the service to include MySQL warnings + captured when loading contributions into the destination table. Warnings are reported in a context of + contributiond should they be allow in the report. + + **Default value**: ``0`` (do not return the warnings) + + **Note**: The parameter is ignored if ``contrib=0`` or if ``contrib_long=0``. + + * - ``include_retries`` + - [ *number* ] + + **Optional**: The flag, if set to any value that differs from ``0``, tells the service to include the information + on the retries to load contributions that were made during the transaction. Retries are reported in a context of + contributiond should they be allow in the report. + + **Default value**: ``0`` (do not return the information on the retries) + + **Note**: The parameter is ignored if ``contrib=0`` or if ``contrib_long=0``. + +This is an example of the most typical request to the service for pulling info on all transactions of ``gaia_edr3``: + +.. code-block:: bash + + curl -X GET "http://localhost:25081/ingest/trans?database=gaia_edr3" + +The service will return a JSON object with the summary report on the transactions in the following JSON object: + +.. code-block:: json + + { + "success" : 1, + "warning" : "No version number was provided in the request's query.", + "error" : "", + "error_ext" : {}, + "databases" : { + "gaia_edr3" : { + "is_published" : 0, + "num_chunks" : 1558, + "transactions" : [ + { + "database" : "gaia_edr3", + "log" : [], + "start_time" : 1726026383559, + "end_time" : 0, + "begin_time" : 1726026383558, + "id" : 1632, + "state" : "STARTED", + "transition_time" : 0, + "context" : {} + }, + { + "end_time" : 1727826539501, + "context" : {}, + "begin_time" : 1726026383552, + "log" : [], + "transition_time" : 1727826539218, + "database" : "gaia_edr3", + "start_time" : 1726026383553, + "state" : "ABORTED", + "id" : 1631 + }, + { + "database" : "gaia_edr3", + "end_time" : 1727826728260, + "id" : 1630, + "transition_time" : 1727826728259, + "start_time" : 1726026383547, + "begin_time" : 1726026383546, + "log" : [], + "state" : "FINISHED", + "context" : {} + }, + +**Note**: that the report doesn't have any entries for the contributions. The contributions are not included in the report since +the parameter ``contrib`` was not set to ``1``. The log entries are also missing since the parameter ``include_log`` was not set to ``1``. +Also, the transaction context objects are not included in the report since the parameter ``include_context`` was not set to ``1``. + +.. _ingest-trans-management-status-one: + +Single transaction finder +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The service returns the information on a single transaction identified by its unique identifier ```` passed +via the request's query: + +.. list-table:: + :widths: 15 20 65 + :header-rows: 1 + + * - method + - resource + - query parameters + * - ``GET`` + - ``/ingest/trans/`` + + - ``include_context={0|1}`` + + ``contrib={0|1}`` + + ``contrib_long={0|1}`` + + ``include_log={0|1}`` + + ``include_warnings={0|1}`` + + ``include_retries={0|1}`` + +Where the parameters are the same as for the service :ref:`ingest-trans-management-status-many`. + +This is an example of using the service for pulling info on a transaction ``1630`` and obtaining +the summary report on contributions and the transaction context: + +.. code-block:: bash + + curl -X GET "http://localhost:25881/ingest/trans/1630?contrib=1" + +The service returns a JSON object that has the following structure (the report is truncated by removing stats +on all workers but ``db12`` for brevity): + +.. code-block:: json + + { + "databases" : { + "gaia_edr3" : { + "num_chunks" : 1558, + "is_published" : 0, + "transactions" : [ + { + "id" : 1630, + "database" : "gaia_edr3", + "end_time" : 1727826728260, + "start_time" : 1726026383547, + "begin_time" : 1726026383546, + "transition_time" : 1727826728259, + "log" : [], + "context" : {}, + "state" : "FINISHED", + "contrib" : { + "summary" : { + "num_failed_retries" : 0, + "num_chunk_files" : 156, + "last_contrib_end" : 1726026945059, + "num_regular_files" : 0, + "num_rows" : 223420722, + "table" : { + "gaia_source" : { + "num_failed_retries" : 0, + "overlap" : { + "num_rows" : 6391934, + "num_warnings" : 0, + "num_rows_loaded" : 6391934, + "data_size_gb" : 5.97671127319336, + "num_files" : 155, + "num_failed_retries" : 0 + }, + "num_files" : 156, + "num_rows_loaded" : 217028788, + "num_warnings" : 0, + "data_size_gb" : 201.872497558594, + "num_rows" : 217028788 + } + }, + "num_workers" : 9, + "first_contrib_begin" : 1726026383616, + "num_rows_loaded" : 223420722, + "worker" : { + "db12" : { + "num_failed_retries" : 0, + "num_regular_files" : 0, + "num_chunk_files" : 18, + "num_rows_loaded" : 52289369, + "num_warnings" : 0, + "data_size_gb" : 48.6947402954102, + "num_chunk_overlap_files" : 23, + "num_rows" : 52289369 + }, + }, + "num_warnings" : 0, + "num_files_by_status" : { + "LOAD_FAILED" : 0, + "IN_PROGRESS" : 0, + "CANCELLED" : 0, + "CREATE_FAILED" : 0, + "READ_FAILED" : 0, + "FINISHED" : 311, + "START_FAILED" : 0 + }, + "num_chunk_overlap_files" : 155, + "data_size_gb" : 207.849166870117 + }, + "files" : [] + } + }, + +**Note**: the report doesn't have any entries for individual contributions in the attribute ``files``. Only the summary info +in the attribute ``summary`` is provided. + + +.. _ingest-trans-management-start: + +Start a transaction +------------------- + +Transactions are started by this service: + +.. list-table:: + :widths: 15 85 + :header-rows: 1 + + * - method + - resource + * - ``POST`` + - ``/ingest/trans`` + +The following JSON object is required to be sent in the body of a request: + +.. code-block:: + + { "database" : , + "context" : , + "auth_key" : + } + +Where: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``database`` + - [ *string* ] + + **Required**: The name of the database definintg a scope of the new transaction. + + * - ``context`` + - [ *object* ] + + **Optional**: An arbitrary workflow-defined object to be stored in the persistet state of + the Ingest System for the transaction. It's up to the workflow to decide what to store in the object. + For exaqmple, this information could be used later for recovering from errors during the ingest, for + general bookkeeping, data provenance, visualization purposes, etc. A value of this attribute, if provided, + must be a valid JSON object. The object could be empty. + + **Default value**: ``{}`` + + **Note**: The current implementation of the Qserv Ingest system limits the size of the context object by **16 MB**. + + * - ``auth_key`` + - [ *string* ] + + **Required**: The authentication key that is required to create transactions. The key is used to prevent + unauthorized access to the service. + +In case of successfull completion of a request (see :ref:`Ingest error reporting`) the service will return +the JSON object with a description of the new transaction: + +.. code-block:: + + { + "databases" : { + : { + "num_chunks" : , + "transactions" : [ + { + "begin_time" : , + "context" : {...}, + "database" : , + "end_time" : , + "id" : , + "log" : [], + "start_time" : , + "state" : "STARTED", + "transition_time" : + } + ] + } + }, + "success" : , + ... + } + } + +Where the attribute ``id`` representing a unique identifier of the transaction is the most important attribute +found in the object. A alue of the identifier needs to be memorized by a workflow to be used in the subsequent +requests to the transaction management services. + +The attribute ``start_time`` will be set to the current time in milliseconds since the UNIX *Epoch*. +And the state of the new transaction will be set to ``STARTED``. The ``end_time`` will be ``0``. A value of +the attribute ``context`` will be the same as it was provided on the input to the service, or the default +value if none was provided. + +.. _ingest-trans-management-end: + +Commit or abort a transaction +----------------------------- + +Transactions are aborted by the following service: + +.. list-table:: + :widths: 15 60 25 + :header-rows: 1 + + * - method + - resource + - query parameters + * - ``PUT`` + - ``/ingest/trans/`` + - ``?abort=1`` + +Transactions are commited by the following service: + +.. list-table:: + :widths: 15 60 25 + :header-rows: 1 + + * - method + - resource + - query parameters + * - ``PUT`` + - ``/ingest/trans/`` + - ``?abort=0`` + +A unique identifier of the transaction is passed into the service in the resource's path parameter ````. +The only mandatory parameter of the request query is ``abort``. The value of the parameter is ``0`` to tell the services +that the transaction has to be committed normally. Any other number will be interpreted as a request to abort the transaction. + +Other parameters defining a request are passed via the request's body: + +.. code-block:: + + { + "context" : , + "auth_key" : + } + +Where: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``context`` + - [ *object* ] + + **Optional**: An arbitrary workflow-defined object to be stored in the persistet state of + the Ingest System for the transaction. It's up to the workflow to decide what to store in the object. + For exaqmple, this information could be used later for recovering from errors during the ingest, for + general bookkeeping, data provenance, visualization purposes, etc. A value of this attribute, if provided, + must be a valid JSON object. The object could be empty. + + **Default value**: ``{}`` + + **Notes**: + + - A value provided in the attribute will replace the initial value specified (if any) at the transaction + start time (see :ref:`ingest-trans-management-start`). + - The current implementation of the Qserv Ingest system limits the size of the context object by **16 MB**. + + * - ``auth_key`` + - [ *string* ] + + **Required**: The authentication key that is required to end transactions. The key is used to prevent + unauthorized access to the service. + + +Upon successful completion of either request (see :ref:`Ingest error reporting`) the service would return an updated +status of the transaction in a JSON object as it was explained in the section :ref:`ingest-trans-management-start`. + +State transitions of the transactions: + +- Aborted transactions will end up in the ``ABORTED`` state. +- Transactions that were committed will end up in the ``FINISHED`` state. +- In case of any problems encountered during an attempt to end a transaction, other states may be also reported + by the service. + +It's also safe to repeat either of the requests. The service will complain if the transaction won't be in +the ``STARTED`` state at a time when the request was received by the service. + +More information on the statuses of transactions can be found at: + +- :ref:`ingest-trans-management-status` + +.. _ingest-trans-management-descriptor: + +Transaction descriptor +---------------------- + +.. note:: + + This section uses a database ``gaia_edr3`` and transaction ``1630`` as an example. + +The content of a JSON object returned by the services varies depending on a presense of the optional parameters: + +- ``include_context={0|1}`` +- ``contrib={0|1}`` +- ``contrib_long={0|1}`` +- ``include_log={0|1}`` +- ``include_warnings={0|1}`` +- ``include_retries={0|1}`` + +Subsections below describe the gradual expantion of the JSON object returned by the services as the optional parameters +are set to ``1``. + +.. _ingest-trans-management-descriptor-short: + +Shortest form +^^^^^^^^^^^^^ + +The shortest form of the JSON object returned by the services when all optional parameters are set to ``0`` is: + +.. code-block:: + + { + "databases" : { + "gaia_edr3" : { + "is_published" : <0|1>, + "num_chunks" : , + "transactions" : [ + { + "id" : 1630, + "database" : "gaia_edr3", + "begin_time" : , + "start_time" : , + "end_time" : , + "transition_time" : , + "state" : , + "context" : , + "log" : + }, + +Where: + +.. list-table:: + :widths: 10 90 + :header-rows: 1 + + * - attr + - description + + * - ``is_published`` + - [ *number* ] + + The flag tells whether the database is *published* or not. + + * - ``num_chunks`` + - [ *number* ] + + The total number of chunks in the database, regardless if any contributons were made into the chunks + in a context of any transaction. Chunks need to be registered in Qserv before the corresponding MySQL tables + can be populated with data. This information is meant to be used for the monitoring and Q&A purposes. + + * - ``id`` + - [ *number* ] + + The unique identifier of the transaction. + + * - ``database`` + - [ *string* ] + + The name of the database the transaction is associated with. + + * - ``begin_time`` + - [ *number* ] + + The timestamp of the transaction creation in milliseconds since the UNIX *Epoch*. The value is + set by the service when the transaction is registered in the system andf assigned + a state ``IS_STARTING``. The value is guaranteed to be not ``0``. + + * - ``start_time`` + - [ *number* ] + + The timestamp of the transaction start in milliseconds since the UNIX *Epoch*. The value is + set by the service when the transaction is started (gets into the ``STARTED`` state). + The value is ``0`` while while teh transaction is still in a state ``IS_STARTING``. + + * - ``end_time`` + - [ *number* ] + + The timestamp of the transaction end in milliseconds since the UNIX *Epoch*. The value is + set by the service when the transaction is ended (committed, aborted or failed). A value + of the atrribite is ``0`` if the transaction is still active. + + * - ``transition_time`` + - [ *number* ] + + The timestamp of the last state transition in milliseconds since the UNIX *Epoch*. The value is + set by the service when the transaction gets into states ``IS_FINISHING`` (the committing process + was initiated) or ``IS_ABORTING`` (the aborting process was initiated). The value would be set + to ``0`` before that. + + * - ``state`` + - [ *string* ] + + The current state of the transaction. The possible values and their meanings are explained in + the dedicated section: + + - :ref:`ingest-trans-management-states` + + * - ``context`` + - [ *object* ] + + The object that was provided by a workflow at the transaction start time, or updated during transaction + commit/abort time. The object could be empty. The object could be used for the recovery from errors during + the ingest, for general bookkeeping, data provenance, visualization purposes, etc. + + * - ``log`` + - [ *array* ] + + The array of log entries. Each entry is a JSON object that has the following attributes: + + - ``id`` [ *number* ] - the unique identifier of the log entry + - ``transaction_state`` [ *string* ] - the state of the transaction at the time the log entry was generated + - ``name`` [ *string* ] - the name of the event that triggered the log entry + - ``time`` [ *number* ] - the timestamp of the event in milliseconds since the UNIX *Epoch* + - ``data`` [ *object* ] - the data associated with the event + + +.. _ingest-trans-management-descriptor-contrib-summary: + +With a summary of contributions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Setting the query parameters to ``contrib=1`` (regardless if ``contrib_long`` is set to ``0`` or ``1``) +will result in expaning the ``transaction`` block with the ``summary`` object. The object will +include the summary info on all contributions made in a sewcope of the transaction. + +The following object illustrates the idea (where most of the previous explained attributes and all +worker-level stats but the one for ``db12`` are omitted for brevity): + +.. code-block:: + + "transactions" : [ + { + "contrib" : { + "summary" : { + "first_contrib_begin" : 1726026383616, + "last_contrib_end" : 1726026945059, + "num_rows" : 223420722, + "num_rows_loaded" : 223420722, + "num_regular_files" : 0, + "num_chunk_files" : 156, + "num_failed_retries" : 0, + "num_workers" : 9, + "table" : { + "gaia_source" : { + "data_size_gb" : 201.872497558594, + "num_rows_loaded" : 217028788, + "num_rows" : 217028788, + "num_files" : 156, + "num_failed_retries" : 0, + "num_warnings" : 0 + "overlap" : { + "data_size_gb" : 5.97671127319336, + "num_rows" : 6391934, + "num_rows_loaded" : 6391934, + "num_files" : 155, + "num_failed_retries" : 0, + "num_warnings" : 0 + } + } + }, + "worker" : { + "db12" : { + "data_size_gb" : 48.6947402954102, + "num_rows" : 52289369, + "num_rows_loaded" : 52289369, + "num_regular_files" : 0, + "num_chunk_files" : 18, + "num_chunk_overlap_files" : 23, + "num_failed_retries" : 0, + "num_warnings" : 0, + }, + } + } + +The ``summary`` object includes 3 sets of attributes: + +- The general stats on the contributions made in a scope of the transaction. +- The stats on the contributions made into the table ``gaia_source`` across all workers. +- The stats on the contributions made into into tables by the worker ``db12``. + +These are the general (transaction-level) stats: + +.. list-table:: + :header-rows: 1 + + * - attr + - description + + * - ``first_contrib_begin`` + - [ *number* ] + + The timestamp of the first contribution in milliseconds since the UNIX *Epoch*. This is the time when a processing of the contribution started. + + * - ``last_contrib_end`` + - [ *number* ] + + The timestamp of the last contribution in milliseconds since the UNIX *Epoch*. This is the time when a processing of the contribution ended. + + * - ``num_rows`` + - [ *number* ] + + The total number of rows parsed in all input contributions made in a scope of the transaction. + + * - ``num_rows_loaded`` + - [ *number* ] + + The total number of rows that were actually loaded into the destination table(s) in all contributions made in a scope of the transaction. + + - **Note**: Normally the number of rows loaded should be equal to the number of rows parsed. If the numbers differ it means that some + rows were rejected during the ingest process. The workflow should be always monitoring any mismatches in these values and trigger alerts. + + * - ``num_regular_files`` + - [ *number* ] + + The total number of regular files (not chunk files) parsed in all input contributions. + + * - ``num_chunk_files`` + - [ *number* ] + + The total number of chunk files parsed in all input contributions. + + * - ``num_failed_retries`` + - [ *number* ] + + The total number of retries that failed during the ingest process. + + - **Note**: In most cases it's okay that the number of failed retries is not zero. The system is designed to retry + the ingest of the failed contributions. A problem is when the number of such failures detected in the scope of + a single contribution exceeds a limit set at the Ingest system. The workflow should be always monitoring + the number of failed retries and trigger alerts if the number is too high. + + * - ``num_workers`` + - [ *number* ] + + The total number of workers that were involved in the ingest process. + + +With detailed info on contributions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Setting the query parameters to ``contrib=1`` and ``contrib_long=1`` will result in expaning the ``contrib`` object +with the ``files`` array. Each entry (JSON object) in the array represents a contribution. The objects provides +the detailed info on all contributions made in a scope of the transaction. + +**Note**: Extended info on warnings and retries posted during contribution loading are still disabled in this case. +To enable warnings use the parameter ``include_warnings=1``. To enable retries use the parameter ``include_retries=1``. + +The following object illustrates the idea, where most but one contribution was eliminated for brevity: + +.. code-block:: + + "transactions" : [ + { + "contrib" : { + "files" : [ + { + "id" : 2651966, + "async" : 1, + "database" : "gaia_edr3", + "table" : "gaia_source", + "worker" : "db13", + "chunk" : 675, + "overlap" : 0, + "transaction_id" : 1630, + + "status" : "FINISHED", + "create_time" : 1726026383616, + "start_time" : 1726026383619, + "read_time" : 1726026396161, + "load_time" : 1726026412474, + + "url" : "http://sdfqserv001:18080/gaia_edr3/gaia_source/files/chunk_675.txt", + "http_method" : "GET", + "http_headers" : [], + "http_data" : "", + "tmp_file" : "/qserv/data/ingest/gaia_edr3-gaia_source-675-1630-7570-6e63-d0b6-6934.csv", + + "max_num_warnings" : 64, + "max_retries" : 4, + + "charset_name" : "latin1", + "dialect_input" : { + "fields_enclosed_by" : "\\0", + "lines_terminated_by" : "\\n", + "fields_escaped_by" : "\\\\", + "fields_terminated_by" : "," + }, + + "num_bytes" : 793031392, + "num_rows" : 776103, + "num_rows_loaded" : 776103, + + "http_error" : 0, + "error" : "", + "system_error" : 0, + "retry_allowed" : 0, + + "num_warnings" : 0, + "warnings" : [], + "num_failed_retries" : 0, + "failed_retries" : [] + }, + +The most important (for the ingest workflows) attributes of the contribution object are: + +.. list-table:: + :header-rows: 1 + + * - attr + - description + + * - ``status`` + - [ *string* ] + + The status of the contribution requests. The possible values are: + + - ``IN_PROGRESS``: The transient state of a request before it's FINISHED or failed. + - ``CREATE_FAILED``: The request was received and rejected right away (incorrect parameters, etc.). + - ``START_FAILED``: The request couldn't start after being pulled from a queue due to changed conditions. + - ``READ_FAILED``: Reading/preprocessing of the input file failed. + - ``LOAD_FAILED``: Loading into MySQL failed. + - ``CANCELLED``: The request was explicitly cancelled by the ingest workflow (ASYNC contributions only). + - ``FINISHED``: The request succeeded, + + * - ``create_time`` + - [ *number* ] + + The timestamp when the contribution request was received (milliseconds since the UNIX *Epoch*). + A value of the attribute is guaranteed to be not ``0``. + + * - ``start_time`` + - [ *number* ] + + The timestamp when the contribution request was started (milliseconds since the UNIX *Epoch*). + A value of the attribute is ``0`` before the processing starts. + + * - ``read_time`` + - [ *number* ] + + The timestamp when the Ingest service finished reading/preprocessing the input file (milliseconds since the UNIX *Epoch*). + A value of the attribute is ``0`` before the reading starts. + + * - ``load_time`` + - [ *number* ] + + The timestamp when the Ingest service finished loading the contribution into the MySQL table (milliseconds since the UNIX *Epoch*). + A value of the attribute is ``0`` before the loading starts. + + * - ``url`` + - [ *string* ] + + The URL of the input file that was used to create the contribution. Depending on a source of the data, the URL **scheme** could + be ``http``, ``https``, ``file``, etc. + + **Note** that there is no guarantee that the URL will be valid after the contribution is processed. + + * - ``max_num_warnings`` + - [ *number* ] + + The maximum number of the MySQL warnings to be captured after loading the contribution into the MySQL table. + The number may correspond to a value that was explicitly set by workflow when making a contribution request. + Otheriwse the default number configured at the system is assumed. + + * - ``max_retries`` + - [ *number* ] + + The maximum number of retries allowed for the contribution. The number may correspond to a value that was explicitly set by workflow + when making a contribution request. Otheriwse the default number configured at the system is assumed. + + * - ``num_bytes`` + - [ *number* ] + + The total number of bytes in the input file. The value is set by the service after it finishes reading + the file and before it starts loading the data into the MySQL table. + + * - ``num_rows`` + - [ *number* ] + + The total number of rows parsed by the ingest service in the input file. + + * - ``num_rows_loaded`` + - [ *number* ] + + The total number of rows loaded into the MySQL table. Normally the number of rows loaded should be equal to the number of rows parsed. + If the numbers differ it means that some rows were rejected during the ingest process. The workflow should be always monitoring any + mismatches in these values and trigger alerts. + + * - ``http_error`` + - [ *number* ] + + The HTTP error code captured by the service when pulling data of the contribution from a remote Web server. + This applies to the corresponidng URL **schemes**. The value is set only if the error was detected. + + * - ``error`` + - [ *string* ] + + The error message captured by the service during the contribution processing. The value is set only if the error was detected. + + * - ``system_error`` + - [ *number* ] + + The system error code captured by the service during the contribution processing. The value is set only if the error was detected. + + * - ``retry_allowed`` + - [ *number* ] + + The flag that tells if the contribution is allowed to be retried. The value is set by the service when the contribution + processing was failed. The value is set to ``1`` if the contribution is allowed to be retried, and to ``0`` otherwise. + + **Important**: The workflow should be always analyze a value of this attribute to decide if the contribution should be retried. + If th eretry i snot possible then the workflow should give up on the corresponding transaction, abort the one, and start + the another transaction to ingest all contributions attempted in a scope of the aborted one. + + * - ``num_warnings`` + - [ *number* ] + + The total number of MySQL warnings captured after loading the contribution into the MySQL table. + + **Note**: The number is reported regardless if the parameter ``include_warnings=1`` was set in the request or not. + + * - ``warnings`` + - [ *array* ] + + The array of MySQL warnings captured after loading the contribution into the MySQL table. Each entry is a string + that represents a warning message. + + **Notes**: + + - The array is populated only if the parameter ``include_warnings=1`` was set in the request. + - The maximum number of warnings captured is limited by the value of the attribute ``max_num_warnings``. + + * - ``num_failed_retries`` + - [ *number* ] + + The total number of retries that failed during the contribution processing. + + **Note**: The number is reported regardless if the parameter ``include_retries=1`` was set in the request or not. + + * - ``failed_retries`` + - [ *array* ] + + The array of failed retries captured during the contribution processing. Each entry is a JSON object that has the following attributes: + + - ``id``: *number* - the unique identifier of the failed retry + - ``time``: *number* - the timestamp of the failed retry in milliseconds since the UNIX *Epoch* + - ``error``: *string* - the error message associated with the failed retry + + **Note**: + + - The array is populated only if the parameter ``include_retries=1`` was set in the request. + - The maximum number of failed retries captured is limited by the value of the attribute ``max_retries``. + +The format of the collection of ``warning`` is presented below: + +.. list-table:: + :header-rows: 1 + + * - attr + - description + + * - ``level`` + - [ *string* ] + + The severity of the warning reported by MySQL. Allowed values: ``Note``, ``Warning``, ``Error``. + + * - ``code`` + - [ *number* ] + + The numeric error code indicates a reason for the observed problem. + + * - ``message`` + - [ *string* ] + + The human-readable explanation for the problem. + +More details on the values can be found in the MySQL documentation: + +- https://dev.mysql.com/doc/refman/8.4/en/show-warnings.html + +.. _ingest-trans-management-states: + +Transaction states +------------------ + +Transactions have well-defined states and the state transition algorithm. Normally, Ingest System moves a transaction +from one state to another in response the explicit transaction management requests made by a workflow. In some cases +the Replication/Ingest system may also change the states. + +The following table explains possible state transitions of a transaction: + +.. list-table:: + :widths: 10 80 10 + :header-rows: 1 + + * - state + - description + - next states + * - ``IS_STARTING`` + - The initial (transient) state assigned to a transaction right after it's registered in the system + in response to a request to start a transaction: :ref`ingest-trans-management-star`. + This transient state that should be changed to ``STARTED`` or ``START_FAILED``. + The former state is assigned to a transaction that was successfully started, the latter + to a transaction that failed to start. + + - ``START`` + + ``START_FAILED`` + + * - ``STARTED`` + - The active state of a transaction that is ready to accept data ingest requests. + When the system receives a request to commit or abort the transaction (see :ref:`ingest-trans-management-end`) + the state would transition to the corresponding transient states ``IS_FINISHING`` or ``IS_ABORTING``. + - ``IS_FINISHING`` + + ``IS_ABORTING`` + + * - ``IS_FINISHING`` + - The transient state assigned to a transaction that is in the process of being committed. + Depending on the database options specified by a workflow, the transaction may stay in this state + for a while. + The state will change to ``FINISHED`` in case of the succesfull completion of a request, or it may + land in in the ``FINISH_FAILED`` state in case of any problems en countered during the request + execution. A transaction may also get into the ``IS_ABORTING`` state if a workflow issues the abort + request while the transaction is being finished. + + - ``FINISHED`` + + ``FINISH_FAILED`` + + ``IS_ABORTING`` + + * - ``IS_ABORTING`` + - The transitional state triggered by the transaction abort request (see :ref:`ingest-trans-management-end`). + - ``ABORTED`` + + ``ABORT_FAILED`` + + * - ``FINISHED`` + - The final state of a transaction that was successfully committed. + - + + * - ``ABORTED`` + - The final state of a transaction that was successfully aborted. + - + + * - ``START_FAILED`` + - The (inactive) state of a transaction that failed to start. The state allows + a workflow to initiate the transaction abort request. + - ``IS_ABORTING`` + + * - ``FINISH_FAILED`` + - The (inactive) state of a transaction that failed to to be commited. The state allows + a workflow to initiate the transaction abort request. + - ``IS_ABORTING`` + + * - ``ABORT_FAILED`` + - The (inactive) state of a transaction that failed to to be aborted. The state allows + a workflow to initiate another transaction abort request (or requests). + - ``IS_ABORTING`` + diff --git a/doc/ingest/api/reference/tools.rst b/doc/ingest/api/reference/tools.rst new file mode 100644 index 000000000..58f2f4895 --- /dev/null +++ b/doc/ingest/api/reference/tools.rst @@ -0,0 +1,11 @@ +###################### +The Command Line Tools +###################### + +Error reporting in the command-line tools +========================================= + +All command line tools return ``0`` to indicate the successful completion of the requested operation. +Other values shall be treated as errors. The error messages are printed to the standard error stream. +Additional information on the error can be found in the standard output stream. + diff --git a/doc/ingest/index.rst b/doc/ingest/index.rst index c9cbfaca1..3f1822377 100644 --- a/doc/ingest/index.rst +++ b/doc/ingest/index.rst @@ -5,46 +5,9 @@ Ingesting catalogs ################## -.. toctree:: - :maxdepth: 2 +.. toctree:: + :maxdepth: 2 - api/index - qserv-ingest/index - -============ -Introduction -============ - -Unlike traditional RDBMS systems, Qserv does not support direct ingestion of data via -SQL ``INSERT`` statements. Neither one can create databases or tables directly via SQL DDL -statements like ``CREATE DATABASE``, ``CREATE TABLE`` and similar. Instead, data must be ingested -into Qserv using a collection of the REST services. The services represent the Qserv Ingest -API (covered in `The Ingest Workflow Developer's Guide `_) which provides the functionaly complete -set of tools and instructions needed for ingesting and managing data in Qserv. There are several -reasons for this design choice: - -- Implementing a parser for the SQL DDL and DML statements is a complex and time-consuming process. - Implementing a correct semantic of the SQL statements in a realm of the distributed database - is even more dounting task. -- The performace of the SQL-based ingest protocol is not sufficient for the high-throughput data ingestion. - - **Note**: Qserv is designed to handle the data volumes of the order of many Petabytes. -- The REST services (unlike the simple text-based SQL statements) allow for more *structural* data formats - for user inputs such as schemas (``JSON``) and data (``CSV``). Verifying the syntactical and semantical - correctness of the data is easier when the data are structured. -- The REST services provide a reliable and transparent mechanism for managing and tracking the distributed - state of the data products within Qserv. -- Many operations on the REST services can be made idempotent and can be easily retried in case of failures. -- By not being bound to a particular SQL dialect, the REST services provide a more flexible and portable - interface for the data ingestion. The API can be extended to support new types of the data management requests, - new data formats and data sources as needed without changing the core of the Qserv engine. - -The API serves as a foundation for designing and implementing the data ingestion processes that -are loosely called the *ingest workflows*. There may be many such workflows depending on a particular -use case, the amount of data to be ingested, data delivery requirements, and the overall complexity -of the data. - -Read `The Ingest Workflow Developer's Guide `_ for further details on the REST services and their -usage. An explanation of a simple Kubernetes-based ingest workflow application `qserv-ingest `_ -is also provided in this documentation portal. - -Also note that a simple ingest API is provided by :ref:`http-frontend` for integsting and managing user tables. + introduction + api/index + qserv-ingest/index diff --git a/doc/ingest/introduction.rst b/doc/ingest/introduction.rst new file mode 100644 index 000000000..cbddbe0ba --- /dev/null +++ b/doc/ingest/introduction.rst @@ -0,0 +1,39 @@ +.. _ingest-introduction: + +============ +Introduction +============ + +Unlike traditional RDBMS systems, Qserv does not support direct ingestion of data via +SQL ``INSERT`` statements. Neither one can create databases or tables directly via SQL DDL +statements like ``CREATE DATABASE``, ``CREATE TABLE`` and similar. Instead, data must be ingested +into Qserv using a collection of the REST services. The services represent the Qserv Ingest +API (covered in `The Ingest Workflow Developer's Guide `_) which provides the functionaly complete +set of tools and instructions needed for ingesting and managing data in Qserv. There are several +reasons for this design choice: + +- Implementing a parser for the SQL DDL and DML statements is a complex and time-consuming process. + Implementing a correct semantic of the SQL statements in a realm of the distributed database + is even more dounting task. +- The performace of the SQL-based ingest protocol is not sufficient for the high-throughput data ingestion. + - **Note**: Qserv is designed to handle the data volumes of the order of many Petabytes. +- The REST services (unlike the simple text-based SQL statements) allow for more *structural* data formats + for user inputs such as schemas (``JSON``) and data (``CSV``). Verifying the syntactical and semantical + correctness of the data is easier when the data are structured. +- The REST services provide a reliable and transparent mechanism for managing and tracking the distributed + state of the data products within Qserv. +- Many operations on the REST services can be made idempotent and can be easily retried in case of failures. +- By not being bound to a particular SQL dialect, the REST services provide a more flexible and portable + interface for the data ingestion. The API can be extended to support new types of the data management requests, + new data formats and data sources as needed without changing the core of the Qserv engine. + +The API serves as a foundation for designing and implementing the data ingestion processes that +are loosely called the *ingest workflows*. There may be many such workflows depending on a particular +use case, the amount of data to be ingested, data delivery requirements, and the overall complexity +of the data. + +Read `The Ingest Workflow Developer's Guide `_ for further details on the REST services and their +usage. An explanation of a simple Kubernetes-based ingest workflow application `qserv-ingest `_ +is also provided in this documentation portal. + +Also note that a simple ingest API is provided by :ref:`http-frontend` for integsting and managing user tables.