Skip to content

A summary of issues about CAPI

Tao Luo edited this page Dec 9, 2019 · 1 revision
  • About documentation
  • About data structure related to input and output
  • About programming model for input and output
  • About compiling
  • About multithreading
  • About interface

About documentation

  • There are few documents about C-API at present. The example given is relatively simple, which leads to slow development using C-API for beginners. Some users solve their problems by reading source code.
  • Users are not clear about the operating principle and operation process of C-API(#2853), and some users are still asking how to write data reader(#5393) with C-API. Users want to have a more detailed document, such as introducing model loading, reading into the expected data, starting the model, exporting the model output and other basic operations.
  • When using C-API, the most important thing is to figure out the input and output format of data. Users want to have more documents about defining data API, such as paddle_matrix_create_sparse. (#2697).
  • Currently, there is only one English introduction about how to use C-API, but users still ask related questions, so it is recommended to have a Chinese introduction.
  • Can the model trained using other architecture(such as tensorflow, Caffe) use Paddle's C-API for model prediction? I tried to find it in the document, but I didn't find the answer, but there was a discussion in issue.
  • In the model running, there will be some unnecessary log, how to set not the output of these log?

Present situation:

  • About the operating principle and operation process of C-API, Zhang Chao and Cao Ying have a PR under models. They write very well. #372

About data structure related to input and output

  • Can C-API support multiple slot hybrid type input(#3182)? And there is no case in demos at present. #4303
  • How does C-API load multiple different models to do inference? #4297

About programming model for input and output

  • How to check whether the model output is empty. #6207

About compiling

  • Some users learn C-API by Reference to this document. Although user can download binary files directly from CI, there are a few reasons that the user has to be compiled locally:
    • The binary file on CI is generated on the CentOS 6.3, and the user may use it on other versions or systems. (At the weekend I checked the files on the CI and found that only the CentOS 6.3 GPU binary file.) #4302
    • Users do not know the binary file is generated by what compiler options. Some users directly download the binary file and run demos, but they always failed. The reason is that the binary file was generated using WITH_AVX at compile time, but the user machine does not support AVX.#3543
  • When users compile C-API locally, they don't know which options to use. For example, some users use WITH_C_API=ON, WITH_SWIG_PY=OFF, WITH_PYTHON=ON to generate Makefile, but they will encounter problems when compiling. (I tried it on weekend, there is no problem.)
  • In the capi directory, there are three examples of capi usage. Although all of the three examples have MakeLists.txt, they are incomplete. There is no way to name include files and link libraries, so running directly wille encounter error.#4409
  • Fail to link library. #2863

About multithreading

  • paddle_init is global, it can be called repeatedly. This is achieved by setting a static variable in the paddle_init. This approach may be no problem for CPU multithreading, but it is problematic for GPU multithreading. #4297
  • When C-API library carries out GPU multithread inference, cublas status: not initialized error occurs (#5669), which may be a bug of multithreading. The GPU resources of other threads outside main thread are not initialized.
  • With users responding to multithreading, inference will lose data. #4227 #4294 #4040 #5433

About interface

  • The interface of GPU data input and output is not friendly. When using GPU to do inference and exporting data, we first need to call paddle_matrix_get_row to get the pointer of type paddle_real, then we call cudaMemcpy (void* DST, const void* SRC, size_t count) to copy. The process is very tedious.#5490
Clone this wiki locally