[CLIP4STR] Integrate CLIP4STR #20

Bin-NV · 2025-01-02T09:09:15Z

Support CLIP4STR inference with visual branch trt engine and text branch trt engine simultaneously
Support upper and lower output

Tyler-D · 2025-01-06T07:55:54Z

include/nvocdr.h

+  bool ocrnet_only_alnum = false;
+  bool ocrnet_only_lowercase = false;
+  char* ocrnet_vocab_file;
+  int ocrnet_vocab_size = 32000;


Just curious, why can't we use a dict_file or vocabulary file to determine those parameters:

ocrnet_only_alnum ocrnet_only_lowercase ocrnet_vocab_file ocrnet_vocab_size

the raw output from CLIP4STR includes0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-./:;<=>?@[\]^_{|}~`
so ocrnet_only_alnum and ocrnet_only_lowercase flags are used to control if we should filter the upper and symbol chars

ocrnet_vocab_file includs 26k words, we only need 32000 words here

Tyler-D · 2025-01-06T08:03:40Z

include/nvocdr.h

+  char* ocrnet_vocab_file;
+  int ocrnet_vocab_size = 32000;
+  // char* charset_train = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~";
+  // char* charset_test = "0123456789abcdefghijklmnopqrstuvwxyz";


Remove the unused code/

Tyler-D · 2025-01-06T08:05:43Z

src/OCRNetEngine.h

 };

+
+class SimpleTokenizer 


Could you move the tokenizer to a separate file?

Tyler-D · 2025-01-06T08:09:42Z

src/OCRNetEngine.h

+            initTokenizer(bpe_path, vocab_size);
+        }
+
+        void initTokenizer(const std::string& bpe_path, int vocab_size)


Could you separate the declaration and definition in two files?

Tyler-D · 2025-01-06T08:11:52Z

src/OCRNetEngine.h

    private:
        std::unique_ptr<TRTEngine> mEngine;
        std::vector<std::string> mDict;
        bool mUDFlag;
        DecodeMode mDecodeMode;

-        // int mDecodeOutputBufferIndex;
+        // CLIP4STR
+        std::unique_ptr<TRTEngine> mTextEngine;


Basically, as you already broke the notion of "one model, one engine", I would suggest to create a new class inherited from OCRNetEngine for the CLIP4STR

ok, will do

[CLIP4STR] Integrate CLIP4STR

5f22703

Bin-NV requested a review from Tyler-D January 2, 2025 09:09

Tyler-D requested changes Jan 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CLIP4STR] Integrate CLIP4STR #20

[CLIP4STR] Integrate CLIP4STR #20

Bin-NV commented Jan 2, 2025

Tyler-D Jan 6, 2025

Bin-NV Jan 6, 2025

Tyler-D Jan 6, 2025

Tyler-D Jan 6, 2025

Tyler-D Jan 6, 2025

Tyler-D Jan 6, 2025

Bin-NV Jan 6, 2025

[CLIP4STR] Integrate CLIP4STR #20

Are you sure you want to change the base?

[CLIP4STR] Integrate CLIP4STR #20

Conversation

Bin-NV commented Jan 2, 2025

Tyler-D Jan 6, 2025

Choose a reason for hiding this comment

Bin-NV Jan 6, 2025

Choose a reason for hiding this comment

Tyler-D Jan 6, 2025

Choose a reason for hiding this comment

Tyler-D Jan 6, 2025

Choose a reason for hiding this comment

Tyler-D Jan 6, 2025

Choose a reason for hiding this comment

Tyler-D Jan 6, 2025

Choose a reason for hiding this comment

Bin-NV Jan 6, 2025

Choose a reason for hiding this comment