Generating embeddings without serving up the model checkpoint each time #17

erichare · 2020-02-10T16:59:09Z

Hello,

Gobbli is a fantastic package. I've been trying to use it in some of my work. One issue is it seems like the BERT checkpoint is being loaded at each call of embed(). This makes the embedding generation take 20-30 seconds on my machine.

Is there a way to "serve up" this model so that subsequent calls to embed() don't have to load the model checkpoint each time? Or would this require quite a bit of restructuring?

jasonnance · 2020-02-10T18:47:13Z

I'm glad you're finding gobbli useful!

You're correct that each call to embed() does a lot of work, although it's not just loading the checkpoint -- it's also writing all your data to disk and reading it in the container, then writing all the embeddings to disk in the container and reading it outside. Depending on how big your dataset is, that might be taking more time.

There isn't currently a way around this -- I'd consider it a fundamental limitation of gobbli's design. If latency is important to you, you may want to look into something like https://github.com/hanxiao/bert-as-service, which is better-suited for serving lower-latency responses. gobbli was only intended for experimental/batch workloads and was more designed to help you quickly determine if a model will work in a production situation rather than serving a production model.

It would be theoretically possible to rework gobbli's model Docker containers into e.g. REST API services (as opposed to single run batch processes) which could be spun up once and repeatedly used, but this would be a fair amount of work, since we'd have to essentially build a mostly-same-but-slightly-different API server within the constraints of a host of different Python environments. I don't see that happening any time soon.

I'll leave this issue open for discussion for a bit, but I don't think there's much we can do about it in the near-term.

erichare · 2020-02-10T18:55:44Z

Thank you so much for that response Jason. I had a suspicion this was the case and it makes sense why it's a technical challenge.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating embeddings without serving up the model checkpoint each time #17

Generating embeddings without serving up the model checkpoint each time #17

erichare commented Feb 10, 2020

jasonnance commented Feb 10, 2020

erichare commented Feb 10, 2020

Generating embeddings without serving up the model checkpoint each time #17

Generating embeddings without serving up the model checkpoint each time #17

Comments

erichare commented Feb 10, 2020

jasonnance commented Feb 10, 2020

erichare commented Feb 10, 2020