You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pointed out by a number of people including Katherine Tian.
Sewon's reply / points:
You are right that FactScore does not work in a batch. I think it should be possible to modify the code to make it work in a batch. My recommendation is to identify which part of the pipeline is causing a bottleneck in speed, and then parallelize from the slowest parts one by one. For instance, there are four possible bottlenecks: (1) atomic fact generation, (2) GTR retrieval, (3) InstLLAMA generation, (4) NPM verification.
If (1) is the bottleneck: the bottleneck is coming from OpenAI API, which is possibly because you are sharing the API key with many others and you run into the Rate Limit error. You can check if this is the case by always printing these lines. If this is the case, the best strategy is to use another API key that is not shared by others. Other than that, it's not straightforward how to speed this part up. (In the past, we've heard from users that this is the main bottleneck in speed in their cases, but it's definitely possible it's not the case for you.)
If (4) is the bottleneck, you can make npm work in a batch, or skip NPM by specifying retrieval+llama instead of retrieval+llama+npm which I think should give descent results.
The text was updated successfully, but these errors were encountered:
Pointed out by a number of people including Katherine Tian.
Sewon's reply / points:
You are right that FactScore does not work in a batch. I think it should be possible to modify the code to make it work in a batch. My recommendation is to identify which part of the pipeline is causing a bottleneck in speed, and then parallelize from the slowest parts one by one. For instance, there are four possible bottlenecks: (1) atomic fact generation, (2) GTR retrieval, (3) InstLLAMA generation, (4) NPM verification.
If (1) is the bottleneck: the bottleneck is coming from OpenAI API, which is possibly because you are sharing the API key with many others and you run into the Rate Limit error. You can check if this is the case by always printing these lines. If this is the case, the best strategy is to use another API key that is not shared by others. Other than that, it's not straightforward how to speed this part up. (In the past, we've heard from users that this is the main bottleneck in speed in their cases, but it's definitely possible it's not the case for you.)
If (2) is the bottleneck, you can make encoding of the query vector (in passage retrieval) work in a batch.
If (3) is the bottleneck, you can make the
_generate
function work in a batch.If (4) is the bottleneck, you can make npm work in a batch, or skip NPM by specifying
retrieval+llama
instead ofretrieval+llama+npm
which I think should give descent results.The text was updated successfully, but these errors were encountered: