[Feature request]: Handle Splitting? #1055

rehandaphedar · 2024-10-19T06:15:45Z

What do you need?

It would be great if fabric could automatically handle splitting/chunking for text that is too large for a given model.

From what I understand, this would need:

Information about the token limit of the model being used
A way to count the tokens in a specific request
A range of splitting options (character, word, sentence, recursive, semantic etc.) to choose from
Possibly a way to select the LLM used for semantic splitting

mattjoyce · 2024-10-28T11:25:09Z

I understand the concept of chunking, but how would it work in this environment?
I have a large file, and pipe to fabric -p summarize, it splits the file in various ways and summarizes each chunk and joins the results?

I'm sceptical about the efficacy and utility, but interested to hear your thoughts.

rehandaphedar · 2024-10-29T07:20:22Z

it splits the file in various ways and summarizes each chunk and joins the results?

Yes. Not just for summarising though. I was thinking that the patterns could be modified so that they inform the LLM that the input given is part of a larger input, and possibly include the outputs to the previous chunks (In the case of for eg. sliding window approaches).

I'm not sure how difficult it would be to implement the advanced splitting options, for example those that include previous chunks' output as input to the next chunk. However, simple splitting should hopefully be easy to implement, and would be very helpful even with it's inefficiencies.

Regarding combining the outputs, I think even just concatenating them would still be very helpful. I'm not aware of how other programs do it though.

Regarding utility, handling chunking would be extremely useful, as one could for example pdftotxt book.pdf | fabric -p extract_wisdom and similar commands without worrying about token limit.

rehandaphedar added the enhancement New feature or request label Oct 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request]: Handle Splitting? #1055

[Feature request]: Handle Splitting? #1055

rehandaphedar commented Oct 19, 2024

mattjoyce commented Oct 28, 2024

rehandaphedar commented Oct 29, 2024

[Feature request]: Handle Splitting? #1055

[Feature request]: Handle Splitting? #1055

Comments

rehandaphedar commented Oct 19, 2024

What do you need?

mattjoyce commented Oct 28, 2024

rehandaphedar commented Oct 29, 2024