by Brayton Hall
Table of Contents:
To use state-of-the-art transformers in novel ways, and quite possibly on novels themselves. Specifically, to test the ways in which transformers with many parameters can be fine-tuned with very little labeled data (i.e. few-shot learning), so that a large language model can learn, by itself, to approximate the task of summarizing, for example, the entirety of War and Peace (or any long form document) with output options along different embeddings to capture desired themes. The potential of this type of few-shot learning, in conjunction with active learning, is enormous.
Summarization of the entirety of War and Peace with zero-shot chunking summarization.
The following summary corresponds to the first approximately 5,120 words of War and Peace, or about 1/100 of the novel. This could be adjusted, and fine-tuned upon certain 'themes' or embeddings produced from a bit of manual supervision and active learning.