From 102c582fb17b987a79d19def18e1d84aa2003d28 Mon Sep 17 00:00:00 2001 From: Omar Khattab Date: Wed, 4 Sep 2024 10:32:50 -0700 Subject: [PATCH] Update 2024.09.impact.md --- 2024.09.impact.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/2024.09.impact.md b/2024.09.impact.md index 17001a2..3fe63d9 100644 --- a/2024.09.impact.md +++ b/2024.09.impact.md @@ -48,7 +48,7 @@ Instead, think at least two steps ahead. Identify the path most people are likel What might this look like in practice? Let's revisit the ColBERT case study. The obvious way to build efficient retrievers with BERT is to encode documents into a vector. Interestingly, there was only limited IR work that does that by late 2019. For example, the best-cited work in this category (DPR) only had its first preprint released in April 2020. Given this, you might think that the right thing to do in 2019 was to build a great single-vector IR model via BERT. In contrast, thinking just two steps ahead would be to ask: everyone will be building single-vector methods sooner or later, where will this single-vector approach get fundamentally stuck? And indeed, that question led to the [late interaction](https://x.com/lateinteraction/status/1736804963760976092) paradigm and [widely-used models](https://huggingface.co/colbert-ir/colbertv2.0). -As another example, we could use [DSPy](https://github.com/stanfordnlp/dspy). In February 2022, as prompting is becoming decently powerful, it was clear that people will want to do retrieval-based QA with prompting, not with fine-tuning like it used to be. A natural thing to do would be to build a method for just that. Thinking two steps ahead would be to ask: where will such approaches get stuck? Ultimately, retrieve-then-generate (or "RAG") approaches are perhaps the simplest possible pipeline involving LMs. For the same reasons people will be interested in it, they would increasingly be interested in (i) expressing more complex modular compositions and (ii) figuring out how the resulting sophisticated pipelines should be supervised, via automated prompting or finetuning of the underlying LMs. That's DSPy. +As another example, we could use [DSPy](https://github.com/stanfordnlp/dspy). In February 2022, as prompting is becoming decently powerful, it was clear that people will want to do retrieval-based QA with prompting, not with fine-tuning like it used to be. A natural thing to do would be to build a method for just that. Thinking two steps ahead would be to ask: where will such approaches get stuck? Ultimately, retrieve-then-generate (or "RAG") approaches are perhaps the simplest possible pipeline involving LMs. For the same reasons people will be interested in it, it was clear that they would increasingly be interested in (i) expressing more complex modular compositions and (ii) figuring out how the resulting sophisticated pipelines should be supervised or optimized, via automated prompting or finetuning of the underlying LMs. That's DSPy. The second half of this guideline is "iterate fast". This was perhaps the very first research advice I received from my advisor Matei Zaharia, in week one of my PhD: by identifying a version of the problem you can iterate quickly on and receive feedback (e.g., latency or validation scores), you greatly impove your chances at solving hard problems. This is especially important if you will be thinking two steps ahead, which is already hard and uncertain enough. @@ -59,7 +59,7 @@ At this point, you’ve identified a good problem and then iterated until you di A common first step is to release the paper as a preprint on arXiv and then release a “thread” (or similar) announcing the paper’s release. When you do this, make sure your thread begins with a concrete, substantial, and accessible claim. The goal isn’t to tell people that you released a paper — that doesn't carry inherent value. The goal is to communicate your key argument in a direct and vulnerable but engaging way in the form of a specific statement that people can agree or disagree with. (Yes, I know this is hard but it is necessary.) -Perhaps more importantly, this whole process does not end after the first "release". It starts with the release. Given that now you're investing in projects, not just papers, your ideas _and of your scientific communication_ persist year-long, well beyond isolated paper releases. Let me illustrate why this matters. When I help grad students “tweet” about their work, it's not uncommon that their initial post doesn’t get as much traction as hoped. Students typically assume this validates their fear of posting about their research and take it as yet another sign that they should just move on to the next paper. Obviously, this is not correct. +Perhaps more importantly, this whole process does not end after the first "release". It starts with the release. Given that now you're investing in projects, not just papers, your ideas _and your scientific communication_ persist year-long, well beyond isolated paper releases. Let me illustrate why this matters. When I help grad students “tweet” about their work, it's not uncommon that their initial post doesn’t get as much traction as hoped. Students typically assume this validates their fear of posting about their research and take it as yet another sign that they should just move on to the next paper. Obviously, this is not correct. A lot of personal experience, second-hand experience, and observations suggest that this is a place where persistence is massively helpful (and, by the way, exceedingly rare). With few exceptions, traction of good ideas needs you to tell people the key things many times in different contexts — and evolving your thoughts and your communication of your thoughts — either until the community can absorb these ideas or until the field reaches the right stage of development to appreciate those ideas.