Merge branch 'current' into cberger_add_git_strategies_blog

dbt-labs · Feb 21, 2025 · 4ef880b · 4ef880b
2 parents 04bdd8b + 768822b
commit 4ef880b
Show file tree

Hide file tree

Showing 109 changed files with 2,634 additions and 671 deletions.
diff --git a/website/api/get-discourse-comments.js b/website/api/get-discourse-comments.js
@@ -9,10 +9,12 @@ const PREVIEW_ENV = 'deploy-preview-'
 // Set API endpoint and headers
 let discourse_endpoint = `https://discourse.getdbt.com`
 let headers = {
-  'Accept': 'application/json',
-  'Api-Key': DISCOURSE_DEVBLOG_API_KEY,
-  'Api-Username': DISCOURSE_USER_SYSTEM,
-}    
+  Accept: "application/json",
+  "Api-Key": DISCOURSE_DEVBLOG_API_KEY,
+  "Api-Username": DISCOURSE_USER_SYSTEM,
+  // Cache comments in the browser (max-age) & CDN (s-maxage) for 1 day
+  "Cache-Control": "max-age=86400, s-maxage=86400 stale-while-revalidate",
+};    
 
 async function getDiscourseComments(request, response) {
   let topicId, comments, DISCOURSE_TOPIC_ID;

diff --git a/website/api/get-discourse-topics.js b/website/api/get-discourse-topics.js
@@ -9,10 +9,12 @@ async function getDiscourseTopics(request, response) {
     // Set API endpoint and headers
     let discourse_endpoint = `https://discourse.getdbt.com`
     let headers = {
-      'Accept': 'application/json',
-      'Api-Key': DISCOURSE_API_KEY,
-      'Api-Username': DISCOURSE_USER,
-    }
+      Accept: "application/json",
+      "Api-Key": DISCOURSE_API_KEY,
+      "Api-Username": DISCOURSE_USER,
+      // Cache topics in the browser (max-age) & CDN (s-maxage) for 1 day
+      "Cache-Control": "max-age=86400, s-maxage=86400 stale-while-revalidate",
+    };
 
     const query = buildQueryString(body)
     if(!query) throw new Error('Unable to build query string.')

diff --git a/website/blog/2025-01-21-wish-i-had-a-control-plane-for-my-renovation.md b/website/blog/2025-01-21-wish-i-had-a-control-plane-for-my-renovation.md
@@ -47,7 +47,7 @@ Here’s the challenge: monitoring tools, by their nature, look backward. They
 
 [dbt Cloud](https://www.getdbt.com/product/dbt-cloud) unifies these perspectives into a single [control plane](https://www.getdbt.com/blog/data-control-plane-introduction), bridging proactive and retrospective capabilities:
 
-- **Proactive planning**: In dbt, you declare the desired [state](https://docs.getdbt.com/reference/node-selection/syntax#state-selection) of your data before jobs even run &mdash; your architectural plans are baked into the pipeline.
+- **Proactive planning**: In dbt, you declare the desired [state](https://docs.getdbt.com/reference/node-selection/state-selection) of your data before jobs even run &mdash; your architectural plans are baked into the pipeline.
 - **Retrospective insights**: dbt Cloud surfaces [job logs](https://docs.getdbt.com/docs/deploy/run-visibility), performance metrics, and test results, providing the same level of insight as traditional monitoring tools.
 
 But the real power lies in how dbt integrates these two perspectives. Transformation logic (the plans) and monitoring (the inspections) are tightly connected, creating a continuous feedback loop where issues can be identified and resolved faster, and pipelines can be optimized more effectively.

diff --git a/website/blog/2025-01-23-levels-of-sql-comprehension.md b/website/blog/2025-01-23-levels-of-sql-comprehension.md
@@ -12,7 +12,6 @@ date: 2025-01-23
 is_featured: true
 ---
 
-
 Ever since [dbt Labs acquired SDF Labs last week](https://www.getdbt.com/blog/dbt-labs-acquires-sdf-labs), I've been head-down diving into their technology and making sense of it all. The main thing I knew going in was "SDF understands SQL". It's a nice pithy quote, but the specifics are *fascinating.*
 
 For the next era of Analytics Engineering to be as transformative as the last, dbt needs to move beyond being a [string preprocessor](https://en.wikipedia.org/wiki/Preprocessor) and into fully comprehending SQL. **For the first time, SDF provides the technology necessary to make this possible.** Today we're going to dig into what SQL comprehension actually means, since it's so critical to what comes next.
@@ -145,6 +144,8 @@ In introducing these concepts, we’re still just scratching the surface. There'
 - How this is all going to roll into a step change in the experience of working with data
 - What it means for doing great data work
 
-Over the coming days, you'll be hearing more about all of this from the dbt Labs team - both familiar faces and our new friends from SDF Labs.
+To learn more, check out [The key technologies behind SQL Comprehension](/blog/sql-comprehension-technologies). 
+
+Over the coming days, you'll hear more about all of this from the dbt Labs team - both familiar faces and our new friends from SDF Labs.
 
 This is a special moment for the industry and the community. It's alive with possibilities, with ideas, and with new potential. We're excited to navigate this new frontier with all of you.
diff --git a/website/blog/2025-02-19-faster-project-parsing-with-rust.md b/website/blog/2025-02-19-faster-project-parsing-with-rust.md
@@ -0,0 +1,77 @@
+---
+title: "Parser, Better, Faster, Stronger: A peek at the new dbt engine"
+description: "Remember how dbt felt when you had a small project? You pressed enter and stuff just happened immediately? We're bringing that back."
+slug: faster-project-parsing-with-rust
+
+authors: [joel_labes]
+
+tags: [data ecosystem]
+hide_table_of_contents: false
+
+date: 2025-02-19
+is_featured: true
+---
+Remember how dbt felt when you had a small project? You pressed enter and stuff just happened immediately? We're bringing that back.
+
+<Lightbox src="/img/blog/2025-02-19-faster-project-parsing-with-rust/parsing_10k.gif" width="100%" title="Benchmarking tip: always try to get data that's good enough that you don't need to do statistics on it" />
+
+After a [series of deep dives](/blog/the-levels-of-sql-comprehension) into the [guts of SQL comprehension](/blog/sql-comprehension-technologies), let's talk about speed a little bit. Specifically, I want to talk about one of the most annoying slowdowns as your project grows: project parsing.
+
+When you're waiting a few seconds or a few minutes for things to start happening after you invoke dbt, it's because parsing isn't finished yet. But Lukas' [SDF demo at last month's webinar](https://www.getdbt.com/resources/webinars/accelerating-dbt-with-sdf) didn't have a big wait, so why not?
+
+<!-- truncate -->
+
+## A primer on parsing
+
+Parsing your project (remember: [not your SQL](/blog/the-levels-of-sql-comprehension)!) is how dbt builds the dependency graph of models and macros. If you've ever looked at a `manifest.json` and noticed all the `depends_on` blocks, that's what we're talking about.
+
+Without the resolved dependencies, dbt can't filter down to a subset of your project – this is why parsing is always an all-or-nothing affair. You can't do `dbt parse --select my_model+` because parsing is what works out what's on the other side of that plus. (Of course, most projects use partial parsing so are not starting from scratch every time).
+
+All those refs and macros are defined in Jinja. I don't know if you've ever thought about how Jinja gets from curly braces into text, but it's pretty weird! It's actually a two-step process: first it gets converted into Python code, and then that Python code is *itself run to generate a string*!
+
+This is kinda slow. Not so much as a one-off, but a project with 10,000 nodes might have 15-20,000 dependencies so every millisecond adds up.
+
+## What if we wanted it to be faster?
+
+Since running the code is slow, one way to get results faster is to not run the code. Since v1.0, dbt's parser has [used a static analyzer](https://github.com/dbt-labs/dbt-core/blob/main/docs/guides/parsing-vs-compilation-vs-runtime.md#:~:text=Simple%20Jinja%2DSQL%20models%20(using%20just%20ref()%2C%20source()%2C%20%26/or%20config()%20with%20literal%20inputs)%20are%20also%20statically%20analyzed%2C%20using%20a%20thing%20we%20built.%20This%20is%20very%20fast%20(~0.3%20ms)) to resolve refs when possible, which is [about 3x faster](https://docs.getdbt.com/reference/parsing#:~:text=For%20now%2C%20the%20static%20parser,speedup%20in%20the%20model%20parser) than going through the whole rigmarole above.
+
+<Lightbox src="/img/blog/2025-02-19-faster-project-parsing-with-rust/evaluation_strategies_1.png" width="100%" />
+
+The other way you could get the result faster is to run the code faster.
+
+The original author of Jinja also wrote [minijinja](https://github.com/mitsuhiko/minijinja) – a Rust implementation of a subset of the original Jinja library.
+
+This is not the post for a deep dive on *why* Rust and Python have such different performance characteristics, but the key takeaway is that [minijinja can *fully evaluate* a ref 30 times faster](https://github.com/mitsuhiko/minijinja/tree/main/benchmarks) than today's dbt can even *statically analyze* it.
+
+<Lightbox src="/img/blog/2025-02-19-faster-project-parsing-with-rust/evaluation_strategies_2.png" width="100%" />
+
+Our analysis in the leadup to dbt v1.0 showed that the static analyzer could handle 60% of models. Evaluating refs 30x faster in 60% of models would itself be great.
+
+But recall that static analysis was the workaround for evaluating Jinja being slow. Since **we can now evaluate Jinja faster than we can statically analyze it**, let's just<sup>†</sup> evaluate everything!
+
+<sup>†</sup>The word "just" is doing a *lot* of heavy lifting here. In practice, there's a lot happening behind the scenes to get both the performance of minijinja and the ability to process the full range of capabilities of a dbt project. Another story for another day.
+
+## What does this mean in practice?
+
+As you saw at the top of the post, I've been running some synthetic projects against an early build of the new dbt engine, and it's pretty snappy - **parsing a 10,000 model project in under 600ms**. Let's see how it goes with some other common project sizes:
+
+<Lightbox src="/img/blog/2025-02-19-faster-project-parsing-with-rust/parse_time_comparison_linear.png" width="100%" title="You might have to squint, but I promise there's a yellow line on each of those groups" />
+
+Even a 20,000-model project finished parsing in about a second. The equivalent cold parse takes well over a minute, and a partial parse (with no changed files) took about 12 seconds.
+
+Let's look at one more comparison: **100k models. I need to break out the log scale for this one:**
+
+<Lightbox src="/img/blog/2025-02-19-faster-project-parsing-with-rust/parse_time_comparison_log.png" width="100%" />
+
+ The new dbt engine parsed our 100,000 model example project in under 10 seconds, compared with almost 20 minutes.
+
+Let me be clear: I do not think you should put 100,000 models into your project! I mostly ran that one for the lols. But back in the realm of project sizes that actually exist:
+
+- If your project isn't currently eligible for partial parsing, cold parses in Rust are fast enough to make it a moot point.
+- Regardless of how your project parses today, your project will feel like it's a couple of orders of magnitude smaller than it is.
+
+## We're just getting started
+
+Speed is just one benefit to come from this integration, and pales in comparison to, say, [the importance of logical plans](https://roundup.getdbt.com/p/the-power-of-a-plan-how-logical-plans). But it sure is fun!
+
+The teams are still hard at work integrating the two tools, and we'll have more to share on how the developer experience will change thanks to SDF's tech at our [Developer Day event in March](https://www.getdbt.com/resources/webinars/dbt-developer-day).
diff --git a/website/blog/ctas.yml b/website/blog/ctas.yml
@@ -30,8 +30,8 @@
   subheader: Catch up on Coalesce 2024 and register to access a select number of on-demand sessions.
   button_text: Register and watch
   url: https://coalesce.getdbt.com/register/online 
-- name: spring_launch_2025
-  header: 2025 dbt Cloud Launch Showcase
-  subheader: Join us on March 19th or 20th to hear from our executives and product leaders about the latest features landing in dbt.
+- name: developer_day_2025
+  header: dbt Developer Day 
+  subheader: Join us on March 19th or 20th to hear from dbt Labs product leads about exciting new and coming-soon features designed to supercharge data developer workflows.
   button_text: Save your seat
-  url: https://www.getdbt.com/resources/webinars/2025-dbt-cloud-launch-showcase
+  url: https://www.getdbt.com/resources/webinars/dbt-developer-day
diff --git a/website/blog/metadata.yml b/website/blog/metadata.yml
@@ -2,11 +2,11 @@
 featured_image: ""
 
 # This CTA lives in right sidebar on blog index
-featured_cta: "spring_launch_2025"
+featured_cta: "developer_day_2025"
 
 # Show or hide hero title, description, cta from blog index
-show_title: true
-show_description: true
+show_title: false
+show_description: false
 hero_button_url: "/blog/welcome"
 hero_button_text: "Start here"
 hero_button_new_tab: false