Skip to content

Commit

Permalink
feat: add first figure
Browse files Browse the repository at this point in the history
  • Loading branch information
jaanphare committed Apr 7, 2024
1 parent 6dc81b7 commit 29e0dec
Show file tree
Hide file tree
Showing 19 changed files with 365 additions and 437 deletions.
39 changes: 38 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,21 @@ dbt run --threads 8
3. Verify that you can query the data on the command line:

```bash
duckdb -c "SELECT * FROM '/Users/me/data/syh_dr/syhdr_commercial_inpatient_2016.parquet'"
```

This should show the data:

```
┌──────────────────────┬──────────────────────┬───┬──────────────────────┬──────────────────────┬──────────────────────┐
│ CAST(PERSON_ID AS … │ CAST(PERSON_WGHT A… │ … │ CAST(CPT_PRCDR_CD_… │ CAST(replace(repla… │ CAST(replace(repla… │
│ uint64 │ decimal(18,3) │ │ varchar │ float │ float │
├──────────────────────┼──────────────────────┼───┼──────────────────────┼──────────────────────┼──────────────────────┤
...
├──────────────────────┴──────────────────────┴───┴──────────────────────┴──────────────────────┴──────────────────────┤
│ 386816 rows (40 shown) 101 columns (5 shown) │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```

## To build a specific data model

Expand All @@ -125,4 +139,27 @@ Use `--select` in dbt to select models, e.g. in order to build all histograms:

```bash
dbt run --select "*histogram*"
```
```

Check the total payment amount:

```bash
❯ duckdb -c "SELECT SUM(Payment * Count) FROM '/Users/me/data/syh_dr/insurance_plan_payment_histogram.parquet'"
┌────────────────────────┐
│ sum((Payment * Count))
│ double │
├────────────────────────┤
│ 8570849798.39355 │
└────────────────────────┘
```

# Contributors

@wesleycheung0, @jaanli, @sumanthkaja (reach out if you also want to volunteer on this and have worked with dbt, duckdb, healthcare data, or Observable Framework! We will be working with large language models next)

# Copyright
(c) 2024 All Bets LLC, a wholly-owned subsidiary of One Fact Foundation (a 501(c)(3) nonprofit). This legal structure is required by the United States' Internal Revenue Service to allow non-profit organizations to engage in the creation of open source software, which outside of non-profits is typically done by for-profit companies and requires significant taxation.

# Contact

File an issue here or email `[email protected]`.
Binary file not shown.
99 changes: 0 additions & 99 deletions docs/example-dashboard.md

This file was deleted.

75 changes: 0 additions & 75 deletions docs/example-report.md

This file was deleted.

94 changes: 58 additions & 36 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,51 +45,73 @@ toc: false
</style>

<div class="hero">
<h1>Hello, Observable Framework</h1>
<h2>Welcome to your new project! Edit&nbsp;<code style="font-size: 90%;">docs/index.md</code> to change this page.</h2>
<a href="https://observablehq.com/framework/getting-started">Get started<span style="display: inline-block; margin-left: 0.25rem;">↗︎</span></a>
<h1>Synthetic Healthcare Data</h1>
<h2>Welcome to nonprofit research into healthcare thanks to the Agency for Healthcare Research and Quality in the United States' Department of Health and Human Services!</h2>
<a href="https://www.ahrq.gov/sites/default/files/wysiwyg/data/SyH-DR-Codebook.pdf">Learn more about variables in the data<span style="display: inline-block; margin-left: 0.25rem;">↗︎</span></a>
</div>

<div class="grid grid-cols-2" style="grid-auto-rows: 504px;">
<div class="card">${
resize((width) => Plot.plot({
title: "Your awesomeness over time 🚀",
subtitle: "Up and to the right!",
width,
y: {grid: true, label: "Awesomeness"},
marks: [
Plot.ruleY([0]),
Plot.lineY(aapl, {x: "Date", y: "Close", tip: true})
]
}))
}</div>
<div class="card">${
resize((width) => Plot.plot({
title: "How big are penguins, anyway? 🐧",
width,
grid: true,
x: {label: "Body mass (g)"},
y: {label: "Flipper length (mm)"},
color: {legend: true},
marks: [
Plot.linearRegressionY(penguins, {x: "body_mass_g", y: "flipper_length_mm", stroke: "species"}),
Plot.dot(penguins, {x: "body_mass_g", y: "flipper_length_mm", stroke: "species", tip: true})
]
}))
}</div>
</div>

---

```js
const aapl = FileAttachment("aapl.csv").csv({typed: true});
const penguins = FileAttachment("penguins.csv").csv({typed: true});
import {DuckDBClient} from "npm:@observablehq/duckdb";
const db = DuckDBClient.of({data: FileAttachment("data/insurance_plan_payment_histogram.parquet")});
```


```js
const orderInsurance = [
'Commercial',
'Medicaid',
'Medicare',
];
```

```js
const paymentData = await db.query(`
SELECT Payment, count, Insurance FROM data
`);
```

```js
function paymentChart(paymentData, width) {
// Create a histogram with a logarithmic base.
return Plot.plot({
width,
marginLeft: 60,
x: { type: "log", domain: [1, 1000000] }, // Set the domain of the x-axis to be fixed between 1 and 1,000,000
y: { axis: null }, // Hide the y-axis
color: { legend: "swatches", columns: 1, domain: orderInsurance },
marks: [
Plot.rectY(
paymentData,
Plot.binX(
{ y: "sum" },
{
x: "Payment",
y: "count",
fill: "Insurance",
order: orderInsurance,
thresholds: d3
.ticks(Math.log10(1), Math.log10(1000000), 40)
.map((d) => +(10 ** d).toPrecision(3)),
tip: true,
}
)
),
],
});
}
```

<div class="card"> <h2>Payment distributions vary by insurance type</h2> <h3>The amount paid for the inpatient health care costs of 1.2 million people in 2016 representing $8.5 billion in health care costs. This dataset is created by the Agency for Healthcare Research and Quality and includes synthetic patient data categorized by insurance type.</h3> <h3> <code style="font-size: 90%;"><a href="https://github.com/onefact/synthetic-healthcare-data/blob/6dc81b75277f349d112bccc0a8db61d9b2240c4e/healthcare_data/models/figures/insurance_plan_payment_histogram.sql">Code for data transform</a></code></h3> ${resize((width) => paymentChart(paymentData, width))} </div>

---

## Next steps

Here are some ideas of things you could try…

We need ideas! Take a look at the source code at https://github.com/onefact/synthetic-healthcare-data and e mail us at [email protected] if you want to volunteer on this. We will be looking at large language models and algorithmic fairness metrics in health care settings next.
<!--
<div class="grid grid-cols-4">
<div class="card">
Chart your own data using <a href="https://observablehq.com/framework/lib/plot"><code>Plot</code></a> and <a href="https://observablehq.com/framework/javascript/files"><code>FileAttachment</code></a>. Make it responsive using <a href="https://observablehq.com/framework/javascript/display#responsive-display"><code>resize</code></a>.
Expand All @@ -112,4 +134,4 @@ Here are some ideas of things you could try…
<div class="card">
Visit <a href="https://github.com/observablehq/framework">Framework on GitHub</a> and give us a star. Or file an issue if you’ve found a bug!
</div>
</div>
</div> -->
Loading

0 comments on commit 29e0dec

Please sign in to comment.