Skip to content

Commit

Permalink
checkpoint
Browse files Browse the repository at this point in the history
  • Loading branch information
ohnorobo committed Sep 21, 2023
1 parent 00dbc6c commit c58db62
Show file tree
Hide file tree
Showing 3 changed files with 78 additions and 4 deletions.
50 changes: 48 additions & 2 deletions docs/base_tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,13 @@ There is one table for each scan type.
- `firehook-censoredplanet:base.discard_scan`
- `firehook-censoredplanet:base.http_scan`
- `firehook-censoredplanet:base.https_scan`
- `firehook-censoredplanet:base.satellite_scan`

## Partitioning and Clustering

The tables are time-partitioned along the `date` field.

The tables are clustered along the `country` and then `asn` fields.
The tables are clustered along the `[server|resolver]_country` and then `[server|resolver]_asn` fields.

## Table Format

Expand Down Expand Up @@ -82,6 +83,10 @@ The json data is processed into a flat table format which looks like this.

We intend to add more columns in the future.





## Original Data Format

The Censored Planet data is stored in .json files with one measurement per line.
Expand Down Expand Up @@ -139,4 +144,45 @@ Data from before 2021-04-25 is parsed from the [Hyperquack V1 format](https://gi
"stateful_block": false,
"tag": "2021-05-30T01:01:01"
}
```
```

### DNS Data

The DNS (Satellite) data included the following alternative set of columns. (Many are identical to Hyperquack)

domain STRING NULLABLE
domain_category STRING NULLABLE
domain_is_control BOOLEAN NULLABLE
date DATE NULLABLE
start_time TIMESTAMP NULLABLE
end_time TIMESTAMP NULLABLE
retry INTEGER NULLABLE
resolver_ip STRING NULLABLE
resolver_name STRING NULLABLE
resolver_is_trusted BOOLEAN NULLABLE
resolver_netblock STRING NULLABLE
resolver_asn INTEGER NULLABLE
resolver_as_name STRING NULLABLE
resolver_as_full_name STRING NULLABLE
resolver_as_class STRING NULLABLE
resolver_country STRING NULLABLE
resolver_organization STRING NULLABLE
resolver_non_zero_rcode_rate FLOAT NULLABLE
resolver_private_ip_rate FLOAT NULLABLE
resolver_zero_ip_rate FLOAT NULLABLE
resolver_connect_error_rate FLOAT NULLABLE
resolver_invalid_cert_rate FLOAT NULLABLE
received_error STRING NULLABLE
received_rcode INTEGER NULLABLE
answers RECORD REPEATED
success BOOLEAN NULLABLE
anomaly BOOLEAN NULLABLE
domain_controls_failed BOOLEAN NULLABLE
average_confidence FLOAT NULLABLE
untagged_controls BOOLEAN NULLABLE
untagged_response BOOLEAN NULLABLE
excluded BOOLEAN NULLABLE
exclude_reason STRING NULLABLE
has_type_a BOOLEAN NULLABLE
measurement_id STRING NULLABLE
source STRING NULLABLE
3 changes: 2 additions & 1 deletion docs/merged_reduced_scans_table.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,5 @@ Reduced Scans
| outcome | STRING | What was the [outcome](outcome.md) of the individual measurement eg `read/timeout` |
| count | INTEGER | How many measurements fit the exact pattern of this row? |
| unexpected_count | INTEGER | Count of measurements with an unexpected outcome |

| hostname | STRING | The domain name of the DNS resolver. (Only used in DNS) eg. `ns1.uts.ae` |
| reg_hostname | STRING | The domain name of the DNS resolver without subdomains. (Only used in DNS) eg. `uts.ae` |
29 changes: 28 additions & 1 deletion docs/outcome.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,5 +84,32 @@ Mismatch Errors are used when the connection is successful, but the content rece

## DNS Outcomes

The Satellite data uses its own unique set of outcomes, and does not use stages.
The Satellite data uses its own unique set of outcomes, and does not use stages. The outcomes are based on

| Outcome | Explanation |
| ----------------------- | ----------- |
| ✅ip.matchip | |
| ✅ip.matchasn | |
| ip.invalid | |
| ip.empty | |
| ✅tls.validcert | |
| tls.connerror | |
| tls.baddomain | |
| tls.badca | |
| blockpage | |
| dns.connrefused | |
| dns.error | |
| dns.hostunreach | |
| dns.msgsize | |
| dns.timedout | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
1

0 comments on commit c58db62

Please sign in to comment.