Skip to content

Commit

Permalink
Merge zoneinfo_parse history into parse-zoneinfo
Browse files Browse the repository at this point in the history
Merged the original history.
  • Loading branch information
djc committed Apr 15, 2024
2 parents 95f4885 + 8d74e9c commit 0075b74
Show file tree
Hide file tree
Showing 11 changed files with 1,098 additions and 23 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,2 @@
target
Cargo.lock
*.swp
15 changes: 15 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
language: rust
rust:
- 1.31.0
- stable
- beta
- nightly

os:
- linux
- osx
- windows

matrix:
allow_failures:
- rust: nightly
3 changes: 1 addition & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
[package]
name = "parse-zoneinfo"
version = "0.3.0"
authors = ["Djzin <[email protected]>"]
description = "Parse zoneinfo files from the IANA database"
keywords = ["date", "time", "timezone", "zone", "calendar"]
repository = "https://github.com/djzin/parse-zoneinfo"
readme = "README.md"
license = "MIT"
keywords = ["date", "time", "timezone", "zone", "calendar"]

[dependencies.regex]
version = "1.3.1"
Expand Down
21 changes: 21 additions & 0 deletions LICENCE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The MIT License (MIT)

Copyright (c) 2016 Benjamin Sago

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
77 changes: 75 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,78 @@

Parse-zoneinfo is a fork of [`zoneinfo_parse`][zoneinfo_parse], with adjustments such that it no longer depends on the `Datetime` crate. It is used by [`chrono-tz`][chrono_tz].

[zoneinfo_parse]: https://github.com/rust-datetime/zoneinfo-parse
[chrono_tz]: https://github.com/djzin/chrono-tz
Rust library for reading the text files comprising the [zoneinfo database][w], which records time zone changes and offsets across the world from multiple sources.

The zoneinfo database is distributed in one of two formats: a raw text format with one file per continent, and a compiled binary format with one file per time zone. This crate deals with the former; for the latter, see the [`zoneinfo_compiled` crate][zc] instead.

The database itself is maintained by IANA. For more information, see [IANA’s page on the time zone database][iana]. You can also find the text files themselves in [the tz repository][tz].

[iana]: https://www.iana.org/time-zones
[tz]: https://github.com/eggert/tz
[w]: https://en.wikipedia.org/wiki/Tz_database
[zc]: https://github.com/rust-datetime/zoneinfo-compiled

### [View the Rustdoc](https://docs.rs/zoneinfo_parse)

# Installation

This crate works with [Cargo](https://crates.io). Add the following to your `Cargo.toml` dependencies section:

```toml
[dependencies]
datetime = "0.5"
zoneinfo_parse = "0.5"
```

The earliest version of Rust that this crate is tested against is [Rust v1.31.0](https://blog.rust-lang.org/2018/12/06/Rust-1.31-and-rust-2018.html).

# Usage

The zoneinfo files contains `Zone`, `Rule`, and `Link` information. Each type of line forms a variant in the `line::Line` enum.

To get started, here are a few lines representing what time is like in the `Europe/Madrid` time zone:

# Zone NAME GMTOFF RULES FORMAT [UNTIL]
Zone Europe/Madrid -0:14:44 - LMT 1901 Jan 1 0:00s
0:00 Spain WE%sT 1946 Sep 30
1:00 Spain CE%sT 1979
1:00 EU CE%sT

The first line is a comment. The second starts with `Zone`, so we know

So parsing these five lines would return the five following results:

- A `line::Line::Space` for the comment, because the line doesn’t contain any information (but isn’t strictly *invalid* either).
- A `line::Line::Zone` for the first `Zone` entry. This contains a `Zone` struct that holds the name of the zone. All the other fields are stored in the `ZoneInfo` struct.
- A `line::Line::Continuation` for the next entry. This is different from the line above as it doesn’t contain a name field; it only has the information in a `ZoneInfo` struct.
- The fourth line contains the same types of data as the third.
- As does the fifth.

Lines with rule definitions look like this:

# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
Rule Spain 1917 only - May 5 23:00s 1:00 S
Rule Spain 1917 1919 - Oct 6 23:00s 0 -
Rule Spain 1918 only - Apr 15 23:00s 1:00 S
Rule Spain 1919 only - Apr 5 23:00s 1:00 S

All these lines follow the same pattern: A `line::Line::Rule` that contains a `Rule` struct, which has a field for each column of data.

Finally, there are lines that link one zone to another’s name:

Link Europe/Prague Europe/Bratislava

The `Link` struct simply contains the names of both the existing and new time zones.


## Interpretation

Once the input lines have been parsed, they must be *interpreted* to form a table of time zone data.

The easiest way to do this is with a `TableBuilder`. You can add various lines to the builder, and it will throw an error as soon as it detects that something’s wrong, such as a duplicate or a missing entry. When all the lines have been fed to the builder, you can use the `build` method to produce a `Table` containing fields for the rule, zone, and link lines.



## Example program

This crate is used to produce the data for the [`zoneinfo-data` crate](https://github.com/rust-datetime/zoneinfo-data). For an example of its use, see the bundled [data crate builder](https://github.com/rust-datetime/zoneinfo-parse/tree/master/data-crate-builder).
2 changes: 1 addition & 1 deletion examples/benchmark.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ extern crate parse_zoneinfo;
use parse_zoneinfo::line::{Line, LineParser};
use parse_zoneinfo::table::TableBuilder;

// This function is needed until zoneinfo_parse handles comments correctly.
// This function is needed until parse_zoneinfo handles comments correctly.
// Technically a '#' symbol could occur between double quotes and should be
// ignored in this case, however this never happens in the tz database as it
// stands.
Expand Down
36 changes: 36 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,39 @@
//! Rust library for reading the text files comprising the [zoneinfo
//! database][w], which records time zone changes and offsets across the world
//! from multiple sources.
//!
//! The zoneinfo database is distributed in one of two formats: a raw text
//! format with one file per continent, and a compiled binary format with one
//! file per time zone. This crate deals with the former; for the latter, see
//! the [`zoneinfo_compiled` crate][zc] instead.
//!
//! The database itself is maintained by IANA. For more information, see
//! [IANA’s page on the time zone database][iana]. You can also find the text
//! files themselves in [the tz repository][tz].
//!
//! [iana]: https://www.iana.org/time-zones
//! [tz]: https://github.com/eggert/tz
//! [w]: https://en.wikipedia.org/wiki/Tz_database
//! [zc]: https://github.com/rust-datetime/zoneinfo-compiled
//!
//! ## Outline
//!
//! Reading a zoneinfo text file is split into three stages:
//!
//! - **Parsing** individual lines of text into `Lines` is done by the `line`
//! module;
//! - **Interpreting** these lines into a complete `Table` is done by the
//! `table` module;
//! - **Calculating transitions** from this table is done by the `transitions`
//! module.
#![warn(missing_copy_implementations)]
//#![warn(missing_docs)]
#![warn(nonstandard_style)]
#![warn(trivial_numeric_casts)]
#![warn(unreachable_pub)]
#![warn(unused)]

extern crate regex;

pub mod line;
Expand Down
80 changes: 78 additions & 2 deletions src/line.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,78 @@
//! Parsing zoneinfo data files, line-by-line.
//!
//! This module provides functions that take a line of input from a zoneinfo
//! data file and attempts to parse it, returning the details of the line if
//! it gets parsed successfully. It classifies them as `Rule`, `Link`,
//! `Zone`, or `Continuation` lines.
//!
//! `Line` is the type that parses and holds zoneinfo line data. To try to
//! parse a string, use the `Line::from_str` constructor. (This isn’t the
//! `FromStr` trait, so you can’t use `parse` on a string. Sorry!)
//!
//! ## Examples
//!
//! Parsing a `Rule` line:
//!
//! ```
//! # extern crate parse_zoneinfo;
//! # fn main() {
//! use parse_zoneinfo::line::*;
//!
//! let parser = LineParser::new();
//! let line = parser.parse_str("Rule EU 1977 1980 - Apr Sun>=1 1:00u 1:00 S");
//!
//! assert_eq!(line, Ok(Line::Rule(Rule {
//! name: "EU",
//! from_year: Year::Number(1977),
//! to_year: Some(Year::Number(1980)),
//! month: Month::April,
//! day: DaySpec::FirstOnOrAfter(Weekday::Sunday, 1),
//! time: TimeSpec::HoursMinutes(1, 0).with_type(TimeType::UTC),
//! time_to_add: TimeSpec::HoursMinutes(1, 0),
//! letters: Some("S"),
//! })));
//! # }
//! ```
//!
//! Parsing a `Zone` line:
//!
//! ```
//! # fn main() {
//! use parse_zoneinfo::line::*;
//!
//! let parser = LineParser::new();
//! let line = parser.parse_str("Zone Australia/Adelaide 9:30 Aus AC%sT 1971 Oct 31 2:00:00");
//!
//! assert_eq!(line, Ok(Line::Zone(Zone {
//! name: "Australia/Adelaide",
//! info: ZoneInfo {
//! utc_offset: TimeSpec::HoursMinutes(9, 30),
//! saving: Saving::Multiple("Aus"),
//! format: "AC%sT",
//! time: Some(ChangeTime::UntilTime(
//! Year::Number(1971),
//! Month::October,
//! DaySpec::Ordinal(31),
//! TimeSpec::HoursMinutesSeconds(2, 0, 0).with_type(TimeType::Wall))
//! ),
//! },
//! })));
//! # }
//! ```
//!
//! Parsing a `Link` line:
//!
//! ```
//! use parse_zoneinfo::line::*;
//!
//! let parser = LineParser::new();
//! let line = parser.parse_str("Link Europe/Istanbul Asia/Istanbul");
//! assert_eq!(line, Ok(Line::Link(Link {
//! existing: "Europe/Istanbul",
//! new: "Asia/Istanbul",
//! })));
//! ```
use std::str::FromStr;
// we still support rust that doesn't have the inherent methods
#[allow(deprecated, unused_imports)]
Expand Down Expand Up @@ -851,6 +926,7 @@ impl LineParser {
fn parse_rule<'a>(&self, input: &'a str) -> Result<Rule<'a>, Error> {
if let Some(caps) = self.rule_line.captures(input) {
let name = caps.name("name").unwrap().as_str();

let from_year = caps.name("from").unwrap().as_str().parse()?;

// The end year can be ‘only’ to indicate that this rule only
Expand Down Expand Up @@ -948,7 +1024,7 @@ impl LineParser {
})
}

fn parse_zone<'a>(&self, input: &'a str) -> Result<Zone<'a>, Error> {
pub fn parse_zone<'a>(&self, input: &'a str) -> Result<Zone<'a>, Error> {
if let Some(caps) = self.zone_line.captures(input) {
let name = caps.name("name").unwrap().as_str();
let info = self.zoneinfo_from_captures(caps)?;
Expand All @@ -961,7 +1037,7 @@ impl LineParser {
}
}

fn parse_link<'a>(&self, input: &'a str) -> Result<Link<'a>, Error> {
pub fn parse_link<'a>(&self, input: &'a str) -> Result<Link<'a>, Error> {
if let Some(caps) = self.link_line.captures(input) {
let target = caps.name("target").unwrap().as_str();
let name = caps.name("name").unwrap().as_str();
Expand Down
23 changes: 9 additions & 14 deletions src/structure.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ impl Structure for Table {
}
}

TableStructure { mappings: mappings }
TableStructure { mappings }
}
}

Expand All @@ -94,7 +94,7 @@ impl<'table> IntoIterator for TableStructure<'table> {

Iter {
structure: self,
keys: keys,
keys,
}
}
}
Expand All @@ -110,20 +110,15 @@ impl<'table> Iterator for Iter<'table> {
type Item = TableStructureEntry<'table>;

fn next(&mut self) -> Option<Self::Item> {
loop {
let key = match self.keys.pop() {
Some(k) => k,
None => return None,
};
let key = self.keys.pop()?;

// Move the strings out into an (automatically-sorted) vector.
let values = self.structure.mappings[key].iter().cloned().collect();
// Move the strings out into an (automatically-sorted) vector.
let values = self.structure.mappings[key].iter().cloned().collect();

return Some(TableStructureEntry {
name: key,
children: values,
});
}
Some(TableStructureEntry {
name: key,
children: values,
})
}
}

Expand Down
40 changes: 39 additions & 1 deletion src/table.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,46 @@
//! Collecting parsed zoneinfo data lines into a set of time zone data.
//!
//! This module provides the `Table` struct, which is able to take parsed
//! lines of input from the `line` module and coalesce them into a single
//! set of data.
//!
//! It’s not as simple as it seems, because the zoneinfo data lines refer to
//! each other through strings: lines of the form “link zone A to B” could be
//! *parsed* successfully but still fail to be *interpreted* successfully if
//! “B” doesn’t exist. So it has to check every step of the way—nothing wrong
//! with this, it’s just a consequence of reading data from a text file.
//!
//! This module only deals with constructing a table from data: any analysis
//! of the data is done elsewhere.
//!
//!
//! ## Example
//!
//! ```
//! use parse_zoneinfo::line::{Zone, Link, LineParser};
//! use parse_zoneinfo::table::{TableBuilder};
//!
//! let parser = LineParser::new();
//! let zone = parser.parse_zone("Zone Pacific/Auckland 11:39:04 - LMT 1868 Nov 2").unwrap();
//! let link = parser.parse_link("Link Pacific/Auckland Antarctica/McMurdo").unwrap();
//!
//! let mut builder = TableBuilder::new();
//! builder.add_zone_line(zone).unwrap();
//! builder.add_link_line(link).unwrap();
//! let table = builder.build();
//!
//! assert!(table.get_zoneset("Pacific/Auckland").is_some());
//! assert!(table.get_zoneset("Antarctica/McMurdo").is_some());
//! assert!(table.get_zoneset("UTC").is_none());
//! ```
use std::collections::hash_map::{Entry, HashMap};
use std::error::Error as ErrorTrait;
use std::fmt;

use line::{self, ChangeTime, DaySpec, Month, TimeType, Year};
use line::{self, ChangeTime, DaySpec, Month, Year};

use crate::line::TimeType;

/// A **table** of all the data in one or more zoneinfo files.
#[derive(PartialEq, Debug, Default)]
Expand Down
Loading

0 comments on commit 0075b74

Please sign in to comment.