diff --git a/README.md b/README.md index 12d3591..28b97ff 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,130 @@ # xbrl-parser + +[![GoDoc](https://godoc.org/github.com/polygon-io/xbrl-parser?status.svg)](https://godoc.org/github.com/polygon-io/xbrl-parser) + A Go library to parse xbrl documents into their facts, contexts, and units. -**Waring:** This library is under development and may change dramatically! +This library is based around the [XBRL 2.1 spec](https://www.xbrl.org/Specification/XBRL-2.1/REC-2003-12-31/XBRL-2.1-REC-2003-12-31+corrected-errata-2013-02-20.html). +It implements support for parsing basic facts (not tuples of facts), contexts and units through the `xml.Unmarshaler` interface. + +See the package example in the godocs for how to unmarshal into the `XBRL` struct. + +This library supports basic validation that checks for malformed facts and broken references between facts and contexts/units (see `XBRL.Validate()`), +but it does _not_ implement full semantic validation of XBRL documents. + +There are no abstractions added on-top of the XBRL data structure, which makes this library flexible and simple, +but it also means you might have to read up a bit on how XBRL works to take full advantage of it. + +To give you a head start, here's some basics about XBRL: + +## What is XBRL? + +At a high level, XBRL is an XML spec intended to be the standard for digital business reporting. + +Digital business reporting is a very broad topic and encompasses many things from financial statements, to compliance reporting and more. + +XBRL is able to support so many use-cases by defining a framework and relying on supplemental taxonomies to describe schemas for specific use cases. +For example the [US GAAP Financial Reporting Taxonomy](https://xbrl.us/xbrl-taxonomy/2021-us-gaap/) defines schemas for +facts that relate to US GAAP financial reporting, which is used heavily in quarterly reports submitted to the SEC, among many other things. + +Let's jump right into the main components of an XBRL document... + +### Facts + +The core of XBRL documents are facts (also referred to as [items](https://www.xbrl.org/Specification/XBRL-2.1/REC-2003-12-31/XBRL-2.1-REC-2003-12-31+corrected-errata-2013-02-20.html#_4.6)). +Facts represent a single business measurement. +Here's an example of a fact whose schema is defined in the [US GAAP Financial Reporting Taxonomy](https://xbrl.us/xbrl-taxonomy/2021-us-gaap/): + +```xml +1.41 +``` + +A fact by itself is only a fragment of a useful piece of information. +In the above example we see that earnings per share is `1.41`, +but we need more context around when this fact was true, and how it was measured. + +In many cases (for example in an SEC quarterly report), +there might be more than one instance of a `us-gaap:EarningsPerShareBasic` fact in the document (or any fact for that matter). +This happens because a quarterly report compares EPS for the current quarter with EPS from previous quarter. + +The above fact doesn't directly tell us in which quarter EPS was `1.41`. That's what contexts are for... + +### Contexts + +A [Context](https://www.xbrl.org/Specification/XBRL-2.1/REC-2003-12-31/XBRL-2.1-REC-2003-12-31+corrected-errata-2013-02-20.html#_4.7) +describes a business entity, period of time, and an optional scenario (this library doesn't currently support scenarios, so we're going to gloss over them). + +When a fact references a context, it gives the fact more detail to help us understand what it means. + +Note that many facts can reference the same context where it makes sense to do so. + +The fact in the above example references a context called "c1", let's see what that context might look like: +```xml + + + 0000320193 + + + 2020-12-27 + 2021-03-27 + + +``` + +With the information in this context, we now know that in Q1 of 2021 (between 2020-12-27 and 2021-03-27), Apple Inc.'s (CIK 0000320193) EPS was 1.41. + +We're closer to having a useful piece of information now, but there's one thing we're still missing. +EPS is 1.41...what? What unit are we measuring it in? + +### Units + +A [Unit](https://www.xbrl.org/Specification/XBRL-2.1/REC-2003-12-31/XBRL-2.1-REC-2003-12-31+corrected-errata-2013-02-20.html#_4.8) +describes a unit of measure for a numeric fact. + +A unit can represent something simple like number of shares, +something slightly more complex like dollars per share, +or any other kind of unit you can think of. + +Just like contexts, more than one fact can reference the same unit when it makes sense to do so. + +Note that only numeric facts have units. +Sometimes a fact is a block of text, which doesn't make sense to have a unit. + +Let's look at what a simple unit like number of shares might look like: +```xml + + shares + +``` + +That's great and all, but the fact in the above example references a unit called "u1". +Let's see what that more complex unit might look like: +```xml + + + + iso4217:USD + + + shares + + + +``` + +This unit is a ratio of two simple units: USD / shares. + +And with that we now fully understand the fact from the example above: + +In Q1 of 2021 (between 2020-12-27 and 2021-03-27), Apple Inc.'s (CIK 0000320193) EPS was 1.41 dollars per share. + +--- + +### Wrapping Up + +That was a _very_ brief overview of the XBRL format, +hopefully it empowers you enough to understand the basics of how to get information out of an XBRL document. + +If you need to dig a little deeper, the models in this library are well documented and contain links to their definitions in the XBRL spec for your reference. +Beyond that, the [official spec](https://www.xbrl.org/Specification/XBRL-2.1/REC-2003-12-31/XBRL-2.1-REC-2003-12-31+corrected-errata-2013-02-20.html) +is a bit large, but it's pretty clear and will almost definitely have the information you need (and probably a lot you don't need too!). diff --git a/example_unmarshal_test.go b/example_unmarshal_test.go new file mode 100644 index 0000000..782a3b7 --- /dev/null +++ b/example_unmarshal_test.go @@ -0,0 +1,56 @@ +package xbrl_test + +import ( + "encoding/xml" + "fmt" + + "github.com/polygon-io/xbrl-parser" +) + +const doc = ` + + + + + 0000320193 + + + 2021-04-16 + + + + 727 + + + shares + +` + +func Example() { + var processed xbrl.XBRL + + if err := xml.Unmarshal([]byte(doc), &processed); err != nil { + panic(err) + } + + fact := processed.Facts[0] + if !fact.IsValid() { + panic("fact invalid!") + } + + factType := fact.Type() + numericValue, err := fact.NumericValue() + + factContext := processed.ContextsByID[fact.ContextRef] + factUnit := processed.UnitsByID[*fact.UnitRef] + + if err != nil { + panic(err) + } + + fmt.Printf("Fact: %s:%s (type: %s)\n", fact.XMLName.Space, fact.XMLName.Local, factType) + fmt.Printf(" %.0f %s on %s\n", numericValue, factUnit.String(), *factContext.Period.Instant) + + // Output: Fact: ci:assets (type: non_fraction) + // 727 shares on 2021-04-16 +}