-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs/introduction: import welcome from cuelang.org
Imported from master branch at commit c06bde0 file: /content/en/docs/about/_index.md with only those markup changes required to satisfy the current alpha processing chain. https://cuelang.org/issue/2604 captures making this page preprocessor-aware. The diff is: $ diff -wu <(git show c06bde0:content/en/docs/about/_index.md) content/docs/introduction/welcome/en.md --- /dev/fd/63 2023-09-19 20:33:16.302252821 +0100 +++ content/docs/introduction/welcome/en.md 2023-09-19 20:33:12.410262509 +0100 @@ -1,8 +1,8 @@ -+++ -title = "About" -weight = 100 -description = "How did CUE come about and what are its principles." -+++ +--- +title: "Welcome" +weight: 10 +description: "How did CUE come about and what are its principles." +--- ## Intro @@ -107,40 +107,34 @@ The middle column shows a possible schema for any municipality. On the right one sees a mix between data and schema as is exemplary of CUE. -{{< blocks/sidebyside >}} -<div class="col"> +{{< columns >}} Data -{{< highlight go >}} +```cue moscow: { name: "Moscow" pop: 11.92M capital: true } -{{< /highlight >}} -</div> - -<div class="col"> +``` +{{< columns-separator >}} Schema -{{< highlight go >}} +```cue municipality: { name: string pop: int capital: bool } -{{< /highlight >}} -</div> - -<div class="col"> +``` +{{< columns-separator >}} CUE -{{< highlight go >}} +```cue largeCapital: { name: string pop: >5M capital: true } -{{< /highlight >}} -</div> -{{< /blocks/sidebyside >}} +``` +{{< /columns >}} In general, in CUE one starts with a broad definition of a type, describing all possible instances. For cue-lang/docs-and-content#81. Preview-Path: /docs/introduction/welcome/ Signed-off-by: Jonathan Matthews <[email protected]> Change-Id: Ief0da7495fa46325d1d26e2629272738019f5335 Dispatch-Trailer: {"type":"trybot","CL":1169105,"patchset":3,"ref":"refs/changes/05/1169105/3","targetBranch":"alpha"}
- Loading branch information
1 parent
962841f
commit 1f59f37
Showing
2 changed files
with
614 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,314 @@ | ||
--- | ||
title: Welcome | ||
title: "Welcome" | ||
weight: 10 | ||
description: "How did CUE come about and what are its principles." | ||
--- | ||
|
||
## Heading 2 example | ||
## Intro | ||
|
||
This page will include a general welcome to CUE the open source project. | ||
CUE is an open-source data validation language and inference engine | ||
with its roots in logic programming. | ||
Although the language is not a general-purpose programming language, | ||
it has many applications, such as | ||
data validation, data templating, configuration, querying, | ||
code generation and even scripting. | ||
The inference engine can be used to validate | ||
data in code or to include it as part of a code generation pipeline. | ||
|
||
It will likely not contain any sub pages. | ||
A key thing that sets CUE apart from its peer languages | ||
is that it merges types and values into a single concept. | ||
Whereas in most languages types and values are strictly distinct, | ||
CUE orders them in a single hierarchy (a lattice, to be precise). | ||
This is a very powerful concept that allows CUE to do | ||
many fancy things. | ||
It also simplifies matters. | ||
For instance, there is no need for generics and enums, sum types | ||
and null coalescing are all the same thing. | ||
|
||
|
||
## Applications | ||
|
||
CUE's design ensures that combining CUE values in any | ||
order always gives the same result | ||
(it is associative, commutative and idempotent). | ||
This makes CUE particularly well-suited for cases where CUE | ||
constraints are combined from different sources: | ||
|
||
- Data validation: different departments or groups can each | ||
define their own constraints to apply to the same set of data. | ||
|
||
- Code extraction and generation: extract CUE definitions from | ||
multiple sources (Go code, Protobuf), combine them into a single | ||
definition, and use that to generate definitions in another | ||
format (e.g. OpenAPI). | ||
|
||
- Configuration: values can be combined from different sources | ||
without one having to import the other. | ||
|
||
The ordering of values also allows set containment analysis of entire | ||
configurations. | ||
Where most validation systems are limited to checking whether a concrete | ||
value matches a schema, CUE can validate whether any instance of | ||
one schema is also an instance of another (is it backwards compatible?), | ||
or compute a new schema that represents all instances that match | ||
two other schema. | ||
|
||
|
||
## History | ||
|
||
Although it is a very different language, the roots of CUE lie in GCL, | ||
the dominant configuration language in use at Google as of this writing. | ||
It was originally designed to configure Borg, the predecessor of Kubernetes. | ||
In fact, the original idea was to use graph unification as used in CUE for GCL. | ||
One of the authors of GCL had extensive experience with such systems and | ||
experienced the benefit of being able to compute and reason with types for the | ||
creation of powerful tooling. | ||
|
||
The graph unification model CUE is based on | ||
was in common use in computational linguistics at that time and was | ||
successfully used to manage grammars and lexicons of over 100k lines of | ||
declarative definitions. | ||
These were effectively very large | ||
configurations of something as irregular and complex as a human language. | ||
A property of these systems were that the types, or constraints, one | ||
defines validate the data while simultaneously reducing boilerplate. | ||
Overall, this approach seemed to be extremely well-suited | ||
for cloud configuration. | ||
|
||
However, the early design of GCL went for something simpler that coincidentally | ||
was also incompatible with the notion of graph unification. | ||
This simpler approach proved insufficient, but it was already too late to | ||
move to the earlier foreseen approach. | ||
Instead, an inheritance-based override model was adopted. | ||
Its complexity made the earlier foreseen tooling intractable | ||
and they never materialized. | ||
The same holds for the GCL offsprings that copied its model. | ||
|
||
CUE goes back to the original idea of using a constraint-based approach and | ||
also makes an effort to incorporate lessons learned from 15 years of GCL usage. | ||
This also includes lessons learned from offsprings and different approaches to | ||
configuration altogether. | ||
|
||
|
||
## Philosophy and principles | ||
|
||
### Types are Values | ||
|
||
CUE does not distinguish between values and types. | ||
This is a powerful notion that allows CUE to define ultra-detailed | ||
constraints, but it also simplifies things considerably: | ||
there is no separate schema or data definition language to learn | ||
and related language constructs such as sum types, enums, | ||
and even null coalescing collapse onto a single construct. | ||
|
||
Below is a demonstration of this concept. | ||
On the left one can see a JSON object (in CUE syntax) with some properties | ||
about the city of Moscow. | ||
The middle column shows a possible schema for any municipality. | ||
On the right one sees a mix between data and schema as is exemplary of CUE. | ||
|
||
{{< columns >}} | ||
Data | ||
```cue | ||
moscow: { | ||
name: "Moscow" | ||
pop: 11.92M | ||
capital: true | ||
} | ||
``` | ||
{{< columns-separator >}} | ||
Schema | ||
```cue | ||
municipality: { | ||
name: string | ||
pop: int | ||
capital: bool | ||
} | ||
``` | ||
{{< columns-separator >}} | ||
CUE | ||
```cue | ||
largeCapital: { | ||
name: string | ||
pop: >5M | ||
capital: true | ||
} | ||
``` | ||
{{< /columns >}} | ||
|
||
In general, in CUE one starts with a broad definition of a type, describing | ||
all possible instances. | ||
One then narrows down these definitions, possibly by combining constraints | ||
from different sources (departments, users), until a concrete data instance | ||
remains. | ||
|
||
|
||
### Push, not pull, constraints | ||
|
||
CUE's constraints act as data validators, but also double as | ||
a mechanism to reduce boilerplate. | ||
This is a powerful approach, but requires some different thinking. | ||
With traditional inheritance approaches one specifies the templates that | ||
are to be inherited from at each point they should be used. | ||
In CUE, instead, one selects a set of nodes in the configuration to which | ||
to apply a template. | ||
This selection can be at a different point in the configuration altogether. | ||
|
||
Another way to view this, a JSON configuration, say, can be | ||
defined as a sequence of path-leaf values. | ||
For instance, | ||
```json | ||
{ | ||
"a": 3, | ||
"b": { | ||
"c": "foo" | ||
} | ||
} | ||
``` | ||
|
||
could be represented as | ||
``` | ||
"a": 3 | ||
"b": "c": "foo" | ||
``` | ||
All the information of the original JSON file is retained in this | ||
representation. | ||
|
||
CUE generalizes this notion to the following pattern: | ||
``` | ||
<set of nodes>: <constraints> | ||
``` | ||
Each field declaration in CUE defines a set of nodes to which to apply | ||
a specific constraint. | ||
Because order doesn't matter, multiple constraints can be applied to the | ||
same nodes, all of which need to apply simultaneously. | ||
Such constraints may even be in different files. | ||
But they may never contradict each other: | ||
if one declaration says a field is `5`, another may not override it to be `6`. | ||
Declaring a field to be both `>5` and `<10` is valid, though. | ||
|
||
This approach is more restricted than full-blown inheritance; | ||
it may not be possible to reuse existing configurations. | ||
On the other hand, it is also a more powerful boilerplate remover. | ||
For instance, suppose each job in a set needs to use a specific | ||
template. | ||
Instead of having to spell this out at each point, | ||
one can declare this separately in a one blanket statement. | ||
|
||
So instead of | ||
``` | ||
jobs: { | ||
foo: acmeMonitoring & { /* ... */ } | ||
bar: acmeMonitoring & { /* ... */ } | ||
baz: acmeMonitoring & { /* ... */ } | ||
} | ||
``` | ||
one can write | ||
|
||
``` | ||
jobs: [string]: acmeMonitoring | ||
jobs: { | ||
foo: { /* ... */ } | ||
bar: { /* ... */ } | ||
baz: { /* ... */ } | ||
} | ||
``` | ||
There is no need to repeat the reference to the monitoring template for | ||
each job, as the first already states that all jobs _must_ use `acmeMonitoring`. | ||
Such requirements can be specified across files. | ||
|
||
This approach not only reduces the boilerplate contained in `acmeMonitoring` | ||
but also removes the repetitiveness of having to specify | ||
this template for each job in `jobs`. | ||
At the same time, this statement act as a type enforcement. | ||
This dual function is a key aspect of CUE and | ||
typed feature structure languages in general. | ||
|
||
This approach breaks down, of course, if the restrictions in | ||
`acmeMonitoring` are too stringent and jobs need to override them. | ||
To this extent, CUE provides mechanisms to allow defaults, opt-out, and | ||
soft constraints. | ||
|
||
|
||
### Separate configuration from computation | ||
|
||
There comes a time that one (seemingly) will need do complex | ||
computations to generate some configuration data. | ||
But simplicity of a configuration language can be paramount when one quickly | ||
needs to make changes. | ||
These are obviously conflicting interests. | ||
|
||
CUE takes the stance that computation and configuration should | ||
be separated. | ||
And CUE actually makes this easy. | ||
The data that needs to be computed can be generated outside of CUE | ||
and put in a file that is to be mixed in. | ||
The data can even be generated in CUE's scripting layer and automatically | ||
injected in a configuration pipeline. | ||
Both approaches rely on CUE's property that the order in which this data gets | ||
added is irrelevant. | ||
|
||
|
||
|
||
### Be useful at all scales | ||
|
||
The usefulness of a language may depend on the scale of the project. | ||
Having too many different languages can put a cognitive strain on | ||
developers, though, and migrating from one language to another as | ||
scaling requirements change can be very costly. | ||
CUE aims to minimize these costs | ||
by covering a myriad of data- and configuration-related tasks at all scales. | ||
|
||
**Small scale** | ||
At small scales, reducing boilerplate in configurations is not necessarily | ||
the best thing to do. | ||
Even at a small scale, however, repetition can be error prone. | ||
For such cases, CUE can define schema to validate otherwise | ||
typeless data files. | ||
|
||
**Medium scale** | ||
As soon the desire arises to reduce boilerplate, the `cue` tool can | ||
help to automatically rewrite configurations. | ||
See the Quick and Dirty section of the | ||
[Kubernetes tutorial](/docs/tutorials/kubernetes) | ||
for an example using the `import` and `trim` tool. | ||
Thousands of lines can be obliterated automatically using this approach. | ||
|
||
**Large scale** | ||
CUE's underlying formalism was developed for large-scale configuration. | ||
Its import model incorporates best practices for large-scale engineering | ||
and it is optimized for automation. | ||
A key to this is advanced tooling. | ||
The mathematical model underlying CUE's operations allows for | ||
automation that is untractable for most other approaches. | ||
CUE's `trim` command is an example of this. | ||
|
||
|
||
### Tooling | ||
|
||
Automation is key. | ||
Nowadays, a good chunk of code gets generated, analyzed, reformatted, | ||
and so on by machines. | ||
The CUE language, APIs, and tooling have been designed to allow for | ||
machine manipulation. | ||
Aspects of this are: | ||
|
||
- make the language easy to scan and parse, | ||
- restrictions on imports, | ||
- allow any piece of data to be split across files and generated | ||
from different sources, | ||
- define packages at the directory level, | ||
- and of course its value and type model. | ||
|
||
The order independence also plays a key role in this. | ||
It allows combining constraints from various sources without having | ||
to define any order in which they are to be applied to get | ||
predictable results. | ||
|
||
|
||
<!-- something about this? | ||
Not turing complete. | ||
Run in contexts where cost is hard to attribute. | ||
Easier to make claims about termination (smart contracts). | ||
--> |
Oops, something went wrong.