docs/introduction: import welcome from cuelang.org

Imported from master branch at commit c06bde0 file: /content/en/docs/about/_index.md with only those markup changes required to satisfy the current alpha processing chain. https://cuelang.org/issue/2604 captures making this page preprocessor-aware. The diff is: $ diff -wu <(git show c06bde0:content/en/docs/about/_index.md) content/docs/introduction/welcome/en.md --- /dev/fd/63 2023-09-19 20:33:16.302252821 +0100 +++ content/docs/introduction/welcome/en.md 2023-09-19 20:33:12.410262509 +0100 @@ -1,8 +1,8 @@ -+++ -title = "About" -weight = 100 -description = "How did CUE come about and what are its principles." -+++ +--- +title: "Welcome" +weight: 10 +description: "How did CUE come about and what are its principles." +--- ## Intro @@ -107,40 +107,34 @@ The middle column shows a possible schema for any municipality. On the right one sees a mix between data and schema as is exemplary of CUE. -{{< blocks/sidebyside >}} -<div class="col"> +{{< columns >}} Data -{{< highlight go >}} +```cue moscow: { name: "Moscow" pop: 11.92M capital: true } -{{< /highlight >}} -</div> - -<div class="col"> +``` +{{< columns-separator >}} Schema -{{< highlight go >}} +```cue municipality: { name: string pop: int capital: bool } -{{< /highlight >}} -</div> - -<div class="col"> +``` +{{< columns-separator >}} CUE -{{< highlight go >}} +```cue largeCapital: { name: string pop: >5M capital: true } -{{< /highlight >}} -</div> -{{< /blocks/sidebyside >}} +``` +{{< /columns >}} In general, in CUE one starts with a broad definition of a type, describing all possible instances. For cue-lang/docs-and-content#81. Preview-Path: /docs/introduction/welcome/ Signed-off-by: Jonathan Matthews <[email protected]> Change-Id: Ief0da7495fa46325d1d26e2629272738019f5335 Dispatch-Trailer: {"type":"trybot","CL":1169105,"patchset":3,"ref":"refs/changes/05/1169105/3","targetBranch":"alpha"}
cue-lang · Sep 19, 2023 · 1f59f37 · 1f59f37
1 parent 962841f
commit 1f59f37
Show file tree

Hide file tree

Showing 2 changed files with 614 additions and 8 deletions.
diff --git a/content/docs/introduction/welcome/en.md b/content/docs/introduction/welcome/en.md
@@ -1,11 +1,314 @@
 ---
-title: Welcome
+title: "Welcome"
 weight: 10
+description: "How did CUE come about and what are its principles."
 ---
 
-## Heading 2 example
+## Intro
 
-This page will include a general welcome to CUE the open source project.
+CUE is an open-source data validation language and inference engine
+with its roots in logic programming.
+Although the language is not a general-purpose programming language,
+it has many applications, such as
+data validation, data templating, configuration, querying,
+code generation and even scripting.
+The inference engine can be used to validate
+data in code or to include it as part of a code generation pipeline.
 
-It will likely not contain any sub pages.
+A key thing that sets CUE apart from its peer languages
+is that it merges types and values into a single concept.
+Whereas in most languages types and values are strictly distinct,
+CUE orders them in a single hierarchy (a lattice, to be precise).
+This is a very powerful concept that allows CUE to do
+many fancy things.
+It also simplifies matters.
+For instance, there is no need for generics and enums, sum types
+and null coalescing are all the same thing.
 
+
+## Applications
+
+CUE's design ensures that combining CUE values in any
+order always gives the same result
+(it is associative, commutative and idempotent).
+This makes CUE particularly well-suited for cases where CUE
+constraints are combined from different sources:
+
+- Data validation: different departments or groups can each
+define their own constraints to apply to the same set of data.
+
+- Code extraction and generation: extract CUE definitions from
+multiple sources (Go code, Protobuf), combine them into a single
+definition, and use that to generate definitions in another
+format (e.g. OpenAPI).
+
+- Configuration: values can be combined from different sources
+  without one having to import the other.
+
+The ordering of values also allows set containment analysis of entire
+configurations.
+Where most validation systems are limited to checking whether a concrete
+value matches a schema, CUE can validate whether any instance of
+one schema is also an instance of another (is it backwards compatible?),
+or compute a new schema that represents all instances that match
+two other schema.
+
+
+## History
+
+Although it is a very different language, the roots of CUE lie in GCL,
+the dominant configuration language in use at Google as of this writing.
+It was originally designed to configure Borg, the predecessor of Kubernetes.
+In fact, the original idea was to use graph unification as used in CUE for GCL.
+One of the authors of GCL had extensive experience with such systems and
+experienced the benefit of being able to compute and reason with types for the
+creation of powerful tooling.
+
+The graph unification model CUE is based on
+was in common use in computational linguistics at that time and was
+successfully used to manage grammars and lexicons of over 100k lines of
+declarative definitions.
+These were effectively very large
+configurations of something as irregular and complex as a human language.
+A property of these systems were that the types, or constraints, one
+defines validate the data while simultaneously reducing boilerplate.
+Overall, this approach seemed to be extremely well-suited
+for cloud configuration.
+
+However, the early design of GCL went for something simpler that coincidentally
+was also incompatible with the notion of graph unification.
+This simpler approach proved insufficient, but it was already too late to
+move to the earlier foreseen approach.
+Instead, an inheritance-based override model was adopted.
+Its complexity made the earlier foreseen tooling intractable
+and they never materialized.
+The same holds for the GCL offsprings that copied its model.
+
+CUE goes back to the original idea of using a constraint-based approach and
+also makes an effort to incorporate lessons learned from 15 years of GCL usage.
+This also includes lessons learned from offsprings and different approaches to
+configuration altogether.
+
+
+## Philosophy and principles
+
+### Types are Values
+
+CUE does not distinguish between values and types.
+This is a powerful notion that allows CUE to define ultra-detailed
+constraints, but it also simplifies things considerably:
+there is no separate schema or data definition language to learn
+and related language constructs such as sum types, enums,
+and even null coalescing collapse onto a single construct.
+
+Below is a demonstration of this concept.
+On the left one can see a JSON object (in CUE syntax) with some properties
+about the city of Moscow.
+The middle column shows a possible schema for any municipality.
+On the right one sees a mix between data and schema as is exemplary of CUE.
+
+{{< columns >}}
+Data
+```cue
+moscow: {
+  name:    "Moscow"
+  pop:     11.92M
+  capital: true
+}
+```
+{{< columns-separator >}}
+Schema
+```cue
+municipality: {
+  name:    string
+  pop:     int
+  capital: bool
+}
+```
+{{< columns-separator >}}
+CUE
+```cue
+largeCapital: {
+  name:    string
+  pop:     >5M
+  capital: true
+}
+```
+{{< /columns >}}
+
+In general, in CUE one starts with a broad definition of a type, describing
+all possible instances.
+One then narrows down these definitions, possibly by combining constraints
+from different sources (departments, users), until a concrete data instance
+remains.
+
+
+### Push, not pull, constraints
+
+CUE's constraints act as data validators, but also double as
+a mechanism to reduce boilerplate.
+This is a powerful approach, but requires some different thinking.
+With traditional inheritance approaches one specifies the templates that
+are to be inherited from at each point they should be used.
+In CUE, instead, one selects a set of nodes in the configuration to which
+to apply a template.
+This selection can be at a different point in the configuration altogether.
+
+Another way to view this, a JSON configuration, say, can be
+defined as a sequence of path-leaf values.
+For instance,
+```json
+{
+  "a": 3,
+  "b": {
+    "c": "foo"
+  }
+}
+```
+
+could be represented as
+```
+"a": 3
+"b": "c": "foo"
+```
+All the information of the original JSON file is retained in this
+representation.
+
+CUE generalizes this notion to the following pattern:
+```
+<set of nodes>: <constraints>
+```
+Each field declaration in CUE defines a set of nodes to which to apply
+a specific constraint.
+Because order doesn't matter, multiple constraints can be applied to the
+same nodes, all of which need to apply simultaneously.
+Such constraints may even be in different files.
+But they may never contradict each other:
+if one declaration says a field is `5`, another may not override it to be `6`.
+Declaring a field to be both `>5` and `<10` is valid, though.
+
+This approach is more restricted than full-blown inheritance;
+it may not be possible to reuse existing configurations.
+On the other hand, it is also a more powerful boilerplate remover.
+For instance, suppose each job in a set needs to use a specific
+template.
+Instead of having to spell this out at each point,
+one can declare this separately in a one blanket statement.
+
+So instead of
+```
+jobs: {
+  foo: acmeMonitoring & { /* ... */ }
+  bar: acmeMonitoring & { /* ... */ }
+  baz: acmeMonitoring & { /* ... */ }
+}
+```
+one can write
+
+```
+jobs: [string]: acmeMonitoring
+
+jobs: {
+    foo: { /* ... */ }
+    bar: { /* ... */ }
+    baz: { /* ... */ }
+}
+```
+There is no need to repeat the reference to the monitoring template for
+each job, as the first already states that all jobs _must_ use `acmeMonitoring`.
+Such requirements can be specified across files.
+
+This approach not only reduces the boilerplate contained in `acmeMonitoring`
+but also removes the repetitiveness of having to specify
+this template for each job in `jobs`.
+At the same time, this statement act as a type enforcement.
+This dual function is a key aspect of CUE and
+typed feature structure languages in general.
+
+This approach breaks down, of course, if the restrictions in
+`acmeMonitoring` are too stringent and jobs need to override them.
+To this extent, CUE provides mechanisms to allow defaults, opt-out, and
+soft constraints.
+
+
+### Separate configuration from computation
+
+There comes a time that one (seemingly) will need do complex
+computations to generate some configuration data.
+But simplicity of a configuration language can be paramount when one quickly
+needs to make changes.
+These are obviously conflicting interests.
+
+CUE takes the stance that computation and configuration should
+be separated.
+And CUE actually makes this easy.
+The data that needs to be computed can be generated outside of CUE
+and put in a file that is to be mixed in.
+The data can even be generated in CUE's scripting layer and automatically
+injected in a configuration pipeline.
+Both approaches rely on CUE's property that the order in which this data gets
+added is irrelevant.
+
+
+
+### Be useful at all scales
+
+The usefulness of a language may depend on the scale of the project.
+Having too many different languages can put a cognitive strain on
+developers, though, and migrating from one language to another as
+scaling requirements change can be very costly.
+CUE aims to minimize these costs
+by covering a myriad of data- and configuration-related tasks at all scales.
+
+**Small scale**
+At small scales, reducing boilerplate in configurations is not necessarily
+the best thing to do.
+Even at a small scale, however, repetition can be error prone.
+For such cases, CUE can define schema to validate otherwise
+typeless data files.
+
+**Medium scale**
+As soon the desire arises to reduce boilerplate, the `cue` tool can
+help to automatically rewrite configurations.
+See the Quick and Dirty section of the
+[Kubernetes tutorial](/docs/tutorials/kubernetes)
+for an example using the `import` and `trim` tool.
+Thousands of lines can be obliterated automatically using this approach.
+
+**Large scale**
+CUE's underlying formalism was developed for large-scale configuration.
+Its import model incorporates best practices for large-scale engineering
+and it is optimized for automation.
+A key to this is advanced tooling.
+The mathematical model underlying CUE's operations allows for
+automation that is untractable for most other approaches.
+CUE's `trim` command is an example of this.
+
+
+### Tooling
+
+Automation is key.
+Nowadays, a good chunk of code gets generated, analyzed, reformatted,
+and so on by machines.
+The CUE language, APIs, and tooling have been designed to allow for
+machine manipulation.
+Aspects of this are:
+
+- make the language easy to scan and parse,
+- restrictions on imports,
+- allow any piece of data to be split across files and generated
+  from different sources,
+- define packages at the directory level,
+- and of course its value and type model.
+
+The order independence also plays a key role in this.
+It allows combining constraints from various sources without having
+to define any order in which they are to be applied to get
+predictable results.
+
+
+<!-- something about this?
+Not turing complete.
+Run in contexts where cost is hard to attribute.
+Easier to make claims about termination (smart contracts).
+-->