This living document serves as a community and consensus-driven glossary of terms for C-Accel 2019 teams in track A and B with the hope to accelerate convergence on common goals and measures for success. In cases where agreement has not yet been reached or where a plurality of views is beneficial, multiple versions can be maintained and term in a single version may be defined multiple times by indicating which teams utilize them. This will enable teams to interoperate without enforcing a single artificial view.
- AI (Artificial Intelligence)
- Coreference Resolution
- Decision Making
- Embedding
- FAIR
- GeoAI (Geographic Artificial Intelligence)
- Geoparsing
- GIR (Geographic Information Retrieval)
- Geospatial Interoperability
- Knowledge
- Knowledge Extraction
- Knowledge Graph (KG)
- Knowledge Graph Schema
- Link Prediction
- Linked Data
- Named-Entity Recognition
- Natural Language Processing
- Ontology
- Ontology Alignment
- Open Knowledge Network (OKN)
- OWL (Web Ontology Language)
- Public Health Informatics
- Question Answering
- RDF (Resource Description Framework)
- Reasoning
- Representation Learning
- Semantic Interoperability
- Semantic Web
- Spatial Decision Support
- Spatial Decision Support Systems (SDSS)
- Geoparsing
- Named-Entity Recognition
- Natural Language Processing
- Public Health Informatics
- Knowledge Extraction
In contrast to natural intelligence, which is an emergent property of evolved organic life, artificial intelligence is the research field of building machines and programming computers to perform actions that seem intelligent, such as 'learning,' 'problem solving' and 'reasoning'. Artificial Intelligence has many subdisciplines, including [Machine Learning], which relies on statistical and mathematical models to analyze data, and [Knowledge Representation], which studies methods and algorithms to encode knowledge (such as in knowledge graphs and Ontologies) by means of formal logic, and to reason over them. In computer science today, one prominent approach to [Machine Learning] uses [Artificial Neural Networks] , which use computational models that are inspired by the behavior of human neurons, and can be trained to solve certain types of problems even in environments that contain noise. Techniques developed from artificial intelligence have been used in applications such as image classification and segmentation, expert systems, language translation, self-driving cars, and more.
[Used by: A-6677]
Coreference resolution or instance matching in general is the task of finding all expressions that refer to the same entity in a text. More specifically, in the Semantic Web domain, given two knowledge graphs G1 and G2 as input, instance matching is defined as the process of comparing instance i1 in G1 and instance i2 in G2 based on some similarity measure in order to assess whether i1 and i2 are actually one and the same entity. Usually, the higher the similarity between two instances, the higher is the probability that they actually refer to the same real-world entity.
[Used by: A-6677]
Decision making can be regarded as an outcome of mental processes (cognitive process) leading to the selection of a course of action among several alternatives. The output can be an action or an opinion of choice. From a cognitive perspective, the decision making process is a continuous process integrated in the interaction with the environment. From a normative perspective, the analysis of individual decisions is concerned with the logic of decision making and rationality and the invariant choice it leads to. At another level, it might be regarded as a problem solving activity which is terminated when a satisfactory solution is found. Logical decision making is an important part of all science-based professions, where specialists apply their knowledge in a given area to making informed decisions.
Source Of Description: http://en.wikipedia.org/wiki/Decision-making (External Link)
[Used by: A-7908]
In AI, embeddings refer to the practice of mapping data with a defined vocabulary in high dimensional space to vectors of real numbers in a lower dimensional space. An embedding can represent words, word phrases, sentences, paragraphs, images, sound, entities/relations in a knowledge graph, and so on.
[Used by: A-6677]
FAIR is a set of guiding principles to make data Findable, Accessible, Interoperable, and Reusable. The term FAIR was launched at a Lorentz workshop in 2014, which produced a set of principles that were published in 2016.
These include 14 principles and 15 metrics meant to quantify the readiness of data in terms of FAIR.
URI: https://www.force11.org/group/fairgroup/fairprinciples)
[Used by: A-7152]
GeoAI is the combination of geography and artificial intelligence (AI). GeoAI leverages the state-of-the-art in AI to address geospatial challenges, such as weather prediction, land use and land cover mapping, and traffic forecasting. GeoAI also contributes to the Artificial Intelligence community by introducing spatiotemporal knowledge into methods design. One example is using spatially-explicit reinforcement learning to summarize places in a knowledge graph.
[Used by: A-6677]
Geoparsing is a special toponym resolution process of converting free-text descriptions of places (such as "twenty miles northeast of Jalalabad") into unambiguous geographic identifiers, such as geographic coordinates expressed as latitude-longitude.
[Used by: A-7136]
Geographic information retrieval or geographical information retrieval is an extension of information retrieval with geographic information. GIR aims at solving textual queries that are best approached from a geographical perspective, such as "How many earthquakes occurred from January to March last year in California?". It is common in GIR to separate the text indexing and analysis from the geographic indexing. Semantic similarity and word-sense disambiguation are important components of GIR. To identify place names, GIR often relies on gazetteers, a geographical dictionary or directory used in conjunction with a map or atlas.
[Used by: A-6677]
Interoperability measures the ability of different computer systems to communicate and access, exchange, integrate, and (re)use data and other resources autonomously without human intervention. Geospatial Interoperability deals with the interoperability among geospatial systems, and software in particular. Standardization is a typical means to achieve interoperability. For instance, the International Standardization Organization (ISO) has defined a series of metadata standards (ISO 19115, ISO 19139) to standardize the organization of metadata. And the Open Geospatial Consortium (OGC) has developed a wide range of web service specifications to define a standardized interface for the representation, retrieval and parsing of geospatial data of various types.
[Used by: A-6677]
A term typically referring to the second concept from the top of a DIKW (Data, Information, Knowledge, Wisdom) model pyramid. The term is used to indicate internalized, synthesized, organized, or familiar information that can readily be applied, for instance, in domain transfer tasks, and used to put new information into perspective. In the context of symbolic representations, the term is often used to contrast methods that aim at making semantics machine-understandable from those that merely operate on a syntactic level. In the context of knowledge graphs and in contrast to epistemology, the term is loosely used to describe statements (about the world) irrespective of whether these statements are factual.
[Used by: A-6677]
Knowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, images) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. It requires either the reuse of existing formal knowledge (reusing identifiers or ontologies) or the generation of a schema based on the source data.
[Used by: A-7136, A-6677}
A combination of technologies, specifications, and data cultures for densely interconnecting (Web-scale) data across domains in a human and machine readable and reasonable way. The term knowledge graph itself does not prescribe any particular technology stack. More formally, a knowledge graph (as a set of statements) can be thought of as a node and edge labeled directed multigraph. The largest publicly available knowledge graph is the so-called Linked Data cloud based on the RDF/Semantic Web technology stack.
[Used by: A-6677]
A Knowledge Graph Schema informs the structure of a knowledge graph. It can be expressed, for instance, by means of an ontology using OWL, or the SHACL Shapes Constraint Language, both of which are W3C standards. Knowledge Graph Schemas can often be understood to be knowledge graphs of classes (types) and their relationships.
[Used by: A-6677]
Link Prediction is a field of research aimed at identifying missing relationships (labeled edges) in knowledge graphs. Link Prediction typically includes entity prediction, which is to predict the subject given the relation and the object, or to predict the object given the subject and the relation. It can also involve relation prediction, which aims to predict possible relations between two entities. Knowledge graph embedding techniques are a widely studied approach to Link Prediction.
[Used by: A-6677]
Linked data currently constitutes the largest publicly available knowledge graph, expressed using W3C standards RDF and OWL. The Semantic Web field has been creating linked data since about 2007. https://lod-cloud.net/ lists over 1,200 interconnected knowledge graphs which are publicly accessible. A count from 2015 identified over 37 billion RDF triples, i.e., node-edge-node knowledge graph statements which could be retrieved from the World Wide Web.
[Used by: A-6677]
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
[Used by: A-7136, A-6677]
Natural language processing (NLP) is a subfield of linguistics, computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.
[Used by: A-7136, A-6677]
A shared domain model usually expressed using the W3C standard Web Ontology Language (OWL). OWL can be serialized in RDF, that is, an OWL document that can be understood to be a knowledge graph of classes (types) and their relationships. Ontologies can constitute knowledge graph schemas.
[Used by: A-6677]
Ontology matching/alignment or Knowledge Graph Schema matching aims at finding correspondences between semantically related entities of different ontologies. These correspondences may stand for equivalence as well as other relations, such as subsumption or disjointness, between ontology entities. Ontology entities, in turn, usually denote the named entities of ontologies, such as classes, properties or individuals. However, these entities may also be more complex expressions, such as formulas, concept definitions, queries or term building expressions. Ontology matching results, called alignments, can thus express with various degrees of precision the relations between the ontologies under consideration. Therefore, ontology alignment can be used for various tasks, such as ontology merging, query answering, data translation or for browsing the semantic web. For example, the library can take advantage of alignments for automatically ordering a book and the seller can use them for checking the availability of a reference by the library. Matching ontologies enables the knowledge and data expressed in the matched ontologies to interoperate. It is thus of utmost importance for the applications whose interoperability is jeopardised by heterogeneous ontologies.
[Used by: A-6677]
It may make sense to distinguish between open-knowledge networks and open knowledge-networks. The first case implies a social structure that provides the means to make knowledge openly available. The second case, in contrast, describes the structure of the data and the mode of contributions to such structure. To avoid conflicts with the narrower and more technical notion of a knowledge graph, we define an open knowledge network in the first sense. An OKN is a network consisting of partners across academia, industry, and the government set out to foster the publication, retrieval, and integration of openly available knowledge. Such a network may also foster a culture of opening up otherwise silo-ed knowledge.
[Used by: A-6677]
A W3C standard for expressing ontologies, in particular as knowledge graph schemas. OWL can be serialized in RDF, that is, an OWL document that can be understood to be a knowledge graph of classes (types) and their relationships. OWL is based on description logic, which essentially is a decidable sublanguage of first-order predicate logic. As such, OWL allows for logical (deductive) reasoning, and an OWL document together with an (RDF) knowledge graph constitute a knowledge base in the sense used in the subfield of Artificial Intelligence known as "Knowledge Representation and Reasoning". For standards documents, see https://www.w3.org/TR/2012/REC-owl2-primer-20121211/
[Used by: A-6677]
Public health informatics: the systematic application of information and computer science and technology to public health practice, research, and learning. It is one of the subdomains of health informatics.
[Used by: A-7136]
In the field of [natural language processing], Question Answering (QA) refers to the methods, processes, and systems which allow users to ask questions in the form of natural language sentences and receive one or more answers, often in the form of sentences. Almost all QA systems answer a given question based on their internal [knowledge bases] (KB). According to the nature of such knowledge bases, current QA research can be classified into three categories: unstructured data-based QA (e.g. QA systems based on unstructured text), semi-structure table-based QA (e.g. QA systems based on tables from Wikipedia pages without any schema information), and structured-KB-based QA (so-called [semantic parsing]). In the field of Semantic Web, question answering over knowledge graphs is considered as one type of structured-KB-based QA which aims at translating a natural language question into a machine understandable program.
[Used by: A-6677]
The Resource Description Framework is a W3C standard (2004; revised 2012) for expressing linked data and knowledge graphs. A corresponding knowledge graph schema can be expressed as an ontology in OWL. A knowledge graph is expressed in RDF as a set of RDF triples, that is, of node-edge-node relations in the knowledge graph. Nodes and edges are identified using IRIs. As part of the RDF standard, RDF Schema provides vocabulary for node and edge types (called classes) and simple relationships between them. For more complex relationships, the Web Ontology Language OWL can be used. For standards documents, see https://www.w3.org/TR/rdf11-primer/
[Used by: A-6677]
Computational reasoning can come in many forms, and the term is often used ambiguously. The W3C Semantic Web standards RDF and OWL natively support deductive reasoning, which is based on formal logic and logical consequences which can be derived from given facts, rules, and/or other statements made in formal logic. For instance, given the statement "Black Beauty is a horse", and the statement "every horse is a mammal", we can arrive at the logical consequence that "Black Beauty is a mammal". The example just given can be expressed in RDF, and the RDF standard prescribes this kind of reasoning. The Web Ontology Language OWL can be used to express more complex relationships that can then also be used for deductive reasoning. Deductive reasoning is a prominent method in some subfields of Artificial Intelligence), in particular in Knowledge Representation and Reasoning.
[Used by: A-6677]
The aim of representation learning is to learn representations of data that make it easy for [machine learning] models to extract information for different downstream tasks. In contrast to features harvested by labor-intensive feature engineering, these representations are automatically learned by different well-designed [machine learning]/[deep learning] methods. Good representations of data are expected to convey human priors about the world. For instance, some representations can be shared across tasks. This means a learned representation should be able to capture most of the information hidden in the data, even if it is distributed. Representation learning can disentangle underlying explanatory factors in deep and abstract ways.
[Used by: A-6677]
Semantic interoperability focuses on the semantic understanding of data and software for better and smarter data integration and synthesis. Common interoperable solutions define the syntax for structuring a dataset or invoking an operation, that is, what are the inputs, outputs, and number of parameters. Semantic interoperability moves interoperability up to the next level, where different computer systems not only agree on a protocol to exchange data, but also have a shared understanding of the semantics. Semantic interoperability is achieved by machine reasoning on top of a controlled, shared vocabulary, or an ontology is used to annotate data and is encoded in machine-understandable formats, such as RDF or OWL.
[Used by: A-6677]
A field of research concerned with developing methods and tools for efficient data sharing, discovery, integration, and reuse. The community is strongly aligned with W3C standards such as RDF, OWL and SPARQL for expressing and manipulating knowledge graphs. Linked Data emerged from the Semantic Web, which constitutes the largest public knowledge graph that is currently available.
[Used by: A-6677]
Spatial decision support is the computational or informational assistance for making better informed decisions about problems with a geographic or spatial component. This support assists with the development, evaluation and selection of proper policies, plans, scenarios, projects, interventions, or solution strategies. Spatial decision making faces various decision complexities such as:
- Spatial nature and temporal development of phenomena and processes;
- Complex multi-dimensional and heterogeneous data describing decision situations;
- Large or extremely large data sets that include data in numerical, map, image, text, and other forms;
- Large number of available alternatives or a need to generate decision alternatives "on the fly" according to the changing situation;
- Multiple participants with different and often conflicting interests;
- Multiple categories of knowledge involved, including expert knowledge and layman knowledge.
Source of Description: Rob Raskin, http://geoanalytics.net/VisA-SDS-2006/
URI: http://sdsportal.sdsconsortium.org/ontology/?n=SDSSAbout:SDS
[Used by: A-7908]
Spatial Decision Support Systems (SDSS) combine spatial and non-spatial data, the analysis and visualization functions of Geographic Information Systems (GIS), and decision models in specific domains, to compute the characteristics of problem solutions, facilitate the evaluation of solution alternatives and the assessment of their trade-offs.
Source: P. Jankowski, Spatial decision support systems, in: K.K. Kemp (Ed.), Encyclopedia of Geographic Information Science, SAGE Publications, Inc., Thousand Oaks, California, 2008, http://sk.sagepub.com/reference/geoinfoscience.
URI: http://sdsportal.sdsconsortium.org/ontology/?n=SDSSAbout:SDSS
[Used by: A-7908]