A simple wrapper for using Solr with LinkML schemas
This provides a convenience layer for working with a Solr database whose schema is defined in LinkML. It provides bindings both from slots in your schema to queries, and binds result objects to your object model.
from tests.test_models.books import *
from linkml_solr import SolrQueryEngine, SolrEndpoint
...
schema = YAMLGenerator(schemafile).schema
qe = SolrQueryEngine(schema=schema,
endpoint=SolrEndpoint(url='http://localhost:8983/solr/books'))
result = qe.search(target_class=Book, genre_s='scifi')
for book in result.items:
print(f'Book: {book.name} :: {book}')
Unlike querying with the native pysolr API, this will validate input keys (which your IDE will be aware of), and will instantiate an instance of your model class.
- validate query inputs (also IDE-aware)
- instantiate classes in your object model
- provide mappings from abstracted domain model concepts
See tests/test_models/books.yaml for an example schema
The Schema must be specified as a LinkML schema. Note that LinkML is more expressive than solr schemas, so not all constructs can be used. However, certain inferences are performed when compiling to Solr schemas - for example, you can use inheritance, and leaf classes will have all slots inferred.
Your schemas should be relatively "flat and wide". Use denormalization over nesting.
When designing your schema consider the two different paradigms supported:
- one core per schema, with document records having the union of all fields
- one core per class
Note: you can use the linkml-model-enrichment toolkit to auto-infer schemas from data
In future there will be ways to annotate your schema to give hints when making solr indexers etc.
Use the LinkML python generator
gen-python books.yaml > books.py
See tests/test_models/books.py for an example
This starts a server, precreates a core "books" and loads a solr schema from a linkml schema:
lsolr start-server -C books -s books.yaml
This wraps a docker container. If you do not wish to use a Docker container, then start solr in the usual way
TODO: docs on how to do this
lsolr bulkload -C books -s books.yaml books1.tsv books2.tsv ...
See tests/test_query.py for an example
lsolr --help
Usage: lsolr [OPTIONS] COMMAND [ARGS]...
Main
Args:
verbose (int): Verbose. quiet (bool): Quiet.
Returns:
None.
Options:
-v, --verbose
-q, --quiet TEXT
--help Show this message and exit.
Commands:
bulkload Convert multiple golr yaml schemas to linkml :param files:...
create-schema
start-server
lsolr bulkload --help
Usage: lsolr bulkload [OPTIONS] [FILES]...
Convert multiple golr yaml schemas to linkml :param files: :param schema:
:return:
Options:
-s, --schema TEXT Path to schema.
-C, --core TEXT solr core.
-f, --format TEXT input format.
-u, --url TEXT solr url.
--help Show this message and exit.
More documentation coming soon. For now, consult the tests.
See the Makefile:
tests/test_models/amigo.yaml: linkml_solr/utils/golr_schema_utils.py
pipenv run python $< tests/test_golr/*yaml > $@
Alpha code. Functionality is very incomplete
- write
- customizable dynamic mapping
- automatic de-nesting/de-normalization
- autogen of model-specific API
- expose additional solr functionality