diff --git a/.github/workflows/R-CMD-check.yaml b/.github/workflows/R-CMD-check.yaml index 5641b54..e8bb9f9 100644 --- a/.github/workflows/R-CMD-check.yaml +++ b/.github/workflows/R-CMD-check.yaml @@ -46,7 +46,7 @@ jobs: - name: Install lamindb run: | - pip install lamindb[bionty] + pip install lamindb[bionty,wetlab] - name: Log in to Lamin run: | @@ -54,7 +54,7 @@ jobs: - name: Set cellxgene as default instance run: | - lamin load laminlabs/cellxgene + lamin connect laminlabs/cellxgene - uses: r-lib/actions/setup-r-dependencies@v2 with: diff --git a/.github/workflows/pkgdown.yaml b/.github/workflows/pkgdown.yaml index 05ed44c..e5b9677 100644 --- a/.github/workflows/pkgdown.yaml +++ b/.github/workflows/pkgdown.yaml @@ -44,7 +44,7 @@ jobs: - name: Install lamindb run: | - pip install lamindb[bionty] + pip install lamindb[bionty,wetlab] - name: Log in to Lamin run: | @@ -52,7 +52,7 @@ jobs: - name: Set cellxgene as default instance run: | - lamin load laminlabs/cellxgene + lamin connect laminlabs/cellxgene - name: Build site run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE) @@ -60,7 +60,7 @@ jobs: - name: Deploy to GitHub pages 🚀 if: github.event_name != 'pull_request' - uses: JamesIves/github-pages-deploy-action@v4.5.0 + uses: JamesIves/github-pages-deploy-action@v4 with: clean: false branch: gh-pages diff --git a/CHANGELOG.md b/CHANGELOG.md index 4d714ef..ad3a10d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,9 +6,9 @@ * Read user settings from env file created by lamin Python package (PR #2, PR #8). -* Render a pkgdown website (PR #13). +* Add `to_string()` and `print()` methods to the `Record` class and (incomplete) `describe()` method to the `Artifact()` class (PR #22). -* Add `to_string()` and `print()` methods to the `Record` class and (incomplete) `describe()` method to the `Artifact()` class (PR #22) +* Add `to_string()` and `print()` methods to remaining classes (PR #31) ## MAJOR CHANGES @@ -19,26 +19,44 @@ - Linting action. - Commands for roxygenizing (`/document`) and restyling the source code (`/style`). -* Allow unauthenticated users to connect to an instance if they ran `lamin load ` beforehand (PR #19). +* Allow unauthenticated users to connect to an instance if they ran `lamin connect ` beforehand (PR #19). ## MINOR CHANGES -* Update `README` with new set up instructions and simplify (PR #14). - * Do not complain when foreign keys are not found in a record, but also do not complain when they are (PR #13). -* Further simplify the `README`, and move the detailed usage description to a separate vignette (PR #13). +* Define a current user and current instance with lamin-cli prior to testing and generating documentation in the CI (PR #23). + +## TESTING + +* Add a simple unit test which queries laminlabs/lamindata (PR #27). + +* Added unit test for the InstanceAPI class (PR #30). + +## DOCUMENTATION + +* Update `README` with new set up instructions and simplify (PR #14). * Add a `pkgdown` website to the project (PR #13). +* Further simplify the `README`, and move the detailed usage description to a separate vignette (PR #13). + * Generate vignettes using Quarto (PR #13). -* Define a current user and current instance with lamin-cli prior to testing and generating documentation in the CI (PR #23). +* Add vignette to showcase laminr usage (PR #18). + +* Replace all mentions of `lamin load` with `lamin connect` (PR #29). + +* Improve the `README` (PR #29). ## BUG FIXES * Fixed the parsing of the env files in `~/.lamin` due to changes in the lamindb-setup Python package (PR #12). +* Return `NULL` when a record's related field is empty (PR #28). + +* Add alternative error message when no message is returned from the API (PR #30). + # laminr v0.0.1 Initial POC implementation of the LaminDB API client for R. diff --git a/DESCRIPTION b/DESCRIPTION index 2209960..309f304 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -26,5 +26,8 @@ Imports: Suggests: anndata, quarto, - s3 (>= 1.1.0) + s3 (>= 1.1.0), + testthat (>= 3.0.0), + withr VignetteBuilder: quarto +Config/testthat/edition: 3 diff --git a/R/Artifact.R b/R/Artifact.R index c602889..1536c6d 100644 --- a/R/Artifact.R +++ b/R/Artifact.R @@ -8,10 +8,9 @@ ArtifactRecord <- R6::R6Class( # nolint object_name_linter "ArtifactRecord", inherit = Record, public = list( - #' Load the artifact into memory - #' #' @description - #' This currently only supports AnnData artifacts. + #' Load the artifact into memory. This currently only supports AnnData + #' artifacts. #' #' @return The artifact load = function() { @@ -26,10 +25,9 @@ ArtifactRecord <- R6::R6Class( # nolint object_name_linter cli_abort(paste0("Unsupported accessor: ", artifact_accessor)) } }, - #' Cache the artifact to the local filesystem - #' #' @description - #' This currently only supports S3 storage. + #' Cache the artifact to the local filesystem. This currently only supports + #' S3 storage. #' #' @return The path to the cached artifact cache = function() { @@ -50,6 +48,10 @@ ArtifactRecord <- R6::R6Class( # nolint object_name_linter cli_abort(paste0("Unsupported storage type: ", artifact_storage$type)) } }, + #' @description + #' Print a more detailed description of an `ArtifactRecord` + #' + #' @param style Logical, whether the output is styled using ANSI codes describe = function(style = TRUE) { provenance_fields <- c( storage = "root", diff --git a/R/Field.R b/R/Field.R index a4d544f..2f68952 100644 --- a/R/Field.R +++ b/R/Field.R @@ -45,6 +45,39 @@ Field <- R6::R6Class( # nolint object_name_linter private$.related_field_name <- related_field_name private$.related_registry_name <- related_registry_name private$.related_module_name <- related_module_name + }, + #' @description + #' Print a `Field` + #' + #' @param style Logical, whether the output is styled using ANSI codes + print = function(style = TRUE) { + cli::cat_line(self$to_string(style)) + }, + #' @description + #' Create a string representation of a `Field` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector + to_string = function(style = FALSE) { + field_strings <- make_key_value_strings( + self, + c( + "field_name", + "column_name", + "type", + "registry_name", + "module_name", + "through", + "is_link_table", + "relation_type", + "related_field_name", + "related_registry_name", + "related_module_name" + ) + ) + + make_class_string("Field", field_strings, style = style) } ), private = list( diff --git a/R/Instance.R b/R/Instance.R index 1f1b7b5..e10cdcb 100644 --- a/R/Instance.R +++ b/R/Instance.R @@ -104,6 +104,114 @@ Instance <- R6::R6Class( # nolint object_name_linter #' Get the names of the modules. Example: `c("core", "bionty")`. get_module_names = function() { names(private$.module_classes) + }, + #' Get instance settings. + get_settings = function() { + private$.settings + }, + #' Get instance API. + get_api = function() { + private$.api + }, + #' @description + #' Print an `Instance` + #' + #' @param style Logical, whether the output is styled using ANSI codes + print = function(style = TRUE) { + + registries <- self$get_module("core")$get_registries() + + is_link_table <- purrr::map(registries, "is_link_table") |> + unlist() + + standard_lines <- purrr::map_chr( + names(registries)[!is_link_table], + function(.registry) { + cli::col_blue(paste0(" $", registries[[.registry]]$class_name)) + } + ) + + link_lines <- purrr::map_chr( + names(registries)[is_link_table], + function(.registry) { + cli::col_blue(paste0(" ", .registry)) + } + ) + + lines <- c( + cli::style_bold(cli::col_green(private$.settings$name)), + cli::style_italic(cli::col_magenta(" Core registries")), + standard_lines, + cli::style_italic(cli::col_magenta(" Core link tables")), + link_lines + ) + + module_names <- self$get_module_names() + module_names <- module_names[module_names != "core"] + + if (length(module_names) > 0) { + lines <- c( + lines, + cli::style_italic(cli::col_magenta(" Additional modules")), + cli::col_blue(paste0(" ", module_names)) + ) + } + + if (isFALSE(style)) { + lines <- cli::ansi_strip(lines) + } + + purrr::walk(lines, cli::cat_line) + }, + #' @description + #' Create a string representation of an `Instance` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector + to_string = function(style = FALSE) { + registries <- self$get_module("core")$get_registries() + + is_link_table <- purrr::map(registries, "is_link_table") |> + unlist() + + mapping <- list( + "CoreRegistries" = paste0( + "[", + paste( + paste0( + "$", + purrr::map_chr(registries[!is_link_table], "class_name") + ), + collapse = ", " + ), + "]" + ), + "CoreLinkTables" = paste0( + "[", + paste(names(registries[is_link_table]), collapse = ", "), + "]" + ) + ) + + module_names <- self$get_module_names() + module_names <- module_names[module_names != "core"] + + if (length(module_names) > 0) { + mapping["AdditionalModules"] <- paste0( + "[", + paste(module_names, collapse = ", "), + "]" + ) + } + + key_value_strings <- make_key_value_strings( + mapping, quote_strings = FALSE + ) + + make_class_string( + private$.settings$name, key_value_strings, style = style + ) } ), private = list( diff --git a/R/InstanceAPI.R b/R/InstanceAPI.R index 2d46c44..68e95ab 100644 --- a/R/InstanceAPI.R +++ b/R/InstanceAPI.R @@ -18,6 +18,7 @@ InstanceAPI <- R6::R6Class( # nolint object_name_linter private$.api_client <- laminr.api::ApiClient$new(instance_settings$api_url) private$.default_api <- laminr.api::DefaultApi$new(private$.api_client) }, + #' @description #' Get the schema for the instance. get_schema = function(id) { schema <- try( @@ -35,7 +36,9 @@ InstanceAPI <- R6::R6Class( # nolint object_name_linter return(schema) }, + #' @description #' Get a record from the instance. + #' #' @importFrom jsonlite toJSON get_record = function(module_name, registry_name, @@ -87,12 +90,42 @@ InstanceAPI <- R6::R6Class( # nolint object_name_linter )) } - return(record) + content } ), private = list( .instance_settings = NULL, - .api_client = NULL, - .default_api = NULL + process_response = function(response, request_type) { + content <- httr::content(response) + if (httr::http_error(response)) { + if (is.list(content) && "detail" %in% names(content)) { + cli_abort(content$detail) + } else { + cli_abort("Failed to {request_type} from instance. Output: {content}") + } + } + + content + }, + #' @description + #' Print an `API` + #' + #' @param style Logical, whether the output is styled using ANSI codes + print = function(style = TRUE) { + cli::cat_line(self$to_string(style)) + }, + #' @description + #' Create a string representation of an `API` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector + to_string = function(style = FALSE) { + field_strings <- make_key_value_strings( + private$.instance_settings, c("api_url", "id", "schema_id") + ) + + make_class_string("API", field_strings, style = style) + } ) ) diff --git a/R/InstanceSettings.R b/R/InstanceSettings.R index abf45c2..65cb296 100644 --- a/R/InstanceSettings.R +++ b/R/InstanceSettings.R @@ -37,15 +37,33 @@ InstanceSettings <- R6::R6Class( # nolint object_name_linter "db_user_password", # api "lamindb_version" # api ) - missing_column <- setdiff(expected_keys, names(settings)) - if (length(missing_column) > 0) { - cli_abort("Missing column: ", missing_column) + missing_keys <- setdiff(expected_keys, names(settings)) + if (length(missing_keys) > 0) { + cli_abort("Missing key{?s}: {missing_keys}") } - unexpected_columns <- setdiff(names(settings), c(expected_keys, optional_keys)) - if (length(unexpected_columns) > 0) { - cli_abort("Unexpected column: ", unexpected_columns) + unexpected_keys <- setdiff(names(settings), c(expected_keys, optional_keys)) + if (length(unexpected_keys) > 0) { + cli_abort("Unexpected key{?s}: {unexpected_keys}") } private$.settings <- settings + }, + #' @description + #' Print an `InstanceSettings` + #' + #' @param style Logical, whether the output is styled using ANSI codes + print = function(style = TRUE) { + cli::cat_line(self$to_string(style)) + }, + #' @description + #' Create a string representation of an `InstanceSettings` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector + to_string = function(style = FALSE) { + field_strings <- make_key_value_strings(private$.settings) + + make_class_string("InstanceSettings", field_strings, style = style) } ), private = list( diff --git a/R/Module.R b/R/Module.R index 6b53f5d..a2a6cb0 100644 --- a/R/Module.R +++ b/R/Module.R @@ -74,17 +74,94 @@ Module <- R6::R6Class( # nolint object_name_linter ) |> set_names(names(module_schema)) }, + #' @description #' Get the registries in the module. get_registries = function() { private$.registry_classes }, + #' @description #' Get a registry by name. get_registry = function(registry_name) { private$.registry_classes[[registry_name]] }, + #' @description #' Get the names of the registries in the module. E.g. `c("User", "Artifact")`. get_registry_names = function() { names(private$.registry_classes) + }, + #' @description + #' Print a `Module` + #' + #' @param style Logical, whether the output is styled using ANSI codes + print = function(style = TRUE) { + registries <- self$get_registries() + + is_link_table <- purrr::map(registries, "is_link_table") |> + unlist() + + standard_lines <- purrr::map_chr( + names(registries)[!is_link_table], + function(.registry) { + cli::col_blue(paste0(" $", registries[[.registry]]$class_name)) + } + ) + + link_lines <- purrr::map_chr( + names(registries)[is_link_table], + function(.registry) { + cli::col_blue(paste0(" ", .registry)) + } + ) + + lines <- c( + cli::style_bold(cli::col_green(private$.module_name)), + cli::style_italic(cli::col_magenta(" Registries")), + standard_lines, + cli::style_italic(cli::col_magenta(" Link tables")), + link_lines + ) + + if (isFALSE(style)) { + lines <- cli::ansi_strip(lines) + } + + purrr::walk(lines, cli::cat_line) + }, + #' @description + #' Create a string representation of a `Module` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector + to_string = function(style = FALSE) { + registries <- self$get_registries() + + is_link_table <- purrr::map(registries, "is_link_table") |> + unlist() + + registry_strings <- make_key_value_strings( + list( + "Registries" = paste0( + "[", + paste( + paste0( + "$", + purrr::map_chr(registries[!is_link_table], "class_name") + ), + collapse = ", " + ), + "]" + ), + "LinkTables" = paste0( + "[", + paste(names(registries[is_link_table]), collapse = ", "), + "]" + ) + ), + quote_strings = FALSE + ) + + make_class_string(private$.module_name, registry_strings, style = style) } ), private = list( diff --git a/R/Record.R b/R/Record.R index cfc9b1d..e893db6 100644 --- a/R/Record.R +++ b/R/Record.R @@ -95,9 +95,19 @@ Record <- R6::R6Class( # nolint object_name_linter ) } }, + #' @description + #' Print a `Record` + #' + #' @param style Logical, whether the output is styled using ANSI codes print = function(style = TRUE) { cli::cat_line(self$to_string(style)) }, + #' @description + #' Create a string representation of a `Record` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector to_string = function(style = FALSE) { important_fields <- c( "uid", @@ -122,34 +132,12 @@ Record <- R6::R6Class( # nolint object_name_linter important_fields, setdiff(names(record_fields), important_fields) ) - field_strings <- purrr::map_chr(field_names, function(.name) { - value <- record_fields[[.name]] + field_strings <- make_key_value_strings(record_fields, field_names) - if (is.null(value)) { - return(NA_character_) - } - - if (is.character(value)) { - value <- paste0("'", value, "'") - } - - paste0( - cli::col_blue(.name), cli::col_br_blue("="), cli::col_yellow(value) - ) - }) |> - purrr::discard(is.na) - - string <- paste0( - cli::style_bold(cli::col_green(private$.registry$class_name)), "(", - paste(field_strings, collapse = ", "), - ")" + make_class_string( + private$.registry$class_name, field_strings, + style = style ) - - if (isFALSE(style)) { - string <- cli::ansi_strip(string) - } - - return(string) } ), private = list( @@ -163,7 +151,7 @@ Record <- R6::R6Class( # nolint object_name_linter } else if (key %in% private$.registry$get_field_names()) { field <- private$.registry$get_field(key) - ## TODO: use related_registry_class$get_records instead + # refetch the record to get the related data related_data <- private$.api$get_record( module_name = field$module_name, registry_name = field$registry_name, @@ -171,10 +159,17 @@ Record <- R6::R6Class( # nolint object_name_linter select = key )[[key]] + # return NULL if the related data is NULL + if (is.null(related_data)) { + return(NULL) + } + + # if the related data is not NULL, create a record class for it related_module <- private$.instance$get_module(field$related_module_name) related_registry <- related_module$get_registry(field$related_registry_name) related_registry_class <- related_registry$get_record_class() + # if the relation type is one-to-many or many-to-many, iterate over the list if (field$relation_type %in% c("one-to-one", "many-to-one")) { related_registry_class$new(related_data) } else { diff --git a/R/Registry.R b/R/Registry.R index b76a806..9286898 100644 --- a/R/Registry.R +++ b/R/Registry.R @@ -48,6 +48,7 @@ Registry <- R6::R6Class( # nolint object_name_linter api = api ) }, + #' @description #' Get a record by ID or UID. get = function(id_or_uid, include_foreign_keys = FALSE, verbose = FALSE) { data <- private$.api$get_record( @@ -60,21 +61,110 @@ Registry <- R6::R6Class( # nolint object_name_linter private$.record_class$new(data = data) }, + #' @description #' Get the fields in the registry. get_fields = function() { private$.fields }, + #' @description #' Get a field by name. get_field = function(field_name) { private$.fields[[field_name]] }, + #' @description #' Get the field names in the registry. get_field_names = function() { names(private$.fields) }, + #' @description #' Get the record class for the registry. get_record_class = function() { private$.record_class + }, + #' @description + #' Print a `Registry` + #' + #' @param style Logical, whether the output is styled using ANSI codes + print = function(style = TRUE) { + fields <- self$get_fields() + # Remove hidden fields + fields <- fields[grep("^_", names(fields), value = TRUE, invert = TRUE)] + # Remove link fields + fields <- fields[grep("^links_", names(fields), value = TRUE, invert = TRUE)] + + relational_fields <- purrr::map(fields, "relation_type") |> + unlist() |> + names() + + simple_lines <- purrr::map_chr( + setdiff(names(fields), relational_fields), + function(.field) { + paste0( + cli::col_blue(paste0(" ", .field)), ": ", + cli::col_grey(fields[[.field]]$type) + ) + } + ) + + relational_lines <- purrr::map_chr(relational_fields, function(.field) { + field_object <- fields[[.field]] + paste0( + cli::col_blue(paste0(" ", .field)), ": ", + cli::col_grey(paste0( + field_object$related_registry_name, + " (", field_object$relation_type, ")" + )) + ) + }) + + lines <- c( + cli::style_bold(cli::col_green(private$.class_name)), + cli::style_italic(cli::col_br_magenta(" Simple fields")), + simple_lines, + cli::style_italic(cli::col_br_magenta(" Relational fields")), + relational_lines + ) + + if (isFALSE(style)) { + lines <- cli::ansi_strip(lines) + } + + purrr::walk(lines, cli::cat_line) + }, + #' @description + #' Create a string representation of a `Registry` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector + to_string = function(style = FALSE) { + fields <- self$get_fields() + # Remove hidden fields + fields <- fields[grep("^_", names(fields), value = TRUE, invert = TRUE)] + # Remove link fields + fields <- fields[grep("^links_", names(fields), value = TRUE, invert = TRUE)] + + relational_fields <- purrr::map(fields, "relation_type") |> + unlist() |> + names() + + field_strings <- make_key_value_strings( + list( + "SimpleFields" = paste0( + "[", + paste(setdiff(names(fields), relational_fields), collapse = ", "), + "]" + ), + "RelationalFields" = paste0( + "[", + paste(relational_fields, collapse = ", "), + "]" + ) + ), + quote_strings = FALSE + ) + + make_class_string(private$.class_name, field_strings, style = style) } ), private = list( diff --git a/R/UserSettings.R b/R/UserSettings.R index 35238f7..a0c474b 100644 --- a/R/UserSettings.R +++ b/R/UserSettings.R @@ -30,6 +30,24 @@ UserSettings <- R6::R6Class( # nolint object_name_linter cli_abort("Unexpected column: ", unexpected_columns) } private$.settings <- settings + }, + #' @description + #' Print a `UserSettings` + #' + #' @param style Logical, whether the output is styled using ANSI codes + print = function(style = TRUE) { + cli::cat_line(self$to_string(style)) + }, + #' @description + #' Create a string representation of a `UserSettings` + #' + #' @param style Logical, whether the output is styled using ANSI codes + #' + #' @return A `cli::cli_ansi_string` if `style = TRUE` or a character vector + to_string = function(style = FALSE) { + field_strings <- make_key_value_strings(private$.settings) + + make_class_string("UserSettings", field_strings, style = style) } ), private = list( diff --git a/R/connect.R b/R/connect.R index 893a9cf..7fc5594 100644 --- a/R/connect.R +++ b/R/connect.R @@ -5,7 +5,7 @@ #' #' Note that prior to connecting to an instance, you need to authenticate with #' `lamin login`. If no slug is provided, the default instance is loaded, which is -#' set by running `lamin load `. +#' set by running `lamin connect `. #' #' @param slug The instance slug `account_handle/instance_name` or URL. #' If the instance is owned by you, it suffices to pass the instance name. @@ -47,13 +47,13 @@ connect <- function(slug = NULL) { paste0( "Could not load default instance. Either:\n", " - Provide a slug. For example: `connect(\"laminlabs/cellxgene\")`)\n", - " - Set a default instance by running `lamin load `." + " - Set a default instance by running `lamin connect `." ) } else { paste0( "No default user or instance is loaded! Either:\n", " - Call `lamin login` to set a default user.\n", - " - Call `lamin load ` to set a default instance." + " - Call `lamin connect ` to set a default instance." ) } cli_abort(error_msg) diff --git a/R/printing.R b/R/printing.R new file mode 100644 index 0000000..5465688 --- /dev/null +++ b/R/printing.R @@ -0,0 +1,66 @@ +#' Make key value strings +#' +#' Generate a vector of styled strings representing key-value pairs. Any +#' `NULL`/`NA` values are not returned. +#' +#' @param mapping Any object for which values can be retrieved using names with +#' `mapping[[name]]` +#' @param names Vector of names to create strings for. Defaults to all names of +#' `mapping` +#' @param quote_strings Logical, whether to quote string values +#' +#' @return A vector of `cli::cli_ansi_string` objects +#' @noRd +make_key_value_strings <- function(mapping, names = NULL, quote_strings = TRUE) { + if (is.null(names)) { + names <- names(mapping) + } + + purrr::map_chr(names, function(.name) { + value <- mapping[[.name]] + + if (is.null(value)) { + return(NA_character_) + } + + if (quote_strings && is.character(value)) { + value <- paste0("'", value, "'") + } + + if (is.list(value)) { + list_strings <- make_key_value_strings(value) + value <- paste0( + "list(", cli::ansi_strip(paste(list_strings, collapse = ", ")), ")" + ) + } + + paste0( + cli::col_blue(.name), cli::col_br_blue("="), cli::col_yellow(value) + ) + }) |> + purrr::discard(is.na) +} + +#' Make a string representation of a class +#' +#' @param class_name Name of the class +#' @param field_strings A vector of formatted name strings as produced by +#' `make_key_value_strings` +#' @param style Whether or not to returned a styled string +#' +#' @return A `cli::cli_ansi_string` object if `style = TRUE`, otherwise a +#' character vector +#' @noRd +make_class_string <- function(class_name, field_strings, style = TRUE) { + string <- paste0( + cli::style_bold(cli::col_green(class_name)), "(", + paste(field_strings, collapse = ", "), + ")" + ) + + if (isFALSE(style)) { + string <- cli::ansi_strip(string) + } + + return(string) +} diff --git a/R/settings_load.R b/R/settings_load.R index de8a027..444cc25 100644 --- a/R/settings_load.R +++ b/R/settings_load.R @@ -9,7 +9,7 @@ instance_settings_file <- .settings_store__current_instance_settings_file() } if (!file.exists(instance_settings_file)) { - cli_abort("No instance is loaded! Call `lamin load ` to load an instance.") + cli_abort("No instance is loaded! Call `lamin connect ` to load an instance.") } settings_store <- tryCatch( diff --git a/README.md b/README.md index 4b61ce1..53ce96c 100644 --- a/README.md +++ b/README.md @@ -22,72 +22,82 @@ Install the development version from GitHub: remotes::install_github("laminlabs/laminr") ``` -Install the Lamin CLI and authenticate: +You will also need to install `lamindb`: ``` bash -pip install lamin-cli -lamin login +pip install lamindb[bionty,wetlab] ``` -> [!TIP] -> -> You can get your token from the [LaminDB web -> interface](https://lamin.ai/settings). +## Connect to an instance + +To connect to a LaminDB instance, you will first need to run +`lamin login` OR `lamin load ` in the terminal. This will +create a directory in your home directory called `.lamin` with the +necessary credentials. -## Quick start +``` bash +lamin login +lamin connect laminlabs/cellxgene +``` -Let’s first connect to a LaminDB instance: +Then, you can connect to the instance using the `laminr::connect()` +function: ``` r library(laminr) db <- connect("laminlabs/cellxgene") +db ``` -Get an artifact: - -``` r -artifact <- db$Artifact$get("KBW89Mf7IGcekja2hADu") -artifact -``` + cellxgene Instance(modules='c('core', 'bionty')') - Artifact(uid='KBW89Mf7IGcekja2hADu', description='Myeloid compartment', key='cell-census/2024-07-01/h5ads/fe52003e-1460-4a65-a213-2bb1a508332f.h5ad', version='2024-07-01', _accessor='AnnData', id=3659, transform_id=22, size=691757462, is_latest=TRUE, created_by_id=1, _hash_type='md5-n', type='dataset', created_at='2024-07-12T12:34:10.345829+00:00', n_observations=51552, updated_at='2024-07-12T12:40:48.837026+00:00', run_id=27, suffix='.h5ad', visibility=1, _key_is_virtual=FALSE, hash='SZ5tB0T4YKfiUuUkAL09ZA', storage_id=2) +## Query the instance -Access some of its fields: +You can use the `db` object to query the instance: ``` r -artifact$id +artifact <- db$Artifact$get("KBW89Mf7IGcekja2hADu") ``` - [1] 3659 +You can print the record: ``` r -artifact$uid +artifact ``` - [1] "KBW89Mf7IGcekja2hADu" + Artifact(uid='KBW89Mf7IGcekja2hADu', description='Myeloid compartment', key='cell-census/2024-07-01/h5ads/fe52003e-1460-4a65-a213-2bb1a508332f.h5ad', storage_id=2, version='2024-07-01', _accessor='AnnData', id=3659, transform_id=22, size=691757462, is_latest=TRUE, created_by_id=1, type='dataset', _hash_type='md5-n', n_observations=51552, created_at='2024-07-12T12:34:10.345829+00:00', updated_at='2024-07-12T12:40:48.837026+00:00', run_id=27, suffix='.h5ad', visibility=1, _key_is_virtual=FALSE, hash='SZ5tB0T4YKfiUuUkAL09ZA') + +Or call the `$describe()` method to get a summary: ``` r -artifact$key +artifact$describe() ``` - [1] "cell-census/2024-07-01/h5ads/fe52003e-1460-4a65-a213-2bb1a508332f.h5ad" + Artifact(uid='KBW89Mf7IGcekja2hADu', description='Myeloid compartment', key='cell-census/2024-07-01/h5ads/fe52003e-1460-4a65-a213-2bb1a508332f.h5ad', storage_id=2, version='2024-07-01', _accessor='AnnData', id=3659, transform_id=22, size=691757462, is_latest=TRUE, created_by_id=1, type='dataset', _hash_type='md5-n', n_observations=51552, created_at='2024-07-12T12:34:10.345829+00:00', updated_at='2024-07-12T12:40:48.837026+00:00', run_id=27, suffix='.h5ad', visibility=1, _key_is_virtual=FALSE, hash='SZ5tB0T4YKfiUuUkAL09ZA') + Provenance + $storage = 's3://cellxgene-data-public' + $transform = 'Census release 2024-07-01 (LTS)' + $run = '2024-07-16T12:49:41.81955+00:00' + $created_by = 'sunnyosun' -Fetch related fields: +## Access fields -``` r -artifact$storage$root -``` +You can access its fields as follows: - [1] "s3://cellxgene-data-public" +- `artifact$id`: 3659 +- `artifact$uid`: KBW89Mf7IGcekja2hADu +- `artifact$key`: + cell-census/2024-07-01/h5ads/fe52003e-1460-4a65-a213-2bb1a508332f.h5ad -``` r -artifact$created_by$handle -``` +You can also fetch related fields: + +- `artifact$root`: s3://cellxgene-data-public +- `artifact$created_by`: sunnyosun - [1] "sunnyosun" +## Load the artifact -Load the artifact: +You can directly load the artifact to access its data: ``` r artifact$load() diff --git a/README.qmd b/README.qmd index 870e7de..f51d06f 100644 --- a/README.qmd +++ b/README.qmd @@ -24,53 +24,66 @@ Install the development version from GitHub: remotes::install_github("laminlabs/laminr") ``` -Install the Lamin CLI and authenticate: +You will also need to install `lamindb`: ```bash -pip install lamin-cli -lamin login +pip install lamindb[bionty,wetlab] ``` -:::{.callout-tip} -You can get your token from the [LaminDB web interface](https://lamin.ai/settings). -::: +## Connect to an instance -## Quick start +To connect to a LaminDB instance, you will first need to run `lamin login` OR `lamin load ` in the terminal. This will create a directory in your home directory called `.lamin` with the necessary credentials. -Let's first connect to a LaminDB instance: +```bash +lamin login +lamin connect laminlabs/cellxgene +``` + +Then, you can connect to the instance using the `laminr::connect()` function: ```{r setup} library(laminr) db <- connect("laminlabs/cellxgene") +db ``` -Get an artifact: +## Query the instance + +You can use the `db` object to query the instance: ```{r get_artifact} artifact <- db$Artifact$get("KBW89Mf7IGcekja2hADu") -artifact ``` -Access some of its fields: +You can print the record: -```{r print_simple_fields} -artifact$id +```{r print_artifact} +artifact +``` -artifact$uid +Or call the `$describe()` method to get a summary: -artifact$key +```{r describe_artifact} +artifact$describe() ``` -Fetch related fields: +## Access fields -```{r print_related_fields} -artifact$storage$root +You can access its fields as follows: -artifact$created_by$handle -``` +* `artifact$id`: `r artifact$id` +* `artifact$uid`: `r artifact$uid` +* `artifact$key`: `r artifact$key` + +You can also fetch related fields: + +* `artifact$root`: `r artifact$storage$root` +* `artifact$created_by`: `r artifact$created_by$handle` + +## Load the artifact -Load the artifact: +You can directly load the artifact to access its data: ```{r load_artifact} artifact$load() diff --git a/man/connect.Rd b/man/connect.Rd index bfb645e..2ab8880 100644 --- a/man/connect.Rd +++ b/man/connect.Rd @@ -14,7 +14,7 @@ If no slug is provided, the default instance is loaded.} \description{ Note that prior to connecting to an instance, you need to authenticate with \verb{lamin login}. If no slug is provided, the default instance is loaded, which is -set by running \verb{lamin load }. +set by running \verb{lamin connect }. } \examples{ \dontrun{ diff --git a/tests/testthat.R b/tests/testthat.R new file mode 100644 index 0000000..16b645d --- /dev/null +++ b/tests/testthat.R @@ -0,0 +1,12 @@ +# This file is part of the standard setup for testthat. +# It is recommended that you do not modify it. +# +# Where should you do additional test configuration? +# Learn more about the roles of various files in: +# * https://r-pkgs.org/testing-design.html#sec-tests-files-overview +# * https://testthat.r-lib.org/articles/special-files.html + +library(testthat) +library(laminr) + +test_check("laminr") diff --git a/tests/testthat/helper-setup_lamindata_instance.R b/tests/testthat/helper-setup_lamindata_instance.R new file mode 100644 index 0000000..66bfb60 --- /dev/null +++ b/tests/testthat/helper-setup_lamindata_instance.R @@ -0,0 +1,42 @@ +local_setup_lamindata_instance <- function(env = parent.frame()) { + root_dir <- withr::local_file(tempfile(), .local_envir = env) + withr::local_envvar(c(LAMIN_SETTINGS_DIR = root_dir), .local_envir = env) + + # create a temporary directory for the settings + lamin_dir <- file.path(root_dir, ".lamin") + dir.create(lamin_dir, recursive = TRUE, showWarnings = FALSE) + + # generate user settings + user_settings <- list( + email = "null", + password = "null", + access_token = "null", + api_key = "null", + uid = "00000000", + uuid = "null", + handle = "anonymous", + name = "null" + ) + user_lines <- paste0("lamin_user_", names(user_settings), "=", unlist(user_settings)) + writeLines(user_lines, file.path(lamin_dir, "current_user.env")) + + # generate instance settings + instance_settings <- list( + owner = "laminlabs", + name = "lamindata", + api_url = "https://us-east-1.api.lamin.ai", + storage_root = "s3://lamindata", + storage_region = "us-east-1", + db = "null", + schema_str = "bionty,wetlab", + schema_id = "097186c3e91c01ced47aa3e01a3c1515", + id = "037ba1e08d804f91a90275a47735076a", + git_repo = "null", + keep_artifacts_local = "False" + ) + instance_lines <- paste0("lamindb_instance_", names(instance_settings), "=", unlist(instance_settings)) + writeLines(instance_lines, file.path(lamin_dir, "current_instance.env")) + writeLines(instance_lines, file.path(lamin_dir, "instance--laminlabs--lamindata.env")) + + root_dir +} diff --git a/tests/testthat/test-connect_lamindata.R b/tests/testthat/test-connect_lamindata.R new file mode 100644 index 0000000..126ac79 --- /dev/null +++ b/tests/testthat/test-connect_lamindata.R @@ -0,0 +1,27 @@ +skip_if_offline() + +test_that("Connecting to lamindata works", { + local_setup_lamindata_instance() + + # try to connect to lamindata + db <- connect("laminlabs/lamindata") + + # check whether schema was parsed and classes were created + expect_equal(db$Artifact$name, "artifact") + + # try to fetch a record + artifact <- db$Artifact$get("mePviem4DGM4SFzvLXf3") + + expect_equal(artifact$uid, "mePviem4DGM4SFzvLXf3") + expect_equal(artifact$suffix, ".csv") + + # try to fetch related field + created_by <- artifact$created_by + expect_equal(created_by$handle, "sunnyosun") + + # access a related field which is empty for this record + expect_null(artifact$type) # one to one + + expect_type(artifact$wells, "list") # one-to-many + expect_length(artifact$wells, 0) +}) diff --git a/tests/testthat/test-instance_api.R b/tests/testthat/test-instance_api.R new file mode 100644 index 0000000..433c0cc --- /dev/null +++ b/tests/testthat/test-instance_api.R @@ -0,0 +1,105 @@ +skip_if_offline() + +broken_instance_settings <- function() { + InstanceSettings$new( + list( + owner = "foo", + name = "bar", + id = "...", + schema_str = "foo,bar", + schema_id = "...", + git_repo = "...", + keep_artifacts_local = TRUE, + api_url = "https://foo.lamin.ai" + ) + ) +} + +test_that("get_schema works", { + local_setup_lamindata_instance() + + instance_file <- .settings_store__instance_settings_file("laminlabs", "lamindata") + instance_settings <- .settings_load__load_instance_settings() + + api <- InstanceAPI$new(instance_settings) + + # try to get the schema + schema <- api$get_schema() + + expect_named(schema, c("core", "bionty", "wetlab")) + + expect_true(all(c("run", "user", "param", "artifact", "storage") %in% names(schema$core))) + + expect_named(schema$core$artifact, c("fields_metadata", "class_name", "is_link_table")) +}) + +test_that("get_schema fails gracefully", { + instance_settings <- broken_instance_settings() + + api <- InstanceAPI$new(instance_settings) + + expect_error(api$get_schema(), regexp = "Could not resolve host: foo.lamin.ai") +}) + +test_that("get_record works", { + local_setup_lamindata_instance() + + instance_file <- .settings_store__instance_settings_file("laminlabs", "lamindata") + instance_settings <- .settings_load__load_instance_settings() + + api <- InstanceAPI$new(instance_settings) + + # try to get a record + artifact <- api$get_record("core", "artifact", "mePviem4DGM4SFzvLXf3") + + expect_true(all(c("uid", "size", "hash", "description", "type") %in% names(artifact))) +}) + +test_that("test get_record fails gracefully with incorrect host", { + instance_settings <- broken_instance_settings() + + api <- InstanceAPI$new(instance_settings) + + # try to get a record + expect_error( + api$get_record("core", "artifact", "mePviem4DGM4SFzvLXf3"), + regexp = "Could not resolve host: foo.lamin.ai" + ) +}) + +test_that("get_record with select works", { + local_setup_lamindata_instance() + + instance_file <- .settings_store__instance_settings_file("laminlabs", "lamindata") + instance_settings <- .settings_load__load_instance_settings() + + api <- InstanceAPI$new(instance_settings) + + # try to get a record + artifact <- api$get_record("core", "artifact", "mePviem4DGM4SFzvLXf3", select = "storage") + + expect_true(all(c("uid", "size", "hash", "description", "type") %in% names(artifact))) + + expect_true(all(c("uid", "type", "region", "root") %in% names(artifact$storage))) +}) + +test_that("get_record fails gracefully", { + local_setup_lamindata_instance() + + instance_file <- .settings_store__instance_settings_file("laminlabs", "lamindata") + instance_settings <- .settings_load__load_instance_settings() + + api <- InstanceAPI$new(instance_settings) + + # nolint start: commented_code + # TODO: improve error messages for these cases + expect_error( + api$get_record("core", "artifact", "foobar")#, + # regexp = "Error getting record: list index out of range" + ) + expect_error( + api$get_record("core", "artifact", "mePviem4DGM4SFzvLXf3", select = "foo"), + # regexp = "Error getting record: invalid select field: foo" + ) + # nolint end: commented_code +}) diff --git a/vignettes/usage.Rmd b/vignettes/usage.Rmd new file mode 100644 index 0000000..2db22b6 --- /dev/null +++ b/vignettes/usage.Rmd @@ -0,0 +1,109 @@ +--- +title: "Usage" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Usage} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) +``` + +LaminDB is an open-source data framework for biology. You can find out about some of its features in the [documentation of the lamindb Python package](https://docs.lamin.ai/introduction). + +This vignette will show you how to use the `laminr` package to interact with LaminDB. + +## Initial setup + +As part of a first-time set up, you will need to install `laminr`, the Python `lamin-cli` package, and set up an instance for first use. + +```bash +pip install lamin-cli +lamin connect laminlabs/cellxgene +``` + +```R +install.packages("remotes") +remotes::install_github("laminlabs/laminr") +``` + +## Connect to a LaminDB instance + +This vignette uses the [`laminlabs/cellxgene`](https://lamin.ai/laminlabs/cellxgene) instance, which is a LaminDB instance that interfaces the CELLxGENE data. + +You can connect to the instance using the `connect` R function: + +```{r connect} +library(laminr) + +db <- connect("laminlabs/cellxgene") +``` + +By printing the instance, you can see which registries are available, including Artifact, Collection, Feature, etc. Each of these registries have a corresponding [Python class](https://docs.lamin.ai/lamindb). + +```{r print_instance} +db +``` + +All of the 'core' registries are directly available from the `db` object, while registries from other modules can be accessed via `db$`, e.g.: + +```{r get_module} +db$bionty +``` + +The `bionty` and other registries also have corresponding [Python classes](https://docs.lamin.ai/bionty). + +## Registry + +A registry is used to query, store and manage data. For instance, the `Artifact` registry stores datasets and models as files, folders, or arrays. + +You can see which functions you can use to interact with the registry by printing the registry object: + +```{r get_artifact_registry} +db$Artifact +``` + +For instance, you can fetch an Artifact by ID or UID. For example, Artifact [KBW89Mf7IGcekja2hADu](https://lamin.ai/laminlabs/cellxgene/artifact/KBW89Mf7IGcekja2hADu) is an AnnData object containing myeloid cells. + +```{r get_artifact} +artifact <- db$Artifact$get("KBW89Mf7IGcekja2hADu") +``` + +You can view its metadata by printing the object: + +```{r print_artifact} +artifact +``` + +Or get more detailed information by calling the `$describe()` method: + +```{r describe_artifact} +artifact$describe() +``` + +You can access its fields as follows: + +* `artifact$id`: `r artifact$id` +* `artifact$uid`: `r artifact$uid` +* `artifact$key`: `r artifact$key` + +Or fetch data from related registries: + +* `artifact$root`: `r artifact$storage$to_string()` +* `artifact$created_by`: `r artifact$created_by$to_string()` + +Finally, for Artifact objects, you can directly fetch or download the data using `$cache()` and `$load()`, respectively. + +```{r cache_artifact} +artifact$cache() +artifact$load() +``` + +:::{.callout-note} +Only S3 storage and AnnData accessors are supported at the moment. If additional storage and data accessors are desired, please open an issue on the [laminr GitHub repository](https://github.com/laminlabs/laminr/issues). +:::