Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the UX for bulk uploading and processing of large number of files #2124

Open
sunu opened this issue Feb 8, 2022 · 9 comments
Open

Comments

@sunu
Copy link
Contributor

sunu commented Feb 8, 2022

Currently while uploading a large number of files or a large archive containing a large number of files through alephclient or the UI, the overall user experience is not great. Here's a list of potential issues the user might face:

  • The progress bar indicating the processing status of the uploaded documents might get stuck at 99% or 100% (Document ingestion gets stuck effectively at 100% #1839) without much insight into what's wrong or how to proceed further
  • Some of the files might fail to be processed without leaving any hint to the uploader or the viewer.
    • This results in an incomplete dataset and the users don't get to know that the dataset is incomplete. This is problematic if the completeness of the dataset is crucial for an investigation.
    • There is no way to upload only the files that failed to be processed without re-uploading the entire set of documents or manually making a list of the failed documents and re-uploading them
    • There is no way for uploaders or Aleph admins to see an overview of processing errors to figure out why some files are failing to be processed without going through docker logs (which is not very user-friendly)
  • while uploading a large number of documents through alephclient, some documents might fail to upload due to network errors, timeouts etc. In that case we want upload the missing files to Aleph by comparing the current directory to the files on Aleph(Check if alephclient crawldir was 100% successful or not alephclient#35)). But that's currently not possible. Same goes for new files in a folder containing lots of files; there is no way to uplaod only the new files to Aleph without uploading the whole folder (Allow users to upload/crawl new files only alephclient#34)
  • when large archives, mailboxes are uploaded to aleph, ingest-file workers run into rate limits from GCS (Implement rate limits on writes to blob storage  #3882)
@sunu
Copy link
Contributor Author

sunu commented Feb 8, 2022

Some ideas on how to improve the experience:

@brrttwrks
Copy link

+1 about both the UI and alephclient/API - I think from a journo's perspective, the UI should give clear indication of state at any given level and what lies underneath, but also alephclient should provide a way to use that info to re-ingest all or failed documents or to pipe errors to other cli tools to analyze or further process the info:

alephclient stream-entities --failed -f <foreign_id> | jq '.' ...

or something similar.

@jlstro
Copy link
Contributor

jlstro commented Feb 9, 2022

We could also think of a way to manually exclude files or folders when using alephclient crawldir? Similar to a gitignore file?

@sunu
Copy link
Contributor Author

sunu commented Feb 9, 2022

We could also think of a way to manually exclude files or folders when using alephclient crawldir? Similar to a gitignore file?

I have added that issue now @jlstro (alephdata/alephclient#39)

@brrttwrks
Copy link

The ability to have include and exclude files like rsync would be sweet via a switch or an include/exclude file that accepted some basic regex like git ignore files or rsync.

@akrymets
Copy link

Hi to everyone!
Any news on this topic?
Sometimes uploading thousands files to an investigation is painful.
Thanks!

lyz-code added a commit to lyz-code/blue-book that referenced this issue Sep 7, 2023
Gain early map control with scouts, then switch into steppe lancers and front siege, finally castle in the face when you clicked to imperial.

- [Example Hera vs Mr.Yo in TCI](https://yewtu.be/watch?v=20bktCBldcw)

feat(aleph#Ingest gets stuck): Ingest gets stuck

It looks that Aleph doesn't yet give an easy way to debug it. It can be seen in the next webs:

- [Improve the UX for bulk uploading and processing of large number of files](alephdata/aleph#2124)
- [Document ingestion gets stuck effectively at 100%](alephdata/aleph#1839)
- [Display detailed ingestion status to see if everything is alright and when the collection is ready](alephdata/aleph#1525)

Some interesting ideas I've extracted while diving into these issues is that:

- You can also upload files using the [`alephclient` python command line tool](https://github.com/alephdata/alephclient)
- Some of the files might fail to be processed without leaving any hint to the uploader or the viewer.
  - This results in an incomplete dataset and the users don't get to know that the dataset is incomplete. This is problematic if the completeness of the dataset is crucial for an investigation.
  - There is no way to upload only the files that failed to be processed without re-uploading the entire set of documents or manually making a list of the failed documents and re-uploading them
  - There is no way for uploaders or Aleph admins to see an overview of processing errors to figure out why some files are failing to be processed without going through docker logs (which is not very user-friendly)
- There was an attempt to [improve the way ingest-files manages the pending tasks](alephdata/aleph#2127), it's merged into the [release/4.0.0](https://github.com/alephdata/ingest-file/tree/release/4.0.0) branch, but it has [not yet arrived `main`](alephdata/ingest-file#423).

There are some tickets that attempt to address these issues on the command line:

- [Allow users to upload/crawl new files only](alephdata/alephclient#34)
- [Check if alephclient crawldir was 100% successful or not](alephdata/alephclient#35)

I think it's interesting either to contribute to `alephclient` to solve those issues or if it's complicated create a small python script to detect which files were not uploaded and try to reindex them and/or open issues that will prevent future ingests to fail.

feat(ansible_snippets#Ansible condition that uses a regexp): Ansible condition that uses a regexp

```yaml
- name: Check if an instance name or hostname matches a regex pattern
  when: inventory_hostname is not match('molecule-.*')
  fail:
    msg: "not a molecule instance"
```

feat(ansible_snippets#Ansible-lint doesn't find requirements): Ansible-lint doesn't find requirements

It may be because you're using `requirements.yaml` instead of `requirements.yml`. Create a temporal link from one file to the other, run the command and then remove the link.

It will work from then on even if you remove the link. `¯\(°_o)/¯`

feat(ansible_snippets#Run task only once): Run task only once

Add `run_once: true` on the task definition:

```yaml
- name: Do a thing on the first host in a group.
  debug:
    msg: "Yay only prints once"
  run_once: true
```

feat(aws_snippets#Invalidate a cloudfront distribution

```bash
aws cloudfront create-invalidation --paths "/pages/about" --distribution-id my-distribution-id
```

feat(bash_snippets#Remove the lock screen in ubuntu): Remove the lock screen in ubuntu

Create the `/usr/share/glib-2.0/schemas/90_ubuntu-settings.gschema.override` file with the next content:

```ini
[org.gnome.desktop.screensaver]
lock-enabled = false
[org.gnome.settings-daemon.plugins.power]
idle-dim = false
```

Then reload the schemas with:

```bash
sudo glib-compile-schemas /usr/share/glib-2.0/schemas/
```

feat(bash_snippets#How to deal with HostContextSwitching alertmanager alert): How to deal with HostContextSwitching alertmanager alert

A context switch is described as the kernel suspending execution of one process on the CPU and resuming execution of some other process that had previously been suspended. A context switch is required for every interrupt and every task that the scheduler picks.

Context switching can be due to multitasking, Interrupt handling , user & kernel mode switching. The interrupt rate will naturally go high, if there is higher network traffic, or higher disk traffic. Also it is dependent on the application which every now and then invoking system calls.

If the cores/CPU's are not sufficient to handle load of threads created by application will also result in context switching.

It is not a cause of concern until performance breaks down. This is expected that CPU will do context switching. One shouldn't verify these data at first place since there are many statistical data which should be analyzed prior to looking into kernel activities. Verify the CPU, memory and network usage during this time.

You can see which process is causing issue with the next command:

```bash

10:15:24 AM     UID     PID     cswch/s         nvcswch/s       Command
10:15:27 AM     0       1       162656.7        16656.7         systemd
10:15:27 AM     0       9       165451.04       15451.04        ksoftirqd/0
10:15:27 AM     0       10      158628.87       15828.87        rcu_sched
10:15:27 AM     0       11      156147.47       15647.47        migration/0
10:15:27 AM     0       17      150135.71       15035.71        ksoftirqd/1
10:15:27 AM     0       23      129769.61       12979.61        ksoftirqd/2
10:15:27 AM     0       29      2238.38         238.38          ksoftirqd/3
10:15:27 AM     0       43      1753            753             khugepaged
10:15:27 AM     0       443     1659            165             usb-storage
10:15:27 AM     0       456     1956.12         156.12          i915/signal:0
10:15:27 AM     0       465     29550           29550           kworker/3:1H-xfs-log/dm-3
10:15:27 AM     0       490     164700          14700           kworker/0:1H-kblockd
10:15:27 AM     0       506     163741.24       16741.24        kworker/1:1H-xfs-log/dm-3
10:15:27 AM     0       594     154742          154742          dmcrypt_write/2
10:15:27 AM     0       629     162021.65       16021.65        kworker/2:1H-kblockd
10:15:27 AM     0       715     147852.48       14852.48        xfsaild/dm-1
10:15:27 AM     0       886     150706.86       15706.86        irq/131-iwlwifi
10:15:27 AM     0       966     135597.92       13597.92        xfsaild/dm-3
10:15:27 AM     81      1037    2325.25         225.25          dbus-daemon
10:15:27 AM     998     1052    118755.1        11755.1         polkitd
10:15:27 AM     70      1056    158248.51       15848.51        avahi-daemon
10:15:27 AM     0       1061    133512.12       455.12          rngd
10:15:27 AM     0       1110    156230          16230           cupsd
10:15:27 AM     0       1192    152298.02       1598.02         sssd_nss
10:15:27 AM     0       1247    166132.99       16632.99        systemd-logind
10:15:27 AM     0       1265    165311.34       16511.34        cups-browsed
10:15:27 AM     0       1408    10556.57        1556.57         wpa_supplicant
10:15:27 AM     0       1687    3835            3835            splunkd
10:15:27 AM     42      1773    3728            3728            Xorg
10:15:27 AM     42      1996    3266.67         266.67          gsd-color
10:15:27 AM     0       3166    32036.36        3036.36         sssd_kcm
10:15:27 AM     119349  3194    151763.64       11763.64        dbus-daemon
10:15:27 AM     119349  3199    158306          18306           Xorg
10:15:27 AM     119349  3242    15.28           5.8             gnome-shell

pidstat -wt 3 10  > /tmp/pidstat-t.out

Linux 4.18.0-80.11.2.el8_0.x86_64 (hostname)    09/08/2020  _x86_64_    (4 CPU)

10:15:15 AM   UID      TGID       TID   cswch/s   nvcswch/s  Command
10:15:19 AM     0         1         -   152656.7   16656.7   systemd
10:15:19 AM     0         -         1   152656.7   16656.7   |__systemd
10:15:19 AM     0         9         -   165451.04  15451.04  ksoftirqd/0
10:15:19 AM     0         -         9   165451.04  15451.04  |__ksoftirqd/0
10:15:19 AM     0        10         -   158628.87  15828.87  rcu_sched
10:15:19 AM     0         -        10   158628.87  15828.87  |__rcu_sched
10:15:19 AM     0        23         -   129769.61  12979.61  ksoftirqd/2
10:15:19 AM     0         -        23   129769.61  12979.33  |__ksoftirqd/2
10:15:19 AM     0        29         -   32424.5    2445      ksoftirqd/3
10:15:19 AM     0         -        29   32424.5    2445      |__ksoftirqd/3
10:15:19 AM     0        43         -   334        34        khugepaged
10:15:19 AM     0         -        43   334        34        |__khugepaged
10:15:19 AM     0       443         -   11465      566       usb-storage
10:15:19 AM     0         -       443   6433       93        |__usb-storage
10:15:19 AM     0       456         -   15.41      0.00      i915/signal:0
10:15:19 AM     0         -       456   15.41      0.00      |__i915/signal:0
10:15:19 AM     0       715         -   19.34      0.00      xfsaild/dm-1
10:15:19 AM     0         -       715   19.34      0.00      |__xfsaild/dm-1
10:15:19 AM     0       886         -   23.28      0.00      irq/131-iwlwifi
10:15:19 AM     0         -       886   23.28      0.00      |__irq/131-iwlwifi
10:15:19 AM     0       966         -   19.67      0.00      xfsaild/dm-3
10:15:19 AM     0         -       966   19.67      0.00      |__xfsaild/dm-3
10:15:19 AM    81      1037         -   6.89       0.33      dbus-daemon
10:15:19 AM    81         -      1037   6.89       0.33      |__dbus-daemon
10:15:19 AM     0      1038         -   11567.31   4436      NetworkManager
10:15:19 AM     0         -      1038   1.31       0.00      |__NetworkManager
10:15:19 AM     0         -      1088   0.33       0.00      |__gmain
10:15:19 AM     0         -      1094   1340.66    0.00      |__gdbus
10:15:19 AM   998      1052         -   118755.1   11755.1   polkitd
10:15:19 AM   998         -      1052   32420.66   25545     |__polkitd
10:15:19 AM   998         -      1132   0.66       0.00      |__gdbus
```

Then with help of PID which is causing issue, one can get all system calls details:
Raw

```bash
```

Let this command run for a few minutes while the load/context switch rates are high. It is safe to run this on a production system so you could run it on a good system as well to provide a comparative baseline. Through strace, one can debug & troubleshoot the issue, by looking at system calls the process has made.

feat(bash_snippets#Redirect stderr of all subsequent commands of a script to a file): Redirect stderr of all subsequent commands of a script to a file

```bash
{
    somecommand
    somecommand2
    somecommand3
} 2>&1 | tee -a $DEBUGLOG
```

feat(diffview#Use the same binding to open and close the diffview windows): Use the same binding to open and close the diffview windows

```lua
vim.keymap.set('n', 'dv', function()
  if next(require('diffview.lib').views) == nil then
    vim.cmd('DiffviewOpen')
  else
    vim.cmd('DiffviewClose')
  end
end)
```

fix(gitea#Using `paths-filter` custom action): Using `paths-filter` custom action to skip job actions

```
jobs:
  test:
    if: "!startsWith(github.event.head_commit.message, 'bump:')"
    name: Test
    runs-on: ubuntu-latest
    steps:
      - name: Checkout the codebase
        uses: https://github.com/actions/checkout@v3

      - name: Check if we need to run the molecule tests
        uses: https://github.com/dorny/paths-filter@v2
        id: filter
        with:
          filters: |
            molecule:
              - 'defaults/**'
              - 'tasks/**'
              - 'handlers/**'
              - 'tasks/**'
              - 'templates/**'
              - 'molecule/**'
              - 'requirements.yaml'
              - '.github/workflows/tests.yaml'

      - name: Run Molecule tests
        if: steps.filter.outputs.molecule == 'true'
        run: make molecule
```

You can find more examples on how to use `paths-filter` [here](https://github.com/dorny/paths-filter#examples ).

feat(gitsigns): Introduce gitsigns

[Gitsigns](https://github.com/lewis6991/gitsigns.nvim) is a neovim plugin to create git decorations similar to the vim plugin [gitgutter](https://github.com/airblade/vim-gitgutter) but written purely in Lua.

Installation:

Add to your `plugins.lua` file:

```lua
  use {'lewis6991/gitsigns.nvim'}
```

Install it with `:PackerInstall`.

Configure it in your `init.lua` with:

```lua
-- Configure gitsigns
require('gitsigns').setup({
  on_attach = function(bufnr)
    local gs = package.loaded.gitsigns

    local function map(mode, l, r, opts)
      opts = opts or {}
      opts.buffer = bufnr
      vim.keymap.set(mode, l, r, opts)
    end

    -- Navigation
    map('n', ']c', function()
      if vim.wo.diff then return ']c' end
      vim.schedule(function() gs.next_hunk() end)
      return '<Ignore>'
    end, {expr=true})

    map('n', '[c', function()
      if vim.wo.diff then return '[c' end
      vim.schedule(function() gs.prev_hunk() end)
      return '<Ignore>'
    end, {expr=true})

    -- Actions
    map('n', '<leader>gs', gs.stage_hunk)
    map('n', '<leader>gr', gs.reset_hunk)
    map('v', '<leader>gs', function() gs.stage_hunk {vim.fn.line('.'), vim.fn.line('v')} end)
    map('v', '<leader>gr', function() gs.reset_hunk {vim.fn.line('.'), vim.fn.line('v')} end)
    map('n', '<leader>gS', gs.stage_buffer)
    map('n', '<leader>gu', gs.undo_stage_hunk)
    map('n', '<leader>gR', gs.reset_buffer)
    map('n', '<leader>gp', gs.preview_hunk)
    map('n', '<leader>gb', function() gs.blame_line{full=true} end)
    map('n', '<leader>gb', gs.toggle_current_line_blame)
    map('n', '<leader>gd', gs.diffthis)
    map('n', '<leader>gD', function() gs.diffthis('~') end)
    map('n', '<leader>ge', gs.toggle_deleted)

    -- Text object
    map({'o', 'x'}, 'ih', ':<C-U>Gitsigns select_hunk<CR>')
  end
})
```

Usage:

Some interesting bindings:

- `]c`: Go to next diff chunk
- `[c`: Go to previous diff chunk
- `<leader>gs`: Stage chunk, it works both in normal and visual mode
- `<leader>gr`: Restore chunk from index, it works both in normal and visual mode
- `<leader>gp`: Preview diff, you can use it with `]c` and `[c` to see all the chunk diffs
- `<leader>gb`: Show the git blame of the line as a shadowed comment

fix(grafana): Install grafana

```yaml
---
version: "3.8"
services:
  grafana:
    image: grafana/grafana-oss:${GRAFANA_VERSION:-latest}
    container_name: grafana
    restart: unless-stopped
    volumes:
      - data:/var/lib/grafana
    networks:
      - grafana
      - monitorization
      - swag
    env_file:
      - .env
    depends_on:
      - db
  db:
    image: postgres:${DATABASE_VERSION:-15}
    restart: unless-stopped
    container_name: grafana-db
    environment:
      - POSTGRES_DB=${GF_DATABASE_NAME:-grafana}
      - POSTGRES_USER=${GF_DATABASE_USER:-grafana}
      - POSTGRES_PASSWORD=${GF_DATABASE_PASSWORD:?database password required}
    networks:
      - grafana
    volumes:
      - db-data:/var/lib/postgresql/data
    env_file:
      - .env

networks:
  grafana:
    external:
      name: grafana
  monitorization:
    external:
      name: monitorization
  swag:
    external:
      name: swag

volumes:
  data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/grafana/app
  db-data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/grafana/database
```

Where the `monitorization` network is where prometheus and the rest of the stack listens, and `swag` the network to the gateway proxy.

It uses the `.env` file to store the required [configuration](#configure-grafana), to connect grafana with authentik you need to add the next variables:

```bash

GF_AUTH_GENERIC_OAUTH_ENABLED="true"
GF_AUTH_GENERIC_OAUTH_NAME="authentik"
GF_AUTH_GENERIC_OAUTH_CLIENT_ID="<Client ID from above>"
GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET="<Client Secret from above>"
GF_AUTH_GENERIC_OAUTH_SCOPES="openid profile email"
GF_AUTH_GENERIC_OAUTH_AUTH_URL="https://authentik.company/application/o/authorize/"
GF_AUTH_GENERIC_OAUTH_TOKEN_URL="https://authentik.company/application/o/token/"
GF_AUTH_GENERIC_OAUTH_API_URL="https://authentik.company/application/o/userinfo/"
GF_AUTH_SIGNOUT_REDIRECT_URL="https://authentik.company/application/o/<Slug of the application from above>/end-session/"
GF_AUTH_OAUTH_AUTO_LOGIN="true"
GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH="contains(groups[*], 'Grafana Admins') && 'Admin' || contains(groups[*], 'Grafana Editors') && 'Editor' || 'Viewer'"
```

In the configuration above you can see an example of a role mapping. Upon login, this configuration looks at the groups of which the current user is a member. If any of the specified group names are found, the user will be granted the resulting role in Grafana.

In the example shown above, one of the specified group names is "Grafana Admins". If the user is a member of this group, they will be granted the "Admin" role in Grafana. If the user is not a member of the "Grafana Admins" group, it moves on to see if the user is a member of the "Grafana Editors" group. If they are, they are granted the "Editor" role. Finally, if the user is not found to be a member of either of these groups, it fails back to granting the "Viewer" role.

Also make sure in your configuration that `root_url` is set correctly, otherwise your redirect url might get processed incorrectly. For example, if your grafana instance is running on the default configuration and is accessible behind a reverse proxy at https://grafana.company, your redirect url will end up looking like this, https://grafana.company/. If you get `user does not belong to org` error when trying to log into grafana for the first time via OAuth, check if you have an organization with the ID of 1, if not, then you have to add the following to your grafana config:

```ini
[users]
auto_assign_org = true
auto_assign_org_id = <id-of-your-default-organization>
```

Once you've made sure that the oauth works, go to `/admin/users` and remove the `admin` user.

feat(grafana#Configure grafana): Configure grafana

Grafana has default and custom configuration files. You can customize your Grafana instance by modifying the custom configuration file or by using environment variables. To see the list of settings for a Grafana instance, refer to [View server settings](https://grafana.com/docs/grafana/latest/administration/stats-and-license/#view-server-settings).

To override an option use `GF_<SectionName>_<KeyName>`. Where the `section name` is the text within the brackets. Everything should be uppercase, `.` and `-` should be replaced by `_`. For example, if you have these configuration settings:

```ini
instance_name = ${HOSTNAME}

[security]
admin_user = admin

[auth.google]
client_secret = 0ldS3cretKey

[plugin.grafana-image-renderer]
rendering_ignore_https_errors = true

[feature_toggles]
enable = newNavigation
```

You can override variables on Linux machines with:

```bash
export GF_DEFAULT_INSTANCE_NAME=my-instance
export GF_SECURITY_ADMIN_USER=owner
export GF_AUTH_GOOGLE_CLIENT_SECRET=newS3cretKey
export GF_PLUGIN_GRAFANA_IMAGE_RENDERER_RENDERING_IGNORE_HTTPS_ERRORS=true
export GF_FEATURE_TOGGLES_ENABLE=newNavigation
```

And in the docker compose you can edit the `.env` file. Mine looks similar to:

```bash
GRAFANA_VERSION=latest
GF_DEFAULT_INSTANCE_NAME="production"
GF_SERVER_ROOT_URL="https://your.domain.org"

GF_DATABASE_TYPE=postgres
DATABASE_VERSION=15
GF_DATABASE_HOST=grafana-db:5432
GF_DATABASE_NAME=grafana
GF_DATABASE_USER=grafana
GF_DATABASE_PASSWORD="change-for-a-long-password"
GF_DATABASE_SSL_MODE=disable

GF_AUTH_GENERIC_OAUTH_ENABLED="true"
GF_AUTH_GENERIC_OAUTH_NAME="authentik"
GF_AUTH_GENERIC_OAUTH_CLIENT_ID="<Client ID from above>"
GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET="<Client Secret from above>"
GF_AUTH_GENERIC_OAUTH_SCOPES="openid profile email"
GF_AUTH_GENERIC_OAUTH_AUTH_URL="https://authentik.company/application/o/authorize/"
GF_AUTH_GENERIC_OAUTH_TOKEN_URL="https://authentik.company/application/o/token/"
GF_AUTH_GENERIC_OAUTH_API_URL="https://authentik.company/application/o/userinfo/"
GF_AUTH_SIGNOUT_REDIRECT_URL="https://authentik.company/application/o/<Slug of the application from above>/end-session/"
GF_AUTH_OAUTH_AUTO_LOGIN="true"
GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH="contains(groups[*], 'Grafana Admins') && 'Admin' || contains(groups[*], 'Grafana Editors') && 'Editor' || 'Viewer'"
```

feat(grafana#Configure datasources): Configure datasources

You can manage data sources in Grafana by adding YAML configuration files in the `provisioning/datasources` directory. Each config file can contain a list of datasources to add or update during startup. If the data source already exists, Grafana reconfigures it to match the provisioned configuration file.

The configuration file can also list data sources to automatically delete, called `deleteDatasources`. Grafana deletes the data sources listed in `deleteDatasources` before adding or updating those in the datasources list.

For example to [configure a Prometheus datasource](https://grafana.com/docs/grafana/latest/datasources/prometheus/) use:

```yaml
apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    # Access mode - proxy (server in the UI) or direct (browser in the UI).
    url: http://prometheus:9090
    jsonData:
      httpMethod: POST
      manageAlerts: true
      prometheusType: Prometheus
      prometheusVersion: 2.44.0
      cacheLevel: 'High'
      disableRecordingRules: false
      incrementalQueryOverlapWindow: 10m
      exemplarTraceIdDestinations: []
```

feat(grafana#Configure dashboards): Configure dashboards

You can manage dashboards in Grafana by adding one or more YAML config files in the `provisioning/dashboards` directory. Each config file can contain a list of dashboards providers that load dashboards into Grafana from the local filesystem.

Create one file called `dashboards.yaml` with the next contents:

```yaml
---
apiVersion: 1
providers:
  - name: default # A uniquely identifiable name for the provider
    type: file
    options:
      path: /etc/grafana/provisioning/dashboards/definitions
```

Then inside the config directory of your docker compose create the directory `provisioning/dashboards/definitions` and add the json of the dashboards themselves. You can download them from the dashboard pages. For example:

- [Node Exporter](https://grafana.com/grafana/dashboards/1860-node-exporter-full/)
- [Blackbox Exporter](https://grafana.com/grafana/dashboards/13659-blackbox-exporter-http-prober/)
- [Alertmanager](https://grafana.com/grafana/dashboards/9578-alertmanager/)

feat(grafana#Configure the plugins): Configure the plugins

To install plugins in the Docker container, complete the following steps:

- Pass the plugins you want to be installed to Docker with the `GF_INSTALL_PLUGINS` environment variable as a comma-separated list.
- This sends each plugin name to `grafana-cli plugins install ${plugin}` and installs them when Grafana starts.

For example:

```bash
docker run -d -p 3000:3000 --name=grafana \
  -e "GF_INSTALL_PLUGINS=grafana-clock-panel, grafana-simple-json-datasource" \
  grafana/grafana-oss
```

To specify the version of a plugin, add the version number to the `GF_INSTALL_PLUGINS` environment variable. For example: `GF_INSTALL_PLUGINS=grafana-clock-panel 1.0.1`.

To install a plugin from a custom URL, use the following convention to specify the URL: `<url to plugin zip>;<plugin install folder name>`. For example: `GF_INSTALL_PLUGINS=https://github.com/VolkovLabs/custom-plugin.zip;custom-plugin`.

feat(jellyfin#Forgot Password. Please try again within your home network to initiate the password reset process.): Forgot Password. Please try again within your home network to initiate the password reset process.

If you're an external jellyfin user you can't reset your password unless you are part of the LAN. This is done because the reset password process is simple and insecure.

If you don't care about that and still think that the internet is a happy and safe place [here](https://wiki.jfa-go.com/docs/password-resets/) and [here](hrfee/jellyfin-accounts#12) are some instructions on how to bypass the security measure.

For more information also read [1](jellyfin/jellyfin#2282) and [2](jellyfin/jellyfin#2869).

feat(lindy): New Charleston, lindy and solo jazz videos

Charleston:

- The DecaVita Sisters:
   - [Freestyle Lindy Hop & Charleston](https://www.youtube.com/watch?v=OV6ZDuczkag)
   - [Moby "Honey"](https://www.youtube.com/watch?v=ciMFQnwfp50)

Solo Jazz:

- [Pedro Vieira at Little Big Swing Camp 2022](https://yewtu.be/watch?v=pmxn2uIVuUY)

Lindy Hop:

- The DecaVita Sisters:
   - [Compromise - agreement in the moment](https://youtu.be/3DhD2u5Eyv8?si=2WKisSvEB3Z8TVMy)
   - [Lindy hop improv](https://www.youtube.com/watch?v=qkdxcdeicLE)

feat(matrix): How to install matrix

```bash
sudo apt install -y wget apt-transport-https
sudo wget -O /usr/share/keyrings/element-io-archive-keyring.gpg https://packages.element.io/debian/element-io-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/element-io-archive-keyring.gpg] https://packages.element.io/debian/ default main" | sudo tee /etc/apt/sources.list.d/element-io.list
sudo apt update
sudo apt install element-desktop
```

fix(mediatracker#alternatives): Update ryot comparison with mediatracker

[Ryot](https://github.com/IgnisDa/ryot) has a better web design, it also has a [jellyfin scrobbler](IgnisDa/ryot#195), although it's not [yet stable](IgnisDa/ryot#187). There are other UI tweaks that is preventing me from migrating to ryot such as [the easier media rating](IgnisDa/ryot#284) and [the percentage over five starts rating system](IgnisDa/ryot#283).

feat(molecule#Get variables from the environment): Get variables from the environment

You can configure your `molecule.yaml` file to read variables from the environment with:

```yaml
provisioner:
  name: ansible
  inventory:
    group_vars:
      all:
        my_secret: ${MY_SECRET}
```

It's useful to have a task that checks if this secret exists:

```yaml
- name: Verify that the secret is set
  fail:
    msg: 'Please export my_secret: export MY_SECRET=$(pass show my_secret)'
  run_once: true
  when: my_secret == None
```

In the CI you can set it as a secret in the repository.

feat(retroarch): Install retroarch instructions

To add the stable branch to your system type:

```bash
sudo add-apt-repository ppa:libretro/stable
sudo apt-get update
sudo apt-get install retroarch
```

Go to Main Menu/Online Updater and then update everything you can:

- Update Core Info Files
- Update Assets
- Update controller Profiles
- Update Databases
- Update Overlays
- Update GLSL Shaders

feat(vim): Update treesitter language definitions

To do so you need to run:

```vim
:TSInstall <language>
```

To update the parsers run

```vim
:TSUpdate
```

feat(vim#Telescope changes working directory when opening a file): Telescope changes working directory when opening a file

In my case was due to a snippet I have to remember the folds:

```
vim.cmd[[
  augroup remember_folds
    autocmd!
    autocmd BufWinLeave * silent! mkview
    autocmd BufWinEnter * silent! loadview
  augroup END
]]
```

It looks that it had saved a view with the other working directory so when a file was loaded the `cwd` changed. To solve it I created a new `mkview` in the correct directory.
@lyz-code
Copy link

Hi some of my users are complaining that they feel uncomfortable of not knowing for sure if all files are uploaded. I feel that until the whole UX is improved we could at least notify the user of what documents failed to upload. That way the admins could process those files manually and analyze the reason why they failed.

If you like the idea I can make a contribution to implement this

@lyz-code
Copy link

lyz-code commented Mar 14, 2024

Until the issue is solved you can be notified whenever there is an error or warning in the ingest docker logs if you set up Loki, promtail using the json-file driver with the next configuration:

  - job_name: docker
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
        refresh_interval: 5s
    relabel_configs:
      - source_labels: ['__meta_docker_container_name']
        regex: '/(.*)'
        target_label: 'container'
    pipeline_stages:
      - static_labels:
          job: docker

And the next alert:

groups:
  - name: should_fire
    rules:
      - alert: AlephIngestError
        expr: |
          sum by (container) (count_over_time({job="docker", container="aleph_ingest-file_1"} | json | __error__=`` | severity =~ `WARNING|ERROR`[5m])) > 0
        for: 10m
        labels:
            severity: critical
        annotations:
            summary: "Errors found in the {{ $labels.container }} docker log"
            message: "Error in {{ $labels.container }}: {{ $labels.message }}"

@lyz-code
Copy link

Until the issue is solved, and assuming that you have loki configured, you can follow the next guidelines to solve some of the ingest errors:

  • Cannot open image data using Pillow: broken data stream when reading image files: The log trace that has this message also contains a field trace_id which identifies the ingestion process. With that trace_id you can get the first log trace with the field logger = "ingestors.manager" which will contain the file path in the message field. Something similar to Ingestor [<E('9972oiwobhwefoiwefjsldkfwefa45cf5cb585dc4f1471','path_to_the_file_to_ingest.pdf')>]
  • A traceback with the next string Failed to process: Could not extract PDF file: FileDataError('cannot open broken document'): This log trace has the file path in the message field. Something similar to [<E('9972oiwobhwefoiwefjsldkfwefa45cf5cb585dc4f1471','path_to_the_file_to_ingest.pdf')>] Failed to process: Could not extract PDF file: FileDataError('cannot open broken document')

Once you have the files that triggered the errors, the best way to handle them is to delete them from your investigation and ingest them again.

lyz-code added a commit to lyz-code/blue-book that referenced this issue Jul 31, 2024
- [Aider](https://aider.chat/) lets you pair program with LLMs, to edit code in your local git repository. Start a new project or work with an existing git repo. Aider works best with GPT-4o & Claude 3.5 Sonnet and can connect to almost any LLM.

feat(ai_coding): Introduce ai coding prompts

These are some useful AI prompts to help you while you code:

- create a function with type hints and docstring using google style called { } that { }
- create the tests for the function { } adding type hints and following the AAA style where the Act section is represented contains a returns = (thing to test) line or if the function to test doesn't return any value append an # act comment at the end of the line. Use paragraphs to separate the AAA blocks and don't add comments inside the tests for the sections

If you use [espanso](espanso.md) you can simplify the filling up of these prompts on the AI chats. For example:

```yaml
---
matches:
  - trigger: :function
    form: |
      Create a function with type hints and docstring using google style called [[name]] that:
      [[text]]
    form_fields:
      text:
        multiline: true
  - trigger: :tweak
    form: |
      Tweak the next code:
      [[code]]

      So that:

      [[text]]
    form_fields:
      text:
        multiline: true
      code:
        multiline: true
  - trigger: :test
    form: |
      create the tests for the function:
      [[text]]

      Following the next guidelines:

      - Add type hints
      - Follow the AAA style
      - In the Act section if the function to test returns a value always name that variable returns. If the function to test doesn't return any value append an # act comment at the end of the line.
      - Use paragraphs to separate the AAA blocks and don't add comments like # Arrange or # Act or # Act/Assert or # Assert

    form_fields:
      text:
        multiline: true
  - trigger: :refactor
    form: |
     Refactor the next code
     [[code]]
     with the next conditions
     [[conditions]]
    form_fields:
      code:
        multiline: true
      conditions:
        multiline: true
```

feat(alacritty): Introduce Alacritty

[Alacritty](https://alacritty.org/) is a modern terminal emulator that comes with sensible defaults, but allows for extensive configuration. By integrating with other applications, rather than reimplementing their functionality, it manages to provide a flexible set of features with high performance.

**[Installation](https://github.com/alacritty/alacritty/blob/master/INSTALL.md#debianubuntu)**

- Clone the repo
  ```bash
  git clone https://github.com/alacritty/alacritty.git
  cd alacritty
  ```
- [Install `rustup`](https://rustup.rs/)
  ```bash
  curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
  ```
- To make sure you have the right Rust compiler installed, run
  ```bash
  rustup override set stable
  rustup update stable
  ```
- Install the dependencies
  ```bash
  apt install cmake pkg-config libfreetype6-dev libfontconfig1-dev libxcb-xfixes0-dev libxkbcommon-dev python3
  ```
- Build the release
  ```bash
  cargo build --release
  ```
  If all goes well, this should place a binary at `target/release/alacritty`
- Move the binary to somewhere in your PATH

  ```bash
  mv target/release/alacritty ~/.local/bin
  ```
- Check the terminfo: To make sure Alacritty works correctly, either the `alacritty` or `alacritty-direct` terminfo must be used. The `alacritty` terminfo will be picked up automatically if it is installed.
  If the following command returns without any errors, the `alacritty` terminfo is already installed:

  ```bash
  infocmp alacritty
  ```

  If it is not present already, you can install it globally with the following command:

  ```bash
  sudo tic -xe alacritty,alacritty-direct extra/alacritty.info
  ```

**[Configuration](https://alacritty.org/config-alacritty.html)**

Alacritty's configuration file uses the TOML format. It doesn't create the config file for you, but it looks for one in `~/.config/alacrity/alacritty.toml`

**Not there yet**
- [Support for ligatures](https://github.com/alacritty/alacritty/issues/50)

**References**
- [Homepage](https://alacritty.org/)
- [Source](https://github.com/alacritty/alacritty)
- [Docs](https://github.com/alacritty/alacritty/blob/master/docs/features.md)

feat(aleph#Debug ingestion errors): Debug ingestion errors

Assuming that you've [set up Loki to ingest your logs](https://github.com/alephdata/aleph/issues/2124) I've so far encountered the next ingest issues:

- `Cannot open image data using Pillow: broken data stream when reading image files`: The log trace that has this message also contains a field `trace_id` which identifies the ingestion process. With that `trace_id` you can get the first log trace with the field `logger = "ingestors.manager"` which will contain the file path in the `message` field. Something similar to `Ingestor [<E('9972oiwobhwefoiwefjsldkfwefa45cf5cb585dc4f1471','path_to_the_file_to_ingest.pdf')>]`
- A traceback with the next string `Failed to process: Could not extract PDF file: FileDataError('cannot open broken document')`: This log trace has the file path in the `message` field. Something similar to `[<E('9972oiwobhwefoiwefjsldkfwefa45cf5cb585dc4f1471','path_to_the_file_to_ingest.pdf')>] Failed to process: Could not extract PDF file: FileDataError('cannot open broken document')`

I thought of making a [python script to automate the files that triggered an error](loki.md#interact-with-loki-through-python), but in the end I extracted the file names manually as they weren't many.

Once you have the files that triggered the errors, the best way to handle them is to delete them from your investigation and ingest them again.

feat(aleph#references): add support channel

[Support chat](https://alephdata.slack.com)

feat(ansible_snippets#Set the ssh connection port using dynamic inventories): Set the ssh connection port using dynamic inventories

To specify a custom SSH port, you can use a `host_vars` or `group_vars` file. For example, create a `group_vars` directory and a file named `all.yaml` inside it:

```yaml
ansible_port: 2222
```

feat(antiracism): Añadir el vídeo del racismo no se sostiene

[El racismo no se sostiene](https://youtube.com/shorts/5Y7novO2t_c?si=dqMGW4ALFLoXZiw3)

feat(argocd#Not there yet): Not there yet

- [Support git webhook on Applicationsets for gitea/forgejo](https://github.com/argoproj/argo-cd/issues/18798): although you could use an ugly fix adding `spec.generators[i].requeueAfterSeconds` to change the interval that ArgoCD uses to refresh the repositories, which is 3 minutes by default.

feat(bash_snippets#Fix docker error: KeyError ContainerConfig): Fix docker error: KeyError ContainerConfig

You need to run `docker-compose down` and then up again.

feat(bash_snippets#Set static ip with nmcli): Set static ip with nmcli

```bash
nmcli con mod "your-ssid" ipv4.addresses
  ipv4.method "manual" \
  ipv4.addresses "your_desired_ip" \
  ipv4.gateway "your_desired_gateway" \
  ipv4.dns "1.1.1.1,2.2.2.2" \
  ipv4.routes "192.168.32.0 0.0.0.0" \
```

The last one is to be able to connect to your LAN, change the value accordingly.

feat(bash_snippets#Fix unbound variable error): Fix unbound variable error

You can check if the variable is set and non-empty with:
```bash
[ -n "${myvariable-}" ]
```

feat(bash_snippets#With sort): Compare two semantic versions with sort

If you want to make it work in non-Debian based systems you can use `sort -V -C`

```bash
printf "2.0.0\n2.1.0\n" | sort -V -C  # Return code 0
printf "2.2.0\n2.1.0\n" | sort -V -C  # Return code 1
```

feat(python_snippets#Compare file and directories): Compare file and directories

The filecmp module defines functions to compare files and directories, with various optional time/correctness trade-offs. For comparing files, see also the difflib module.

```python
from filecmp import dircmp

def print_diff_files(dcmp):
    for name in dcmp.diff_files:
        print("diff_file %s found in %s and %s" % (name, dcmp.left, dcmp.right))
    for sub_dcmp in dcmp.subdirs.values():
        print_diff_files(sub_dcmp)
dcmp = dircmp('dir1', 'dir2')
print_diff_files(dcmp)
```

feat(conference_organisation): Software to manage the conference

There are some open source software that can make your life easier when hosting a conference:

- [Frab](https://frab.github.io/frab/)
- [Pretalx](https://pretalx.com/p/about/)
- [Wafer](https://wafer.readthedocs.io/en/latest/)

In addition to the management of talks from the call for papers till the event itself it can help the users visualise the talks schedule with [EventFahrplan](https://github.com/EventFahrplan/EventFahrplan?tab=readme-ov-file) which is what's used in the ChaosComputerClub congress.

If you also want to coordinate helpers and shifts take a look to [Engelsystem](https://engelsystem.de/en)

feat(conventional_comments): Introduce conventional comments

[Conventional comments](https://conventionalcomments.org/) is the practice to use a specific format in the review comments to express your intent and tone more clearly. It's strongly inspired by [semantic versioning](semantic_versioning.md).

Let's take the next comment:

```
This is not worded correctly.
```

Adding labels you can tell the difference on your intent:

```
**suggestion:** This is not worded correctly.
```
Or
```
**issue (non-blocking):** This is not worded correctly.
```

Labels also prompt the reviewer to give more **actionable** comments.

```
**suggestion:** This is not worded correctly.

Can we change this to match the wording of the marketing page?
```

Labeling comments encourages collaboration and saves **hours** of undercommunication and misunderstandings. They are also parseable by machines!

**Format**

Adhering to a consistent format improves reader's expectations and machine readability.
Here's the format we propose:
```
<label> [decorations]: <subject>

[discussion]
```
- _label_ - This is a single label that signifies what kind of comment is being left.
- _subject_ - This is the main message of the comment.
- _decorations (optional)_ - These are extra decorating labels for the comment. They are surrounded by parentheses and comma-separated.
- _discussion (optional)_ - This contains supporting statements, context, reasoning, and anything else to help communicate the "why" and "next steps" for resolving the comment.
For example:
```
**question (non-blocking):** At this point, does it matter which thread has won?

Maybe to prevent a race condition we should keep looping until they've all won?
```

Can be automatically parsed into:

```json
{
  "label": "question",
  "subject": "At this point, does it matter which thread has won?",
  "decorations": ["non-blocking"],
  "discussion": "Maybe to prevent a race condition we should keep looping until they've all won?"
}
```
**Labels**

We strongly suggest using the following labels:
|                 |                                                                                                                                                                                                                                                                                           |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **praise:**     | Praises highlight something positive. Try to leave at least one of these comments per review. _Do not_ leave false praise (which can actually be damaging). _Do_ look for something to sincerely praise.                                                                                  |
| **quibble:**    | Quibbles are trivial preference-based requests. These should be non-blocking by nature. Similar to `polish` but clearly preference-based.|
| **suggestion:** | Suggestions propose improvements to the current subject. It's important to be explicit and clear on _what_ is being suggested and _why_ it is an improvement. These are non-blocking proposals. If it's blocking use `todo` instead.|
| **todo:**       | TODO's are necessary changes. Distinguishing `todo` comments from `issues` or `suggestions` helps direct the reader's attention to comments requiring more involvement. |
| **issue:**      | Issues highlight specific problems with the subject under review. These problems can be user-facing or behind the scenes. It is strongly recommended to pair this comment with a `suggestion`. If you are not sure if a problem exists or not, consider leaving a `question`.             |
| **question:**   | Questions are appropriate if you have a potential concern but are not quite sure if it's relevant or not. Asking the author for clarification or investigation can lead to a quick resolution.                                                                                            |
| **thought:**    | Thoughts represent an idea that popped up from reviewing. These comments are non-blocking by nature, but they are extremely valuable and can lead to more focused initiatives and mentoring opportunities.                                                                                |
| **chore:**      | Chores are simple tasks that must be done before the subject can be "officially" accepted. Usually, these comments reference some common process. Try to leave a link to the process description so that the reader knows how to resolve the chore.                                       |
| **note:**      | Notes are always non-blocking and simply highlight something the reader should take note of.                                       |

If you like to be a bit more expressive with your labels, you may also consider:

|    |    |
|----|----|
| **typo:** | Typo comments are like **todo:**, where the main issue is a misspelling. |
| **polish:** | Polish comments are like a **suggestion**, where there is nothing necessarily wrong with the relevant content, there's just some ways to immediately improve the quality. Similar but not exactly the same as `quibble`.|

**Decorations**

Decorations give additional context for a comment. They help further classify comments which have the same label (for example, a security suggestion as opposed to a test suggestion).
```
**suggestion (security):** I'm a bit concerned that we are implementing our own DOM purifying function here...
Could we consider using the framework instead?
```
```
**suggestion (test,if-minor):** It looks like we're missing some unit test coverage that the cat disappears completely.
```

Decorations may be specific to each organization. If needed, we recommend establishing a minimal set of decorations (leaving room for discretion) with no ambiguity.
Possible decorations include:
|                    |                                                                                                                                                                                                         |
| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **(non-blocking)** | A comment with this decoration **should not** prevent the subject under review from being accepted. This is helpful for organizations that consider comments blocking by default.                       |
| **(blocking)**     | A comment with this decoration **should** prevent the subject under review from being accepted, until it is resolved. This is helpful for organizations that consider comments non-blocking by default. |
| **(if-minor)**     | This decoration gives some freedom to the author that they should resolve the comment only if the changes ends up being minor or trivial.                                                               |

Adding a decoration to a comment should improve understandability and maintain readability. Having a list of many decorations in one comment would conflict with this goal.

**More examples**
```
**quibble:** `little star` => `little bat`

Can we update the other references as well?
```
```
**chore:** Let's run the `jabber-walk` CI job to make sure this doesn't break any known references.
Here are [the docs](https://en.wikipedia.org/wiki/Jabberwocky) for running this job. Feel free to reach out if you need any help!
```
```
**praise:** Beautiful test!
```

**Best Practices**

Some best practices for writing helpful review feedback:

- Mentoring pays off exponentially
- Leave actionable comments
- Combine similar comments
- Replace "you" with "we"
- Replace "should" with "could"

**References**
- [Home](https://conventionalcomments.org/)

feat(data_orchestrators): Introduce data orchestrators

Data orchestration is the process of moving siloed data from multiple storage locations into a centralized repository where it can then be combined, cleaned, and enriched for activation.

Data orchestrators are web applications that make this process easy. The most popular right now are:

- Apache Airflow
- [Kestra](#kestra)
- Prefect

There are several comparison pages:

- [Geek Culture comparison](https://medium.com/geekculture/airflow-vs-prefect-vs-kestra-which-is-best-for-building-advanced-data-pipelines-40cfbddf9697)
- [Kestra's comparison to Airflow](https://kestra.io/vs/airflow)
- [Kestra's comparison to Prefect](https://kestra.io/vs/prefect)

When looking at the return on investment when choosing an orchestration tool, there are several points to consider:

- Time of installation/maintenance
- Time to write pipeline
- Time to execute (performance)

**[Kestra](kestra.md)**

Pros:

- Easier to write pipelines
- Nice looking web UI
- It has a [terraform provider](https://kestra.io/docs/getting-started/terraform)
- [Prometheus and grafana integration](https://kestra.io/docs/how-to-guides/monitoring)

Cons:

- Built in Java, so extending it might be difficult
- [Plugins are made in Java](https://kestra.io/docs/developer-guide/plugins)

Kestra offers a higher ROI globally compared to Airflow:

- Installing Kestra is easier than Airflow; it doesn’t require Python dependencies, and it comes with a ready-to-use docker-compose file using few services and without the need to understand what’s an executor to run task in parallel.
- Creating pipelines with Kestra is simple, thanks to its syntax. You don’t need knowledge of a specific programming language because Kestra is designed to be agnostic. The declarative YAML design makes Kestra flows more readable compared to Airflow’s DAG equivalent, allowing developers to significantly reduce development time.
- In this benchmark, Kestra demonstrates better execution time than Airflow under any configuration setup.

feat(kubectl_commands#Upload a file to a pod): Upload a file to a pod

```bash
kubectl cp {{ path_to_local_file }} {{ container_id }}:{{ path_to_file }}
```

feat(kubernetes#Tools to test): Add reloader to tools to test

[stakater/reloader](https://github.com/stakater/Reloader): A Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, DaemonSet and DeploymentConfig. Useful for not that clever applications that need a reboot when a configmap changes.

feat(kubernetes_volumes#Specify a path of a configmap): Specify a path of a configmap

If you have a configmap with a key `ssh-known-hosts` and you want to mount it's content in a file, in the deployment `volumeMounts` section you can use the `subPath` field:

```yaml
      - mountPath: /home/argocd/.ssh/known_hosts
        name: ssh-known-hosts
        subPath: ssh_known_hosts
        readOnly: true
```

feat(safety): Add deprecation warning

Since 2024-05-27 it requires an account to work, use [pip-audit](pip_audit.md) instead.

feat(efs#List the size of the recovery points): List the size of the recovery points

```bash

BACKUP_VAULT_NAME="your-vault-name"

RECOVERY_POINTS=$(aws backup list-recovery-points-by-backup-vault --backup-vault-name $BACKUP_VAULT_NAME --query 'RecoveryPoints[*].[RecoveryPointArn,BackupSizeInBytes,CreationDate]' --output text)
echo -e "Creation Date\t\tRecovery Point ARN\t\t\t\t\t\t\t\t\tSize (TB)"
echo "---------------------------------------------------------------------------------------------------------------------"

while read -r RECOVERY_POINT_ARN BACKUP_SIZE_BYTES CREATION_DATE; do
    # Remove the decimal part from the epoch time
    EPOCH_TIME=$(echo $CREATION_DATE | cut -d'.' -f1)
    # Convert the creation date from epoch time to YYYY-MM-DD format
    FORMATTED_DATE=$(date -d @$EPOCH_TIME +"%Y-%m-%d")
    SIZE_TB=$(echo "scale=6; $BACKUP_SIZE_BYTES / (1024^4)" | bc)
    # echo -e "$FORMATTED_DATE\t$RECOVERY_POINT_ARN\t$SIZE_TB"
   	printf "%-16s %-80s %10.6f\n" "$FORMATTED_DATE" "$RECOVERY_POINT_ARN" "$SIZE_TB"
done <<< "$RECOVERY_POINTS"
```

feat(efs#List the size of the jobs): List the size of the jobs

To list AWS Backup jobs and display their completion dates and sizes in a human-readable format, you can use the following AWS CLI command combined with `jq` for parsing and formatting the output. This command handles cases where the backup size might be null and rounds the size to the nearest whole number in gigabytes.

```sh
aws backup list-backup-jobs --output json | jq -r '
  .BackupJobs[] |
  [
    (.CompletionDate | strftime("%Y-%m-%d %H:%M:%S")),
    (if .BackupSizeInBytes == null then "0GB" else ((.BackupSizeInBytes / 1024 / 1024 / 1024) | floor | tostring + " GB") end)
  ] |
  @tsv' | column -t -s$'\t'
```
Explanation:

- `aws backup list-backup-jobs --output json`: Lists all AWS Backup jobs in JSON format.
- `.BackupJobs[]`: Iterates over each backup job.
- `(.CompletionDate | strftime("%Y-%m-%d %H:%M:%S"))`: Converts the Unix timestamp in CompletionDate to a human-readable date format (YYYY-MM-DD HH:MM:SS).
- `(if .BackupSizeInBytes == null then "0GB" else ((.BackupSizeInBytes / 1024 / 1024 / 1024) | floor | tostring + " GB") end)`: Checks if BackupSizeInBytes is null. If it is, outputs "0GB". Otherwise, converts the size from bytes to gigabytes, rounds it down to the nearest whole number, and appends " GB".
- `| @tsv`: Formats the output as tab-separated values.
- `column -t -s$'\t'`: Formats the TSV output into a table with columns aligned.

feat(espanso): Introduce espanso

[Espanso](https://github.com/espanso/espanso) is a cross-platform Text Expander written in Rust.

A text expander is a program that detects when you type a specific keyword and replaces it with something else. This is useful in many ways:

- Save a lot of typing, expanding common sentences or fixing common typos.
- Create system-wide code snippets.
- Execute custom scripts
- Use emojis like a pro.

**[Installation](https://espanso.org/docs/install/linux/)**
Espanso ships with a .deb package, making the installation convenient on Debian-based systems.

Start by downloading the package by running the following command inside a terminal:

```bash
wget https://github.com/federico-terzi/espanso/releases/download/v2.2.1/espanso-debian-x11-amd64.deb
```

You can now install the package using:

```bash
sudo apt install ./espanso-debian-x11-amd64.deb
```

From now on, you should have the `espanso` command available in the terminal (you can verify by running `espanso --version`).

At this point, you are ready to use `espanso` by registering it first as a Systemd service and then starting it with:

```bash
espanso service register
```

Start espanso

```bash
espanso start
```

Espanso ships with very few built-in matches to give you the maximum flexibility, but you can expand its capabilities in two ways: creating your own custom matches or [installing packages](#using-packages).

**[Configuration](https://espanso.org/docs/get-started/#configuration)**

Your configuration lives at `~/.config/espanso`. A quick way to find the path of your configuration folder is by using the following command `espanso path`.

- The files contained in the `match` directory define what Espanso should do. In other words, this is where you should specify all the custom snippets and actions (aka Matches). The `match/base.yml` file is where you might want to start adding your matches.
- The files contained in the `config` directory define how Espanso should perform its expansions. In other words, this is were you should specify all Espanso's parameters and options. The `config/default.yml` file defines the options that will be applied to all applications by default, unless an app-specific configuration is present for the current app.

**[Using packages](https://espanso.org/docs/get-started/#understanding-packages)**

Custom matches are great, but sometimes it can be tedious to define them for every common operation, especially when you want to share them with other people.

Espanso offers an easy way to share and reuse matches with other people, packages. In fact, they are so important that Espanso includes a built-in package manager and a store, the [Espanso Hub](https://hub.espanso.org/).

**[Installing a package](https://espanso.org/docs/get-started/#installing-a-package)**

Get the id of the package from the [Espanso Hub](https://hub.espanso.org/) and then run `espanso install <<package_name>>`.

Of all the packages, I've found the next ones the most useful:

- [typofixer-en](https://hub.espanso.org/typofixer-en)
- [typofixer-es](https://hub.espanso.org/typofixer-es)
- [misspell-en-uk](https://hub.espanso.org/misspell-en-uk)

**Overwriting the snippets of a package**

For example the `typofixer-en` replaces `si` to `is`, although `si` is a valid spanish word. To override the fix you can create your own file on `~/.config/espanso/match/typofix_overwrite.yml` with the next content:

```yaml
matches:
  # Simple text replacement
  - trigger: "si"
    replace: "si"
```

**[Creating a package](https://espanso.org/docs/packages/creating-a-package/)**

**Auto-restart on config changes**

Set `auto_restart: true` on `~/.config/espanso/config/default.yml`.

**[Changing the search bar shortcut](https://espanso.org/docs/configuration/options/#customizing-the-search-bar)**

If the default search bar shortcut conflicts with your i3 configuration set it with:

```yaml
search_shortcut: CTRL+SHIFT+e
```

**[Hiding the notifications](https://espanso.org/docs/configuration/options/#hiding-the-notifications)**

You can hide the notifications by adding the following option to your `$CONFIG/config/default.yml` config:

```yaml
show_notifications: false
```

**Usage**

Just type and you'll see the text expanded.

You can use the search bar if you don't remember your snippets.

**References**
- [Code](https://github.com/espanso/espanso)
- [Docs](https://espanso.org/docs/get-started/)

fix(free_knowledge): Update the way of seeding ill knowledge torrents

A good way to contribute is by seeding the ill torrents. You can [generate a list of torrents that need seeding](https://annas-archive.org/torrents#generate_torrent_list) up to a limit in TB. If you follow this path, take care of IP leaking, they're

feat(gotify): Complete installation

* Create the data directories:
  ```bash
  mkdir -p /data/config/gotify/ /data/gotify
  ```
* Assuming you're using an external proxy create the next docker compose in `/data/config/gotify`.

  ```yaml
  ---
  version: "3"

  services:
    gotify:
      image: gotify/server
      container_name: gotify
      networks:
        - swag
      env_file:
        - .env
      volumes:
        - gotify-data:/app/data

  networks:
    swag:
      external:
        name: swag

  volumes:
    gotify-data:
      driver: local
      driver_opts:
        type: none
        o: bind
        device: /data/gotify
  ```

  With the next `.env` file:

```

  GOTIFY_SERVER_SSL_ENABLED=false

  GOTIFY_DATABASE_DIALECT=sqlite3
  GOTIFY_DATABASE_CONNECTION=data/gotify.db

  GOTIFY_DEFAULTUSER_NAME=admin
  GOTIFY_DEFAULTUSER_PASS=changeme

  GOTIFY_PASSSTRENGTH=10
  GOTIFY_UPLOADEDIMAGESDIR=data/images
  GOTIFY_PLUGINSDIR=data/plugins
  GOTIFY_REGISTRATION=false
  ```

* Create the service by adding a file `gotify.service` into `/etc/systemd/system/`

```
[Unit]
Description=gotify
Requires=docker.service
After=docker.service

[Service]
Restart=always
User=root
Group=docker
WorkingDirectory=/data/config/gotify
TimeoutStartSec=100
RestartSec=2s
ExecStart=/usr/bin/docker-compose -f docker-compose.yaml up
ExecStop=/usr/bin/docker-compose -f docker-compose.yaml down

[Install]
WantedBy=multi-user.target
```

* Copy the nginx configuration in your `site-confs`

  ```

  server {
      listen 443 ssl;
      listen [::]:443 ssl;

      server_name gotify.*;

      include /config/nginx/ssl.conf;

      client_max_body_size 0;

      # enable for ldap auth (requires ldap-location.conf in the location block)
      #include /config/nginx/ldap-server.conf;

      # enable for Authelia (requires authelia-location.conf in the location block)
      #include /config/nginx/authelia-server.conf;

      location / {
          # enable the next two lines for http auth
          #auth_basic "Restricted";
          #auth_basic_user_file /config/nginx/.htpasswd;

          # enable for ldap auth (requires ldap-server.conf in the server block)
          #include /config/nginx/ldap-location.conf;

          # enable for Authelia (requires authelia-server.conf in the server block)
          #include /config/nginx/authelia-location.conf;

          include /config/nginx/proxy.conf;
          include /config/nginx/resolver.conf;
          set $upstream_app gotify;
          set $upstream_port 80;
          set $upstream_proto http;
          proxy_pass $upstream_proto://$upstream_app:$upstream_port;
      }
  }
  ```
* Start the service `systemctl start gotify`
* Restart the nginx service `systemctl restart swag`
* Enable the service `systemctl enable gotify`.
* Login with the `admin` user
* Create a new user with admin permissions
* Delete the `admin` user

**Configuration**

- [Android client](https://github.com/gotify/android)
- Linux clients
  - [command line client](#command-line-client)
  - [Dunst client](https://github.com/ztpnk/gotify-dunst)
  - [gotify-desktop](https://github.com/desbma/gotify-desktop)
  - [rofi client](https://github.com/diddypod/rotify)

**Connect it with Alertmanager**

It's not trivial to connect it to Alertmanager([1](https://github.com/prometheus/alertmanager/issues/2120), [2](https://github.com/gotify/contrib/issues/21), [3](https://github.com/prometheus/alertmanager/issues/3729), [4](https://github.com/prometheus/alertmanager/issues/2120). The most popular way is to use [`alertmanager_gotify_bridge`](https://github.com/DRuggeri/alertmanager_gotify_bridge?tab=readme-ov-file).

We need to tweak the docker-compose to add the bridge:

```yaml
```

**Connect it with Authentik**

Here are some guides to connect it to authentik. The problem is that the clients you want to use must support it

- https://github.com/gotify/server/issues/203
- https://github.com/gotify/server/issues/553

**References**

- [Docs](https://gotify.net/docs/)

feat(gpu#install cuda): Install cuda

[CUDA](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) is a parallel computing platform and programming model invented by NVIDIA®. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).
If you're not using Debian 11 follow [these instructions](https://developer.nvidia.com/cuda-downloads)

**Base Installer**

```sh
wget https://developer.download.nvidia.com/compute/cuda/12.5.1/local_installers/cuda-repo-debian11-12-5-local_12.5.1-555.42.06-1_amd64.deb
sudo dpkg -i cuda-repo-debian11-12-5-local_12.5.1-555.42.06-1_amd64.deb
sudo cp /var/cuda-repo-debian11-12-5-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo add-apt-repository contrib
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-5
```

Additional installation options are detailed [here](https://developer.nvidia.com/cuda-downloads).

**Driver Installer**

To install the open kernel module flavor:

```sh
sudo apt-get install -y nvidia-kernel-open-dkms
sudo apt-get install -y cuda-drivers
```

Install cuda:

```bash
apt-get install cuda
reboot
```

**Install nvidia card**

Check if your card is supported in the [releases supported by your OS](https://wiki.debian.org/NvidiaGraphicsDrivers)
- [If it's supported](https://wiki.debian.org/NvidiaGraphicsDrivers)
- [If it's not supported](https://docs.kinetica.com/7.1/install/nvidia_deb/)

**Ensure the GPUs are Installed**

Install `pciutils`:

Ensure that the `lspci` command is installed (which lists the PCI devices connected to the server):

```sh
sudo apt-get -y install pciutils
```

Check Installed Nvidia Cards: Perform a quick check to determine what Nvidia cards have been installed:

```sh
lspci | grep VGA
```

The output of the `lspci` command above should be something similar to:

```
00:02.0 VGA compatible controller: Intel Corporation 4th Gen ...
01:00.0 VGA compatible controller: Nvidia Corporation ...
```

If you do not see a line that includes Nvidia, then the GPU is not properly installed. Otherwise, you should see the make and model of the GPU devices that are installed.

**Disable Nouveau**

Blacklist Nouveau in Modprobe: The `nouveau` driver is an alternative to the Nvidia drivers generally installed on the server. It does not work with CUDA and must be disabled. The first step is to edit the file at `/etc/modprobe.d/blacklist-nouveau.conf`.

Create the file with the following content:

```sh
cat <<EOF | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
EOF
```

Then, run the following commands:

```sh
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
```

Update Grub to Blacklist Nouveau:

Backup your grub config template:

```sh
sudo cp /etc/default/grub /etc/default/grub.bak
```

Then, update your grub config template at `/etc/default/grub`. Add `rd.driver.blacklist=nouveau` and `rcutree.rcu_idle_gp_delay=1` to the `GRUB_CMDLINE_LINUX` variable. For example, change:

```sh
GRUB_CMDLINE_LINUX="quiet"
```

to:

```sh
GRUB_CMDLINE_LINUX="quiet rd.driver.blacklist=nouveau rcutree.rcu_idle_gp_delay=1"
```

Then, rebuild your grub config:

```sh
sudo grub2-mkconfig -o /boot/grub/grub.cfg
```

**Install prerequisites**

The following prerequisites should be installed before installing the Nvidia drivers:

```sh
sudo apt-get -y install linux-headers-$(uname -r) make gcc-4.8
sudo apt-get -y install acpid dkms
```

Close X Server:

Before running the install, you should exit out of any X environment, such as Gnome, KDE, or XFCE. To exit the X session, switch to a TTY console using `Ctrl-Alt-F1` and then determine whether you are running `lightdm` or `gdm` by running:

```sh
sudo ps aux | grep "lightdm|gdm|kdm"
```

Depending on which is running, stop the service, running the following commands (substitute `gdm` or `kdm` for `lightdm` as appropriate):

```sh
sudo service lightdm stop
sudo init 3
```

Install Drivers Only:

To accommodate GL-accelerated rendering, OpenGL and GL Vendor Neutral Dispatch (GLVND) are now required and should be installed with the Nvidia drivers. OpenGL is an installation option in the `*.run` type of drivers. In other types of the drivers, OpenGL is enabled by default in most modern versions (dated 2016 and later). GLVND can be installed using the installer menus or via the `--glvnd-glx-client` command line flag.

This section deals with installing the drivers via the `*.run` executables provided by Nvidia.

To download only the drivers, navigate to [http://www.nvidia.com/object/unix.html](http://www.nvidia.com/object/unix.html) and click the Latest Long Lived Branch version under the appropriate CPU architecture. On the ensuing page, click Download and then click Agree and Download on the page that follows.

The Unix drivers found in the link above are also compatible with all Nvidia Tesla models.

If you'd prefer to download the full driver repository, Nvidia provides a tool to recommend the most recent available driver for your graphics card at [http://www.Nvidia.com/Download/index.aspx?lang=en-us](http://www.Nvidia.com/Download/index.aspx?lang=en-us).

If you are unsure which Nvidia devices are installed, the `lspci` command should give you that information:

```sh
lspci | grep -i "nvidia"
```

Download the recommended driver executable. Change the file permissions to allow execution:

```sh
chmod +x ./NVIDIA-Linux-$(uname -m)-*.run
```

Run the install.

To check that the GPU is well installed and functioning properly, you can use the `nvidia-smi` command. This command provides detailed information about the installed Nvidia GPUs, including their status, utilization, and driver version.

First, ensure the Nvidia drivers are installed. Then, run:

```sh
nvidia-smi
```

If the GPU is properly installed, you should see an output that includes information about the GPU, such as its model, memory usage, and driver version. The output will look something like this:

```
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.66       Driver Version: 450.66       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   38C    P8    29W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
```

If you encounter any errors or the GPU is not listed, there may be an issue with the installation or configuration of the GPU drivers.

**[Measure usage](https://askubuntu.com/questions/387594/how-to-measure-gpu-usage)**

For Nvidia GPUs there is a tool [nvidia-smi](https://developer.nvidia.com/system-management-interface) that can show memory usage, GPU utilization and temperature of GPU.

**[Load test the gpu](https://github.com/wilicc/gpu-burn)**

First make sure you have [CUDA](#install-cuda) installed, then install the `gpu_burn` tool
```bash
git clone https://github.com/wilicc/gpu-burn
cd gpu-burn
make
```

To run a test for 60 seconds run:
```bash
./gpu_burn 60
```

**[Monitor it with Prometheus](https://developer.nvidia.com/blog/monitoring-gpus-in-kubernetes-with-dcgm/)**

[NVIDIA DCGM](https://developer.nvidia.com/dcgm) is a set of tools for managing and monitoring NVIDIA GPUs in large-scale, Linux-based cluster environments. It’s a low overhead tool that can perform a variety of functions including active health monitoring, diagnostics, system validation, policies, power and clock management, group configuration, and accounting. For more information, see the [DCGM User Guide](https://docs.nvidia.com/datacenter/dcgm/latest/dcgm-user-guide/overview.html).

You can use DCGM to expose GPU metrics to Prometheus using `dcgm-exporter`.

- [Install NVIDIA Container Kit](https://github.com/NVIDIA/nvidia-container-toolkit): The NVIDIA Container Toolkit allows users to build and run GPU accelerated containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs.

  ```bash
  sudo apt-get install -y nvidia-container-toolkit
  ```

- Configure the container runtime by using the nvidia-ctk command:

  ```bash
  sudo nvidia-ctk runtime configure --runtime=docker
  ```
- Restart the Docker daemon:

  ```bash
  sudo systemctl restart docker
  ```

- Install NVIDIA DCGM: Follow the [Getting Started Guide](https://docs.nvidia.com/datacenter/dcgm/latest/user-guide/getting-started.html).

Determine the distribution name:

```bash
distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e 's/\.//g')
```

Download the meta-package to set up the CUDA network repository:

```bash
wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-keyring_1.1-1_all.deb
```

Install the repository meta-data and the CUDA GPG key:

```bash
sudo dpkg -i cuda-keyring_1.1-1_all.deb
```

Update the Apt repository cache:

```bash
sudo apt-get update
```

Now, install DCGM:

```bash
sudo apt-get install -y datacenter-gpu-manager
```

Enable the DCGM systemd service (on reboot) and start it now:

```bash
sudo systemctl --now enable nvidia-dcgm
```

You should see output similar to this:

```
● dcgm.service - DCGM service
  Loaded: loaded (/usr/lib/systemd/system/dcgm.service; disabled; vendor preset: enabled)
  Active: active (running) since Mon 2020-10-12 12:18:57 PDT; 14s ago
Main PID: 32847 (nv-hostengine)
    Tasks: 7 (limit: 39321)
  CGroup: /system.slice/dcgm.service
          └─32847 /usr/bin/nv-hostengine -n

Oct 12 12:18:57 ubuntu1804 systemd[1]: Started DCGM service.
Oct 12 12:18:58 ubuntu1804 nv-hostengine[32847]: DCGM initialized
Oct 12 12:18:58 ubuntu1804 nv-hostengine[32847]: Host Engine Listener Started
```

To verify installation, use `dcgmi` to query the system. You should see a listing of all supported GPUs (and any NVSwitches) found in the system:

```bash
dcgmi discovery -l
```

Output:

```
8 GPUs found.
+--------+----------------------------------------------------------------------+
| GPU ID | Device Information                                                   |
+--------+----------------------------------------------------------------------+
| 0      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:07:00.0                                         |
|        | Device UUID: GPU-1d82f4df-3cf9-150d-088b-52f18f8654e1                |
+--------+----------------------------------------------------------------------+
| 1      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:0F:00.0                                         |
|        | Device UUID: GPU-94168100-c5d5-1c05-9005-26953dd598e7                |
+--------+----------------------------------------------------------------------+
| 2      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:47:00.0                                         |
|        | Device UUID: GPU-9387e4b3-3640-0064-6b80-5ace1ee535f6                |
+--------+----------------------------------------------------------------------+
| 3      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:4E:00.0                                         |
|        | Device UUID: GPU-cefd0e59-c486-c12f-418c-84ccd7a12bb2                |
+--------+----------------------------------------------------------------------+
| 4      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:87:00.0                                         |
|        | Device UUID: GPU-1501b26d-f3e4-8501-421d-5a444b17eda8                |
+--------+----------------------------------------------------------------------+
| 5      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:90:00.0                                         |
|        | Device UUID: GPU-f4180a63-1978-6c56-9903-ca5aac8af020                |
+--------+----------------------------------------------------------------------+
| 6      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:B7:00.0                                         |
|        | Device UUID: GPU-8b354e3e-0145-6cfc-aec6-db2c28dae134                |
+--------+----------------------------------------------------------------------+
| 7      | Name: A100-SXM4-40GB                                                 |
|        | PCI Bus ID: 00000000:BD:00.0                                         |
|        | Device UUID: GPU-a16e3b98-8be2-6a0c-7fac-9cb024dbc2df                |
+--------+----------------------------------------------------------------------+
6 NvSwitches found.
+-----------+
| Switch ID |
+-----------+
| 11        |
| 10        |
| 13        |
| 9         |
| 12        |
| 8         |
+-----------+
```

[Install the dcgm-exporter](https://github.com/NVIDIA/dcgm-exporter)

As it doesn't need any persistence I've added it to the prometheus docker compose:

```
  dcgm-exporter:
    # latest didn't work
    image: nvcr.io/nvidia/k8s/dcgm-exporter:3.3.6-3.4.2-ubuntu22.04
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
    restart: unless-stopped
    container_name: dcgm-exporter
```

And added the next scraping config in `prometheus.yml`

```yaml
  - job_name: dcgm-exporter
    metrics_path: /metrics
    static_configs:
    - targets:
      - dcgm-exporter:9400
```

**Adding alerts**

Tweak the next alerts for your use case.

```yaml
---
groups:
- name: dcgm-alerts
  rules:
  - alert: GPUHighTemperature
    expr: DCGM_FI_DEV_GPU_TEMP > 80
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "GPU High Temperature (instance {{ $labels.instance }})"
      description: "The GPU temperature is above 80°C for more than 5 minutes.\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

  - alert: GPUMemoryUtilizationHigh
    expr: DCGM_FI_DEV_MEM_COPY_UTIL > 90
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "GPU Memory Utilization High (instance {{ $labels.instance }})"
      description: "The GPU memory utilization is above 90% for more than 10 minutes.\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

  - alert: GPUComputeUtilizationHigh
    expr: DCGM_FI_DEV_GPU_UTIL > 90
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "GPU Compute Utilization High (instance {{ $labels.instance }})"
      description: "The GPU compute utilization is above 90% for more than 10 minutes.\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

  - alert: GPUPowerUsageHigh
    expr: DCGM_FI_DEV_POWER_USAGE > 160
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "GPU Power Usage High (instance {{ $labels.instance }})"
      description: "The GPU power usage is above 160W for more than 5 minutes.\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

  - alert: GPUUnavailable
    expr: up{job="dcgm-exporter"} == 0
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "GPU Unavailable (instance {{ $labels.instance }})"
      description: "The DCGM Exporter instance is down or unreachable for more than 5 minutes.\n  LABELS: {{ $labels }}"
```

**Adding a dashboard**

I've [tweaked this dashboard](https://grafana.com/grafana/dashboards/12239-nvidia-dcgm-exporter-dashboard/) to simplify it. Check the article for the full json

feat(grafana#Copy panels between dashboards): Copy panels between dashboards

On each panel on the top right you can select `copy`, then on the menu to add a new panel you can click on `Paste panel from clipboard`.

So far you [can't do this for rows](https://github.com/grafana/grafana/issues/23762).

feat(graphql): Introduce GraphQL

[GraphQL](https://graphql.org/) is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.

To use it with python you can use [Ariadne](https://ariadnegraphql.org/) ([source](https://github.com/mirumee/ariadne))

feat(jellyfin#System.InvalidOperationException: There is an error in XML document 0, 0): Troubleshoot pSystem.InvalidOperationException: There is an error in XML document (0, 0)

This may happen if you run out of disk and some xml file in the jellyfin data directory becomes empty. The solution is to restore that file from backup.

feat(kestra): introduce Kestra

[Kestra](https://kestra.io/) is an [open-source orchestrator](data_orchestrator.md) designed to bring Infrastructure as Code (IaC) best practices to all workflows — from those orchestrating mission-critical operations, business processes, and data pipelines to simple Zapier-style automation. Built with an API-first philosophy, Kestra enables users to define and manage data pipelines through a simple YAML configuration file. This approach frees you from being tied to a specific client implementation, allowing for greater flexibility and easier integration with various tools and services.

Look at this [4 minute video](https://www.youtube.com/watch?v=h-P0eK2xN58) for a visual introduction

**References**
- [Docs](https://kestra.io/docs/getting-started)
- [Home](https://kestra.io/)
- [4 minute introduction video](https://www.youtube.com/watch?v=h-P0eK2xN58)

fix(life_planning): Tweak the month planning

Add the next steps:

- Clean your agenda and get an feeling of the busyness of the month:
  - Open the orgmode month view agenda and clean it
  - Read the rest of your calendars

Then reorder the objectives in order of priority. Try to have at least one objective that improves your life.

- For each of your month and trimester objectives:
  - Decide whether it makes sense to address it this month. If not, mark it as inactive
  - Create a clear plan of action for this month on that objective.
    - Reorder the projects as needed
    - Mark as INACTIVE the ones that you don't feel need to be focused on this month.

- Refine the roadmap of each of the selected areas (change this to the trimestral planning)
- Select at least one coding project in case you enter in programming mode
- Clean your mobile browser tabs

feat(zfs#Manually create a backup): Manually create a backup

To create a snapshot of `tank/home/ahrens` that is named `friday` run:

```bash
zfs snapshot tank/home/ahrens@friday
```

feat(linux_snippets#Set the vim filetype syntax in a comment): Set the vim filetype syntax in a comment

Add somewhere in your file:

```
```

feat(linux_snippets#Export environment variables in a crontab): Export environment variables in a crontab

If you need to expand the `PATH` in theory you can do it like this:

```
PATH=$PATH:/usr/local/bin

* * * * * /path/to/my/script
```

I've found however that sometimes this doesn't work and you need to specify it in the crontab line:

```
* * * * * PATH=$PATH:/usr/local/bin /path/to/my/script
```

feat(logcli): Introduce logcli

[`logcli`](https://grafana.com/docs/loki/latest/query/logcli/) is the command-line interface to Grafana Loki. It facilitates running LogQL queries against a Loki instance.

**[Installation](https://grafana.com/docs/loki/latest/query/logcli/#installation)**
Download the logcli binary from the [Loki releases page](https://github.com/grafana/loki/releases) and install it somewhere in your `$PATH`.

**[Usage](https://grafana.com/docs/loki/latest/query/logcli/#logcli-usage)**
`logcli` points to the local instance `http://localhost:3100` directly, if you want another one export the `LOKI_ADDR` environment variable.

Run a query:

```bash
logcli query '{job="loki-ops/consul"}'
```

You can also set the time range and output format

```bash
logcli query \
     --timezone=UTC  \
     --from="2024-06-10T07:23:36Z" \
     --to="2024-06-12T16:23:58Z" \
     --output=jsonl \
     '{job="docker", container="aleph_ingest-file_1"} | json | __error__=`` | severity =~ `WARNING|ERROR` | message !~ `Queueing failed task for retry.*` | logger!=`ingestors.manager`'
```

**References**

- [Docs](https://grafana.com/docs/loki/latest/query/logcli/)

fix(loki): Don't use vector(0) on aggregation over labels

If you're doing an aggregation over a label this approach won't work because it will add a new time series with value 0. In those cases use a broader search that includes other logs from the label you're trying to aggregate and multiply it by 0. For example:

```logql
(
sum by (hostname) (
  count_over_time({job="systemd-journal", syslog_identifier="sanoid"}[1h])
)
or
sum by (hostname) (
  count_over_time({job="systemd-journal"}[1h]) * 0
)
) < 1
```

The first part of the query returns all log lines of the service `sanoid` for each `hostname`. If one hostname were not to return any line that query alone won't show anything for that host. The second part of the query counts all the log lines of each `hostname`, so if it's up it will probably be sending at least one line per hour. As we're not interested in those number of lines we multiply it by 0, so that the target is shown.

feat(loki#Interact with loki through python): Interact with loki through python

There is [no client library for python](https://community.grafana.com/t/how-could-i-pull-loki-records-from-a-python-script/111483/4) ([1](https://stackoverflow.com/questions/75056462/querying-loki-logs-using-python), [2](https://stackoverflow.com/questions/75056462/querying-loki-logs-using-python)) they suggest to interact with the [API](https://grafana.com/docs/loki/latest/reference/loki-http-api/) with `requests`. Although I'd rather use [`logcli`](logcli.md) with the [`sh`](python_sh.md) library.

feat(loki#Download the logs): Download the logs

The web UI only allows you to download the logs that are loaded in the view, if you want to download big amounts of logs you need to either use [`logcli`](logcli.md) or interact with the [API](https://grafana.com/docs/loki/latest/reference/loki-http-api/).

One user did a query on loop:

```bash

set -x

JOB_ID=9079dc54-2f5c-4d74-a9aa-1d9eb39dd3c2

for I in `seq 0 655`; do
    FILE=logs_$I.txt
    ID="$JOB_ID:$I"
    QUERY="{aws_job_id=\"$ID\",job=\"varlogs\"}"
    docker run grafana/logcli:main-1b6d0bf-amd64 --addr=http://localhost:3100/ -o raw -q query $QUERY --limit 100000 --batch 100 --forward --from "2022-09-25T10:00:00Z" > $FILE
done
```

feat(mediatracker#Add missing books): Add missing books

- Register an account in openlibrary.com
- Add the book
- Then add it to mediatracker

feat(memorious): Introduce memorious

[Memorious](https://github.com/alephdata/memorious) is a light-weight web scraping toolkit. It supports scrapers that collect structured or un-structured data. This includes the following use cases:

- Make crawlers modular and simple tasks re-usable
- Provide utility functions to do common tasks such as data storage, HTTP session management
- Integrate crawlers with the Aleph and FollowTheMoney ecosystem

**References**

- [Memorious](https://github.com/alephdata/memorious)

feat(morph_io): Introduce morph.io

[morph.io](https://morph.io/) is a web service that runs your scrapers for you.

Write your scraper in the language you know and love, push your code to GitHub, and they take care of the boring bits. Things like running your scraper regularly, alerting you if there's a problem, storing your data, and making your data available for download or through a super-simple API.

To sign in you'll need a GitHub account. This is where your scraper code is stored.

The data is stored in an sqlite

**Usage limits**

Right now there are very few limits. They are trusting you that you won't abuse this.

However, they do impose a couple of hard limits on running scrapers so they don't take up too many resources

- max 512 MB memory
- max 24 hours run time for a single run

If a scraper runs out of memory or runs too long it will get killed automatically.

There's also a soft limit:

- max 10,000 lines of log output

If a scraper generates more than 10,000 lines of log output the scraper will continue running uninterrupted. You just won't see any more output than that. To avoid this happening simply print less stuff to the screen.

Note that they are keeping track of the amount of cpu time (and a whole bunch of other metrics) that you and your scrapers are using. So, if they do find that you are using too much they reserve the right to kick you out. In reality first they'll ask you nicely to stop.

**References**

- [Docs](https://morph.io/documentation)
- [Home](https://morph.io/)

feat(orgmode#<c-i> doesn't go up in the jump list): Debug <c-i> doesn't go up in the jump list

It's because [<c-i> is a synonym of <tab>](https://github.com/neovim/neovim/issues/5916), and `org_cycle` is [mapped by default as <tab>](https://github.com/nvim-orgmode/orgmode/blob/c0584ec5fbe472ad7e7556bc97746b09aa7b8221/lua/orgmode/config/defaults.lua#L146)
If you're used to use `zc` then you can disable the `org_cycle` by setting the mapping `org_cycle = "<nop>"`.

feat(orgmode#Python libraries): Python libraries

**[org-rw](https://code.codigoparallevar.com/kenkeiras/org-rw)**

`org-rw` is a library designed to handle Org-mode files, offering the ability to modify data and save it back to the disk.

- **Pros**:
  - Allows modification of data and saving it back to the disk
  - Includes tests to ensure functionality

- **Cons**:
  - Documentation is lacking, making it harder to understand and use
  - The code structure is complex and difficult to read
  - Uses `unittest` instead of `pytest`, which some developers may prefer
  - Tests are not easy to read
  - Last commit was made five months ago, indicating potential inactivity
  - [Not very popular]( https://github.com/kenkeiras/org-rw), with only one contributor, three stars, and no forks

**[orgparse](https://github.com/karlicoss/orgparse)**

`orgparse` is a more popular library for parsing Org-mode files, with better community support and more contributors. However, it has significant limitations in terms of editing and saving changes.

- **Pros**:
  - More popular with 13 contributors, 43 forks, and 366 stars
  - Includes tests to ensure functionality
  - Provides some documentation, available [here](https://orgparse.readthedocs.io/en/latest/)

- **Cons**:
  - Documentation is not very comprehensive
  - Cannot write back to Org-mode files, limiting its usefulness for editing content
    - The author suggests using [inorganic](https://github.com/karlicoss/inorganic) to convert Org-mode entities to text, with examples available in doctests and the [orger](https://github.com/karlicoss/orger) library.
      - `inorganic` is not popular, with one contributor, four forks, 24 stars, and no updates in five years
      - The library is only 200 lines of code
    - The `ast` is geared towards single-pass document reading. While it is possible to modify the document object tree, writing back changes is more complicated and not a common use case for the author.

**[Tree-sitter](https://tree-sitter.github.io/tree-sitter/)**

Tree-sitter is a powerful parser generator tool and incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited.

- **Pros**:
  - General enough to parse any programming language
  - Fast enough to parse on every keystroke in a text editor
  - Robust enough to provide useful results even in the presence of syntax errors
  - Dependency-free, with a runtime library written in pure C
  - Supports multiple languages through community-maintained parsers
  - Used by Neovim, indicating its reliability and effectiveness
  - Provides good documentation, available [here](https://tree-sitter.github.io/tree-sitter/using-parsers)
  - Python library, [py-tree-sitter](https://github.com/tree-sitter/py-tree-sitter), simplifies the installation process

- **Cons**:
  - Requires installation of Tree-sitter and the Org-mode language parser separately
  - The Python library does not handle the Org-mode language parser directly

To get a better grasp of Tree-sitter you can check their talks:

- [Strange Loop 2018](https://www.thestrangeloop.com/2018/tree-sitter---a-new-parsing-system-for-programming-tools.html)
- [FOSDEM 2018](https://www.youtube.com/watch?v=0CGzC_iss-8)
- [Github Universe 2017](https://www.youtube.com/watch?v=a1rC79DHpmY).

**[lazyblorg orgparser.py](https://github.com/novoid/lazyblorg/blob/master/lib/orgparser.py)**

`lazyblorg orgparser.py` is another tool for working with Org-mode files. However, I didn't look at it.

feat(pip_audit): Introduce pip-audit

[`pip-audit`](https://github.com/pypa/pip-audit) is the official pypa tool for scanning Python environments for packages with known vulnerabilities. It uses the Python Packaging Advisory Database (https://github.com/pypa/advisory-database) via the PyPI JSON API as a source of vulnerability reports.

**Installation**

```bash
pip install pip-audit
```

**Usage**

```bash
pip-audit
```
On completion, pip-audit will exit with a code indicating its status.

The current codes are:

- `0`: No known vulnerabilities were detected.
- `1`: One or more known vulnerabilities were found.

pip-audit's exit code cannot be suppressed. See [Suppressing exit codes from pip-audit](https://github.com/pypa/pip-audit?tab=readme-ov-file#suppressing-exit-codes-from-pip-audit) for supported alternatives.

**References**

- [Code](https://github.com/pypa/pip-audit)

feat(qbittorrent#Trackers stuck on Updating): Troubleshoot Trackers stuck on Updating

Sometimes the issue comes from an improvable configuration. In advanced:

- Ensure that there are enough [Max concurrent http announces](https://github.com/qbittorrent/qBittorrent/issues/15744): I changed from 50 to 500
- [Select the correct interface and Optional IP address to bind to](https://github.com/qbittorrent/qBittorrent/issues/14453). In my case I selected `tun0` as I'm using a vpn and `All IPv4 addresses` as I don't use IPv6.

feat(roadmap_adjustment#Trimester review): Trimester review

The objectives of the trimester review are:

- Identify the areas to focus on for the trimester
- Identify the tactics you want to use on those areas.
- Review the previous trimester tactics

The objectives are not:

- To review what you've done or why you didn't get there.

**When to do the trimester reviews**

As with [personal integrity review](#personal-integrity-review), it's interesting to do analysis at representative moments. It gives it an emotional weight. You can for example use the solstices or my personal version of the solstices:

- Spring analysis (1st of March): For me the spring is the real start of the year, it's when life explodes after the stillness of the winter. The sun starts to set later enough so that you have light in the afternoons, the climate gets warmer thus inviting you to be more outside, the nature is blooming new leaves and flowers. It is then a moment to build new projects and set the current year on track.
- Summer analysis (1st of June): I hate heat, so summer is a moment of retreat. Everyone temporarily stop their lives, we go on holidays and all social projects slow their pace. Even the news have even less interesting things to report. It's so hot outside that some of us seek the cold refuge of home or remote holiday places. Days are long and people love to hang out till late, so usually you wake up later, thus having less time to actually do stuff. Even in the moments when you are alone the heat drains your energy to be productive. It is then a moment to relax and gather forces for the next trimester. It's also perfect to develop *easy* and *chill* personal projects that have been forgotten in a drawer. Lower your expectations and just flow with what your body asks you.
- Autumn analysis (1st of September): September it's another key moment for many people. We have it hardcoded in our life since we were children as it was the start of school. People feel energized after the summer holidays and are eager to get back to their lives and stopped projects. You're already 6 months into the year, so it's a good moment to review your year plan and decide how you want to invest your energy reserves.
- Winter analysis (1st of December): December is the cue that the year is coming to an end. The days grow shorter and colder, they basically invite you to enjoy a cup of tea under a blanket. It is then a good time to get into your cave and do an introspection analysis on the whole year and prepare the ground for the coming year. Some of the goals of this season are:
  - Think everything you need to guarantee a good, solid…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants