Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reindexing on startup #579

Open
dhirschfeld opened this issue Dec 23, 2022 · 1 comment
Open

Allow reindexing on startup #579

dhirschfeld opened this issue Dec 23, 2022 · 1 comment

Comments

@dhirschfeld
Copy link
Contributor

If your quetz server is restarted and you're not using an external database you will lose all package information (IIUC).

You can manually reindex the server by calling the /api/channels endpoint but it might be nice if you could specify a configuration such that quetz would automatically reindex all channels it knows about whenever it starts up.

@Krande
Copy link

Krande commented Sep 20, 2023

Hey, I was just looking for this too. Did you find something that worked?

I made a quick attempt and it seems something like this might work?

Update

I temporarily create an admin user system which I remove prior completion. Now its at least possible to list the packages from the /get/<channel_name>/ endpoints. But the /api/channels/ are empty for now. I guess these are fixable by simply re-building these in the db as well. I'll see if I can do something about it. Also, you'll have to decide what the re-creation defaults should be for the different channels (such as channel privacy, mirror mode etc.).

# add to ~ https://github.com/mamba-org/quetz/blob/main/quetz/main.py#L1724
@app.on_event("startup")
def start_reindex_packages_from_pkg_store():
    from .tasks.reindexing import reindex_packages_from_store

    db_manager = contextmanager(get_db)

    with TicToc("Reindex packages"):
        with db_manager(config) as db:
            dao = get_dao(db)
            admin = dao.create_user_with_role('system', 'admin')

            pkg_store = config.get_package_store()
            for channel in pkg_store.list_channels():
                # loop over all channels found in the pkg store
                reindex_packages_from_store(dao, config, channel, user_id=admin.id)
            dao.delete_user(admin.id)

# Add to ~ https://github.com/mamba-org/quetz/blob/main/quetz/pkgstores.py#L80
@abc.abstractmethod
def list_channels(self) -> List[str]:
    pass

# Azure example: https://github.com/mamba-org/quetz/blob/main/quetz/pkgstores.py#L576
def list_channels(self) -> List[str]:
    with self._get_fs() as fs:
        for d in fs.ls(self.container_prefix):
            if not fs.isdir(d):
                continue
            yield d.replace(self.container_prefix, "")

I also saw that currently the reindexing only works for .tar.bz2 files. To add .conda file support I made a minor update.

# https://github.com/mamba-org/quetz/blob/main/quetz/tasks/reindexing.py#L140

pkg_files = [f for f in all_files if f.endswith(".tar.bz2")]

# updated to

pkg_files = [f for f in all_files if f.endswith(".tar.bz2") or f.endswith(".conda")]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants