Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling large repositories #92

Open
fzzyhmstrs opened this issue Jan 31, 2025 · 11 comments
Open

Handling large repositories #92

fzzyhmstrs opened this issue Jan 31, 2025 · 11 comments
Assignees
Labels
authors For developers and docs authors bug Something isn't working

Comments

@fzzyhmstrs
Copy link

Trying to set up my first wiki on the site, and I'm running into a "Something went Wrong" page.

Link to wiki directory in repo: https://github.com/fzzyhmstrs/fconfig/tree/master/wiki

When I look in my browser console, I see a couple errors related to 2117-21b5f9f9cbf4f2d8.js

TypeError: r is undefined
    B NextJS
2117-21b5f9f9cbf4f2d8.js:1:5387
    NextJS 70
        error
        l_
        callback
        nB
        nV
        aq
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        a9
        aY
        is
        is
        o1
        oZ
        T

I am filling the form out like so:
Image

Upon clicking "Submit", it will think for a while and then redirect to the Something Went Wrong page.

My repository:
Image

Branch and wiki folder. The /docs folder is already used for my separately published KDoc, hence specifying and using /wiki:
Image

@fzzyhmstrs
Copy link
Author

The issue may have something to do with the Github App? Reading through the publishing article again, there is an error related to not having the app on the repository. I did not get that error when I submitted, but looking now I don't see any apps integrated on the repo.

Image

When I logged in with github, I didn't get redirected to any authorization request, the page just looped back to the login with github message. I refreshed the page and was logged in. So something in that login step isn't working as expected. Firefox 134.0.2 on Windows.

Edit: tried on Edge also, and same result; logging in isn't getting a App permissions request from Github for me.

And since the error given wasn't "Please first install our GitHub app on your repository (here).", I'm not entirely sure how I'm supposed to go about installing it "manually", if that's possible at all.

@Su5eD
Copy link
Member

Su5eD commented Feb 1, 2025

As of the latest update, the app is only used for logging into the wiki and no longer needs to be installed on repositories. The issue is going to be somewhere else, but I'm not sure yet.

I'll try to register the project locally and see if I get the same error. Looking at the config, everything seems fine.

@Su5eD Su5eD self-assigned this Feb 1, 2025
@Su5eD Su5eD added bug Something isn't working authors For developers and docs authors labels Feb 1, 2025
@Su5eD
Copy link
Member

Su5eD commented Feb 1, 2025

I've investigated the issue and it seems like there's multiple factors at play here.

The issue(s)

The repository you're trying to register looks to be over 2GB large even when shallow-cloned, which just by itself shouldn't have been a problem. However, rather than download speed issues, I found out libgit2 is terribly sluggish at cloning large repositories. In my local test, this took several minutes, with the checkout phase taking significantly longer than fetch itself.

Combined with this comes the fact that the form submission is proxied by the frontend instead of being sent directly to the backend. The serverless function has an execution time limit, and after it is reached, the server returns an error, giving the user the aforementioned "Something Went Wrong" page.

Possible solutions

The problem here is we clone all of the repository's history instead of doing a shallow clone so that we can display a "last edit time" stat for docs pages. Even if we sacrificed that in favor of taking up less storage space, we can't determine the size of the docs folder alone before the repo is cloned. If the total repo size is 10G while the docs folder is a mere 10MB in size, even then we wouldn't be able to clone it.

Likely solution for slow speeds: To avoid both quickly running out of storage as well as similar cloning issues, I'm considering introducing a reasonable repository size limit for projects. A few hundred MBs should suffice for any project even with large assets.

Alternative: Instead of using libgit2, we can use the native git command line program. However, retrieving errors from CLI is much harder and less precise since we're effectively limited to strings and exit codes. With an embedded library, this is much more convenient.

Solving the error screen: I've updated the backend to use a shallow clone when first validating the repository, which should speed up the submission process. However, there is still a risk of the function timing out, and that applies to this case as well. To avoid that, we'll need to update the forms to be submitted from the client directly to the server instead of the frontend.

Workaround

In your case, I can recommend creating a separate repository for hosting your mod's documentation. It may not be the most convenient solution, but given the unusually large size of the repository, our options are limited.

One last thing I noticed is that the project ID contains an underscore (_), which is not allowed by the schema. Please use a hyphen (-) instead.

Thanks for reporting this issue btw!

@Su5eD Su5eD changed the title [New Wiki Creation] Submission Fails with "Something Went Wrong" Handling large repositories Feb 1, 2025
@fzzyhmstrs
Copy link
Author

The vast majority of that size is multi-versioned KDoc, to the tune of 2.2 GB. Which, I thought I had trimmed that size significantly, but seems I wasn't trimming everything I thought I was. That I can fix, but if it is still too big afterwards, it would be a bit of a pickle for me.

Having everything in one repo was one of the main attractants to this setup, as I can organize and work on everything in one place at one time.

Is last edit time the only reason you are making a clone, shallow or otherwise? I would assume there is a Github API route for seeing details about a repos last commit; perhaps this is the CLI solution you are referring to.

Let me re-trim the documentation and see where it ends up first.

@fzzyhmstrs
Copy link
Author

fzzyhmstrs commented Feb 1, 2025

Trimmed most of the fat off the docs, fixed the json issue, still getting the error even with the repo down to 130 MB (barely larger than the repo of Cinderscapes and much smaller than that of Spectrum, for a couple examples from repo forks I have lying around)

Image

Image

Image

So the few hundred MB posed in your Possible Solutions seems still too large without making other changes.

@Su5eD
Copy link
Member

Su5eD commented Feb 1, 2025

Looks like the project was added succesfully this time https://moddedmc.wiki/en/project/fzzy-config/docs. Does it show up on your dev dashboard?

@fzzyhmstrs
Copy link
Author

Oh, yes I do. Still got the server error when I submitted it though.

@Su5eD
Copy link
Member

Su5eD commented Feb 1, 2025

There seems to be a syntax error in the frontmatter metadata in some pages:

Image

Wrapping the title in quotes should fix it. Not sure why the error page is not showing up though, I'll take a look at that.

@Su5eD
Copy link
Member

Su5eD commented Feb 1, 2025

Oh, yes I do. Still got the server error when I submitted it though.

Roughly how long did the submission take? Could it have been going for longer than 15 seconds? In that case it's going to be the timeout error that I still need to fix.

@fzzyhmstrs
Copy link
Author

fzzyhmstrs commented Feb 1, 2025

I wasn't timing it, but it's plausible it took longer than 15 seconds.

I definitely see what you mean about the slowness of clones from libgit2. I did a reload after pushing some fixes, and the clone took ~5 minutes according to the built in log.

Example Log

[I] [2025-02-01 17:28:46] [fzzy-config] Setting up project
[I] [2025-02-01 17:28:46] [fzzy-config] Cloning git repository
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.00%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.09%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.19%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.28%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.37%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.46%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.56%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.65%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.74%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.83%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 0.93%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 1.02%
[T] [2025-02-01 17:28:48] [fzzy-config] Fetch progress: 1.11%

...
you get the point
...
[T] [2025-02-01 17:28:56] [fzzy-config] Fetch progress: 99.55%
[T] [2025-02-01 17:28:56] [fzzy-config] Fetch progress: 99.64%
[T] [2025-02-01 17:28:56] [fzzy-config] Fetch progress: 99.73%
[T] [2025-02-01 17:28:56] [fzzy-config] Fetch progress: 99.83%
[T] [2025-02-01 17:28:56] [fzzy-config] Fetch progress: 99.92%
[T] [2025-02-01 17:28:56] [fzzy-config] Fetch progress: 100.00%
[T] [2025-02-01 17:33:52] [fzzy-config] Checkout progress: 0.01%
[T] [2025-02-01 17:33:52] [fzzy-config] Checkout progress: 5.65%
[T] [2025-02-01 17:33:52] [fzzy-config] Checkout progress: 11.30%
[T] [2025-02-01 17:33:53] [fzzy-config] Checkout progress: 16.95%
[T] [2025-02-01 17:33:53] [fzzy-config] Checkout progress: 22.59%
[T] [2025-02-01 17:33:54] [fzzy-config] Checkout progress: 28.24%
[T] [2025-02-01 17:33:54] [fzzy-config] Checkout progress: 33.89%
[T] [2025-02-01 17:33:55] [fzzy-config] Checkout progress: 39.53%
[T] [2025-02-01 17:33:55] [fzzy-config] Checkout progress: 45.18%
[T] [2025-02-01 17:33:56] [fzzy-config] Checkout progress: 50.83%
[T] [2025-02-01 17:33:56] [fzzy-config] Checkout progress: 56.47%
[T] [2025-02-01 17:33:57] [fzzy-config] Checkout progress: 62.12%
[T] [2025-02-01 17:33:57] [fzzy-config] Checkout progress: 67.77%
[T] [2025-02-01 17:33:58] [fzzy-config] Checkout progress: 73.41%
[T] [2025-02-01 17:33:58] [fzzy-config] Checkout progress: 79.06%
[T] [2025-02-01 17:33:58] [fzzy-config] Checkout progress: 84.71%
[T] [2025-02-01 17:33:59] [fzzy-config] Checkout progress: 90.36%
[T] [2025-02-01 17:33:59] [fzzy-config] Checkout progress: 96.00%
[T] [2025-02-01 17:34:00] [fzzy-config] Checkout progress: 100.00%
[I] [2025-02-01 17:34:00] [fzzy-config] Git clone successful
[I] [2025-02-01 17:34:00] [fzzy-config] Copying project files for version 'latest'
[I] [2025-02-01 17:34:01] [fzzy-config] Done copying files
[I] [2025-02-01 17:34:02] [fzzy-config] Project setup complete

@IMB11
Copy link
Contributor

IMB11 commented Feb 1, 2025

Most of the time it's quicker to actually just call the git commands through a shell process than use libgit2, libgit2 is plagued with performance issues - most IDEs have moved away from libgit2 these days (Visual Studio was the last to stop using it in 2020)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
authors For developers and docs authors bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants