Replies: 7 comments
-
I don't see a problem as long as you use the same major version and configuration (history generation etc.). You should also be able to add the new project without restart (using https://opengrok.docs.apiary.io/#reference/0/projects/add-project and https://opengrok.docs.apiary.io/#reference/0/project-metadata-management/marks-project-as-indexed). However, what do you mean by |
Beta Was this translation helpful? Give feedback.
-
@ahornace, thank you for your quick response. With
Those problems continue with the most recent OpenGrok releases. |
Beta Was this translation helpful? Give feedback.
-
time to bump the log level up to see what is going on. Maybe use
these need to be debugged case by case. What exactly is missing ? xref ? history ? something else ?
This is caused by the directory traversal that happens for every reindex. The fix is tracked by #3077. |
Beta Was this translation helpful? Give feedback.
-
It's using Usually, it's missing the indexed data. Git fetches the new code change, OpenGrok reindexes the new repo, indexer said it finished successfully. OpenGrok web page shows the date when finished indexing, but that change is not showing in OpenGrok when you search or open the file directly. It's difficult to debug because there are no errors related to the missing indexed data in the logs. Yes, I am aware of #3077. Thank you for sharing. Because of that issue, I assumed reindexing was or might still be broken. And problems, as I have experienced, would be expected in big repositories. #3077 made me change from reindexing to always from scratch. I have another question. Because I tried different combinations without success, and the documentation is not clear. What is the workflow to add one project at once? Is it supported? I mean, Git pulls the change for the foo repository and tells OpenGrok to index only the foo repository. |
Beta Was this translation helpful? Give feedback.
-
Grabbing the stack traces of the indexer process with
It would be nice to get to the bottom of this because this is the first time I hear about such problem. I mean functional problem, not performance. For each file reported in the logs with The indexer traverses the whole directory tree of given project (in
#3077 is merely performance enhancement. What is the structure of the repositories in yours big project in terms of repository types ?
The indexing granularity is per project, i.e. it is not possible to index just one repository of a project. |
Beta Was this translation helpful? Give feedback.
-
Sadly, I don't have the logs to show because I changed everything to build from scratch. I am making a new OpenGrok setup; I could change it to reindexing, wait until it happens to look in the logs and report it back. It could take a while. Sorry.
Let me try to reply with most of the information that I am legally allowed. OpenGrok runs in a dedicated Ubuntu 18.04 VM with 64 GB RAM; less than this RAM will fail with OOM. It gives an idea of how big the repositories are. We have around six big Git repositories. Each one of them has around 20 GB of source code plus Git history. Each repo is treated as an OG project. It's indexed to keep the Git history. It has a lot of OG indexing filters to optimize the indexing time and avoid ctag-universal crashes. It's an OpenGrok standalone setup, and it's not using Docker because the documentation says it should not be used for big repositories. It would need to be adjusted, but I don't want to experiment as this is critical for many devs. |
Beta Was this translation helpful? Give feedback.
-
Okay, this means that the changes for #3077 should help with lowering the indexing time in your environment. |
Beta Was this translation helpful? Give feedback.
-
It is not a bug report. It's more like a question.
There are huge repositories (double-digit GB of code) where OpenGrok takes hours to finish indexing because reindexing always fails. Previously to Opengrok indexes the new code fetched by Git, the indexed data is removed, and the indexing must be done from scratch. So, all these hours, OpenGrok cannot be used.
Is it possible to index the new code in an OpenGrok running on a different machine, copy the newly indexed data, and reuse it in the production OpenGrok? Aside from source code, indexed data, what else needs to be copied? File configuration?
Beta Was this translation helpful? Give feedback.
All reactions