Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R-Instat, probably reogrid has a problem with shape files #9160

Open
rdstern opened this issue Sep 29, 2024 · 9 comments
Open

R-Instat, probably reogrid has a problem with shape files #9160

rdstern opened this issue Sep 29, 2024 · 9 comments

Comments

@rdstern
Copy link
Collaborator

rdstern commented Sep 29, 2024

The Climatic > File > Import and tidy Shape files is used routinely for our climatic mapping work.

image

Here is an example of data from Zimbabwe. I found the problem with similar data from Bangladesh. I think it is with all the shape files, at least those from gadm.

image

Note that this is a challenging type of file for reogrid to be able to handle. The one shown only has 91 rows, but the geometry variable is sufficiently large that the data frame is over 5 mbytes. (The one where I found the problem was over 100 mbytes.)

a) The first problem - perhaps minor - is that R-Instat now runs slowly, and I assume this may be to do with refreshing the reogrid. For example making (say) NAME_1 into a factor takes over 1 minute on the Bangladesh data.

b) So I then tried to do something about the geometry variable. I tried right-click and delete, then right-click and rename. Each time I get this error message and it then closes the data frame. Other data frames stay open.

image

@N-thony I wonder what can be done. I note it adds special types the the metadata. I fon't need to be able to dit, or do anything with the contents of the geometry variable. It works, in that the maps are drawn. I would just like reogrid to handle the data frame better - and also not throw me out of R-Instat.

I wonder if this problem may need some @ChrisMarsh82 advice?

@N-thony
Copy link
Collaborator

N-thony commented Sep 30, 2024

@rdstern I suggest instead @Patowhiz to advice on this, as he did work on it recently.

@Patowhiz
Copy link
Contributor

@rdstern @N-thony could you do these tests in R Studio. Whenever I want to know whether things are slow in the front end, I conduct the tests in R Studio, by running line by line the log file.
Reogrid get's a limited number of rows and operations like renaming are done on the R side then refreshed in the reogrid by reading from R.
@N-thony I think you can check or delegate someone to check on why the rename dialog has the bug.

Personally I have an issue with how ingestion and usage of shape files was conceptualised as ordinary data frames. I think shape files are not meant to be read or displayed as tables, I treat them as Geojson, which again are meant for applications to use in drawing maps not human users. Unfortunately I haven't thought of how else we could treat them in R-Instat and still be able to use them in dialogs. In the meantime, I would propose doing much of the processing (renaming and changing of factors etc) during the import itself, so that operations can be done outside of the data book.

@rdstern
Copy link
Collaborator Author

rdstern commented Sep 30, 2024

@Patowhiz and @N-thony I strongly suggest you may go round in circles involving the whole team in this fix. The error message implies to me that it is probably ok in R and hence RStudio. I can picture that the geometry variable might cause problems for reogrisd, because it is so complex. I wonder, instead, if @ChrisMarsh82 could look at this problem? In the variable it isn't a rewname problem. I get just the same error if I try to delete the variable. I suspect doing anything to that variable will trigger the problem.

And, I assume they way reogrid handles that variable is linked to the clear slowing down on R-Instat when doing simple operations on other variables. Like changing a name variable into a factor. Please note @Patowhiz that I don't case particularly if it is a factor or a character variable. I care that it seems that operations on that data frame are unreasonably slow - whatever they are. And operations on that variable are both impossible and throw the user out of that data frame. I don't have any other operation I have found that does this.

On the positive side, I am really impressed with the ease with which we seem to be able to enter these data, and their neat use (that still works) in drawing maps.

@Patowhiz
Copy link
Contributor

Patowhiz commented Oct 1, 2024

@rdstern thanks for the details. And as you say, I'm happy for @ChrisMarsh82 to look at the problem.

@ChrisMarsh82
Copy link
Contributor

@N-thony Not sure if I fully understand this but if we can bring back the value fine in the dataframe and it is just reogrid causing issues then could we set up reogrid to show '[BLOB]' (or some equivalent) and not ever attempt to show values for that datatype.
This would hopefully mean dialogs still work fine as they will be using the dataframe value and not the value in reogrid.
We could even have a click on the cell to bring a pop up memobox with the value in if needed.

Can you check that its reogrid causing the issue and not the dataframe. If the dataframe is also causing issues we could do something similar where we don't bring the value back for the column but this would get more complicated when it comes to the dialogs that use the field.

@N-thony
Copy link
Collaborator

N-thony commented Oct 2, 2024

image

@N-thony I wonder what can be done. I note it adds special types the the metadata. I fon't need to be able to dit, or do anything with the contents of the geometry variable. It works, in that the maps are drawn. I would just like reogrid to handle the data frame better - and also not throw me out of R-Instat.

I wonder if this problem may need some @ChrisMarsh82 advice?

@rdstern Renaming a geometry type column should not be handled the same way as other columns. We need to enhance our rename function to properly support geometry columns as well.

@N-thony
Copy link
Collaborator

N-thony commented Oct 2, 2024

@ChrisMarsh82 have a look of the screenshot below, is this what you were suggesting yesterday in our meeting?
image

@ChrisMarsh82
Copy link
Contributor

@N-thony screenshot looks good. Does this solve the speed issues? Are you able to add a double click to the cell so users can see the info within the multipolygon?
You mentioned rename doesn't work is there any other functions that we have issues with on multipolygon, Can you list them all so issues can be created.

@rdstern
Copy link
Collaborator Author

rdstern commented Oct 4, 2024

@ChrisMarsh82 and @N-thony I first found the same problem when I tried to delete the variable. Then I moved to rename. I haven't tried re-ordering, or duplicate column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants