This repository has been archived by the owner on Jun 14, 2024. It is now read-only.
Replies: 1 comment
-
@clee704 I think index provider can be a good improvement for Hyperspace. Could you review it and add some comments? Thanks! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
NOTE: This is for the design discussion in new metadata store. Later we will create a design proposal PR.
Currently, Hyperspace loads all index log entries in system directory at once when it requires the list of indexes such as query optimizer, refreshIndex, deleteIndex, ... etc and caching all the entries for the next use.
However, this is not desirable & scalable in terms of
IndexLogEntry
might take large memory space when it has many source files. (mainly file paths)There're some comments from @clee704 in #405. (Thanks!)
Originally posted by @clee704 in #405 (comment)
Things to discuss
Key for retrieving indexes
Since we don't define "table identifier" for each source dataframe, it's hard to mapping the relations in query plan to index source relation. We could do this by filtering
For support index view later, we should consider the query plan has multiple relations + indexes can be built on multiple relations.
Metadata format
Metadata Location
Concurrency
===
Please feel free to add any comment/idea/question on this :)
Beta Was this translation helpful? Give feedback.
All reactions