Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi source support #101

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Multi source support #101

wants to merge 4 commits into from

Conversation

ntrp
Copy link
Contributor

@ntrp ntrp commented May 14, 2021

This is only a first draft to see if the general direction makes sense. Should already work for the file source while it is not implementeds for the others.

After some feedback and once agreed how we do it I will proceed implementing it for the rest of the sources

@simplesteph
Copy link
Contributor

at a high level, does it break compatibility, and what does the "directory" structure look like?

@ntrp
Copy link
Contributor Author

ntrp commented May 26, 2021

No compatibility break, if you provide a single file it will work as before. When you provide a directory to KSM it will traverse recursively and create a source for every file using the correct parser so you can even use csv and yml at the same time. This will work also for the Github source for example, you can provide a directory in a respository and the source will pull all files inside the directory.

@simplesteph
Copy link
Contributor

is there one API call at the directory level to see if the timestamp has changed, or do we have to do one API call by file discovered in that directory? I don't want to start overwhelming GitHub for example with API calls.

I guess this implementation will also be dependent on the source type, right?

@ntrp
Copy link
Contributor Author

ntrp commented May 26, 2021

I have to check better api wise but caching was already in my mind to minimize update time and load on the backend. The commit can be used to check if anything changed at all from last pull and I guess it's possible also to fetch a list of changed files via API.

@simplesteph
Copy link
Contributor

simplesteph commented May 26, 2021 via email

@trobert
Copy link
Contributor

trobert commented May 26, 2021

This is only a first draft to see if the general direction makes sense

LGTM 👍

ntrp added 2 commits June 29, 2021 20:52
I realized that the code was completely wrong, I need
to call the applySourceAcls with all values not for
every parsing context. So what I do now is, first I
collect all acls either from the reader or cache and
then apply on the result
@ntrp ntrp requested a review from trobert June 29, 2021 21:52
@ntrp
Copy link
Contributor Author

ntrp commented Jun 29, 2021

I stabilized the current code (realized it was actually wrong) and implemented directory support only for the file source, does it make sense merging that one by itself and leave the rest supporting only single file for now? I can add some documentation for it and when I manage implement multi source for the other ones create another PR.

p.s. I also need still to add some tests for the multisource support in the File source.

@ntrp ntrp marked this pull request as ready for review June 29, 2021 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants