Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create callback based on file descriptor monitoring #190

Open
wants to merge 61 commits into
base: main
Choose a base branch
from

Conversation

jcheng5
Copy link
Member

@jcheng5 jcheng5 commented Oct 21, 2024

No description provided.

src/fd.cpp Outdated Show resolved Hide resolved
@jcheng5
Copy link
Member Author

jcheng5 commented Oct 22, 2024

Thinking out loud: Again, impressed at your pace in getting this done. I haven't actually run the code but I will assume for now that it works as advertised.

I looked at later_posix.cpp for the first time in a long while and (re)learned that the Timer class always uses a background thread--I'd forgotten that. Sorry if I said anything too misleading.

It makes me a little itchy that there is a new thread being launched for each later_fd call. Probably not noticeable to any of our users, but it's such a common idiom to have a single thread/loop calling select/poll/epoll for many jobs that a dedicated thread per callback feels a bit shocking. I'm fine if you want to leave this the way it is for now while we complete the end-to-end scenario.

src/fd.cpp Show resolved Hide resolved
src/fd.cpp Outdated Show resolved Hide resolved
@shikokuchuo
Copy link
Contributor

Oh, couple more metadata-ish things:
1. Add yourself to DESCRIPTION as an author
2. Update NEWS.md
3. Add a few sentences about later_fd to README.md (unless you have a lot to say, then add a vignette, but I think for most people who could make use of later_fd a short note would be enough?)

Thank you Joe, I've done this in 1622227. It's really an honour to be listed alongside yourself and Winston!

@shikokuchuo
Copy link
Contributor

Just added a couple of additional things I noticed.

src/fd.cpp Outdated Show resolved Hide resolved
src/fd.cpp Show resolved Hide resolved

typedef struct ThreadArgs_s {
std::shared_ptr<std::atomic<bool>> flag;
std::unique_ptr<std::vector<int>> fds;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this? It looks to me like you initialize it to contain all the fd's, but then ignore it, then use it as a int* to hold result values. Can you get rid of it and create a new Rcpp::LogicalVector on the stack at callback time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I see now that it is the only place you hold the fd's. I find it confusing that it's used as both input and output. Can I suggest that you keep it to just input?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've just had to bite the bullet here and introduce a results vector for the args struct in a32b3ea.

Although not strictly necessary, it does make things clearer and I didn't find a really good alternative that wasn't confusing in some other way.

I note it's a bit like the separate 'events' and 'revents' fields of poll() rather than a self-updating structure like the select() 'fd_sets', with poll() being the more modern design.

src/fd.cpp Outdated
Comment on lines 74 to 75
int timeoutms = infinite || args->timeout < 0 ? 1000 : (int) (args->timeout * 1000);
int repeats = infinite || args->timeout < 0 ? 0 : timeoutms / LATER_INTERVAL;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this confusing and also I think it assumes that this thread will be promptly scheduled when it wants to wake up. I've always handled this situation (multiple little waits with an overall timeout) by adding timeoutms to the current time at the start of the overall operation, and then for each little wait, wait for the lesser of (time til expiration) and (liveness interval). For example,

bool CallbackRegistry::wait(double timeoutSecs, bool recursive) const {
ASSERT_MAIN_THREAD()
if (timeoutSecs < 0) {
// "1000 years ought to be enough for anybody" --Bill Gates
timeoutSecs = 3e10;
}
Timestamp expireTime(timeoutSecs);
Guard guard(mutex);
while (true) {
Timestamp end = expireTime;
Optional<Timestamp> next = nextTimestamp(recursive);
if (next.has_value() && *next < expireTime) {
end = *next;
}
double waitFor = end.diff_secs(Timestamp());
if (waitFor <= 0)
break;
// Don't wait for more than 2 seconds at a time, in order to keep us
// at least somewhat responsive to user interrupts
if (waitFor > 2) {
waitFor = 2;
}
condvar->timedwait(waitFor);
Rcpp::checkUserInterrupt();
}
return due();
}

(Note that the Timestamp class used there is designed specifically for this kind of waiting, it won't be affected by the system clock being messed with)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, apologies here - I had meant to write some comments for this logic!

The advantage of my method is that it just relies on arithmetic rather than calls for the system time after each loop. The disadvantage is that there isn't the one source of truth being the fixed expiry time. The slippage on the wake-up time is taken as a given as the semantics for a timeout is typically "wait for at least x ms", without the expectation for too fine a granularity.

However, I do like how the Timestamp class is implemented. The intent is at least much clearer so I've proceeded to adopt it in d8eb35f (and it also makes sense as it's already in the package), with the following notes:

  1. Made sure it runs at least once even when the timeout is set as 0.
  2. Poll interval is set to a default of 1024 ms, which means that for normal use in curl the default of 1000ms won't be broken up by this.
  3. Initialized the timestamp as soon as possible up front instead of on the thread.

src/fd.cpp Show resolved Hide resolved
src/fd.cpp Outdated Show resolved Hide resolved
R/later.R Show resolved Hide resolved
src/fd.cpp Outdated Show resolved Hide resolved
src/fd.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants