-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix runtime filtering with --threads:on #45
Conversation
- the topics registry is now a threadvar instead of a global one - raw pointers are no longer being used to bypass the GC - the test suite is now running all tests with --threads:on which means the single test doing that in the past can be safely removed
|
This is an improvement over the status quo because it removes the gcsafe warnings, but I think we need a global topics registry - it would be unexpected for the users that functions like Also, |
How would we implement that, now that Araq wants to make {.gcsafe.} blocks run only on the main thread? nim-lang/RFCs#142
Readability and correctness are more important than performance. Also, can we not ignore and forget this PR for the next 3 months? I need it now, to get runtime log level selection in Nimbus. |
- mention that log levels available at runtime in this section are limited by a compile time setting in the section above
|
Can you clarify what other solutions were attempted? In order of preference, I can imagine some other approaches:
|
There are 2 different problems solved for Nimbus here: runtime log level setting with
Nimbus is compiled with
I totally would, but gcsafe checks ignore this completely and there's no interest upstream to change this. With {.gcsafe.} blocks being targetted for an even more restrictive change, I'm reluctant to use them at all.
You're an optimist. I don't think Araq is interested in / understands the need to implement inter-process communication through shared memory. |
Regarding this problem, I happen to know for sure what his future plans are. We have discussed this at length on multiple occasions (the last time was just 2 days ago). |
Yes, but what I meant is that proper synchronization is not required yet, because there aren't multiple threads in practice. With |
Why would we use raw pointers as our first option in a GC-ed language? |
can't the gcsafe override be put somewhere in chronicles, so we don't have to litter the other code? an alternative that I've used on occasion is to have a (thread-safe) queue for logging through which everything passes: log lines, settings updates etc. this tends to be good for latency on the logging side, but requires some maintenance to process the queue (on a thread or in the event loop, for example). |
|
- new FileOutput.open() variant for existing file descriptors
This last change allows overriding a "textblocks[file]" sink with stdout or stderr: if len(conf.debug.logFile) != 0:
discard defaultChroniclesStream.output.open(conf.debug.logFile, fmAppend)
else:
discard defaultChroniclesStream.output.open(stdout) (A workaround for the lack of dynamic sinks, of course.) |
except KeyError: | ||
topic = Topic(state: Normal, logLevel: NONE) | ||
registerTopic(topicName, topic) | ||
return topic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wanted to spend some time with the code myself to help you finish this, but I'll have to provide some comments only.
Please try to get back to the previous version of topicStateIMPL
. This would allow you to greatly reduce the need for locking in the rest of the code. The simple idea of topicStateIMPL
is that each topic receives a global variable with a statically known address and most of code that needs to access and manipulate the state of a particular topic can work with a simple pointer to this address. Please note that the state variable does not contain garbage collected types and manipulating the state is just flipping an enum value, which is perfectly thread-safe.
setTopicState("literal_name", ...)
can be translated to a write to the known location - no locking required.
The only place where we need locking is the registration of new topics (which happens at the start of the program before any threads are spawned). The more dynamic API (getTopicState
, setTopicState(non-static string)
) was added only to simplify the way the chronicle_tail
tool works (because the tail tool doesn't known the names of topics in advance). Only this "dynamic" API needs locking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to get back to the previous version of topicStateIMPL.
I don't think the old version is correct. All the corresponding entries in the global topic registry were overwritten with Topic(state: Normal, logLevel: NONE)
instances, deleting previous non-default values for those topics.
Please note that the state variable does not contain garbage collected types and manipulating the state is just flipping an enum value, which is perfectly thread-safe.
You can't just assume that with modern architectures. You either do explicit locking or you use atomic operations: https://github.com/nim-lang/Nim/blob/master/lib/system/atomics.nim
setTopicState("literal_name", ...) can be translated to a write to the known location - no locking required.
Until it's executed at the same time from different threads. It's better to be defensive here and assume the worst about the users of your API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a logging library. Extremely precise cache-coherent behavior of setTopicState
is not really that important. Eventual consistency is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventual consistency is good enough.
We were not getting it at all.
topicsMatch()
received a list of Topic
objects created by topicStateIMPL()
- all of them having a Normal
state and a log level of NONE
. Then topicsMatch()
checked their states and log levels like there could be any other values in there.
Somewhere along the line, the logic was scrambled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found the commit that messed up the logic: f929f30#diff-7810f6d32e19cbc385dbe25d827e77f9L133
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the old code, topicsMatch
received pointers to global objects that were "created" at the begging of the program. If you were always seeing NONE
and Normal
values for their fields, this is another bug that needs to be investigated.
@@ -282,19 +296,12 @@ template log*(stream: type, | |||
template logFn(name, severity) {.dirty.} = | |||
template `name`*(eventName: static[string], | |||
props: varargs[untyped]) {.dirty.} = | |||
|
|||
bind logIMPL, bindSym, brForceOpen | |||
logIMPL(instantiationInfo(), activeChroniclesStream(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I have used these slightly repeated bodies is to shave few nanoseconds from the compile times. Haven't tried to measure the effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to shave few nanoseconds from the compile times
Readability before nano-optimisations ;-)
chronicles.nim
Outdated
@@ -255,6 +262,13 @@ macro logIMPL(lineInfo: static InstInfo, | |||
code.add newCall("logAllDynamicProperties", Stream, record) | |||
code.add newCall("flushRecord", record) | |||
|
|||
code = quote do: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if this is controlled by a compile-time option. Also, we can use one lock per stream I think (or should it be even per-sink/per-output?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if this is controlled by a compile-time option.
Done. All locking is disabled if threading is off.
Also, we can use one lock per stream I think (or should it be even per-sink/per-output?).
The finer-grained your locking, the more runtime overhead you have, with no actual benefit. Then you have to worry about different threads being able to interlace their setProperty() calls (or any other sub-record component).
This way, with one lock per log call, you're sure your record will reach the output in one piece and the wait should be short enough to not matter.
Worth revisiting with some actual instrumentation data on a system with 32+ cores and see what happens when they all wait on the same lock to do their logging. At that point, we're going to ask Jacek about logging buffers and dedicated logging threads.
registry.topicStatesTable[name] = topic | ||
|
||
proc topicsMatch*(logStmtLevel: LogLevel, logStmtTopics: openarray[Topic]): bool = | ||
lockRegistry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't lock either. All the information needed by the function is passed in as parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the information needed by the function is passed in as parameters.
It still reads from 2 global variables that may be changed at the same time from other threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, ok, I'll think about this. It's OK to merge with locking for now
|
except KeyError: | ||
topic = Topic(state: Normal, logLevel: NONE) | ||
registerTopic(topicName, topic) | ||
return topic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the old code, topicsMatch
received pointers to global objects that were "created" at the begging of the program. If you were always seeing NONE
and Normal
values for their fields, this is another bug that needs to be investigated.
registry.topicStatesTable[name] = topic | ||
|
||
proc topicsMatch*(logStmtLevel: LogLevel, logStmtTopics: openarray[Topic]): bool = | ||
lockRegistry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, ok, I'll think about this. It's OK to merge with locking for now
@@ -255,6 +262,15 @@ macro logIMPL(lineInfo: static InstInfo, | |||
code.add newCall("logAllDynamicProperties", Stream, record) | |||
code.add newCall("flushRecord", record) | |||
|
|||
when compileOption("threads"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I've changed my mind. We don't need locking here at all. Instead, we can unconditionally use the BufferedOutput
type when compiling with threads: on
. Then, the setProperty
statements will populate a local buffer that will be flushed in the end with a single operation. Only the actual output device needs to be thread-safe, which is already the case for stdout.write
.
The above can be achieved by modifying the function below:
https://github.com/status-im/nim-chronicles/blob/master/chronicles/log_output.nim#L155
Closing in favor of #46 |
the topics registry is now a threadvar instead of a global onewhich means the single test doing that in the past can be safely
removed