-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracepoint extension support #160
base: master
Are you sure you want to change the base?
Conversation
I wasn't sure how the zero panic verification guarantee can be checked - there isn't a how to section in the README about it. I tried to write code as panic-free as I could, but I may have missed some pieces. |
Thanks for sending in this PR - I'm excited to dig in here! Unfortunately, I'm just about to leave for a vacation where I'll be completely AFK, so I'll only be able to take a look sometime in the range of ~Dec 20th to ~Dec 23rd. Just wanted to give you a heads up, so that you're not worried about the lack of movement here 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello again, and happy holidays!
I've finally found some time to sit down and give this PR a review.
Please find a plethora of comments attached.
Broadly speaking, I think this is an impressive chunk of work, implementing what appears to be a very annoying and non-trivial part of the GDB RSP. From an organizational and syntactical POV, there's nothing that a few review comments can't polish up, and the overall vision here appears to be consistent and well put together. Kudos!
That said, I do have some concerns about the amount of API surface area we're taking on here, and the feasibility of testing all of it. Its great that the you've got some things working in the armv4t
example, but from looking at the code (notably: obvious errors such as handlers which send responses with spaces delimiting various part of the packet), it seems that you've got a lot of code here that theoretically works, but hasn't been directly validated.
That's not to say we shouldn't try to land this work!
I think we should certainly try to get this PR landed... but to temper expectations for end-users, I might suggest we land this code under a mod experimental
, with some documentation mentioning that this code covers a lot of surface area, and may not be fully tested.
Alternatively, if you're so inclined, I'd be happy to see more investment into the armv4t
example code, with some corresponding logs that show all these codepaths having been smoke-tested. Or, of course, logs from whatever project you're implementing this feature for (and ideally, a link to the implementation itself - assuming you're working on something open-source).
I wasn't sure how the zero panic verification guarantee can be checked - there isn't a how to section in the README about it. I tried to write code as panic-free as I could, but I may have missed some pieces.
CI has a check for this, but it seems that the check doesn't run if clippy
fails. Oops.
I should probably re-jig CI a bit so that clippy failing still gives no_panic feedback... my apologies.
"QTDP" => _QTDP::QTDP<'a>, | ||
"QTinit" => _QTinit::QTinit, | ||
"QTBuffer" => _QTBuffer::QTBuffer<'a>, | ||
// TODO: QTNotes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if its worth singling out this one particular packet, when there are plenty of other tracepoint related packets that are presently unimplemented.
alternatively, might I suggest fully enumerating which remaining tracepoint packets are left to implement, and providing brief descriptions why they aren't necessary for v0? That would give any future folks working in this space some useful context.
i => Some(decode_hex(i).ok()?) | ||
})} | ||
}, | ||
req => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my reading on the packet spec, it looks like you're conflating the qTBuffer
packet with the QTBuffer
packets.
i.e: from https://sourceware.org/gdb/current/onlinedocs/gdb.html/Tracepoint-Packets.html
‘qTBuffer:offset,len’
Return up to len bytes of the current contents of trace buffer, starting at offset. The trace buffer is treated as if it were a contiguous collection of traceframes, as per the trace file format. The reply consists as many hex-encoded bytes as > the target can deliver in a packet; it is not an error to return fewer than were asked for. A reply consisting of just l indicates that no bytes are available.
which is distinct from
‘QTBuffer:circular:value’
This packet directs the target to use a circular trace buffer if value is 1, or a linear buffer if the value is 0.
‘QTBuffer:size:size’
This packet directs the target to make the trace buffer be of size size if possible. A value of -1 tells the target to use whatever size it prefers.
I think this implementation needs to be split into 2 (or 3) files, to handle the 3 packet variants.
@@ -172,6 +172,11 @@ impl<T: Target, C: Connection> GdbStubImpl<T, C> { | |||
} | |||
} | |||
|
|||
if let Some(_ops) = target.support_tracepoints() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are quite a few other tracepoint-related features in the docs. could you explain why these are the only two that were enabled, and maybe leave a comment mentioning what other features may need to be enabled in the future if/when additional functionality is ever implemented?
// Our response has to be a hex encoded buffer that fits within | ||
// our packet size, which means we actually have half as much space | ||
// as our slice would indicate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, interestingly enough, this is not the approach gdbstub
has taken thus-far when it comes to responses.
While we certainly have a fixed buffer for incoming packets (i.e: the buffer we are parsing from here), we assume that the GDB client is capable of accepting any amount of data we stream out as part of our responses. This aligns with my particular reading of the GDB spec, which discusses the size of packets the stub can accept... but doesn't make a judgement on the size of response packets the client can receive.
as such - I would suggest skipping this buffer-slicing step entirely, and instead offering the handler a callback function they can write an arbirarily sized &[u8]
into, which gdbstub
can then simply stream out. If you poke around some of the other target APIs, you'll see examples of this callback-based / streaming-based pattern being employed to great effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively, if you think that from a purely ergonomic POV, its nicer to provide end-users with a buffer to write data into... lets make sure to pass along the entire buffer we have access to, and let downstream response-writer code stream out the corresponding hex bytes.
let body = buf.into_body(); | ||
match body { | ||
[b':', b'-', actions @ ..] => { | ||
let mut params = actions.splitn_mut(4, |b| matches!(*b, b':')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this matches should just be a comparison, since you only have one thing you're matching against
mut f: impl FnMut(&TracepointAction<'_, U>), | ||
) -> Option<bool> { | ||
let mut more = false; | ||
let mut unparsed: Option<&mut [u8]> = Some(actions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't this loop be replaced with a split
iterator, matching on b'S' | b'R' | b'M' | b'X' | b'-'
?
actions: &mut [u8], | ||
mut f: impl FnMut(&TracepointAction<'_, U>), | ||
) -> Option<bool> { | ||
let mut more = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is another case where strip_suffix
could simplify some logic
Some([b'R', mask @ ..]) => { | ||
let mask_end = mask | ||
.iter() | ||
.cloned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raising my eyebrows at cloned
instead of copied
👀.
let status = ops.trace_experiment_status().handle_error()?; | ||
res.write_str("T")?; | ||
res.write_dec(if status.running { 1 } else { 0 })?; | ||
for explanation in status.explanations.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using ManagedSlice in the return value of the target method is certainly one approach... but can't we avoid the need to expose ManagedSlice in the API entirely by having trace_experiment_status
accept a callback instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to write the T0
/T1
running status before we write any of the explanations. If we pass the res
writing callback to the target implementation, then it would need to have some mechanism of reporting the experiment state before it can run the explanation callback. We can't do like an &mut FnOnce(ExperimentStatus)->TargetResult<&mut FnMut(ExperimentExplantion)->TargetResult<(),Self>, Self>
because we can't return a borrow from the closure due to lifetime issues, and can't return a trait object by value due to it being unsized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, one solution would be to split the API into trace_experiment_status
and then trace_experiment_statistics
or something? Does gdbstub particularly care about matching API surface 1:1 with gdb packets or would that be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does gdbstub particularly care about matching API surface 1:1 with gdb packets or would that be ok?
Reflecting the underlying packet structure to end users is actually an anti-goal of gdbstub
😄
It just-so-happens that the many times, the protocol is relatively trivial, so the resulting API ends up matching the packets 1:1... but there are other cases where gdbstub
goes out of its way to expose a more "user-friendly" API, and then take on whatever heavy-lifting it needs to do in order to map those friendly end-user semantics onto the underlying protocol.
src/stub/core_impl/tracepoints.rs
Outdated
let e = (|| -> Result<_, _> { | ||
match desc { | ||
FrameDescription::FrameNumber(n) => { | ||
res.write_str("F ")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, this sort of space isn't really a thing in the GDB RSP...
We include spaces in some of the templates for clarity; these are not part of the packet’s syntax. No GDB packet uses spaces to separate its components. For example, a template like ‘foo bar baz’ describes a packet beginning with the three ASCII bytes ‘foo’, followed by a bar, followed directly by a baz. GDB does not transmit a space character between the ‘foo’ and the bar, or between the bar and the baz.
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Packets.html#Packets
This strongly implies to me that this codepath hasn't been tested...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely missed that section of the docs! This codepath does actually work, however: in the armv4t
example that's included in this MR I can do
(gdb) tfind 3
Found trace frame 3, tracepoint 1
#0 main () at test.c:10
10 in test.c
(gdb) tfind 2
Found trace frame 2, tracepoint 1
10 in test.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh GDB... thou art truly exceptional software... /s
If we peek inside GDB and look at how it parses the packet response, we find this code
case 'F':
p = ++reply;
target_frameno = (int) strtol (p, &reply, 16);
(via https://github.com/bminor/binutils-gdb/blob/e16e638/gdb/remote.c#L14258-L14263)
And lo, reading the docs of strtol
reveals this lovely feature:
Discards any whitespace characters (as identified by calling isspace) until the first non-whitespace character is found [...]
So that solves that mystery...
In any case, even though this "works", lets make sure gdbstub
remains spec-compliant, in case other GDB RSP clients aren't quite so forgiving of bonus whitespace here.
Apologies for assuming that you hadn't tested this code, hopefully you understand why I might've gotten that assumption from reading just this code + the corresponding spec 😅
Thanks for the feedback! I'll look over and resolve them. For some background, we're using gdbstub in order to build out debugging introspection for a project we have. Our current tooling is via Python, and we're using gdbstub via some PyO3 bindings I hacked together (which I do intend to open source eventually, once I find the time), but that also isn't very "interesting" code. I'm working on this tracepoint support in tandem with building out the rest of the debugging stack, and so the API surface here was the minimal amount I needed in order to get gdb to not error out with "not supported" and implement the tracepoint functionality. We do have functionality using this, however, so none of it should be untested code (although some parts, like |
Description
This PR adds basic tracepoint extension support to GDB stub. Closes #157.
API Stability
Checklist
rustdoc
formatting looks good (viacargo doc
)examples/armv4t
withRUST_LOG=trace
+ any relevant GDB output under the "Validation" section below./example_no_std/check_size.sh
before/after changes under the "Validation" section belowexamples/armv4t
./example_no_std/check_size.sh
)Arch
implementationValidation
GDB output
loading section ".text" into memory from [0x55550000..0x55550078] Setting PC to 0x55550000 Waiting for a GDB connection on "127.0.0.1:9001"...
(The start of the
cargo run
output is corrupted by binary data that's printed, so I cut out the portion of the output relevant for tracepoint packets)Before/After `./example_no_std/check_size.sh` output
Before
After