-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Host I/O operations #66
Conversation
Some questions:
|
Ah, what a pleasant surprise! Host I/O is a feature I've been meaning to implement for a while now - thanks for taking a crack at it! Before I dive into the PR (either later today, or at some point this weekend), the first thing I'll have to ask is that you split this PR into two separate ones - one for exec-file, and another for Host I/O. I suspect the former will be much easier to merge than the latter, as host I/O seems like something that'll require some more iteration. |
In fact, I implement Host I/O firstly and then exec-file. Host I/O can be implemented separately, but exec-file can't. If we implement exec-file, gdb will request to read the file by Host I/O automatically. |
That's fine. Even if the GDB client is doing something weird and trying to use unsupported Host I/O packets to fetch the reported exec-file, the worst thing that would happen is that We should still be able to implement exec-file separately, even if it'd only be "useful" in conjunction with Host I/O support. |
Do I need to send another PR, or just split this commit into two? |
Thanks. I'd prefer it if you sent a separate PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the inline comments, I also had a general question:
Could you also implement the few remaining Host I/O operations, for completeness sake? Namely: pwrite
, fstat
, unlink
, and readlink
. They shouldn't require much more effort, and I'm not sure if I'd merge the PR without them.
Yes, I can implement them, but I don't know how to test them. I never meet a situation which |
I noodled around a bit in I'm not entirely sure why the client sends
This is a good point, and notably, it also applies to the PR in its current form as well. i.e: there's no reason any particular subset of vFile operations must be implemented together, as they are all building blocks from which more advanced functionality is derived (e.g: fetching the process's mappings). From the specs' perspective, the target is free to implement as many or as few of the operations as it likes, and the GDB client just has to work around a target's provided feature set. So, with that in mind, I see two different approaches on how to tackle this:
While it's certainly more verbose, I think we should go with the first option. It'd be consistent with other existing IDETs in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for implementing the missing vFile operations, and for integrating my feedback so effectively! I suspect we are only a few more iterations away from having this PR ready to merge :)
A couple general comment about the armv4t
example:
- please include at-least stub implementations for each of the vFile methods, just so it's easy to validate that they are in-fact being hit. this could be as simple as logging the provided parameters via
eprintln!
, and then returningOk
.- this
eprintln!
logging should be added to the existing open, pwrite, and close methods as well.
- this
- Feel free to leave in the pseudo-"proc fs" example, but please add a comment explaining why it's here (i.e: to support
info proc mappings
) - Please add a "scratch" memory-backed file that can be used to validate
open/write/unlink
. i.e: tweakEmu
to have apub(crate) scratch: Option<Vec<u8>>
member, have it default-initialized to some sample string (e.g:Some(b"sample scratch vFile".to_vec())
), and then write up thepread/pwrite/unlink
handlers appropriately.
Now that we're getting close to the finish line, it's also a good time to update the |
I want to implement real filesystem access in example, which may help people who want to use this feature. There are two ways to do it:
What do you think about it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To expand on my comment regarding the API docs: one of my goals with gdbstub
is to present an API + docs that don't require users to go down the rabbit-hole of reading the actual GDB RSP. i.e; if you've never used the GDB RSP before, you should be able to get up and running with gdbstub
without having an intimate knowledge on the underlying protocol.
This is primarily accomplished via two complimentary avenues:
- wrangling the spartan GDP RSP C-style APIs (i.e: raw numbers + invariants enforced by "comments") into rich Rust APIs (structs + enums that have invariants enforced by types)
- extensively documenting the APIs, such that all invariants that aren't implicitly enforced by the Rust typesystem are explicitly documented (e.g: "return number of bytes read")
Whenever possible, it's better to encode invariants in the type system, which we've managed to do pretty well with this latest API iteration. i.e: the return type of HostIoResult<(), Self>
replaces the need to explicitly document "this method must return zero on success, or -1 + errno on error" - the Rust type system makes it impossible to return some kind of invalid value from the method.
Hopefully this provides some insight and guidance into why we've been iterating on the API so heavily. With each iteration, we've gradually shifted from the initial C-style API in your original PR, into a rich Rust-style API.
It's really a microcosm for writing low-level Rust code in general: why even write things in Rust if you're just going to end up writing code that looks like C 😄
I think this would be a great idea! One thing I should mention right off the bat though: you'd probably need to include some kind of shim for But with that out of the way: yes, I think it'd be nice to have some kind of in-tree "helper" implementation for implementing vFile (that would most-likely have to be feature-gated behind e.g: something like pub struct HostIoHelper<T: Target> {
_target: PhantomData<T>,
...
}
impl<T: Target> HostIoHelper<T> {
pub fn new() -> HostIoHelper { .. }
pub fn pread<'a>(
&mut self,
fd: u32,
count: usize,
offset: usize,
output: HostIoOutput<'a>,
) -> HostIoResult<HostIoToken<'a>, T> {
// ... impl
}
// etc... for all other methods
}
// and then in a target's vFile impl
impl target::ext::host_io::HostIoPread for MyTarget {
fn pread<'a>(
&mut self,
fd: u32,
count: u32,
offset: u32,
output: HostIoOutput<'a>,
) -> HostIoResult<HostIoToken<'a>, Self> {
// assuming they've already instantiated HostIoHelper earlier
self.host_io_helper.pread(fd, count, offset, output)
}
} Now, as for whether to use Some final considerations:
|
I do some searching in gdb's source code.
And the call path to So we don't need to cover all possible combinations of |
I think the implementation now provides a good example for users. What, |
Alright, I relent, we can keep this current implementation (with the one caveat below) 🙂
I still feel strongly that in the example code, we should at least do something like: // disallow access to /proc/, as this would give bogus reading when running `info proc <...>`
if filename.starts_with(b"/proc") {
return Err(HostIoError::Errno(HostIoErrno::ENOENT)); // or some other appropriate error code
} It's not that much extra code, and it's a nice safety-guard against folks shooting themselves in the foot and not handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, just gave one more holistic pass over the PR, and left some comments. Mostly nit / doc related stuff.
I'm super excited that we're only one or two iterations away from merging this bad boy!
examples/armv4t/gdb/host_io.rs
Outdated
let path = match std::str::from_utf8(filename) { | ||
Ok(v) => v, | ||
Err(_) => return Err(HostIoError::Errno(HostIoErrno::ENOENT)), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically, we should probably use std::ffi::OsStr
here, and then have some platform-specific cfg
blocks to decide how to construct the OsStr (e.g: from_bytes
on Unix, something else on Windows, etc...)
If you want to try and do things the Right Way in this PR, feel free to take a crack at it. Otherwise, this approach is fine for now, and we can punt the nitty-gritty details of how to properly handle this to the follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't rust have a universal way to handle with filename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike other languages, which try to pave over platform differences when possible (Go is one example that comes to mind), Rust takes the approach of "abstract when possible, but use platform-specific behavior when a reasonable abstraction is impossible".
Essentially, Rust gives you the flexibility to be as "correct" as you'd like, without implicitly locking you in to some kind of built-in abstraction.
The "easy" approach that many applications use is to convert paths into str
, and then wrap those in a std::path::Path
. I'd wager that this works fine 95% of the time, which is why I don't mind taking this approach here.
It's moreso that there are some file paths that aren't valid str
s on Unix/Windows, and to be 100% correct, there should be platform-specific logic to convert raw &[u8]
buffers into platform-specific std::ffi::OsStr
values (which can then be wrapped in std::path::Path
).
...but that's hard, and the extra 5% might not be worth it. I just thought I'd point it out, moreso as an opportunity for "learning" rather than as something we'd actually want to dig into / properly handle.
About the How GDB creates it on Windows: And in this page https://sourceware.org/gdb/onlinedocs/gdb/struct-stat.html:
Does this mean these fields are not used? |
About the transformation from There are more Refer: rust-lang/rust#86442 |
Sure, if we're going to try and pave over platform-specific stuff, might as well do what GDB does. As part of the implementation, make sure to leave a comment "citing your sources" for why certain fields are being filled with dummy-data.
I'm not 100% sure, but I think the wording here is written from the perspective of the other kind of I/O packets defined by the GDB RSP - i.e: target-initiated file I/O. If you read it from that perspective, it makes a bit more sense, as the target will be reading data about the host's files, and things like "inode number" or "host-side uid/gid" won't have much meaning to the target. If you want to dig into the reference GDB client / gdbserver implementation to see what it does when implementing the
The |
Which source? The link I provided seems not be so persuasive🧐 |
Just find that we can't use gdbstub/src/protocol/packet.rs Lines 60 to 63 in c40145c
And when RUST_LOG=trace is enabled, pread with binary data will cause a panic:gdbstub/src/protocol/response_writer.rs Line 57 in c40145c
pwrite with binary data will lead to log of <invalid packet>
|
Oh, I don't have anything specific in mind. What I meant to say is that you should make sure to leave a inline comment that explains why the specific data is being stubbed out, potentially citing sources if need be.
Oh shit. Would you look at that. It seems that past-me assumed that all GDB RSP packets would be 7-bit clean, when the reality is that some of the later packets (such as It's a good thing you caught this now, because this is a serious bug that we need to address! To add to those examples you gave, we'd also have to consider gdbstub/src/gdbstub_impl/mod.rs Line 566 in c40145c
I did a quick audit of the The exception to that would be: #[cfg(feature = "std")]
trace!(
"--> ${}#{:02x?}",
core::str::from_utf8(&self.msg).unwrap(), // buffers are always ascii
checksum
); In this case, I'd actually swap it out for: #[cfg(feature = "std")]
trace!(
"--> ${}#{:02x?}",
String::from_utf8_lossy(&self.msg),
checksum
); i.e: use, because this is already gated behind the Long term, I'll spend a bit of time playing around with a good way to debug-print mostly ASCII strings in Also, quite importantly, now that you've pointed out how to properly handle binary data, you'll also need to fix the following code in // ...
let offset = decode_hex_buf(body.next()?).ok()?;
let data = body.next()?; // <-- incorrect
Some(vFilePwrite{fd, offset, data})
// ... Binary data sent to the client may include escaped bytes. Any instances of This means that you'll have to write a It seems that we've stumbled into a small unforseen pit of complexity, but I'm glad we spotted this before the PR was merged. |
Because it's just an example, users can fill it themselves if they really need it...🙃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright, did one last holistic overview of the PR, and left the (what I think will be) the very last batch of nits.
if cfg!(feature = "paranoid_unsafe") { | ||
Ok(&mut buf[..j]) | ||
} else { | ||
unsafe { Ok(buf.get_unchecked_mut(..j)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, my bad: debug_assert!(false)
is obviously incorrect, but we should add a debug_assert!(j <= buf.len())
before calling get_unchecked_mut
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why there are no debug_assert
in decode_hex_buf
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no reason aside from the fact that I forgot to add them. I should probably insert them at some point, just in case...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a looooong journey, but I'm happy to say that I think we've finally made it 🎉
Thanks again for putting up with my endless battery of nits and comments - I sincerely hope you've come away from this PR with a better idea of gdbstub
's philosophy to code quality and correctness. This is a project that really tries to embody the Rust ethos of "Fast, Reliable, Productive. Pick Three", but pulling that off does require a bit of a shift in mindset, if you're coming from a more traditional C/C++ background.
And with that, lets merge this bad boy 🚀
Description
Support Host I/O operations
Refer: #32
API Stability
Checklist
cargo build
compiles withouterrors
orwarnings
cargo clippy
runs withouterrors
orwarnings
cargo fmt
was runexamples/armv4t
examples/armv4t
withRUST_LOG=trace
+ any relevant GDB output under the "Validation" section below./scripts/test_dead_code_elim.sh
and/or./example_no_std/check_size.sh
)Arch
implementationValidation
GDB output
armv4t output