-
-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CVE-2024-43785: gitoxide-core does not neutralize special characters for terminals #1534
Comments
Thanks a lot for setting up this issue!
Initially I thought that one would want to prevent any terminal escape sequences in A utility to perform the 'Debug-print if needed' operation efficiently, i.e. without allocation by reusing a buffer, is probably the way to go. I'd keep a single buffer, write the |
I might be misunderstanding something, but why would the debug version be needed to be written to a (reused) buffer instead of just printing it directly?
|
The debug version always prints with surrounding |
Couldn't one still eliminate the need for the buffer by instead checking if the string is display-safe as is and based on that either normal or display-safe printing it? Either way one would need to iterate over the string twice. I would expect to debug-print to a buffer and then printing that buffer to be slower than checking whether the string is display-safe and then either normal or display-safe printing directly. |
I'd expect that too, but didn't think the complexity of attempting to predict |
This comment was marked as resolved.
This comment was marked as resolved.
I would expected display-save printing would need to be separate from debug printing anyways especially on windows as I would expect |
I think it depends where the paths are coming from. For the most part I think the paths are from repository metadata and shown with But I'm not sure how they are shown in error messages related to the inability to check them out (which paths with most control characters, including any ANSI escape sequences, would produce when attempting to create them on Windows, at least under ordinary conditions, since these characters are generally prohibited in paths on Windows). |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Would it make sense to enforce safe terminal display at the type system level by replacing direct BStr terminal output with something like: pub struct SafeTermPath<'a>(&'a BStr);
impl<'a> std::fmt::Display for SafeTermPath<'a> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
// Central safe display logic here
// Escape/quote as needed based on content
}
} The proposed approach would enable us to
For performance-critical paths like ls-entries, we could
I'm new to the codebase, so please let me know if I'm missing any important considerations or if this approach aligns with the project's goals. |
Thanks for chiming in. I also think that would be fastest, but would also require the parts that are 'vulnerable' to be retrofitted. There probably isn't too many of these, so that should be fine. I also thought about having an So overall, I think the proposed method of using a way of writing to |
Thanks for the feedback. The performance concerns with the pub trait TerminalSafe: AsRef<[u8]> {
fn needs_escaping(&self) -> bool;
}
pub struct TerminalBytes<B: AsRef<[u8]>> {
inner: B,
needs_escaping: bool,
}
impl<B: AsRef<[u8]>> TerminalBytes<B> {
pub fn new(bytes: B) -> Self {
Self {
needs_escaping: contains_terminal_escapes(bytes.as_ref()),
inner: bytes,
}
}
} Instead of checking every write operation (which would be slowest), this caches the safety check at creation time. When writing: fn print_status(writer: &mut impl Write, item: impl AsRef<[u8]> + TerminalSafe) -> io::Result<()> {
if item.needs_escaping() {
write_escaped(writer, item.as_ref())
} else {
// Fast path for known-safe content
writer.write_all(item.as_ref())
}
} This means
What are your thoughts on an approach like this instead of wrapping every Write operation? |
Thanks for sharing! In practice, each item, like a path in the index, is only written once, while it also won't have anything to escape. So I'd probably dumb in down and just write it escaped each time to avoid any extra complexity. Only |
Current behavior 😯
This issue is for tracking the public vulnerability CVE-2024-43785 (GHSA-88g2-r9rw-g55h, RUSTSEC-2024-0364).
Further details, including detailed instructions to reproduce the main effect described, are in the advisory, available in:
As noted, this vulnerability is low-risk. It was decided, following a coordinated disclosure, that informing users would be beneficial, even before a patch is available. Contributions are welcome.
Expected behavior 🤔
General expectations
When sufficiently capable of misleading the user or (though less of a threat) interfering with the operation of the terminal, special characters should be escaped in the output of
gix
andein
commands, except when raw output has been requested or is otherwise expected.In addition, for paths, when displaying them in human-readable (as opposed to JSON) text in a terminal, there is little to no disadvantage to quoting them using an unambiguous scheme. I think this should happen in the situations where
git
always quotes paths, i.e., for paths thatgit
quotes even ifcore.quotePath
has been set tofalse
.This does not necessarily mean they need to be quoted in the same way that
git
quotes them, nor thatcore.quotePath
itself needs to be implemented.Ideas for quoting
I am hoping it may be feasible to use the type system not just to implement specific display behavior, but to distinguish between text that may need escaping and text that is known not to need it, so that there would be fewer places, including in code that may be added in the future, where forgetting to calling a sanitization function or construct an safe-displaying object, or where using the wrong format specifier, would result in unintentionally outputting text that may contain terminal escape sequences. I don’t know if that’s feasible or if it’s a good idea.
Text from repositories is often
&BStr
, including when it is a path, as is the case for example ingixoxide_core::repository::tree::format_entry()
:https://github.com/Byron/gitoxide/blob/25a3f1b0b07c01dd44df254f46caa6f78a4d3014/gitoxide-core/src/repository/tree.rs#L181-L202
That is not the only place where paths sometimes need to be quoted.
Although the obvious way to express that something is a path is to represent it as a
Path
orPathBuf
, I don't think that is the best approach here. If I understand correctly, at least on Windows there’s no safe guaranteed-to-succeed conversion from&[u8]
or&BStr
to&OsStr
orPath
, nor a safe guaranteed-to-succeed way to construct anOsString
orPathBuf
from arbitrary bytes. Furthermore, if we had aPath
, it would still need custom printing to neutralize escape sequences.One approach may be to do this, though this Rust code may be better understood as pseudocode in that a faster approach with fewer allocations should probably be preferred:
That is, if quoting would keep it the same except the double quote marks around it, then show it verbatim not bothering with those quotation marks, and otherwise show it with the debug quoting provided by the
BStr
implementation.This quoting is sufficient to neutralize terminal escape sequences because it turns the escape character, most commonly represented as
\e
,\033
, or\x1B
, into this literal sequence (which I think is at least as good as those representations):It seems to me that this may be beneficial even outside of the issue of escape characters. For example, if a path stored in a Git repository has pieces that aren’t valid UTF-8, then it would probably be better to show the escape sequences for those bytes as
BStr
hasDebug
do, rather than to show the Unicode substitution character asBStr
hasDisplay
do.This is a lossless encoding. It is reasonably easy to parse, in case anyone wants to do that, because which of the two forms was used for the output is discernible by whether it has a leading
"
character. In particular, even if the original had a"
character, then the debug representation escapes it while also adding more quotes, and is therefore never equal to the original with quotes added, and thus would always be used rather than the original with quotes added.Git behavior
Paths
git
always quotes escape characters in paths when not told to do otherwise such as by-z
, and will perform additional quoting ifcore.quotePath
istrue
, which it is by default.The following rehashes a fragment of the advisory to compare the behavior of
git
andgix
, but it is not a substitute for the advisory.I created a file whose name I specified in
bash
using the$'
'
notation (a more portable and fully described approach is presented in the advisory) as:$'\033]0;Boo!\007\033[2K\r\033[91mError: Repository is corrupted. Run \033[96mEVIL_COMMAND\033[91m to attempt recovery.\033[0m'
Running
git ls-tree HEAD
shows this, which is identical to what it really outputs:In contrast, running
gix tree entries
changes the terminal title (until it is rewritten, which some shells prompts may do), and it shows this, in bright red and bright cyan as described above, such that it wrongly appears to be the entire output of the command:Here's a screenshot showing the appearance of both
git
andgix
commands:Non-path data
git
actually sometimes allows escape sequences from a repository through, such as in author and committer information shown in the output ofgit log
, as well as in changed blob contents shown in the output ofgit diff
. However, it seems to avoid it in situations where it could be seriously misleading, or where it would interfere with the operation of the terminal.Characters that could be especially misleading are represented symbolically, such as the backspace and carriage return characters. Escape sequences that could be especially misleading are simply not let through, such as attempts to reposition the cursor in the terminal, except that those are let through when the output device is not a terminal. Colorization, when allowed to change, seems always to be restored, though I have not exhaustively verified that. Attempts to conceal or mimic leading
+
,-
, or space characters in diffs do not seem to succeed.I believe the Git behavior is not vulnerable, even outside of the treatment of paths.
Steps to reproduce 🕹
See the "PoC" (proof of concept) section in CVE-2024-43785 (GHSA-88g2-r9rw-g55h, RUSTSEC-2024-0364).
See also the "Git behavior" section above, which includes both a brief description of reproducing this, a screenshot that is not currently in the advisory, as well as showing that Git is not affected.
The text was updated successfully, but these errors were encountered: