-
Notifications
You must be signed in to change notification settings - Fork 144
RUST-1998 Remove lossy utf8 as a decoder option #550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3df66fd
to
328ac6d
Compare
@@ -386,13 +386,6 @@ impl crate::DateTime { | |||
} | |||
} | |||
|
|||
#[deprecated(since = "2.3.0", note = "Use try_to_rfc3339_string instead.")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is completely unrelated but I happened to notice it and figured we should remove deprecated methods while we're doing a major version release. I did a grep and didn't find any others.
src/raw/document.rs
Outdated
@@ -531,6 +531,21 @@ impl RawDocument { | |||
let bytes = self.cstring_bytes_at(start_at)?; | |||
try_to_str(bytes) | |||
} | |||
|
|||
/// Copy this into a [`Document`], returning an error if invalid BSON is encountered. | |||
pub fn to_document(&self) -> Result<Document> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this from RawDocumentBuf
since that was calling the TryInto
impl on RawDocument
anyway and this way both types can use it (RawDocumentBuf
derefs to this one).
src/raw/error.rs
Outdated
Self::MalformedValue { message: r_message }, | ||
) => l_message == r_message, | ||
(Self::Utf8EncodingError, Self::Utf8EncodingError) => true, | ||
(Self::DeError(_), Self::DeError(_)) => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crate::de:Error
doesn't impl PartialEq
because it wraps std::io::Error
, which also doesn't, hence the manual impl. I figure this whole situation can be revisted for RUST-1406.
This is all needed so I can use crate::de::reader_to_vec
in RawDocumentBuf::from_reader
; the former uses the crate::de
error hierarchy, the latter the crate::raw
one.
@@ -266,7 +268,47 @@ impl<'a> RawElement<'a> { | |||
}) | |||
} | |||
|
|||
pub(crate) fn value_utf8_lossy(&self) -> Result<Option<Utf8LossyBson<'a>>> { | |||
pub fn value_utf8_lossy(&self) -> Result<RawBson> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted to expose this method rather than the lower-level value_utf8_lossy_inner
because in my estimation the confluence between needing to deal with invalid utf8 and needing to save the memory cost of copying a single element is very small, and outside of that this is a strictly more useful interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, it'll be equivalent performance-wise, since using the wrapper type just enables behavior the same way |
96b7844
to
6930185
Compare
dc3585d
to
8679191
Compare
RUST-1998
This removes lossy utf8 decoding as a public option for the bson serde decoder, and as a consequence many of the API entry points. Dealing with lossy utf8 is now done in one of three ways depending on the specific need:
Utf8LossyDeserialization
helper type, if the user is going viaserde
and knows that particular fields might have bad data;RawElement::value_utf8_lossy
method, which allows fine-grained iteration over potentially-invalid raw data, orRawDocument::to_document_utf8_lossy
method, which is a convenience wrapper over the above to bulk-convert an entireRawDocument
at once.