Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop rustc-serialize dependency #20

Open
dtolnay opened this issue Apr 23, 2017 · 6 comments
Open

Drop rustc-serialize dependency #20

dtolnay opened this issue Apr 23, 2017 · 6 comments
Assignees

Comments

@dtolnay
Copy link

dtolnay commented Apr 23, 2017

It has been deprecated: announcement.

@vmx
Copy link
Member

vmx commented Apr 23, 2017

The problem is that we are using the streaming JSON parser from rustc-serialize. Last time I've checked there wasn't any good alternative. Though I'll see if this has changed in the meantime.

@dtolnay
Copy link
Author

dtolnay commented Apr 23, 2017

The StreamDeserializer type provides streaming JSON deserialization.

extern crate serde_json;

use serde_json::{Deserializer, Value};

fn main() {
    let data = "{\"k\": 3}  {}  [0, 1, 2]";

    // You can make it a stream of any type, in this case serde_json::Value.
    let stream = Deserializer::from_str(data).into_iter::<Value>();

    for value in stream {
        println!("{}", value.unwrap());
    }
}

If this isn't what you mean, can you give an example of streaming parsing?

@vmx
Copy link
Member

vmx commented Apr 23, 2017

With a streaming parser I mean that it starts parsing without having the input fully read. So that it can parse large JSON files without the need to keep it completely in memory.

@vmx
Copy link
Member

vmx commented Apr 23, 2017

To be fair, we don't use rust-serialize that way at the moment, and to be honest I'm actually not even sure that's possible rust-serialize.

@dtolnay
Copy link
Author

dtolnay commented Apr 23, 2017

I see. I looked at all three links in src/json_shred.rs and I think by that definition practically every use of serde_json is a streaming parser. Those links are just really complicated workarounds for not buffering the entire input into a rustc_serialize::json::Json value before decoding it, which serde_json doesn't do. Unlike in rustc_serialize, Serde's deserialize implementations can do whatever stream processing they want without those workarounds. Here is one example. The input data also doesn't have to be in memory if you use serde_json::from_reader.

If there is a specific task you want to accomplish in a streaming way I can help you work through it.

@vmx
Copy link
Member

vmx commented Apr 24, 2017

After long discussions on IRC (thanks @dtolnay for having so much time for me) I've a rough template for the streaming parser that I had in mind. This code should be the basis for replacing rustc_serialize:

extern crate serde;
extern crate serde_json;

use std::fmt;
use std::fs::File;

use serde::de::{Deserialize, Deserializer, DeserializeSeed, MapAccess, SeqAccess, Visitor};


struct ValueVisitor;


impl<'de> DeserializeSeed<'de> for ValueVisitor {
    type Value = ();

    fn deserialize<D>(self, deserializer: D) -> Result<(), D::Error>
        where D: serde::Deserializer<'de>
    {
        deserializer.deserialize_any(self)
    }
}


impl<'de> Visitor<'de> for ValueVisitor {
    type Value = ();

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("any valid JSON value")
    }

    #[inline]
    fn visit_bool<E>(self, value: bool) -> Result<(), E> {
        println!("vmx: visit_bool: {:?}", value);
        Ok(())
    }

    #[inline]
    fn visit_i64<E>(self, value: i64) -> Result<(), E> {
        println!("vmx: visit_i64: {:?}", value);
        Ok(())
    }

    #[inline]
    fn visit_u64<E>(self, value: u64) -> Result<(), E> {
        println!("vmx: visit_u64: {:?}", value);
        Ok(())
    }

    #[inline]
    fn visit_f64<E>(self, value: f64) -> Result<(), E> {
        println!("vmx: visit_f64: {:?}", value);
        Ok(())
    }

    #[inline]
    fn visit_str<E>(self, value: &str) -> Result<(), E>
        where
        E: serde::de::Error,
    {
        println!("vmx: visit_string: {:?}", value);
        Ok(())
        //self.visit_string(String::from(value))
    }

    #[inline]
    fn visit_none<E>(self) -> Result<(), E> {
        println!("vmx: visit_null");
        Ok(())
    }

    #[inline]
    fn visit_some<D>(self, deserializer: D) -> Result<(), D::Error>
        where
        D: serde::Deserializer<'de>,
    {
        Deserialize::deserialize(deserializer)
    }

    #[inline]
        fn visit_unit<E>(self) -> Result<(), E> {
        Ok(())
    }

    #[inline]
    fn visit_seq<V>(self, mut visitor: V) -> Result<(), V::Error>
        where
        V: SeqAccess<'de>,
    {
        println!("vmx: visit_seq: start");
        while let Some(_) = visitor.next_element_seed(ValueVisitor)? {}
        println!("vmx: visit_seq: end");
        Ok(())
    }

    fn visit_map<V>(self, mut visitor: V) -> Result<(), V::Error>
        where
        V: MapAccess<'de>,
    {
        println!("vmx: visit_map: start");
        while let Some(_) = visitor.next_entry_seed(ValueVisitor, ValueVisitor)? {}
        println!("vmx: visit_map: end");
        Ok(())
    }
}


fn main() {
    //let data = r#"{"hello": [1, 2, 3]}"#;
    //let stream: Value = serde_json::from_str(data).unwrap();
    let data = File::open("test.json").unwrap();
    //let _: Value = serde_json::from_reader(data).unwrap();
    let _ = serde_json::Deserializer::from_reader(data).deserialize_any(ValueVisitor);
}

@vmx vmx self-assigned this May 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants