Serde framework in rust
serde-rs
This is the de-facto standard for serializing and deserializing data in Rust. It does not provide implementations, but provides an API and helpers based around it in which a serializer can be plugged in minimally and gotten to work. Out of the box, Serde is able to serialize and deserialize common Rust data types in any of the supported formats. There are also derive macros that allow you to do it with your own custom structs.
The data model
There's the Serializer trait and the Deserializer trait. And now there are two implementations for
each struct and both the traits.
Types
There are a total of 29 types in the Serde data model.
Primitives:
- bool
- i8, i16, i32, i64, i128
- unsigned variants of above
- f32,64
- char
String:
- UTF-8 string
- They may be transient, borrowed, or owned during deserializing.
Byte array: same as above, just byte array instead
Option type:
Unit:
- Unnamed and no data
Unit struct:
- Named and no data
Unit variant:
- Enum without data
Newtype struct:
- exactly what it is in rust
Newtype variant:
- constructor of enum with data
Sequence type:
- Akin to hashset and vector
- hetergenous
- Dynamically sized
Tuple:
- Static sized and hetergenous
- Same as rust
- However, anything whose size is known at compile time, so even fixed length arrays
Tuple struct:
- Named tuples
Tuple variants:
- Name tuple constructors for enums
Map:
- Variable sized heterogenous key value pairs
Struct:
- Same as a rust struct
- compile time string keys
Struct variant:
- Enum constructor as a struct
Serialization
First is Serialize and then Serializer.
So the flow goes like this for serializing:
<Data Struct> ---- > <Serialize> ---- > <Serializer> ---- > <Output>
- Serialize
This maps the data structure to the data model of Serde by invoking exactly one method of the Serializer.
This by itself contains only one method.
pub trait Serialize { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: Serializer; }If you're implementing this by yourself, then you just need to find the one method in Serializer that'll serialize the type you're implementing this for.
Sequence types follow a three step process:
- init with the length
- serialize all the elements
- end
impl<T> Serialize for Vec<T> where T: Serialize, { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: Serializer, { let mut seq = serializer.serialize_seq(Some(self.len()))?; for e in self { seq.serialize_element(e)?; } seq.end() } } impl<K, V> Serialize for MyMap<K, V> where K: Serialize, V: Serialize, { fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: Serializer, { let mut map = serializer.serialize_map(Some(self.len()))?; for (k, v) in self { map.serialize_entry(k, v)?; } map.end() } }For tuples, since the type has known length, it need not be serialized into the final type.
Structs are done like the sequences and tuples for regular and tuple and structs. For newtype and unit, they behave more like the primitives.
For enums, just do a match on the individual variant, and then serialize the variant.
- Serializer
This implements the mapping of the serde data mode to the output format. This has nothing but an output field in the type over which the impl is provided and many methods. Apart from that it may also have some helpers here and there.
Deserialization
The flow is like this:
passed to the deserializer and driven
+---------------------------------------+
wrap | create |
<Data format> ---- > <Deserializer> ---- > <Deserialize> ---- > <Visitor> ---- > <Data Struct>
The Deserialize trait basically provides the Deserializer with a Visitor to the data format that'll drive the parsing forward.
- Deserialize
The impl looks like this. The body inside should be fairly straightforward. It'll pass a Visitor for the type it expects, use the result and maybe do something useful with it, and then maybe call some more visitors, and finally spit out a result.
pub trait Deserialize<'de>: Sized { fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> where D: Deserializer<'de>; } - Deserializer
This supports two types of entry-points which enable different kinds of deserializations.
deserialize_any: Look at the format and decide for yourself. This is mostly used for self-describing formats, where you can see what the next type's gonna be. Such as JSON.deserialize_*: Use the methods to deserialize non-describing formats such as Postcard. These act as hints for the deserializer for knowing what to deserialize next. They can't deserialize something likeserde_json::Valuethat depend on the former method.
Internally, it stores the kind of input it read and slowly and steadily deserializes it. The trait just hands those over to the visitor.
The reason it has a lifetime is because it'll internally try to keep as many references to the original data as possible, allowing it to
- Visitor
This is something passed to the Deserializer so that it can call the appropriate conversion methods from the Serde internal types that it encounters. Anything that's unimplemented by the visitor is treated as an error.
