Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for writing SExpression in code generation #107

Merged
merged 8 commits into from
May 9, 2024

Conversation

desaikd
Copy link
Contributor

@desaikd desaikd commented May 2, 2024

Issue #105:

Description of changes:

This PR works on adding support for writing SExpression in code generation.

Generates Code:

  • Generated code for given ISL test files in Java can be found here
  • Generated code for given ISL test files in Rust can be found here

List of changes:

  • Adds changes to support sequence types with type constraint
    • Modifies AbstractDataType::Sequence to store sequence type (sexp or
      list) and element type
    • Adds element_type() and sequence_type() methods on
      AbstractDataType to get element type and sequence type details
    • Renames verify_abstract_data_type_consistency to
      verify_and_update_abstract_data_type
    • verify_and_update_abstract_data_type modified to check for special
      cases when type and element cosntraints occur together
    • JavaLanguage and RustLanguage updated target_type to return
      generic type for List and SExp
    • Adds wrapper_class in JavaLanguage to be sued within ArrayList
  • Adds result template to generate SerdeResult in Rust
    • Adds template to generate SerdeResult in Rust
    • Changes current tests to use SerdeResult
    • Changes generator to render SerdeReuslt in output module
  • Adds template file changes for SExp support
  • Adds new test files for SExp support
    • Adds new fields that uses SExpression in Ion files
    • Adds bad Ion files to verify SExpression and List are properly
      serialized-deserialized
    • Adds ISL files with Sexpression field changes
    • Adds changes to Java setter tests

Tests

Modifies roundtrip tests for given ISL files and input Ion values to verify SExpression support.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

* Modifies `AbstractDataType::Sequence` to store sequence type (sexp or
list) and element type
* Adds `element_type()` and `sequence_type()` methods on
`AbstractDataType` to get element type and sequence type details
* Renames `verify_abstract_data_type_consistency` to
`verify_and_update_abstract_data_type`
* `verify_and_update_abstract_data_type` modified to check for special
cases when `type` and `element` cosntraints occur together
* `JavaLanguage` and `RustLanguage` updated `target_type` to return
generic type for `List` and `SExp`
* Adds `wrapper_class` in `JavaLanguage` to be sued within `ArrayList`
* Adds template to generate `SerdeResult` in Rust
* Changes current tests to use `SerdeResult`
* Changes generator to render `SerdeReuslt` in output module
* Modifies class and struct template to add check for sequence type in
write_to API
* Modifies class and struct template to add check for sequence type in
read_from API
* Modifies struct template to use the new `SerdeResult` in `read_from`
and `write_to` API
* Adds new fields that uses SExpression in Ion files
* Adds bad Ion files to verify SExpression and List are properly
serialized-deserialized
* Adds ISL files with Sexpression field changes
* Adds changes to Java setter tests
@desaikd desaikd requested a review from popematt May 2, 2024 21:53
@@ -0,0 +1,9 @@
// nested struct with mismatched sequence type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Adds extra fields in input files to test SExpression support.

@@ -0,0 +1,32 @@
/// Represents serde result
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Adds new SerdeResult which will be generated for Rust code generation and used by read_from and write_to APIs.

@@ -21,8 +21,8 @@ mod tests {
}

#[test_resources("../../input/good/struct_with_fields/**/*.ion")]
fn roundtrip_good_test_generated_code_structs_with_fields(file_name: &str) -> IonResult<()> {
let ion_string = fs::read_to_string(file_name)?;
fn roundtrip_good_test_generated_code_structs_with_fields(file_name: &str) -> SerdeResult<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This file includes SerdeResult usage changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per offline discussion, SerdeResult will be a type that is generated alongside all the other code. This will guarantee that it's always available in codebases using codegen and does not require packaging a library per target language. It also means that we can evolve it more easily over time.

@@ -317,9 +317,9 @@ fn test_code_generation_in_rust(
assert!(contents.contains(expected_accessor));
}
// verify that it generates read-write APIs
assert!(contents.contains("pub fn read_from(reader: &mut Reader) -> IonResult<Self> {"));
assert!(contents.contains("pub fn read_from(reader: &mut Reader) -> SerdeResult<Self> {"));
Copy link
Contributor Author

@desaikd desaikd May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This file includes SerdeResult usage changes.

Comment on lines 56 to 59
Sequence {
element_type: String,
sequence_type: String,
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Changes Sequence variant to include information of sequence_type ("list" or "sexp") and element_type.

@@ -75,14 +79,30 @@ pub enum AbstractDataType {
Structure(bool),
}

impl AbstractDataType {
pub fn element_type(&self) -> Option<&String> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Adds element_type and sequence_type APIs to get Sequence variant information from AbstractDataType. These methods are used in verify_and_update_abstract_data_type which verifies that the abstract data type to be generated doesn't result into different type due to given ISL constraints.

let rendered = tera.render("import.templ", &Context::new()).unwrap();
let rendered_import = tera.render("import.templ", &Context::new()).unwrap();
// Render the SerdeResult that is used in generated read-write APIs
let rendered_result = tera.render("result.templ", &Context::new()).unwrap();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This statement renders SerdeResult for Rust code generation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might not be a good idea. If we need a SerdeResult of some sort, it should probably be defined in a "ion-codegen-runtime" crate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from another comment for convenient reference:

Per offline discussion, SerdeResult will be a type that is generated alongside all the other code. This will guarantee that it's always available in codebases using codegen and does not require packaging a library per target language. It also means that we can evolve it more easily over time.

@@ -386,14 +423,95 @@ impl<'a, L: Language + 'static> CodeGenerator<'a, L> {
/// ```
/// For the above schema, both `fields` and `type` constraints map to different abstract data types
/// respectively Struct(with given fields `source` and `destination`) and Value(with a single field that has String data type).
fn verify_abstract_data_type_consistency(
fn verify_and_update_abstract_data_type(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This method has changes for when type and element constraints are used together. This method updates AbstractDataType::Sequence { element_type, sequence_type } to contain element_type as per element constraint and sequence_type as per type constraint.

Comment on lines 87 to 95
{% if abstract_data_type["Sequence"].sequence_type == "list" %}
if(reader.getType() != IonType.LIST) {
throw new IonException("Expected list, found" + reader.getType() + "while reading {{ target_kind_name }}.");
}
{% elif abstract_data_type["Sequence"].sequence_type == "sexp" %}
if(reader.getType() != IonType.SEXP) {
throw new IonException("Expected sexpression, found" + reader.getType() + "while reading {{ target_kind_name }}.");
}
{% endif %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Adds exception statements for when sequence_type(list or sexp) doesn't match while reading Ion data.

Comment on lines 112 to 116
{% if abstract_data_type["Sequence"].sequence_type == "list" %}
writer.step_in(IonType::List)?;
{% else %}
writer.step_in(IonType::SExp)?;
{% endif %}
Copy link
Contributor Author

@desaikd desaikd May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Adds write statements based on list or sexp.

@desaikd desaikd requested a review from nirosys May 2, 2024 22:12
StructWithFields s = new StructWithFields("hello", 12, new AnonymousType2(a), 10e2);
StructWithFields s = new StructWithFields("hello", 12, new AnonymousType3(a), 10e2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly why we need #103

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be working on this in the next PR :)

code-gen-projects/schema/nested_struct.isl Outdated Show resolved Hide resolved
let rendered = tera.render("import.templ", &Context::new()).unwrap();
let rendered_import = tera.render("import.templ", &Context::new()).unwrap();
// Render the SerdeResult that is used in generated read-write APIs
let rendered_result = tera.render("result.templ", &Context::new()).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might not be a good idea. If we need a SerdeResult of some sort, it should probably be defined in a "ion-codegen-runtime" crate.

src/bin/ion/commands/beta/generate/generator.rs Outdated Show resolved Hide resolved
Comment on lines 4 to 10
/// Represents an error found during code generation
#[derive(Debug)]
pub enum SerdeError {
IonError { source: IonError },
SerializationError { description: String },
DeserializationError { description: String },
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be included in the template. If you ever try to compose code that was generated from different models, you'll have two different SerdeErrors with the same name, but you'll have to write code to be able to convert form one to the other.

Can you explain the rationale for not using IonError and IonResult?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is to distinguish between IonError (while reading Ion or writing Ion) and validation error based on ISL requirements. Another is IonFailure is non-public and hence we can't create IonError or IonResult from another crate. But I do understand your point here. There can be few options here and if you have better ideas let me know:

  1. We can make ValidationError(SerdeError) part of ion-schema-rust?
  2. Make IonFailure public and use it for validation as well?
  3. Use expect statements for all validation errors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 1 is possible, but not great. We should not require users to take a dependency on ion-schema-rust just to use the generated code.

Option 2 is reasonable, though I'd consider limiting the scope of what gets made public. Right now, if you have an Element that is an Ion int, and you call expect_float(), you will get a DecodingError. It's really a sort of type constraint error. Perhaps DecodingError is inappropriately overloaded, but using it is a possibility that doesn't introduce meaning other than the meaning it already has.

I think Option 3 is right out. Bad input is to be expected and should be easily recoverable; calling expect() can panic, which is not as easy to recover as an Err.

I think a 4th option is to add a new (possibly feature-gated) Result/Error type in ion-rust that is specific for this case and has a public constructor.

Finally, we could create a new crate called ion-codegen-runtime-support (or something like that) with the error type and any other common functionality, but this seems a little heavyweight to me.

@zslayton do you have any thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope was to have IonError mean that ion-rust produced an error, and not something in a layer higher up the stack. Making IonFailure public means committing to an API, but I'm hopeful that we can get more helpful errors (requiring storing extra context) soon.

Right now, if you have an Element that is an Ion int, and you call expect_float(), you will get a DecodingError. It's really a sort of type constraint error.

This is a good observation. I could be convinced that IonError should have a ConstraintError variant; however, I'd still want ISL validation errors to result in a distinct type.


Per our offline conversation, for the time being we're going to generate a ValidationError (name tbd) that's used by all of the types produced in the same codegen run. This covers the 80% use case, and someone using code generated from two or more disparate schemas can still unify them into a common type as they bubble up with ? by implementing From<ValidationError> for UnifiedError. We should probably add an option to give ValidationError a custom name.

Copy link
Contributor Author

@desaikd desaikd May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used SerdeError as the name here because read_from and write_to APIs could result into either validation error or IonError. And SerdeError sounds like a good name to incorporate both of these errors. Although I can change the variant names of SerdeError enum to point to the specific error that is IonError and ValidationError. What do you guys think? @zslayton @popematt

src/bin/ion/commands/beta/generate/utils.rs Outdated Show resolved Hide resolved
src/bin/ion/commands/beta/generate/utils.rs Outdated Show resolved Hide resolved
* Removes extra tab space
* Uses `Option<String>` to represent if `element_type` or
`sequence_type` is not present for AbtsractDataType
* Adds a check in code gen to return error when any field in
`tera_fields` have `None` value in `value_type`
* Adds tests for `type` and `element` constraint absence
* Modifies tests to include `type` constraint along with `element`
constraint
@@ -21,8 +21,8 @@ mod tests {
}

#[test_resources("../../input/good/struct_with_fields/**/*.ion")]
fn roundtrip_good_test_generated_code_structs_with_fields(file_name: &str) -> IonResult<()> {
let ion_string = fs::read_to_string(file_name)?;
fn roundtrip_good_test_generated_code_structs_with_fields(file_name: &str) -> SerdeResult<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per offline discussion, SerdeResult will be a type that is generated alongside all the other code. This will guarantee that it's always available in codebases using codegen and does not require packaging a library per target language. It also means that we can evolve it more easily over time.

src/bin/ion/commands/beta/generate/context.rs Show resolved Hide resolved
src/bin/ion/commands/beta/generate/context.rs Show resolved Hide resolved
let rendered = tera.render("import.templ", &Context::new()).unwrap();
let rendered_import = tera.render("import.templ", &Context::new()).unwrap();
// Render the SerdeResult that is used in generated read-write APIs
let rendered_result = tera.render("result.templ", &Context::new()).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from another comment for convenient reference:

Per offline discussion, SerdeResult will be a type that is generated alongside all the other code. This will guarantee that it's always available in codebases using codegen and does not require packaging a library per target language. It also means that we can evolve it more easily over time.

src/bin/ion/commands/beta/generate/generator.rs Outdated Show resolved Hide resolved
Comment on lines 4 to 10
/// Represents an error found during code generation
#[derive(Debug)]
pub enum SerdeError {
IonError { source: IonError },
SerializationError { description: String },
DeserializationError { description: String },
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope was to have IonError mean that ion-rust produced an error, and not something in a layer higher up the stack. Making IonFailure public means committing to an API, but I'm hopeful that we can get more helpful errors (requiring storing extra context) soon.

Right now, if you have an Element that is an Ion int, and you call expect_float(), you will get a DecodingError. It's really a sort of type constraint error.

This is a good observation. I could be convinced that IonError should have a ConstraintError variant; however, I'd still want ISL validation errors to result in a distinct type.


Per our offline conversation, for the time being we're going to generate a ValidationError (name tbd) that's used by all of the types produced in the same codegen run. This covers the 80% use case, and someone using code generated from two or more disparate schemas can still unify them into a common type as they bubble up with ? by implementing From<ValidationError> for UnifiedError. We should probably add an option to give ValidationError a custom name.

* Minor nit changes for spacing and compact representation
* Changes `SerdeError` enum to include `ValidationError` and removes
unused `SerializationError` and `DeserializationError`
@desaikd desaikd merged commit 7f5a92c into amazon-ion:master May 9, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants