Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: commit function #297

Merged
merged 3 commits into from
Jul 11, 2024
Merged

Conversation

aner-starkware
Copy link
Contributor

@aner-starkware aner-starkware commented Jul 10, 2024

This change is Reviewable

Copy link

Benchmark movements:
full_committer_flow performance regressed!
full_committer_flow time: [29.652 ms 29.727 ms 29.817 ms]
change: [+2.5601% +2.8884% +3.2441%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe

Copy link

Benchmark movements:
full_committer_flow performance regressed!
full_committer_flow time: [29.403 ms 29.454 ms 29.507 ms]
change: [+2.3108% +2.5806% +2.8388%] (p = 0.00 < 0.05)
Performance has regressed.

@codecov-commenter
Copy link

codecov-commenter commented Jul 10, 2024

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 70.69%. Comparing base (adf31b2) to head (a67d9ad).
Report is 1 commits behind head on main.

Files Patch % Lines
crates/committer_cli/src/main.rs 0.00% 2 Missing ⚠️
crates/committer_cli/src/commands.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #297      +/-   ##
==========================================
+ Coverage   70.65%   70.69%   +0.03%     
==========================================
  Files          38       38              
  Lines        2021     2020       -1     
  Branches     2021     2020       -1     
==========================================
  Hits         1428     1428              
+ Misses        524      523       -1     
  Partials       69       69              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor Author

@aner-starkware aner-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python side:
https://reviewable.io/reviews/starkware-industries/starkware/35461#-

Reviewable status: 0 of 4 files reviewed, all discussions resolved (waiting on @TzahiTaub and @yoavGrs)

Copy link
Contributor

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware and @yoavGrs)


crates/committer_cli/src/commands.rs line 10 at r1 (raw file):

};

pub async fn commit(input: Input<ConfigImpl>, output_path: String) {

I don't know if it makes more sense for this to be Input or RawInputso I'm just raising the question @dorimedini-starkware

Code quote:

commit(input: Input<ConfigImpl>

crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

        ))
    }
}

IIUC the purpose of this wrapper, can't you add something like
#[cfg_attr(test, derive(Deserialize))] above the Input declaration and remove this struct?

Code quote:

struct CommitterInput(Input<ConfigImpl>);

impl<'de> Deserialize<'de> for CommitterInput {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        Ok(Self(
            parse_input(&String::deserialize(deserializer)?).unwrap(),
        ))
    }
}

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware, @TzahiTaub, and @yoavGrs)


crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

    criterion.bench_function("full_committer_flow", |benchmark| {
        benchmark.iter(|| {
            // TODO(Aner, 10/07/2024): replace functions with running main.

WDYM? you want to open a subprocess that runs the committer_cli?

Code quote:

// TODO(Aner, 10/07/2024): replace functions with running main.

crates/committer_cli/src/commands.rs line 10 at r1 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

I don't know if it makes more sense for this to be Input or RawInputso I'm just raising the question @dorimedini-starkware

I think we should separate deserialization of input from the actual business logic, so I'm in favor of Input


crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

IIUC the purpose of this wrapper, can't you add something like
#[cfg_attr(test, derive(Deserialize))] above the Input declaration and remove this struct?

I also don't think you need a wrapper; just impl<'de, T: Config> Deserialize<'de> for Input<T> { ... }

Copy link
Contributor

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @dorimedini-starkware and @yoavGrs)


crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, dorimedini-starkware wrote…

WDYM? you want to open a subprocess that runs the committer_cli?

I think we discussed about calling the main function of the committer_cli directly.

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @TzahiTaub and @yoavGrs)


crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

I think we discussed about calling the main function of the committer_cli directly.

I don't think that's good practice; it's the entry point of the binary (only argument parsing and logger initialization). the contents of this main is only this:

            let input_string = read_from_stdin();
            commit(&input_string, output_path).await;

... so if you want to include input deserialization, just extract these two lines into a function, and call it

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @TzahiTaub and @yoavGrs)


crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, dorimedini-starkware wrote…

I don't think that's good practice; it's the entry point of the binary (only argument parsing and logger initialization). the contents of this main is only this:

            let input_string = read_from_stdin();
            commit(&input_string, output_path).await;

... so if you want to include input deserialization, just extract these two lines into a function, and call it

... but since you're not using stdin, I see no reason to do even that

@aner-starkware
Copy link
Contributor Author

crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, dorimedini-starkware wrote…

... but since you're not using stdin, I see no reason to do even that

I think the logic is that:

  1. If someone changes main (e.g., using a different commit according some flag), these changes are automatically transferred to this benchmarking, they need to be written manually.
  2. We're not benchmarking exactly the same thing. For example, we don't time reading from std_in (we assume that read_from_stdin() is cheap, but we might be missing something like when we wrote to file wrong).

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware, @TzahiTaub, and @yoavGrs)


crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, aner-starkware wrote…

I think the logic is that:

  1. If someone changes main (e.g., using a different commit according some flag), these changes are automatically transferred to this benchmarking, they need to be written manually.
  2. We're not benchmarking exactly the same thing. For example, we don't time reading from std_in (we assume that read_from_stdin() is cheap, but we might be missing something like when we wrote to file wrong).

if you want to invoke the CLI binary (i.e. subprocess?), then using main is good.
because of (1) and (2), this is probably a good idea :)
but no reason to call the main function from within rust IMO, unless you want to pass arguments (which is the point of the main function).

@aner-starkware
Copy link
Contributor Author

crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, dorimedini-starkware wrote…

I also don't think you need a wrapper; just impl<'de, T: Config> Deserialize<'de> for Input<T> { ... }

This cannot be done without a larger refactoring, because Input is in a different crate (committer) than the function that deserializes it (parse_input in committer_cli).

@aner-starkware
Copy link
Contributor Author

crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, dorimedini-starkware wrote…

if you want to invoke the CLI binary (i.e. subprocess?), then using main is good.
because of (1) and (2), this is probably a good idea :)
but no reason to call the main function from within rust IMO, unless you want to pass arguments (which is the point of the main function).

I don't understand

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @TzahiTaub and @yoavGrs)


crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, aner-starkware wrote…

This cannot be done without a larger refactoring, because Input is in a different crate (committer) than the function that deserializes it (parse_input in committer_cli).

ah... right, cannot impl T for S if both T and S are defined outside the current crate.
your solution (wrapping Input in a type defined within the crate) seems like the good one

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @aner-starkware, @TzahiTaub, and @yoavGrs)


crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, aner-starkware wrote…

I don't understand

I agree with you that we should run main here.
however, you shouldn't "call main" directly from rust code, you should be building the release binary and running it.
if this is impossible, then I don't think we can do better than the current solution: main may change but it's logic is mainly for argument parsing anyway, which is only relevant if you run the binary...
tl;dr: I would keep the code as-is for now, and in subsequent work see if the binary itself can be run from the context of the benchmark

@aner-starkware
Copy link
Contributor Author

crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, dorimedini-starkware wrote…

ah... right, cannot impl T for S if both T and S are defined outside the current crate.
your solution (wrapping Input in a type defined within the crate) seems like the good one

It would be good if it weren't for the annoying .0 that needs to be added...

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @TzahiTaub and @yoavGrs)


crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, aner-starkware wrote…

It would be good if it weren't for the annoying .0 that needs to be added...

if you add derive_more::Deref this issue should go away usually

Copy link

Benchmark movements:
tree_computation_flow performance regressed!
tree_computation_flow time: [34.246 ms 34.634 ms 35.103 ms]
change: [+2.4263% +3.6971% +4.9697%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) high mild
10 (10.00%) high severe

full_committer_flow performance regressed!
full_committer_flow time: [29.009 ms 29.047 ms 29.090 ms]
change: [+1.7006% +1.9061% +2.1102%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe

@aner-starkware
Copy link
Contributor Author

crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, dorimedini-starkware wrote…

if you add derive_more::Deref this issue should go away usually

How do you add it? I couldn't figure it out. But I implemented Deref directly, is it OK?

Copy link

Benchmark movements:
tree_computation_flow performance regressed!
tree_computation_flow time: [34.246 ms 34.646 ms 35.130 ms]
change: [+1.4884% +3.0377% +4.4971%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) high mild
10 (10.00%) high severe

full_committer_flow performance regressed!
full_committer_flow time: [29.068 ms 29.100 ms 29.135 ms]
change: [+1.2945% +1.5670% +1.7892%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 2 of 4 files reviewed, 2 unresolved discussions (waiting on @aner-starkware, @TzahiTaub, and @yoavGrs)


crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, aner-starkware wrote…

How do you add it? I couldn't figure it out. But I implemented Deref directly, is it OK?

yes, but what I suggested doesn't work...? non-blocking


crates/committer_cli/src/tests/regression_tests.rs line 34 at r2 (raw file):

        &self.0
    }
}

this doesn't work?

Suggestion:

#[derive(derive_more::Deref)]
struct FactMap(Map<String, Value>);

Copy link
Contributor Author

@aner-starkware aner-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 4 files reviewed, 2 unresolved discussions (waiting on @dorimedini-starkware, @TzahiTaub, and @yoavGrs)


crates/committer_cli/src/tests/regression_tests.rs line 50 at r1 (raw file):

Previously, dorimedini-starkware wrote…

yes, but what I suggested doesn't work...? non-blocking

Done.


crates/committer_cli/src/tests/regression_tests.rs line 34 at r2 (raw file):

Previously, dorimedini-starkware wrote…

this doesn't work?

Works, thanks :)

Copy link
Contributor

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r4.
Reviewable status: 3 of 4 files reviewed, all discussions resolved (waiting on @yoavGrs)


crates/committer_cli/benches/committer_bench.rs line 69 at r1 (raw file):

Previously, dorimedini-starkware wrote…

I agree with you that we should run main here.
however, you shouldn't "call main" directly from rust code, you should be building the release binary and running it.
if this is impossible, then I don't think we can do better than the current solution: main may change but it's logic is mainly for argument parsing anyway, which is only relevant if you run the binary...
tl;dr: I would keep the code as-is for now, and in subsequent work see if the binary itself can be run from the context of the benchmark

Let's try to replace it as in the TODO in another PR.

@aner-starkware aner-starkware added this pull request to the merge queue Jul 11, 2024
Merged via the queue into main with commit f9d223a Jul 11, 2024
13 of 14 checks passed
@aner-starkware aner-starkware deleted the aner/refactor_commit_function branch July 13, 2024 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants