-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: a convention for error handling #25
Comments
Just food for thought, what if the safety was baked into the type? E.g. have a |
Thinking a bit further into it, you could still have the three methods by doing something along the lines of impl Checked for Matrix {
fn get(...) {
// ...
}
}
impl Unchecked for UncheckedMatrix {
fn get(...) {
// ...
}
}
fn get<T: Checked>(m: T, c: usize, r: usize) -> ... {
match m.get(c, r) {
Ok(v) => v,
Err(_) => panic!(),
}
} Obviously that's only a rough thought but I think it might make a much cleaner API. |
Does this mean that a higher level function would either work with a safe matrix or an unsafe matrix only? I was thinking in the lines of linear algebra code where most of the time, one would use the safe matrix API (like elementary row/column operations), while occasionally getting dirty with the unsafe API for efficiency purposes. |
Hi, I also am not sure how I would use the matrix type based API. Seems
|
Rust slices have both get_unchecked() and get() methods. https://doc.rust-lang.org/std/primitive.slice.html#method.get_unchecked . The proposed design is not very different. But rather than having 2 versions of get, here 3 versions are being suggested. The panic version is meant primarily for making writing scripty code easier. |
TL;DR:
Reasoning: get_checked: indices are input from untrusted source, like file or a user. get: every other case. So I expect my indices to be correct, but I want get_unchecked: this is only for optimization purposes, so should be used About other functions: get and set are potentially cheap enough that bounds I've had a different idea btw. We could associate to a matrix two new Daniel On Aug 10, 2015 4:46 PM, "Shailesh Kumar" [email protected] wrote:
|
I was actually thinking the opposite, that the two types would have exactly the same API and only the implementations would differ. Consider the following workflow: // let's say, for example, that matrix_from_file() is guaranteed to return a 20x20 matrix or larger
let m: Matrix<f64> = matrix_from_file("filename");
// this element is not guaranteed to be there
if let Ok(element) = m.get(50, 50) {
println!("m[50, 50] = {}", element);
}
// we can guarantee that this is in bounds so we start using UncheckedMatrix
let m = m as UncheckedMatrix;
let mut i = 0.0;
for j in 0..20 {
i += m.get(j, j);
} I think it would make a cleaner API but actually the more of this I write out the more error prone it seems for the user. I just always feel like the type system isn't being used properly when I have to write things like |
These two functions called get do not have the same return type, hence the On Tue, Aug 11, 2015 at 2:04 PM, Sean Marshallsay [email protected]
Daniel Vainsencher |
Of course. Sorry, I completely missed that (it's been a long week already). |
No problem. I wonder if its possible to flesh out the shape-borrowing get<C: SafeColIndex, R: SafeRowIndex>(col: C, row: R) -> ... (doesn't need On Tue, Aug 11, 2015 at 3:52 PM, Sean Marshallsay [email protected]
Daniel Vainsencher |
Sure, but what I would love is if we can arrange that users cannot create Something like let c = m.min_col() // has type SafeColIndex, so can be used in unchecked On Tue, Aug 11, 2015 at 4:52 PM, Sean Marshallsay [email protected]
Daniel Vainsencher |
Just to clarify on one point. The RFC doesn't intend to suggest that three versions of a method should be considered for each method. Typical development would go like this.
So in summary:
The objective of this RFC is to develop a guideline for the development process. |
So looks like we agree that the unsafe version should be a rarity. I think that if most methods should be optiony/resulty, we should strive For example, I would make the get have "column/row out of bounds" error. Daniel On Tue, Aug 11, 2015 at 8:01 PM, Shailesh Kumar [email protected]
Daniel Vainsencher |
Please look at |
Ah, looks pretty good! Already had the pleasure to encounter On Wed, Aug 12, 2015 at 10:55 PM, Shailesh Kumar [email protected]
Daniel Vainsencher |
This RFC proposes a convention for structuring methods in SciRust
which can cater to the conflicting needs of efficiency, easy of use
and effective error handling.
For the impatient:
Detailed discussion
The audience of SciRust can be possibly divided into
two usage scenarios.
do some numerical experiment, get the results and analyze them.
would be built on top of fundamental building blocks provided
by SciRust (these may be other modules shipped in SciRust itself).
While the first usage scenario is important for getting new users hooked
to the library, the second usage scenario is also important for justifying
why Rust should be used for scientific software development compared
to other scientific computing platforms.
In context of the two usage scenarios, the design of SciRust has three conflicting goals:
While ease of use is important for script style usage,
efficiency and well managed error handling are important
for serious software development on top of core components
provided by SciRust.
We will consider the example of a
get(r,c)
methodon a matrix object to discuss these conflicting goals.
Please note that
get
is just a representative methodfor this discussion. The design ideas can be applied in
many different parts of SciRust once accepted.
If
get
is being called in a loop, usually the codearound it can ensure that the conditions for accessing
data within the boundary of the matrix are met correctly.
Thus, a bound checking within the implementation of
get
is just an extra overhead.
While this design is good for writing efficient software,
it can lead to a number of memory related bugs and goes
against the fundamental philosophy of Rust (Safety first).
There are actually two different options for error handling:
Option<T>
orResult<T, Error>
.panic
mechanism.Option<T>
orResult<T, Error>
provides the users afine grained control over what to do when an error occurs.
This is certainly the Rusty way of doing things. At the
same time, both of these return types make the user code
more complicated. One has to add extra calls to
.unwrap()
even if one is sure that the function is not going to fail.
Users of scientific computing tend to prefer an environment
where they can get more work done with less effort. This is
a reason of the success of specialized environments like
MATLAB. Open source environments like Python (NumPy, SciPy)
try to achieve something similar.
While SciRust doesn't intend to compete at the level of
simplicity provided by MATLAB/Python environments, it does
intend to take an extra effort wherever possible to address
the ease of use goal.
In this context, the return type of a
getter
shouldbe just the value type
T
. This can be achievedsafely by using a panic if the access boundary
conditions are not met.
The discussion above suggests up to 3 possible ways of
implementing methods like
get
.where the calling code is responsible for ensuring that
the necessary requirements for correct execution of the
method are being met.
Option<T>
orResult<T, Error>
which can be used for professionalsoftware development where the calling code has full control
over error handling.
an API which is simpler to use for writing short scientific
computing scripts.
Proposed convention
We propose that a method for which these variations
need to be supported, should follow the convention defined below:
method_unchecked
version should provide basic implementationof the method. This should assume that necessary conditions
for successful execution of the methods are already being
ensured by the calling code. The unchecked version of method
MUST be marked
unsafe
. This ensures that the calling codeknows that it is responsible for ensuring the right conditions
for calling the unchecked method.
method_checked
version should be implemented on top ofa
method_unchecked
method. The checked version shouldcheck for all the requirements for calling the method safely.
The return type should be either
Option<T>
orResult<T, Error>
. In case the required conditions forcalling the method are not met, a
None
orError
should be returned. Once the required conditions are met,
method_unchecked
should be called to get the resultwhich would be wrapped inside
Option
orResult
.method
version should be built on top ofmethod_checked
version.It should simply attempt to unwrap
the value returned by
method_checked
and return asT
.If
method_checked
returns an error or None, this versionshould panic.
First two versions are suitable for professional development
where most of the time we need a safe API while at some times
we need an unsafe API for efficient implementation.
The third version is suitable for script style usage scenario.
The convention has been illustrated in the three versions of
get
at the beginning of this document.API bloat
While this convention is expected to lead into an API bloat,
but if the convention is followed properly across the library,
then it should be easy to follow (both from the perspective
of users of the library and from the perspective of developers
of the library).
The text was updated successfully, but these errors were encountered: