-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quick sort and related Arrays utilities #58
base: master
Are you sure you want to change the base?
Conversation
The latter doesn’t work well in CI here because the /verificationLogger options cause extra, unpredictable output lines
src/Comparison.dfy
Outdated
cmp(t0, t1) | ||
} | ||
|
||
predicate Complete??(t0: T, t1: T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of the ??
suffix. We may want to engage in some bikeshedding about style here before we introduce it. Based on my experience even Valid?
below is not common, Valid
is much more common and is even baked into the language through {:autocontracts}
.
I'm not necessarily to using ?
in user-defined constructs, I'd just like to agree on concrete guidance in the style guide on it. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see where the idea came from at least - you almost want to overload Complete?
to accept either a pair or a set.
Could this be CompleteOnPair
perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the ??
was due to have two layers. Happy with calling them Complete
and Complete'
(the OCaml style) or Complete1
(the Lisp style) or even Complete_
or Complete2
. CompletePair
is long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's go with removing all of the single ?
suffixes (except for Le?
and Ge?
above), and replacing all of the double ??
suffixes with a '
instead.
So in this case Complete??
will become Complete'
, and Complete?
will become Complete
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So naturally I've focussed most of my reviewing capacity on arguing about question marks. :) I do actually think it's a really important point for readability though, so I'd love to nail that down first before I look more deeply at the proofs.
reads arr | ||
ensures t in arr[lo..hi] | ||
{ | ||
arr[lo] // TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should either implement this or drop the function and just inline this choice of pivot above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's implement it — it's not hard and it's an important part of making the algorithm safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per offline conversation we decided not to block on this, but in that case we should add a comment on QuickSort that it doesn't yet have the intended O(n log n) worst case runtime yet (which is what @cpitclaudel was referring to as "safe").
src/Collections/Arrays/Arrays.dfy
Outdated
} | ||
} | ||
|
||
lemma Sortable_Slice(arr: array<T>, lo: int, hi: int, lo': int, hi': int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lemma Sortable_Slice(arr: array<T>, lo: int, hi: int, lo': int, hi': int) | |
lemma SortableSlice(arr: array<T>, lo: int, hi: int, lo': int, hi': int) |
Only instance of an underscore I see here and it really sticks out.
Also consider SortableSubslice
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SortableSlice
or SortableSubslice
both makes sense putting Subslice describes it better.
import opened Comparison | ||
import opened Wrappers | ||
|
||
trait Sorter<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely needs an example in examples
to demonstrate using it. I assume you have to not only provide a concrete class that implements Sorter<T>
and fills in Cmp
, but perhaps even has to invoke a lemma or two from the trait in order to prove that it works.
Is it possible to also provide a PredicateSorter
class that is constructed around a (T, T) -> Cmp
function value? Perhaps it could even be a partial function if you also provide a ghost set of all values you intend to sort with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, I'd strongly prefer to have an example, but the PredicateSorter
idea can wait for a future PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yikes, I went to write an example just to play around with it and immediately got the A class may only extend a trait in the same module, unless the parent trait is annotated with {:termination false}
error. That means this is currently useless unless we add {:termination false}
.
Unfortunately we should treat that as a hard blocker given there's already been resistance to putting {:termination false}
on any standard library code: #22 And especially given the discovery of dafny-lang/dafny#2500 I'm not inclined to push back on that.
We can think about whether there's a good alternative for this code that doesn't use traits, but it might be better to stick with this if it's good UX and at least wait for the solution to dafny-lang/dafny#2500.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not right: the proper API to use is PredicateSorter
, which is a regular class, defined right below Sorter
. That doesn't require :termination false
.
Original code here, with examples: https://github.com/dafny-lang/compiler-bootstrap/blob/4f616822209828e48cf63d3da66ee1c010f689d4/src/Utils/Lib.Sort.dfy#L396-L420
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cpitclaudel, I see the issue now is this PR doesn't include the PredicateSorter
class. We should just add that as well.
const Le? := this != Gt | ||
const Ge? := this != Lt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely like this use of ?
at least, since it's a very natural extension of the built-in Lt?/Eq?/Gt?
discriminator syntax.
Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) | ||
} | ||
|
||
predicate Sorted(sq: seq<T>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't this have a ?
as well then?
I'll be happiest if we come up with a simple rule for where ?
belongs. I definitely don't think it should be every predicate, especially since changing Valid
to Valid?
will be very disruptive for arguably not a lot of benefit. It looks like the pattern here is if the predicate is "about" the receiver rather than the arguments, but that feels very fuzzy to me, and I don't find it improves readability personally.
Perhaps to start, the straw-person proposal could be to only use it in symbols that aren't used like function calls, such as the Le?
and Ge?
constants above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment.
src/Comparison.dfy
Outdated
cmp(t0, t1) | ||
} | ||
|
||
predicate Complete??(t0: T, t1: T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see where the idea came from at least - you almost want to overload Complete?
to accept either a pair or a set.
Could this be CompleteOnPair
perhaps?
requires 0 <= lo <= hi <= arr.Length | ||
reads arr | ||
{ | ||
multiset(arr[lo..hi]) == old(multiset(arr[lo..hi])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember adding this to almost every every function method in MergeSort.dfy
will need to replace that with shuffled
instead.
Co-authored-by: Robin Salkeld <salkeldr@amazon.com>
import opened Comparison | ||
import opened Wrappers | ||
|
||
trait Sorter<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, I'd strongly prefer to have an example, but the PredicateSorter
idea can wait for a future PR.
src/Comparison.dfy
Outdated
cmp(t0, t1) | ||
} | ||
|
||
predicate Complete??(t0: T, t1: T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's go with removing all of the single ?
suffixes (except for Le?
and Ge?
above), and replacing all of the double ??
suffixes with a '
instead.
So in this case Complete??
will become Complete'
, and Complete?
will become Complete
.
src/Comparison.dfy
Outdated
{} | ||
|
||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | |
} |
src/Comparison.dfy
Outdated
} | ||
|
||
predicate {:opaque} Valid?(ts: set<T>) { | ||
Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) | |
Complete?(ts) && */ Transitive?(ts) |
Just following the general rule of not checking in commented out code. We could put a comment saying something like "we want to include Antisymmetric(ts)
as well but it isn't necessary and makes verification timeout" (for example, I don't actually know in this case).
} | ||
|
||
predicate Complete??(t0: T, t1: T) { | ||
cmp(t0, t1) == cmp(t1, t0).Flip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It threw me a bit that I didn't recognize this isn't one of the standard properties of total orderings, until I realized that "Transitive" really meant "cmp(...).Le? is a transitive binary relation" and similarly for "Reflexive". Is "complete" a standard term for this property of this kind of comparator?
I expect at the point where we try to connect Relations
to this version, we'll have a lemma that proves that C.Valid(s) ==> Relations.TotalOrdering((t1, t2) => C.cmp(t1, t2).Le?)
src/Collections/Arrays/Arrays.dfy
Outdated
ensures Shuffled(arr, lo, hi) | ||
{} | ||
|
||
twostate predicate SameElements(arr: array<T>, lo: int, hi: int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider UnchangedExceptSliceShuffled
instead.
reads arr | ||
ensures t in arr[lo..hi] | ||
{ | ||
arr[lo] // TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per offline conversation we decided not to block on this, but in that case we should add a comment on QuickSort that it doesn't yet have the intended O(n log n) worst case runtime yet (which is what @cpitclaudel was referring to as "safe").
Complete?(ts) && /* Antisymmetric?(ts) && */ Transitive?(ts) | ||
} | ||
|
||
predicate Sorted(sq: seq<T>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I've realized this won't work as is and can't be made to work safely with current Dafny limitations - see my first last comment. We'll have to pause on this for a while.
cmp(t0, t1) | ||
} | ||
|
||
predicate CompleteonPair(t0: T, t1: T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predicate CompleteonPair(t0: T, t1: T) { | |
predicate Complete'(t0: T, t1: T) { |
CompleteonPair was my earlier not-as-good suggestion, let's do Complete'
instead :)
|
||
function method Cmp(t0: T, t1: T): Cmp | ||
|
||
twostate predicate UnchangedSlice(arr: array<T>, lo: int, hi: int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just food for thought, or a potential learning exercise: It might make all of this more readable if we had an actual ArraySlice
datatype to pass around containing these values. It's a great example of the classic Design Pattern of introducing a type to hold collections of values that are always passed around together. Something like:
datatype ArraySlice_<T> = ArraySlice_(arr: array<T>, lo: int, hi: int) {
predicate Valid() {
0 <= lo <= hi <= arr.Length
}
twostate predicate Unchanged()
requires Valid()
reads arr
{
arr[lo..hi] == old(arr[lo..hi])
}
// etc.
}
type ArraySlice<T> = a: ArraySlice_<T> | a.Valid() witness *
AFAICT this would be useful even if we only used it in ghost code to avoid any runtime cost (and even then it's possible we could optimize the wrapper away in the future, just as we now do for single-field datatypes).
import opened Comparison | ||
import opened Wrappers | ||
|
||
trait Sorter<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yikes, I went to write an example just to play around with it and immediately got the A class may only extend a trait in the same module, unless the parent trait is annotated with {:termination false}
error. That means this is currently useless unless we add {:termination false}
.
Unfortunately we should treat that as a hard blocker given there's already been resistance to putting {:termination false}
on any standard library code: #22 And especially given the discovery of dafny-lang/dafny#2500 I'm not inclined to push back on that.
We can think about whether there's a good alternative for this code that doesn't use traits, but it might be better to stick with this if it's good UX and at least wait for the solution to dafny-lang/dafny#2500.
Fix #54
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.