Unformatted version of **ai05s/ai05-0044-1.txt version 1.4**

Other versions for file**ai05s/ai05-0044-1.txt**

Other versions for file

!standard A.18(4/2) 08-05-15 AI05-0044-1/04

!standard A.18.2(231/2)

!standard A.18.3(145/2)

!standard A.18.6(56/2)

!standard A.18.8(66/2)

!standard A.18.9(79/2)

!standard A.18.16(5/2)

!standard A.18.16(9/2)

!class binding interpretation 07-04-02

!status ARG Approved 7-0-2 06-11-10

!status work item 07-04-02

!status received 07-01-24

!priority Medium

!difficulty Medium

!qualifier Omission

!subject Equivalence and equality in containers

!standard A.18.2(231/2)

!standard A.18.3(145/2)

!standard A.18.6(56/2)

!standard A.18.8(66/2)

!standard A.18.9(79/2)

!standard A.18.16(5/2)

!standard A.18.16(9/2)

!class binding interpretation 07-04-02

!status ARG Approved 7-0-2 06-11-10

!status work item 07-04-02

!status received 07-01-24

!priority Medium

!difficulty Medium

!qualifier Omission

!subject Equivalence and equality in containers

!summary

The notion of "strict weak ordering" is defined. All "<" formal subprograms are
required to have a strict weak ordering.

"=" formal functions for the Set containers are required to return False for non-equivalent
elements.

!question

The equivalence relationship of ordered containers is assumed to be transitive. However,
that does not follow from the requirements on "<" and the definition of equivalence.
Consider the following function:

This function meets the requirements on "<". But the equivalence relationship derived from
it is not transitive. This could cause many bizarre effects in a container.

Similar issues arise for "<" as passed to various sorting routines.
If "<" allows incomparable items, it's impossible to understand what the meaning of
"smallest first" is.

A related issue is that the requirements on "=" for sets are not related to equivalence. That
means that "=" on a set must take O(N**2), as checking only equivalent elements would be making
an unwarranted assumption.

!recommendation

!wording

Define "strict weak ordering" after A.18(4/2):

When a formal operator "<" is used to provide an ordering for a container,
it is generally required to define a strict weak ordering. An operator "<"
defines a *strict weak ordering* if it is irreflexive,
asymmetric, transitive, and in addition, if x < y for any values x and y,
then for any other value z, (x < z) or (z < y).

Replace the second sentence of A.18.2(231/2) by
It should define a strict weak ordering relationship (see A.18); it should not modify
Container.

Make similar changes to A.18.3(145/2), A.18.6(56/2), A.18.9(79/2), A.18.16(5/2),
and A.18.16(9/2).

Add "If the actual function for the generic formal function "=" returns True for any pair
of non-equivalent elements, then the behavior of the container function "=" is unspecified."
as a separate paragraph after A.18.8(66/2) and A.18.9(79/2).

!discussion

The existing wording requires a "<" actual function to be irreflexive, asymetric,
and transitive. The formal definitions of these terms are:
- irreflexive: X < X is False for all X.
- asymmetric: if X < Y is True, then Y < X is False.
- transitive: if X < Y is True and Y < Z is True then X < Z is True.

Note that the latter two say nothing about the case where X < Y is False. Thus
it is OK to have both X < Y = False and Y < X = False and still be asymeteric. Indeed, this
is the definition of equivalence, so it is clearly fine. What is not OK is for there to be
a value which is equivalent to two other values which are not equivalent (that is,
that have "<" returning True).

That is, we also need transitively of equivalence, or more precisely.

if x < y is True, then for all z, (x < z) or (z < y) is True.

!corrigendum A.18(4/2)

Insert after the paragraph:

If the advice suggests that the complexity should be less than *O*(f(N)),
then for any arbitrarily small positive real D, there should exist a positive
integer M such that for all N > M, t(N)/f(N) < D.

the new paragraph:

When a formal operator "<" is used to provide an ordering for a container,
it is generally required to define a strict weak ordering. An operator "<"
defines a *strict weak ordering* if it is irreflexive,
asymmetric, transitive, and in addition, if *x* < *y* for
any values *x* and *y*, then for all other values *z*,
(*x* < *z*) or (*z* < *y*).

!corrigendum A.18.2(231/2)

Replace the paragraph:

The actual function for the generic formal function "<" of Generic_Sorting is
expected to return the same value each time it is called with a particular pair
of element values. It should define a strict ordering relationship, that is, be
irreflexive, asymmetric, and transitive; it should not modify Container. If the
actual for "<" behaves in some other manner, the behavior of the subprograms of
Generic_Sorting are unspecified. How many times the subprograms of Generic_Sorting
call "<" is unspecified.

by:

The actual function for the generic formal function "<" of Generic_Sorting is
expected to return the same value each time it is called with a particular pair
of element values. It should define a strict weak ordering relationship (see A.18);
it should not modify Container. If the
actual for "<" behaves in some other manner, the behavior of the subprograms of
Generic_Sorting are unspecified. How many times the subprograms of Generic_Sorting
call "<" is unspecified.

!corrigendum A.18.3(145/2)

Replace the paragraph:

The actual function for the generic formal function "<" of Generic_Sorting is
expected to return the same value each time it is called with a particular pair
of element values. It should define a strict ordering relationship, that is,
be irreflexive, asymmetric, and transitive; it should not modify Container. If
the actual for "<" behaves in some other manner, the behavior of the subprograms
of Generic_Sorting are unspecified. How many times the subprograms of Generic_Sorting
call "<" is unspecified.

by:

The actual function for the generic formal function "<" of Generic_Sorting is
expected to return the same value each time it is called with a particular pair
of element values. It should define a strict weak ordering relationship (see A.18);
it should not modify Container. If
the actual for "<" behaves in some other manner, the behavior of the subprograms
of Generic_Sorting are unspecified. How many times the subprograms of Generic_Sorting
call "<" is unspecified.

!corrigendum A.18.6(56/2)

Replace the paragraph:

The actual function for the generic formal function "<" on Key_Type values is
expected to return the same value each time it is called with a particular pair
of key values. It should define a strict ordering relationship, that is, be
irreflexive, asymmetric, and transitive. If the actual for "<" behaves in
some other manner, the behavior of this package is unspecified. Which
subprograms of this package call "<" and how many times they call it, is unspecified.

by:

The actual function for the generic formal function "<" on Key_Type values is
expected to return the same value each time it is called with a particular pair
of key values. It should define a strict weak ordering relationship (see A.18).
If the actual for "<" behaves in
some other manner, the behavior of this package is unspecified. Which
subprograms of this package call "<" and how many times they call it, is unspecified.

!corrigendum A.18.8(66/2)

Insert after the paragraph:

The actual function for the generic formal function Equivalent_Elements is
expected to return the same value each time it is called with a particular
pair of Element values. It should define an equivalence relationship, that
is, be reflexive, symmetric, and transitive. If the actual for
Equivalent_Elements behaves in some other manner, the behavior of this package
is unspecified. Which subprograms of this package call Equivalent_Elements, and
how many times they call it, is unspecified.

the new paragraph:

If the actual function for the generic formal function "=" returns True for any pair
of non-equivalent elements, then the behavior of the container function "="
is unspecified.

!corrigendum A.18.9(79/2)

Replace the paragraph:

The actual function for the generic formal function "<" on Element_Type values
is expected to return the same value each time it is called with a particular
pair of key values. It should define a strict ordering relationship, that is,
be irreflexive, asymmetric, and transitive. If the actual for "<" behaves in
some other manner, the behavior of this package is unspecified. Which
subprograms of this package call "<" and how many times they call it, is
unspecified.

by:

The actual function for the generic formal function "<" on Element_Type values
is expected to return the same value each time it is called with a particular
pair of key values. It should define a strict weak ordering relationship (see A.18).
If the actual for "<" behaves in
some other manner, the behavior of this package is unspecified. Which
subprograms of this package call "<" and how many times they call it, is
unspecified.

If the actual function for the generic formal function "=" returns True for any pair
of non-equivalent elements, then the behavior of the container function "="
is unspecified.

!corrigendum A.18.16(5/2)

Replace the paragraph:

The actual function for the generic formal function "<" of Generic_Array_Sort is
expected to return the same value each time it is called with a particular pair
of element values. It should define a strict ordering relationship, that is,
be irreflexive, asymmetric, and transitive; it should not modify Container.
If the actual for "<" behaves in some other manner, the behavior of the instance
of Generic_Array_Sort is unspecified. How many times Generic_Array_Sort calls
"<" is unspecified.

by:

The actual function for the generic formal function "<" of Generic_Array_Sort is
expected to return the same value each time it is called with a particular pair
of element values. It should define a strict weak ordering relationship (see A.18);
it should not modify Container.
If the actual for "<" behaves in some other manner, the behavior of the instance
of Generic_Array_Sort is unspecified. How many times Generic_Array_Sort calls
"<" is unspecified.

!corrigendum A.18.16(9/2)

Replace the paragraph:

The actual function for the generic formal function "<" of Generic_Constrained_Array_Sort
is expected to return the same value each time it is called with a particular pair
of element values. It should define a strict ordering relationship, that is,
be irreflexive, asymmetric, and transitive; it should not modify Container.
If the actual for "<" behaves in some other manner, the behavior of the instance
of Generic_Constrained_Array_Sort is unspecified. How many times
Generic_Constrained_Array_Sort calls "<" is unspecified.

by:

The actual function for the generic formal function "<" of Generic_Constrained_Array_Sort
is expected to return the same value each time it is called with a particular pair
of element values. It should define a strict weak ordering relationship (see A.18);
it should not modify Container.
If the actual for "<" behaves in some other manner, the behavior of the instance
of Generic_Constrained_Array_Sort is unspecified. How many times
Generic_Constrained_Array_Sort calls "<" is unspecified.

!ACATS test

This just defines more generic actuals to provide unspecified behavior, and unspecified
behavior is not testable.

!appendix

From: Niklas Holsti Date: Wednesday, January 24, 2007 8:11 AM !topic Equivalent keys/elements in Ordered Maps and Sets !reference Ada 2005 RMA.18.6(55/2) !reference Ada 2005 RMA.18.9(78/2) !from Niklas Holsti 07-01-24 !discussion The generic container packages for Maps and Sets deal with a concept called "equivalent keys" or "equivalent elements". This equivalence is called an "equivalence relation" in A.18.4(5/2) for Maps and A.18.7(5/2) for Sets. As I understand it, and as Wikipedia defines it, an equivalence relation is a reflexive, symmetric and transitive relation: any key or element X is equivalent to itself; if X is equivalent to Y then Y is equivalent to X; and if X and Y are equivalent and Y and Z are equivalent then X and Z are equivalent. (Wikipedia: http://en.wikipedia.org/wiki/Equivalence_relation) The concept of equivalent keys or elements is defined differently for Hashed and Ordered containers. The question discussed here arises only for Ordered containers. The question is whether the "equivalence" concept as defined for Ordered containers is certain to be transitive and thus whether this "equivalence" really is an equivalence relation. If it is not transitive then some container operations can give strange results. For Ordered_Maps (or Ordered_Sets) the RM defines two keys (or elements) X and Y to be "equivalent" if they are incomparable, that is, if X < Y is False and Y < X is False, using the actual comparison operator associated with the generic formal "<" operator at instantiation time. This is in A.18.6(55/2) and A.18.9(78/2), the paragraphs referenced in the headers above. For clarity and brevity I will denote this "equivalence" by the "~" symbol. Thus the RM definition is: (X ~ Y) if and only if (X < Y is False and Y < X is False) The properties of the "~" relation thus depend on the properties of the actual comparison operator associated with the formal "<" operator. I will use the symbol "<" for this actual comparison operator too, as there seems no risk of confusion between the two. The properties required or expected of "<" are defined in A.18.6(56/2) for Ordered_Maps and in A.18.9(79/2) for Ordered_Sets. In both cases it is expected to be "a strict ordering relationship, that is, to be irreflexive, asymmetric, and transitive". I understand this as follows, using definitions from Wikipedia: - irreflexive: X < X is False for all X. http://en.wikipedia.org/wiki/Irreflexive - asymmetric: if X < Y is True, then Y < X is False. http://en.wikipedia.org/wiki/Asymmetric_relation (This is the "stronger" definition, the "weaker" being only "not symmetric". This form implies that "<" is irreflexive, making the requirements on "<" partly redundant.) - transitive: if X < Y is True and Y < Z is True then X < Z is True. http://en.wikipedia.org/wiki/Transitive_relation What do these properties of "<" imply for "~"? - The "~" relation is obviously symmetric because "and" is commutative. This does not depend on any properties of "<". - The "~" relation is reflexive because "<" is irreflexive. - The "~" relation is *not* necessarily transitive. The last claim is supported by this example: type T1 is (X, Y, Z); function "<" (Left, Right : T1) return Boolean is -- This function implements the ordering: -- X < Z -- No other pairs are comparable. begin return Left = X and Right = Z; end "<"; This function "<" is a strict ordering relation for T1. But the "~" relation induced by this "<" is not transitive: we have X ~ Y (because both X < Y and Y < X are False) and Y ~ Z (because both Y < Z and Z < Y are False) but not X ~ Z (because X < Z is True). If the above reasoning is correct the conclusion is that the current RM requirements on the actual "<" operator for an instantiation of Ordered_Maps or Ordered_Keys are not strong enough to ensure that the RM-defined "equivalence" concept for keys or elements is an equivalence relation. This has three negative effects: 1. It confuses the reader to call something an "equivalence" when it isn't an equivalence relation (because it is not transitive). 2. As the effects of operations on Maps and Sets are defined in terms of this "equivalence", the operations may give strange (unexpected) results. 3. The definitions of some operations become ambiguous. For an example of negative effect (2), strange results, consider the function Equivalent_Sets when Ordered_Sets is instantiated with the following element type and "<" operator: type T2 is (W, X, Y, Z); function "<" (Left, Right : T2) return Boolean is -- This function implements the ordering: -- W < X < Z -- W < Y -- No other pairs are comparable. begin case Left is when W => return Right /= W; when X => return Right = Z; when Y => return False; when Z => return False; end case; end "<"; The "equivalence" relation "~" derived from this "<" relation contains exactly the following pairs (giving True) and no others (all other pairs give False): X ~ X, X ~ Y Y ~ Y, Y ~ X, Y ~ Z Z ~ Z, Z ~ Y W ~ W. Now consider two objects Left and Right of the type Set from this instance of Ordered_Sets. Let the Left set be {X, Z} and the Right set be {W, Y}. Note that X ~ Z is False, so the Left set is "legal" with respect to A.18.6(5/2), and likewise for the Right set. Let's apply the RM definition A.18.6(22/2) to see what Equivalent_Sets (Left, Right) should return: - Left and Right are not the same set object so we do not return True at once. - Left and Right have the same number of elements (2) so we do not return False at once. - Now we must check each element of Left and look for an "equivalent" element in Right: - Check X from Left: we have X ~ Y and Y is in Right. - Check Z from Left: we have Z ~ Y and Y is in Right. Thus Equivalent_Sets (Left, Right) returns True. However, Right contains the element W, which is not "equivalent" to any element in Left. I feel that this makes it strange to consider Left and Right to be "equivalent sets". Moreover, it means that Equivalent_Sets (Right, Left) will return False, showing that Equivalent_Sets is not symmetric in this case, another strange result. These strange results are possible because two different elements of the Left set can be "equivalent" to the *same* element of the Right set, leaving one or more elements of the Right set unpaired with any "equivalent" elements of the Left set. If the "~" relation is transitive, these strange results cannot happen, which is easily proved as follows. If there are two different elements L1 and L2 of Left and an element R of Right such that L1 ~ R and L2 ~ R then by the symmetry and transitivity of "~" we must have L1 ~ L2 which is impossible because no Set can contain two or more "equivalent" elements according to A.18.6(5/2). For an example of negative effect (3), ambiguous definitions, consider the definition of the "=" operator on Ordered_Maps.Map in A.18.4(22/2) and (23/2). This definition implicitly assumes that each key in the Left map is "equivalent" to at most one key in the Right map. This assumption is false if "equivalence" of keys is not transitive; then there may be many "equivalent" keys in the Right map that could be matched against the same key in the Left map, making this definition of "=" ambiguous. To avoid these problem I think that the RM paragraphs that specify the properties expected of the actual "<" operators should be extended to say that "equivalence of keys/elements" must be (is expected to be) transitive. Thus, in A.18.6(56/2) the following sentence should be inserted after the sentence that ends with "transitive": "It should behave in such a manner that equivalence of keys is a transitive relation between keys." In A18.9(79/2) the following sentence should be inserted after the sentence that ends with "transitive": "It should behave in such a manner that equivalence of elements is a transitive relation between elements." This question does not arise for Hashed_Maps or Hashed_Sets because the equivalence of keys or elements is there defined directly by the formal functions Equivalent_Keys and Equivalent_Elements for which the RM specifically states that transitivity is expected, as well as reflexivity and symmetry. **************************************************************** From: Tucker Taft Date: Wednesday, January 24, 2007 1:51 PM This seems like a valid point, though it seems unfortunate to have to impose requirements on the "equivalence relation" rather than directly on "<". I'm surprised there isn't a more directly stated property of "<" that would ensure that equiv(X,Y) == "not X<Y and not Y<X" is transitive. Perhaps something like requiring that "<" satisfies the requirement that "X < Y and not Z < Y implies X < Z". **************************************************************** From: Matthew Heaney Date: Wednesday, January 24, 2007 2:44 PM As I mentioned in my CLA posts, the requisite term is that "<" must define a "strict weak ordering", which implies that equivalence is transitive (it would be a "partial ordering" otherwise). The SGI STL page has more info: <http://www.sgi.com/tech/stl/StrictWeakOrdering.html> **************************************************************** From: Adam Beneschan Date: Wednesday, January 24, 2007 3:19 PM > Perhaps something like requiring that "<" satisfies > the requirement that "X < Y and not Z < Y implies X < Z". Well, the definition of "transitive" for equivalence relationships is that (X equiv Y) and (Y equiv Z) implies (X equiv Z) so, using the definition of "equivalence" from the RM, this becomes not (X < Y) and not (Y < X) and not (Y < Z) and not (Z < Y) implies not (X < Z) and not (Z < X) which is ugly, but I don't know that it's possible to come up with a shorter version just in terms of "<". If you want to introduce a new term "comparable", defined as "A and B are comparable if either A<B is true or B<A is true", then the above is equivalent to: "If two elements X and Z are comparable, then every other element is comparable to either X or Z." Which, since "comparable" just means the same as "not equivalent", this is really just a fancy way of restating the definition of transitivity for an equivalence relationship, then stating it as a contrapositive. But maybe it would be easier to understand and flow better than talking about equivalence relationships. The reason I don't like it is that I'm used to thinking about the term "comparable" in the context of a partial ordering, where two non-comparable elements are just too different to be compared---not in the context of a strict weak ordering where two non-comparable elements are *equivalent*. Also, I should mention that I learned all this set theory stuff some time ago, and I remember what an equivalence relation, equivalence classes, partial orderings, and total orderings are, plus some other related concepts; and I've never come across the term "strict weak ordering" before this discussion. So while my own personal experience isn't necessarily an indicator of anything, I do think we should consider how much the term is "common knowledge" before just dumping it in the RM with no explanation. **************************************************************** From: Tucker Taft Date: Wednesday, January 24, 2007 3:52 PM I would still prefer a requirement that does not require the circuitous route of defining equivalence and then requiring it to be transitive. Did you consider my proposal of requiring that: (X<Y and not Z<Y) implies X<Z I wonder whether I could use that and the other requirements to prove that the equivalence relation is transitive. Let's see: The above is equivalent to: not (X<Y and not Z<Y) or X<Z which is equivalent to: not X<Y or Z<Y or X<Z which can be rearranged to be: (X < Z or Z < Y) or not X<Y which is equivalent to: (not X<Z and not Z<Y) implies not X<Y We can swap X and Y in the above and produce: (not Y<Z and not Z<X) implies not Y<X If we combine these two and regroup we get (not X<Z and not Z<X) and (not Z<Y and not Y<Z) implies (not X<Y and not Y<X) which is our desired transitivity rule. QED! Now I suppose this might be overspecifying, but the rule I have suggested seems something that any "reasonable" "<" operator ought to satisfy. In words it might be phrased that if X and Y are ordered with respect to one another, then Z must be ordered with at least one of them. A reexpression of the requirement that more nearly matches these words would be: For any X,Y,Z in domain of "<": X < Y implies (X < Z or Z < Y) Comments? **************************************************************** From: Adam Beneschan Date: Wednesday, January 24, 2007 4:46 PM ... > If we combine these two and regroup we get > > (not X<Z and not Z<X) and (not Z<Y and not Y<Z) > implies (not X<Y and not Y<X) > > which is our desired transitivity rule. QED! > > Now I suppose this might be overspecifying, Actually, I don't think so---it appears that the rule you formed is equivalent to the transitivity rule. I wrote a small program (in Ada, of course) that assumed there were three possible elements, A, B, and C, and tested all 64 combinations of A<B, A<C, B<A, etc., eliminating those that didn't meet the asymmetry and transitivity requirements; that left 19 combinations. In all those cases, the statements (X equiv Y) and (Y equiv Z) implies (X equiv Z), for all X, Y, Z and (X < Y) and (not Z < Y) implies (X < Z), for all X, Y, Z produced the same Boolean result. (It's nice to be able to work with Boolean variables, because then the number of cases is small enough that I can just try them all instead of having to mess around with Boolean algebra and risk making a mistake.) > but the > rule I have suggested seems something that any "reasonable" > "<" operator ought to satisfy. In words it might > be phrased that if X and Y are ordered with respect > to one another, then Z must be ordered with at least > one of them. Sounds vaguely familiar :-) > A reexpression of the requirement that > more nearly matches these words would be: > > For any X,Y,Z in domain of "<": X < Y implies (X < Z or Z < Y) Yep, I added that to my program, and it's equivalent to the other two. I kind of like this last formulation since it's more symmetrical than the first one, which IMHO would make it easier to understand and apply. **************************************************************** From: Randy Brukardt Date: Wednesday, January 24, 2007 5:15 PM > Also, I should mention that I learned all this set theory stuff some > time ago, and I remember what an equivalence relation, equivalence > classes, partial orderings, and total orderings are, plus some other > related concepts; and I've never come across the term "strict weak > ordering" before this discussion. So while my own personal experience > isn't necessarily an indicator of anything, I do think we should > consider how much the term is "common knowledge" before just dumping > it in the RM with no explanation. On c.l.a., somone claimed that the term is not commonly used. In any case, our definition of mathematical terms is given in by the document defined in 1.3 of the Standard. I tried the online version of it (www.mathworld.com), and did not find any definition of "strict weak ordering". That means (presuming I didn't screw up in some way) that we can't use that term in the RM, as it is not defined by our reference. (Or we'd have to define it ourselves.) One of Tucker's fixes looks good enough for this, so I don't think we need to bother with a definition of "strict weak ordering". (Aside: it would sure be nice if we could specify such things in the generic contract, rather than a mess of words. But I don't have any good idea as how to do that.) **************************************************************** From: Simon Wright Date: Wednesday, January 24, 2007 5:40 PM > The SGI STL page has more info: > > <http://www.sgi.com/tech/stl/StrictWeakOrdering.html> Also at http://en.wikipedia.org/wiki/Strict_weak_ordering -- the Properties section is interesting, very close to Tucker's suggestions. I think that the 4th bullet is the point at which Nicholas's bizarre "<" fails to meet the test. **************************************************************** From: Pascal Leroy Date: Thursday, January 25, 2007 3:26 AM Niklas said: > The question discussed here arises only > for Ordered containers. The question is whether the "equivalence" > concept as defined for Ordered containers is certain to be transitive > and thus whether this "equivalence" really is an equivalence > relation. Interesting. The requirements on the formal operators "=" and "<" were nailed down fairly late in the game, and I suppose that we didn't do our homework properly. In a private email date June 1st, 2005, I wrote "we could say that the relation "not (K1 < K2) and not (K2 < K1)" has to be an equivalence" but apparently we didn't opt for that formulation (I suppose that we thought that "strict ordering" was sufficient). For an implementation based on a binary tree to work the "equivalence" really has to be an equivalence relation, otherwise the result that you get depends on the order of insertions (because it depends on the representative that you choose to store in a node). Tuck replied: > A reexpression of the requirement that > more nearly matches these words would be: > > For any X,Y,Z in domain of "<": X < Y implies (X < Z or Z < Y) As Simon noticed this exactly matches the notion of "transitivity of equivalence" as defined by the Wikipedia article on strict weak ordering, so it does look like the way to go. Randy commented: > In any case, our definition of mathematical terms is given in > by the document defined in 1.3 of the Standard. I tried the > online version of it (www.mathworld.com), and did not find > any definition of "strict weak ordering". That means > (presuming I didn't screw up in some way) that we can't use > that term in the RM, as it is not defined by our reference. > (Or we'd have to define it ourselves.) Well, in practice we define the term "strict ordering" because we write "it should define a strict ordering relationship, that is, be irreflexive, asymmetric, and transitive". But we rely on our reference document to define "irreflexive", "asymmetric", and "transitive". In A.18(3-4) we give a definition of the big-O notation. It would be sensible to append there a definition of "strict weak ordering" and use that term whenever we describe the requirements on "<", with a cross-reference "see A.18". **************************************************************** From: Christoph Grein Date: Thursday, January 25, 2007 6:47 AM I'm not sure where to put the following definition, but I think it should be included in the RM. A <strict weak ordering> has the following properties. For all x, y and z: For all x, it is not the case that x < x (irreflexivity). For all x /= y, if x < y then it is not the case that y < x (antisymmetry). For all x, y, and z, if x < y and y < z then x < z (transitivity). For all x, y, and z, if x is incomparable with y, and y is incomparable with z, then x is incomparable with z (transitivity of equivalence). **************************************************************** From: Niklas Holsti Date: Monday, January 29, 2007 3:40 PM >>A reexpression of the requirement that >>more nearly matches these words would be: >> >> For any X,Y,Z in domain of "<": X < Y implies (X < Z or Z < Y) > > > As Simon noticed this exactly matches the notion of "transitivity of > equivalence" as defined by the Wikipedia article on strict weak ordering, > so it does look like the way to go. Expanding the area of discussion a bit, should this requirement for "weak" ordering, or transivity of equivalence, also apply to the generic formal "<" operators of Ada.Containers.Vectors.Generic_Sorting, Ada.Containers.Doubly_Linked_Lists.Generic_Sorting and the generic array sorting procedures in A.18.16? I think it should. I tried to understand what the RM means by "sorted smallest first" in these places, but my attempts at writing this in terms of "<" between adjacent elements suggest that incomparability should be transitive here, too. Moreover, googling "Quicksort strict weak" locates several pages that require "strict weak ordering" for Quicksort. Whatever the answer, I suggest that the RM should define what is meant by "sorted smallest first". It is not obvious, if the definition can use only the "<" operator and the container can contain incomparable elements. **************************************************************** From: Tucker Taft Date: Monday, January 29, 2007 5:36 PM Yes, I think you are right that the additional requirement on "<" should apply in these additional places. Presumably the definition of being sorted smallest first is that for the final "Sorted" array: for I,J indices into Sorted: I < J implies not(Sorted(J) < Sorted(I)) **************************************************************** From: Niklas Holsti Date: Tuesday, January 30, 2007 2:54 AM > Yes, I think you are right that the additional > requirement on "<" should apply in these additional > places. Good. > Presumably the definition of being sorted > smallest first is that for the final "Sorted" array: > > for I,J indices into Sorted: > > I < J implies not(Sorted(J) < Sorted(I)) Yes, that is one possible formulation; it describes the Sorted array as "globally non-decreasing". An equivalent one, which I prefer because it has no negation and because it is closer to the words "smallest first" is for I,J indices into Sorted: Sorted(I) < Sorted(J) implies I < J If "<"-equivalence (incomparability) is transitive, it is enough to compare adjacent elements and check that Sorted(I+1) < Sorted(I) is False for all indices I in First .. Last-1. This form describes the Sorted array as "locally non-decreasing". Using this form in the function Is_Sorted gives O(N) complexity, as one would expect for that function. **************************************************************** From: Tucker Taft Date: Tuesday, January 30, 2007 1:36 PM > An equivalent one, which I prefer because it has no negation and because > it is closer to the words "smallest first" is > > for I,J indices into Sorted: > > Sorted(I) < Sorted(J) implies I < J That does seem more natural, and would seem to be a worthwhile addition to the manual as a definition of "smallest first." **************************************************************** From: Niklas Holsti Date: Wednesday, January 24, 2007 2:00 PM !topic Formal "=" for Element_Type in Set containers !reference Ada 2005 RMA.18.4(3/2) !reference Ada 2005 RMA.18.7(3/2) !from Niklas Holsti 07-01-24 !discussion All generic packages in Ada.Containers for Maps and Sets take a generic formal "=" operator for the Element_Type. The actual "=" operator for Element_Type, supplied for the instantiation, is expected to be deterministic, reflexive and symmetric. This is stated in A.18.4(3/2) for Maps and in A.18.7(3/2) for Sets. The Map packages have no other generic formal functions taking Element_Type values, thus the question to be discussed here does not arise for Maps. The Set packages have other generic formal functions taking Element_Type parameters: Equivalent_Elements and Hash (for Hashed_Sets) and "<" (for Ordered_Sets). The question I want to raise here is the puzzling fact that the RM apparently requires and expects no relationship between the "=" operator on Element_Type and the functions Equivalent_Elements, Hash and "<". For example, in the case of two elements E1 and E2 in an Ordered_Set it would be very natural to require that if E1 "=" E2 then we cannot also have E1 "<" E2 or E2 "<" E1 (in other words E1 should be "equivalent" to E2, using the definition of equivalent elements given in A.18.9(78/2)). But I can find no RM text in A.18 that explicitly says this. Apart from the RM paragraphs referenced above, I found only the following mentions of the "=" operator on Element_Type for Sets: - In A.18.7(20/2), which says that "=" on Set objects uses the formal "=" on Element_Type. (The Annotated RM says that this is the only place where "=" for Element_Type should be used.) - In A.18.9(80/2), which says that elements should not be changed so that the "=" operator gives a different result (except by operations in the Sets package itself). Neither of these paragraphs specifies any relationship between the "=" operator on Element_Type and the other formal functions with Element_Type parameters. The "=" operator on Sets is defined in A.18.7(20/2). This definition requires each element of the Left set to be compared for "=" with the elements of the Right set, using the formal "=" operator on Element_Type. If the RM really puts no constraints on the behaviour of "=" on Element_Type with respect to the functions Equivalent_Elements, Hash and "<" on Element_Type, two problems arise with this definition of "=" on Sets: 1. I don't see how the "=" operator on Sets could be implemented in less than O(N**2) time, because the implementation could not use the Hashed or Ordered structure to limit the number of Element_Type pairs from Left and Right that must be compared with "=". 2. The "=" operator on Sets could return True for two sets Left and Right although the Right set contains elements that are not "=" to any elements of the Left set. This is because the definition is asymmetric (it "traverses" Left and "searches" Right), so two different elements of Left could be "=" to the same element of Right, leaving some elements of Right unmatched with elements of Left. I think (and discussion on comp.lang.ada supports it) that the definition of "=" on Sets expects the implementation to compare only "equivalent" elements of the Left and Right Sets. If this "equivalence" is transitive (see my earlier comment on the topic "Equivalent keys/elements in Ordered Sets and Maps") this means that it is enough to compare each element of Left with "=" to exactly one element of Right (or, for Hashed_Sets, to the elements in one bucket of Right). This makes the "=" operator on Sets take O(N) time, as desired. As a parenthesis, from discussion on comp.lang.ada it occurs to me that the RM authors may have intended to specify relationships between the "=" operator on Element_Type and the other formal functions with Element_Type parameters *indirectly*, through the properties expected of these other formal functions, as follows: For Hashed_Sets A.18.8(66/2) says that the actual Equivalent_Sets function is expected to be reflexive. According to Wikipedia, a (mathematical) relation "~" is reflexive if E ~ E for all E, thus on the Ada level I understand this RM paragraph to expect only that Equivalent_Sets (E, E) is True for all elements E. Obviously this does not involve the "=" operator on Element_Type. However, a *mathematically* equivalent definition of a reflexive relation "~" is that E1 = E2 implies E1 ~ E2, for all E1 and E2. Perhaps RM A.18.8(66/2) intends a similar formulation on the Ada level, that is: for any two elements E1 and E2, if E1 = E2 (using the formal "=" operator) then Equivalent_Elements (E1, E2) is expected to be True. This would define a natural relationship between the actual "=" for Element_Type and the actual function for Equivalent_Elements. The two mathematical definitions of "reflexive relation" are logically equivalent because mathematics is referentially transparent (I hope I am using this term more or less correctly): if E1 = E2 then E1 can be substituted for E2 in any expression that uses E2 (as a free variable) with no change in the value of the expression. The two Ada-level definitions of the reflexivity of Equivalent_Elements are not logically equivalent because Ada expressions are not referentially transparent, neither with respect to the predefined "=" operators (for example, E1 = E2 does not imply E1'Address = E2'Address) nor with respect to user-defined "=" operators. To continue in the same way, for Ordered_Sets A.18.9(79/2) says that the formal "<" function on Element_Type is expected to be irreflexive. Following the normal mathematical definition this means on the Ada-level that E < E is False for all E. As for Hashed_Sets a mathematically equivalent reformulation can translate this into the Ada-level expectation that E1 < E2 implies that E1 = E2 is False, using the formal "<" and "=" operators. Again this would define a natural relationship between the actual "=" and "<" operators for Element_Type. The RM authors may not have intended these alternative Ada-level formulations of "reflexive" and "irreflexive" properties, so I will close this parenthesis here and proceed to my recommendation. To enforce or clarify the expected relationship between "=" on Element_Type and the other formal functions taking Element_Type parameters, I suggest to add a sentence in A.18.7 saying that the actual function for the generic function "=" on Element_Type is expected to return True only when its Left and Right parameters are equivalent elements in the equivalence relation introduced in paragraph A.18.7(5/2). In other words, if E1 = E2 is True then E1 and E2 are expected to be equivalent. For Hashed_Sets this means that if E1 = E2 is True then so is Equivalent_Elements (E1, E2), which (by existing rules) means that E1 and E2 have the same Hash value. For Ordered_Sets this means that if E1 = E2 is True then E1 < E2 is False and E2 < E1 is also False. It may be better to reorder the paragraphs in this area so that the new sentence can be put in the same paragraph with the sentences in the current paragraph (3/2) that define the other expected properties of the "=" function. The equivalence relation should be introduced first, followed by the expected properties of "=". **************************************************************** From: Pascal Leroy Date: Thursday, January 25, 2007 4:27 AM > Neither of these paragraphs specifies any relationship > between the "=" operator on Element_Type and the other formal functions with > Element_Type parameters. At the time we realized that we needed to be very precise in specifying the behavior of the formal functions that define the equivalence (Hash and "<") because they are essential to the integrity of the container. On the other hand, we thought that, since the formal "=" operator is just used for "=" of sets and maps, it does not play any role in the integrity of the container, and that you get what you get. The requirements that we imposed on "=" only reflect the fact that we didn't want to constrain how many times the implementation calls it and with what parameters. > 1. I don't see how the "=" operator on Sets could be > implemented in less > than O(N**2) time I agree that this is not acceptable. > 2. The "=" operator on Sets could return True for two sets Left and > Right although the Right set contains elements that are not > "=" to any > elements of the Left set. This is because the definition is > asymmetric I have never been in love with this asymmetric definition. I suppose that we could fix the definition, but that would leave us with problem (1) above. At any rate, the definition makes it clear that the intent was "scan one set, do a lookup in the other, compare the elements". > the RM authors may have intended to specify relationships between the > "=" operator on Element_Type and the other formal functions with > Element_Type parameters *indirectly* No, that was not the intent, we missed problem (1) above, but we were not confused that the formal "=" operator is very different from mathematical equality. > I suggest to add a sentence in A.18.7 saying that the actual > function for the generic function "=" on Element_Type is expected to > return True only when its Left and Right parameters are equivalent > elements in the equivalence I agree that this is the best fix. Note however that "=" on sets is fraught with difficulties because its result will depend on the representative of each equivalence class that is actually stored in the set. If you ever insert two equivalent elements in a given set, you are treading on thin ice. Two sets that only differ because they were created in different orders (but with the same elements) are unlikely to compare equal. Heck, if an implementation internally introduces some amount of randomization (perhaps due to storage management) two sets created in the same order might even compare unequal. So this strange function can be trusted when it return True, not when it returns False. Hmm. **************************************************************** From: Matthew Heaney Date: Thursday, January 25, 2007 12:21 PM >> 1. I don't see how the "=" operator on Sets could be >> implemented in less >> than O(N**2) time > > I agree that this is not acceptable. O(n) is the worst that "=" could ever do. In practice you'll do better since when equality for elements compares false you terminate the iteration immediately. > Note however that "=" on sets is fraught with difficulties because its > result will depend on the representative of each equivalence class that is > actually stored in the set. If you ever insert two equivalent elements in > a given set, you are treading on thin ice. But the second insertion will be rejected (since an equivalent element is already in the set), so there is no issue. **************************************************************** From: Matthew Heaney Date: Thursday, January 25, 2007 1:18 PM > The Set packages have other generic formal functions taking Element_Type > parameters: Equivalent_Elements and Hash (for Hashed_Sets) and "<" (for > Ordered_Sets). The question I want to raise here is the puzzling fact > that the RM apparently requires and expects no relationship between the > "=" operator on Element_Type and the functions Equivalent_Elements, Hash > and "<". No, this is incorrect. There certainly *is* an expectation that equality implies equivalence (even if this is not stated explicitly). > For example, in the case of two elements E1 and E2 in an Ordered_Set it > would be very natural to require that if E1 "=" E2 then we cannot also > have E1 "<" E2 or E2 "<" E1 (in other words E1 should be "equivalent" to > E2, using the definition of equivalent elements given in A.18.9(78/2)). > But I can find no RM text in A.18 that explicitly says this. Equality for sets works by iterating over the left set, and searching for each element in the right set. If no equivalent element is found then the operation terminates immediately with False as the result. If the element is found (meaning that the elements are equivalent), then the elements are compared for equality. If element equality returns False, then the operation terminates immediately with the result False. Otherwise it continues the iteration until you run out of elements. Equality for sets certainly does NOT work by performing a linear search for the element. That would be a truly useless operation. (For the implementation of an ordered set, no search is necessary since you can simply have a pair of cursors to walk both internal trees simultaneously.) > Apart from the RM paragraphs referenced above, I found only the > following mentions of the "=" operator on Element_Type for Sets: ... > Neither of these paragraphs specifies any relationship between the "=" > operator on Element_Type and the other formal functions with > Element_Type parameters. Equality should imply equivalence. If you define equality differently then you might not get the result you want. > 1. I don't see how the "=" operator on Sets could be implemented in less > than O(N**2) time, because the implementation could not use the Hashed > or Ordered structure to limit the number of Element_Type pairs from Left > and Right that must be compared with "=". No, this is all wrong. A non-linear search is first performed (meaning that the elements are compared for equivalence using the canonical mechanisms), and then a test for equality is performed. You're probably confused because RM05 A.18.7 (20/2) says: "Otherwise, for each element E in Left, the function returns False if an element equal to E (using the generic formal equality operator) is not present in Right." But this does NOT mean that there's a linear search for an equal item. It DOES mean that there's a non-linear search for an equivalent item, followed by a test for equality. > 2. The "=" operator on Sets could return True for two sets Left and > Right although the Right set contains elements that are not "=" to any > elements of the Left set. This is because the definition is asymmetric > (it "traverses" Left and "searches" Right), so two different elements of > Left could be "=" to the same element of Right, leaving some elements of > Right unmatched with elements of Left. You're confused because you seem to think that element equality is used to implement the search. But that's wrong: element equality is only used *after* the search (which finds an equivalent item). > However, a *mathematically* equivalent definition of a reflexive > relation "~" is that E1 = E2 implies E1 ~ E2, for all E1 and E2. Perhaps > RM A.18.8(66/2) intends a similar formulation on the Ada level, that is: > for any two elements E1 and E2, if E1 = E2 (using the formal "=" > operator) then Equivalent_Elements (E1, E2) is expected to be True. This > would define a natural relationship between the actual "=" for > Element_Type and the actual function for Equivalent_Elements. Yes, that's correct. If element equality returns True, then Equivalent_Elements should return True. Of course, if you define element equality differently, then equality on sets might not return the value you want. If you want set equality to imply set equivalence, then you have to define the generic actual for element equality such that it only returns True when the elements are equivalent. > To enforce or clarify the expected relationship between "=" on > Element_Type and the other formal functions taking Element_Type > parameters, I suggest to add a sentence in A.18.7 saying that the actual > function for the generic function "=" on Element_Type is expected to > return True only when its Left and Right parameters are equivalent > elements in the equivalence relation introduced in paragraph > A.18.7(5/2). In other words, if E1 = E2 is True then E1 and E2 are > expected to be equivalent. Why is this necessary? If element equality does not imply equivalence then set equality will not imply set equivalence. What's the problem? **************************************************************** From: Tucker Taft Date: Thursday, January 25, 2007 5:46 PM > Equality for sets works by iterating over the left set, and searching > for each element in the right set. If no equivalent element is found > then the operation terminates immediately with False as the result. That isn't what the RM says in A.18.7(20). It makes no mention of equivalence. If we want the above to be a legitimate implementation, then we need to include a mention of equivalence in A.18.7(20) or add a requirement that "=" on elements implies equivalence or the result of "=" on sets is unspecified. > Equality should imply equivalence. If you define equality differently > then you might not get the result you want. But the RM has to say that somewhere. >> ... >> To enforce or clarify the expected relationship between "=" on >> Element_Type and the other formal functions taking Element_Type >> parameters, I suggest to add a sentence in A.18.7 saying that the >> actual function for the generic function "=" on Element_Type is >> expected to return True only when its Left and Right parameters are >> equivalent elements in the equivalence relation introduced in >> paragraph A.18.7(5/2). In other words, if E1 = E2 is True then E1 and >> E2 are expected to be equivalent. > > Why is this necessary? If element equality does not imply equivalence > then set equality will not imply set equivalence. What's the problem? The problem is that the definition of "=" on sets does not mention equivalence, nor is there any mention of the relationship between "=" on elements and equivalence, so you as an implementor can't use equivalence to implement set "=". If you want to use equivalence, then the RM needs to be augmented with words that authorize that implementation. So I agree with the original comment by Niklas, that we need to add something to the RM wording in this section. **************************************************************** From: Matthew Heaney Date: Thursday, January 25, 2007 9:13 PM > > Equality for sets works by iterating over the left set, and searching > > for each element in the right set. If no equivalent element is found > > then the operation terminates immediately with False as the result. > > That isn't what the RM says in A.18.7(20). It makes no > mention of equivalence. If we want the above to be a > legitimate implementation, then we need to include a mention > of equivalence in A.18.7(20) or add a requirement that "=" on > elements implies equivalence or the result of "=" > on sets is unspecified. OK, I'm certainly in favor of the RM requiring (or otherwise mentioning) that element equality must imply element equivalence. The explicit requirement would allow me to put some assertions in GNAT to verify the implication. (In GNAT if you compile a container instantiation with assertions enabled, you get a bunch of extra checks. That's how I detect dangling cursors, for example.) > > Equality should imply equivalence. If you define equality differently > > then you might not get the result you want. > > But the RM has to say that somewhere. OK. > > Why is this necessary? If element equality does not imply equivalence > > then set equality will not imply set equivalence. What's the problem? > > The problem is that the definition of "=" on sets does not > mention equivalence, nor is there any mention of the > relationship between "=" on elements and equivalence, so you > as an implementor can't use equivalence to implement set "=". OK. **************************************************************** From: Pascal Leroy Date: Friday, January 26, 2007 1:17 AM > >> 1. I don't see how the "=" operator on Sets could be > >> implemented in less > >> than O(N**2) time > > > > I agree that this is not acceptable. > > O(n) is the worst that "=" could ever do. Duh? For an ordered set I don't see how you can do better than O(n*Log(n)) since the search is going to require O(Log(n)) comparisons anyway. > In practice you'll do better > since when equality for elements compares false you terminate the > iteration immediately. Matt, this is irrelevant to the asymptotic complexity. It doesn't help that in some cases you'll stop after the first comparison. If the sets compare equal surely you'll have to check all the elements. And if they don't, on average you'll have to check n/2 elements. > > Note however that "=" on sets is fraught with difficulties because its > > result will depend on the representative of each equivalence class > > that is actually stored in the set. If you ever insert two equivalent > > elements in a given set, you are treading on thin ice. > > But the second insertion will be rejected (since an equivalent element > is already in the set), so there is no issue. Not if you call Include or the 4-parameter Insert to add the element to the set. The scenario is this: you have a collection of elements, and you add them to two sets, perhaps using Include, in different order. Chances are that the resulting sets will not compare equal, even though they come from the same elements. (They will be equivalent, though.) Because of this counter-intuitive semantics, I believe that "=" on sets is going to lead to bugs in user code. Most people won't realize that they want to use Equivalent_Sets, not "=", to get a reliable answer. The "=" operator as defined is junk and should not have been included in the package. Sigh. **************************************************************** From: Matthew Heaney Date: Friday, January 26, 2007 7:57 AM > Duh? For an ordered set I don't see how you can do better than > O(n*Log(n)) since the search is going to require O(Log(n)) > comparisons anyway. Yes, you're right. I was wearing my implementation hat, since for an ordered set you can simply do an pairwise comparison; no O(log(n)) search is necessary. > > In practice you'll do better > > since when equality for elements compares false you terminate the > > iteration immediately. > > Matt, this is irrelevant to the asymptotic complexity. It > doesn't help that in some cases you'll stop after the first > comparison. If the sets compare equal surely you'll have to > check all the elements. And if they don't, on average you'll > have to check n/2 elements. Right. > > But the second insertion will be rejected (since an equivalent element > > is already in the set), so there is no issue. > > Not if you call Include or the 4-parameter Insert to add the > element to the set. I still don't see this. The 4-param Insert doesn't change anything, it simply reports whether the insertion succeeds using a Boolean instead of an exception. > The scenario is this: you have a collection of elements, and > you add them to two sets, perhaps using Include, in different > order. Chances are that the resulting sets will not compare > equal, even though they come from the same elements. (They > will be equivalent, though.) But wouldn't this only be true for a multi-set (and only if you don't specify where the new item goes relative to existing equivalent items)? In our case (a unique-element set), if an equivalent item is already in the set, then the insert will fail. (Unless you're talking about Include, but then the new item would replace what's there, so the same element will still be in both sets.) > Because of this counter-intuitive semantics, I believe that > "=" on sets is going to lead to bugs in user code. Most > people won't realize that they want to use Equivalent_Sets, > not "=", to get a reliable answer. The "=" > operator as defined is junk and should not have been included > in the package. Sigh. I still don't understand this. Nor do I understand how you could get rid of set equality if the set type is non-limited. **************************************************************** From: Pascal Leroy Date: Friday, January 26, 2007 8:31 AM > I still don't see this. The 4-param Insert doesn't change > anything, it simply reports whether the insertion succeeds > using a Boolean instead of an exception. > > But wouldn't this only be true for a multi-set (and only if > you don't specify where the new item goes relative to > existing equivalent items)? In our case (a unique-element > set), if an equivalent item is already in the set, then the > insert will fail. (Unless you're talking about Include, but > then the new item would replace what's there, so the same > element will still be in both sets.) An example might help: consider a set of strings, and assume that the "<" is case-insensitive lexicographic comparison (so equivalence is case-insensitive equality). Equality of strings is, of course, case-sensitive. Now consider the insertions: Include (S1, "Ada"); Include (S1, "ADA"); Include (S2, "ADA"); Include (S2, "Ada"); After these calls you have S1={"ADA"} and S2={"Ada"} so S1 and S2 are equivalent (fine) but not equal. Surely the semantics is well-defined, but I find it extremely counter-intuitive, given that the two sets were created from the same string literals. In real life it might not be obvious that the insertions happened in different orders, and anyone depending on equality of sets might get some surprising results. (To make life more interesting, imagine that the insertions of "Ada" and the insertions of "ADA" happen on different tasks, with proper synchronization, but in a non-deterministic order.) Of course, the same problem may happen with the 4-parameter Insert (except that in this case the first insertion, not the last one, wins). > Nor do I understand how you > could get rid of set equality if the set type is non-limited. This is water under the bridge now, but I think that "=" should have been a mere renaming of Equivalent_Sets. **************************************************************** From: Tucker Taft Date: Friday, January 26, 2007 10:05 AM > This is water under the bridge now, but I think that "=" should have been > a mere renaming of Equivalent_Sets. Your points above are valid, but I don't agree with your conclusion. If two sets are equal, then I would hope there are essentially indistinguishable. But these are clearly distinguishable, because if you iterate through the two sets, you get different things. If we wanted to achieve your implied goal that, if you insert the same strings, but in different orders, you end up with the same result, we would need to provide another formal function that would allow the implementation to compare two equivalent elements and decide which one to put in the set. But we clearly don't have that. So I think the distinction between Equivalent_Elements and element "=" is directly analogous to the Equivalent_Sets and set "=", and so I think (once we make it clear that equal-element should imply equivalent-element), that we have the right semantics here. **************************************************************** From: Matthew Heaney Date: Friday, January 26, 2007 10:22 PM > An example might help: consider a set of strings, and assume that the "<" > is case-insensitive lexicographic comparison (so equivalence is > case-insensitive equality). Equality of strings is, of course, > case-sensitive. Yes, the predefined equality operator for type String is case sensitive, but if you have a case insensitive less-than operator then you have to match it with a case insensitive equality operator in the instantiation (if case-insensitive set equality is what you want). > Now consider the insertions: > > Include (S1, "Ada"); > Include (S1, "ADA"); > Include (S2, "ADA"); > Include (S2, "Ada"); > > After these calls you have S1={"ADA"} and S2={"Ada"} so S1 and S2 are > equivalent (fine) but not equal. But it depends on the generic actual equality operator for element type. The fact that it's type String doesn't make any difference. > Surely the semantics is well-defined, > but I find it extremely counter-intuitive, given that the two sets were > created from the same string literals. But if the generic actual equality operator is case sensitive, then the sets are going to compare unequal. Why do you think that that's counter-intuitive? If you want the sets to compare equal, then you have to provide a generic actual equality operator that's case insensitive. In GNAT, I augmented then container library with the following functions: Ada.Strings.Less_Case_Insensitive Ada.Strings.Equal_Case_Insensitive Ada.Strings.Hash_Case_Insensitive If you want string sets to compare equal, then you simply have to say: with Ada.Containers.Indefinite_Ordered_Sets; with Ada.Containers.Less_Case_Insensitive; with Ada.Containers.Equal_Case_Insensitive; package String_Sets is new Ada.Containers.Indefinite_Ordered_Sets (String, "<" => Ada.Containers.Less_Case_Insensitive, "=" => Ada.Containers.Equal_Case_Insensitive); and now all is well. > In real life it might not be > obvious that the insertions happened in different orders, and anyone > depending on equality of sets might get some surprising results. If the general actual equality operator is case sensitive, then there is no surprise: the sets compare unequal. If this would cause surprise, then you solve that problem by supplying a case insensitive element equality operator too. This issue is exactly why we have a generic formal element equality operator. It's up to the developer to supply a generic actual that does not cause surprise (where "surprise" is defined by him). >> Nor do I understand how you >> could get rid of set equality if the set type is non-limited. > > This is water under the bridge now, but I think that "=" should have been > a mere renaming of Equivalent_Sets. I disagree. I think we made exactly the right decision by declaring a generic formal equality operator for the element type, and leaving it up to the developer to pass in (or not) whatever generic actual makes sense for his application. **************************************************************** From: Pascal Leroy Date: Monday, January 29, 2007 2:26 AM > If you want string sets to compare equal, then you simply have to say: > > package String_Sets is > new Ada.Containers.Indefinite_Ordered_Sets > (String, > "<" => Ada.Containers.Less_Case_Insensitive, > "=" => Ada.Containers.Equal_Case_Insensitive); If you believe that the "=" and "<" should be consistent, then: 1 - Why the hell do we require that the user pass in "=" when we can deduce equivalence from the given "<"? 2 - This is a requirement on the user, and it should be spelled out in the RM. We agreed earlier that equality should imply equivalence. My understanding is that you are now saying that equality should be the same as equivalence, otherwise you'll have dependencies on the order of insertion. **************************************************************** From: Pascal Leroy Date: Monday, January 29, 2007 2:32 AM > If two sets are equal, then I would hope there > are essentially indistinguishable. But these are clearly > distinguishable, because if you iterate through the two sets, > you get different things. Yes, you can always distinguish sets through iterations, but that doesn't bother me too much, since you are in some sense "dissecting" the set. In general, the operations of this package that deal with entire sets (not doing any dissection) rely on element equivalence, not element equality. This makes it possible to preserve the natural mathematical invariants, for instance the fact that the intersection is always a subset of the union. On the other hand, when you start using "=", all bets are off regarding these mathematical invariants. The fact that two sets are equal, for instance, doesn't mean that their symmetric difference is empty. I think this is treacherous. > If we wanted to achieve your > implied goal that, if you insert the same strings, but in > different orders, you end up with the same result, we would > need to provide another formal function that would allow the > implementation to compare two equivalent elements and decide > which one to put in the set. But we clearly don't have that. That's not what I am trying to achieve. It would not be very practical, because deciding how to canonicalize the set could be hard. But I think that, as a set-theoretical operation, "=" is badly broken. > So I think the distinction between Equivalent_Elements and > element "=" is directly analogous to the Equivalent_Sets and > set "=", and so I think (once we make it clear that > equal-element should imply equivalent-element), that we have > the right semantics here. I guess that discussions in the abstract don't go anywhere. I would like to see a practical example where the distinction between "=" and Equivalent_Sets is actually useful. We are obviously not going to change the specs of these packages now. But I would like to have a note warning the user that, if element equality and element equivalence are not the same, the result of set equality will in general depend on the sequence of calls used to create each set. **************************************************************** From: Matthew Heaney Date: Monday, January 29, 2007 9:21 AM > If you believe that the "=" and "<" should be consistent, then: They should be consistent if you want set equality to imply set equivalence. > 1 - Why the hell do we require that the user pass in "=" when we can > deduce equivalence from the given "<"? I don't understand this. You can deduce equivalence from "<", but you cannot deduce equality, so you have to pass in equality too. > 2 - This is a requirement on the user, and it should be spelled out in the > RM. The only requirement on the user is that if they want set equality to imply set equivalence, then the user is required to pass in an equality operator such that element equality implies element equivalence. > We agreed earlier that equality should imply equivalence. Yes, but only if set equality is to imply set equivalence. The user decides whether this is true. (And if he decides in the affirmative, then he has the obligation to pass in a suitable element equality operator.) > My > understanding is that you are now saying that equality should be the same > as equivalence, otherwise you'll have dependencies on the order of > insertion. No, that's not what I'm saying at all. I'm saying that if set equality is to imply set equivalence, then element equality must imply element equivalence. That's very different from saying that "equality should be the same as equivalence." **************************************************************** From: Niklas Holsti Date: Monday, January 29, 2007 12:39 PM >> 2 - This is a requirement on the user, and it should be spelled out in >> the RM. > > > The only requirement on the user is that if they want set equality to > imply set equivalence, then the user is required to pass in an equality > operator such that element equality implies element equivalence. More than that: if element equality does not imply element equivalence, the fast algorithm to test equality of sets is wrong because it compares only equivalent elements; an O(N**2) algorithm would be needed. **************************************************************** From: Tucker Taft Date: Monday, January 29, 2007 1:37 PM Before we get too confused by the philosophical discussion, I think we all now agree that we *do* need to make a substantive change in the RM wording to state that: if element "=" does *not* imply element equivalence then the result of set-wise "=" is not specified. This is required to allow implementations to compare only equivalent elements to determine the result of set-wise "=". **************************************************************** From: Pascal Leroy Date: Tuesday, July 17, 2007 2:58 AM The last bit of wording in this AI [version /01 - ED] has: Add "If the actual for the generic formal function "=" does not return True for any pair of equivalent elements, then the behavior of the container function "=" is unspecified." to A.18.8(65.2) and A.18.9(79/2). This is wrong. As explained in the email thread of the AI, the requirement ought to be that equality implies equivalence, not the other way around. Think of a set of strings: equality could be normal string equality, and equivalence could be based on case-insensitive comparison. This is fine, as it makes it possible to implement set equality in linear time, but it fails the above rule. The wording should be changed to: Add "If the actual for the generic formal function "=" returns True for any pair of non-equivalent elements, then the behavior of the container function "=" is unspecified." to A.18.8(65.2) and A.18.9(79/2). ****************************************************************

Questions? Ask the ACAA Technical Agent