# A strange behavior in passing from list to set

I find it a little puzzling that while the following code:

cand = ["A","B","C","D"]
show(cand)
len(cand)


displays the natural order A, B, C, D, this little variation:

Scand = Set(cand)  # La majuscule est importante
show(Scand)


changes the order to B, D, A, C.

This is disturbing since when one later asks for Arrangements, the order is not the natural one, which is highly perturbing given the high number of cases.

edit retag close merge delete

Sets don't have an intrinsic order, and the order in which they are printed is essentially random. If the order is important, you should probably use a list or a tuple.

( 2020-12-21 18:40:42 +0200 )edit

Une question sur un commentaire dans le code qui indique «la majuscule est importante».

En quoi est-ce le cas?

( 2020-12-21 23:24:18 +0200 )edit

Sort by » oldest newest most voted

Sets ensure uniqueness. Lists, strings and tuples preserve order.

Sometimes both uniqueness and order matter.

One can either carefully input a list, string or tupe, making sure oneself of uniqueness and order...

... or use a set to enforce uniqueness and sorted to ensure order.

Below we review a number of variations.

First define a special-purpose printing function, to save space when printing arrangements of characters (try list(A) for any of the examples below and compare with the output of xprint(A)):

sage: xprint = lambda A: print(f"{len(A)}: {' '.join(''.join(a) for a in A)}")


Use a string for cand, trusting ourselves to enter it in order and without repetition:

sage: cand = 'ABCD'
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
ABCD
Arrangements of the set ['A', 'B', 'C', 'D'] of length 2
12: AB AC AD BA BC BD CA CB CD DA DB DC


The same A would be obtained starting from a list version of cand, obtained in any of those ways:

sage: cand = ['A', 'B', 'C', 'D']
sage: cand = [x for x in 'ABCD']
sage: cand = list('ABCD')


Use a list but forget to check uniqueness gives trouble:

sage: cand = 'ABCDA'
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
ABCDA
Arrangements of the multi-set ['A', 'B', 'C', 'D', 'A'] of length 2
13: AA AB AC AD BA BC BD CA CB CD DA DB DC

sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
Arrangements of the multi-set ['A', 'B', 'D', 'A', 'C', 'A', 'D', 'A', 'B', 'C', 'A'] of length 2
16: AA AB AD AC BA BB BD BC DA DB DD DC CA CB CD CC


Using a set for cand forces uniqueness but can cost order:

sage: cand = Set('ABDACADABDA')
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
{'B', 'C', 'D', 'A'}
Arrangements of the set ['B', 'C', 'D', 'A'] of length 2
12: BC BD BA CB CD CA DB DC DA AB AC AD


Sorting the above out-of-order arrangements after the fact:

sage: B = sorted(A)
sage: print(B); xprint(B)
[['A', 'B'], ['A', 'C'], ['A', 'D'], ['B', 'A'], ['B', 'C'], ['B', 'D'],
['C', 'A'], ['C', 'B'], ['C', 'D'], ['D', 'A'], ['D', 'B'], ['D', 'C']]
12: AB AC AD BA BC BD CA CB CD DA DB DC


Here we lost some structure: while A was a set of arrangements, with dedicated methods:

sage: A.cardinality()
12
sage: A.an_element()
['B', 'C']
sage: A.random_element()  # random
['A', 'D']
sage: A.random_element()  # random
['C', 'B']


this is no longer the case for B which is only a list now, so that its string representation is less informative, and none of the above methods are available for it.

On of the final two versions below might combine all the qualities you seek.

Use both Set and sorted for cand, to get order and uniqueness:

sage: cand = sorted(Set('ABDACADABDA'))
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
['A', 'B', 'C', 'D']
Arrangements of the set ['A', 'B', 'C', 'D'] of length 2
12: AB AC AD BA BC BD CA CB CD DA DB DC


Keep cand a set but use a sorted version of it to compute arrangements:

sage: cand = Set('ABDACADABDA')
sage: A = Arrangements(sorted(cand), 2)
sage: print(cand); print(A); xprint(A)
{'B', 'C', 'D', 'A'}
Arrangements of the set ['A', 'B', 'C', 'D'] of length 2
12: AB AC AD BA BC BD CA CB CD DA DB DC

more

Thanks for thr very usefull answer but this was a simple interrogation.

( 2020-12-21 22:06:09 +0200 )edit

You're welcome. I was not sure exactly what the question was.

So I went for variations on the theme, with some hopefully useful tips and tricks along the way.

I did not comment on why sets don't preserve order, or why sets don't use a "more reasonable" order such as alphabetic order for sets of strings, which maybe was implicitly part of the question...

The reason is that in addition to uniqueness, sets care about efficiency of checking membership, adding elements, taking union and intersection...

Internally set elements might be ordered by their hash I think.

( 2020-12-21 23:16:49 +0200 )edit

NB. I fixed some typos in the question, would you fix these two in your comment:

• thr -> the
• usefull -> useful
( 2020-12-21 23:21:46 +0200 )edit