Ask Your Question
1

A strange behavior in passing from list to set

asked 2020-12-21 16:53:01 +0100

Cyrille gravatar image

updated 2020-12-21 23:20:53 +0100

slelievre gravatar image

I find it a little puzzling that while the following code:

cand = ["A","B","C","D"]
show(cand)
len(cand)

displays the natural order A, B, C, D, this little variation:

Scand = Set(cand)  # La majuscule est importante
show(Scand)

changes the order to B, D, A, C.

This is disturbing since when one later asks for Arrangements, the order is not the natural one, which is highly perturbing given the high number of cases.

edit retag flag offensive close merge delete

Comments

Sets don't have an intrinsic order, and the order in which they are printed is essentially random. If the order is important, you should probably use a list or a tuple.

John Palmieri gravatar imageJohn Palmieri ( 2020-12-21 18:40:42 +0100 )edit

Une question sur un commentaire dans le code qui indique «la majuscule est importante».

En quoi est-ce le cas?

slelievre gravatar imageslelievre ( 2020-12-21 23:24:18 +0100 )edit

1 Answer

Sort by » oldest newest most voted
0

answered 2020-12-21 20:10:45 +0100

slelievre gravatar image

Sets ensure uniqueness. Lists, strings and tuples preserve order.

Sometimes both uniqueness and order matter.

One can either carefully input a list, string or tupe, making sure oneself of uniqueness and order...

... or use a set to enforce uniqueness and sorted to ensure order.

Below we review a number of variations.

First define a special-purpose printing function, to save space when printing arrangements of characters (try list(A) for any of the examples below and compare with the output of xprint(A)):

sage: xprint = lambda A: print(f"{len(A)}: {' '.join(''.join(a) for a in A)}")

Use a string for cand, trusting ourselves to enter it in order and without repetition:

sage: cand = 'ABCD'
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
ABCD
Arrangements of the set ['A', 'B', 'C', 'D'] of length 2
12: AB AC AD BA BC BD CA CB CD DA DB DC

The same A would be obtained starting from a list version of cand, obtained in any of those ways:

sage: cand = ['A', 'B', 'C', 'D']
sage: cand = [x for x in 'ABCD']
sage: cand = list('ABCD')

Use a list but forget to check uniqueness gives trouble:

sage: cand = 'ABCDA'
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
ABCDA
Arrangements of the multi-set ['A', 'B', 'C', 'D', 'A'] of length 2
13: AA AB AC AD BA BC BD CA CB CD DA DB DC

sage: cand = 'ABDACADABCA'
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
ABDACADABCA
Arrangements of the multi-set ['A', 'B', 'D', 'A', 'C', 'A', 'D', 'A', 'B', 'C', 'A'] of length 2
16: AA AB AD AC BA BB BD BC DA DB DD DC CA CB CD CC

Using a set for cand forces uniqueness but can cost order:

sage: cand = Set('ABDACADABDA')
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
{'B', 'C', 'D', 'A'}
Arrangements of the set ['B', 'C', 'D', 'A'] of length 2
12: BC BD BA CB CD CA DB DC DA AB AC AD

Sorting the above out-of-order arrangements after the fact:

sage: B = sorted(A)
sage: print(B); xprint(B)
[['A', 'B'], ['A', 'C'], ['A', 'D'], ['B', 'A'], ['B', 'C'], ['B', 'D'],
 ['C', 'A'], ['C', 'B'], ['C', 'D'], ['D', 'A'], ['D', 'B'], ['D', 'C']]
12: AB AC AD BA BC BD CA CB CD DA DB DC

Here we lost some structure: while A was a set of arrangements, with dedicated methods:

sage: A.cardinality()
12
sage: A.an_element()
['B', 'C']
sage: A.random_element()  # random
['A', 'D']
sage: A.random_element()  # random
['C', 'B']

this is no longer the case for B which is only a list now, so that its string representation is less informative, and none of the above methods are available for it.

On of the final two versions below might combine all the qualities you seek.

Use both Set and sorted for cand, to get order and uniqueness:

sage: cand = sorted(Set('ABDACADABDA'))
sage: A = Arrangements(cand, 2)
sage: print(cand); print(A); xprint(A)
['A', 'B', 'C', 'D']
Arrangements of the set ['A', 'B', 'C', 'D'] of length 2
12: AB AC AD BA BC BD CA CB CD DA DB DC

Keep cand a set but use a sorted version of it to compute arrangements:

sage: cand = Set('ABDACADABDA')
sage: A = Arrangements(sorted(cand), 2)
sage: print(cand); print(A); xprint(A)
{'B', 'C', 'D', 'A'}
Arrangements of the set ['A', 'B', 'C', 'D'] of length 2
12: AB AC AD BA BC BD CA CB CD DA DB DC
edit flag offensive delete link more

Comments

Thanks for thr very usefull answer but this was a simple interrogation.

Cyrille gravatar imageCyrille ( 2020-12-21 22:06:09 +0100 )edit

You're welcome. I was not sure exactly what the question was.

So I went for variations on the theme, with some hopefully useful tips and tricks along the way.

I did not comment on why sets don't preserve order, or why sets don't use a "more reasonable" order such as alphabetic order for sets of strings, which maybe was implicitly part of the question...

The reason is that in addition to uniqueness, sets care about efficiency of checking membership, adding elements, taking union and intersection...

Internally set elements might be ordered by their hash I think.

slelievre gravatar imageslelievre ( 2020-12-21 23:16:49 +0100 )edit

NB. I fixed some typos in the question, would you fix these two in your comment:

  • thr -> the
  • usefull -> useful
slelievre gravatar imageslelievre ( 2020-12-21 23:21:46 +0100 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2020-12-21 16:53:01 +0100

Seen: 271 times

Last updated: Dec 21 '20