Ask Your Question
0

Comparing lists

asked 14 years ago

This is probably too basic. If so, I apologize (where should I look for questions like this, if that is the case?). (And I'm not sure I chose the right tag either...)

How do I check all the members of a list are contained in another list? Specifically, I'm interested in lists that consist of (prime) numbers. Perhaps there is a more efficient method in this case?

Preview: (hide)

1 Answer

Sort by » oldest newest most voted
1

answered 14 years ago

mvngu gravatar image

updated 14 years ago

You can use a brute-force search by defining your own custom function. This option doesn't assume that elements in your list are unique. Your lists can contain duplicate elements if you want.

sage: def is_sublist(shortlist, longlist):
....:     for e in shortlist:
....:         if not (e in longlist):
....:             return False
....:     return True
....: 
sage: L = [2, 5, 23]
sage: P = primes_first_n(20); P
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
sage: is_sublist(L, P)
True
sage: L + [23]
[2, 5, 23, 23]
sage: is_sublist(L + [23], P)
True
sage: L.append(next_prime(P[-1])); L
[2, 5, 23, 73]
sage: is_sublist(L, P)
False
sage: is_sublist(L + [23], P)
False

Alternatively, you can use the built-in functions itertools.imap and all. The function itertools.imap is efficient when your lists are large, e.g. having hundreds or even hundreds of thousands of elements. This second option doesn't care if your lists have duplicate elements.

sage: import itertools
sage: L = [2, 5, 23]
sage: P = primes_first_n(20); P
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
sage: L + [23]
[2, 5, 23, 23]
sage: all(itertools.imap(lambda x: x in P, L))
True
sage: all(itertools.imap(lambda x: x in P, L + [23]))
True
sage: L.append(next_prime(P[-1])); L
[2, 5, 23, 73]
sage: all(itertools.imap(lambda x: x in P, L))
False
sage: all(itertools.imap(lambda x: x in P, L + [23]))
False

Or, as Mitesh Patel said, you could use set. This third approach assumes that the elements in each list are unique, i.e. each list doesn't contain duplicate elements.

sage: L = [2, 5, 23]
sage: P = set(primes_first_n(20))
sage: set(L)
set([2, 5, 23])
sage: set(L).issubset(P)
True
sage: set(L + [23])
set([2, 5, 23])
sage: set(L + [23]).issubset(P)
True
sage: L.append(111); L
[2, 5, 23, 111]
sage: set(L)
set([2, 111, 5, 23])
sage: set(L + [111])
set([2, 111, 5, 23])
sage: set(L + [111]).issubset(P)
False
sage: set(L).issubset(P)
False
Preview: (hide)
link

Comments

FYI, I find sage: A = range(10^4) sage: B = range(1,10^4,2) sage: all(x in A for x in B) True to be as fast or faster than any of the option above. Perhaps there is a case that I'm missing where imap is faster, though.

Jason Bandlow gravatar imageJason Bandlow ( 14 years ago )

Thank you! I suspected there was something like issubset, but didn't about about itertools.imap or all.

Andres Caicedo gravatar imageAndres Caicedo ( 14 years ago )

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 14 years ago

Seen: 2,193 times

Last updated: Sep 15 '10