Sets
A set is, of course, a datatype that can contain only unique entries, in no particular order. Contrast this with a list, which can contain as many references to a single object as you want, as in this example:
Python 2.3a1 (#1, Jan 15 2003, 22:10:49) [GCC 2.95.3 20010315 (SuSE)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class A: pass ... >>> a = A() >>> a <__main__.A instance at 0x401c612c> >>> l = [a, a, a] >>> l [<__main__.A instance at 0x401c612c>, <__main__.A instance at 0x401c612c>, <__main__.A instance at 0x401c612c>]
>>>
You can't do that with sets, as you can see here:
>>> import sets >>> s = sets.Set([a, a]) >>> s
Set([<__main__.A instance at 0x401c612c>])
Sets can contain only immutable object. To be able to nest sets within sets, some magic is implemented. However, this magic means that you should not nest mutable sets in sets in a multithreaded environment. Sets exist in two varieties: mutable and immutable. You can add and remove entries from mutable sets; the contents of an immutable set are fixed upon creation.
Sets come in really handy when you need to create the intersection of two bunches of data. Traditionally, you'd store your data in a list and loop over the lists to determine what's in both lists, in only one list or the other. Even if your lists contain duplicate data, you can easily use sets:
>>> a = 'a' >>> b = 'b' >>> c = 'c' >>> d = 'd' >>> l = [a, a, b] >>> l2 = [b, c, d] >>> sets.Set(l).intersection(sets.Set(l2))
Set(['b'])
Or, you can create a union of your two lists:
>>> sets.Set(l).union(sets.Set(l2))
Set(['a', 'c', 'b', 'd'])
The current sets module is meant as an interim solution: The goal is to create a new datatype.