New interesting data structures in Python 3

time to read 4 min | 845 words

Python 3’s uptake is dramatically on the rise rise these days, and I think therefore that it is a good time to take a look at some data structures that Python 3 offers, but that are not available in Python 2.

We will take a look at typing.NamedTuple, types.MappingProxyType and types.SimpleNamespace, all of which are new to Python 3.

typing.NamedTuple

typing.NamedTuple is a supercharged version of the venerable collections.namedtuple and while it was added in Python 3.5, it really came into its own in Python 3.6.

In comparions to collections.namedtuple, typing.NamedTuple gives you (Python >= 3.6):

  • nicer syntax

  • inheritance

  • type annotations

  • default values (python >= 3.6.1)

  • equally fast

See an illustrative typing.NamedTuple example below:

>>> from typing import NamedTuple

>>> class Student(NamedTuple):
>>>     name: str
>>>     address: str
>>>     age: int
>>>     sex: str
    
>>> tommy = Student(name='Tommy Johnson', address='Main street',     age=22, sex='M')
>>> tommy
    Student(name='Tommy Johnson', address='Main street', age=22, sex='M')

I like the class-based syntax compared to the old function-based syntax, and find this much more readable.

The Student class is a subclass of tuple, so it can be handled like any normal tuple:

>>> isinstance(tommy, tuple)
    True
>>> tommy[0]
    'Tommy Johnson'

A more advanced example, subclassing Student and using default values (note: default values require Python >= 3.6.1):

>>> class MaleStudent(Student):
>>>     sex: str = 'M'  # default value, requires Python >= 3.6.1 

>>>  MaleStudent(name='Tommy Johnson', address='Main street', age=22)
     MaleStudent(name='Tommy Johnson', address='Main street', age=22, sex='M')  # note that sex defaults to 'M'

In short, this modern version of namedtuples is just super-nice, and will no doubt become the standard namedtuple variation in the future.

See the docs for further details.

types.MappingProxyType

types.MappingProxyType is used as a read-only dict and was added in Python 3.3.

That types.MappingProxyType is read-only means that it can’t be directly manipulated and if users want to make changes, they have to deliberately make a copy, and make changes to that copy. This is perfect if you’re handing a dict -like structure over to a data consumer, and you want to ensure that the data consumer is not unintentionally changing the original data. This is often extremely useful, as cases of data consumers changing passed-in data structures leads to very obscure bugs in your code that are difficult to track down.

A types.MappingProxyType example:

>>>  from  types import MappingProxyType
>>>  data = {'a': 1, 'b':2}
>>>  read_only = MappingProxyType(data)
>>>  del read_only['a']
TypeError: 'mappingproxy' object does not support item deletion
>>>  read_only['a'] = 3
TypeError: 'mappingproxy' object does not support item assignment

Note that the example shows that the read_only object cannot be directly changed.

So, if you want to deliver data dicts to different functions or threads and want to ensure that a function is not changing data that is also used by another function, you can just deliver a MappingProxyType object to all functions, rather than the original dict, and the data dict now cannot be changed unintentionally. An example illustrates this usage of MappingProxyType:

>>>  def my_func(in_dict):
>>>     ...  # lots of code
>>>     in_dict['a'] *= 10  # oops, a bug, this will change the sent-in dict

...
# in some function/thread:
>>>  my_func(data)
>>>  data
data = {'a': 10, 'b':2}  # oops, note that data['a'] now has changed as an side-effect of calling my_func

If you send in a mappingproxy to my_func instead, however, attempts to change the dict will result in an error:

>>>  my_func(MappingProxyType(data))
TypeError: 'mappingproxy' object does not support item deletion

We now see that we have to correct the code in my_func to first copy in_dict and then alter the copied dict to avoid this error. This feature of mappingproxy is great, as it helps us avoid a whole class of difficult-to-find bugs.

Note though that while read_only is read-only, it is not immutable, so if you change data, read_only will change too:

>>>  data['a'] = 3
>>>  data['c'] = 4
>>>  read_only  # changed!
mappingproxy({'a': 3, 'b': 2, 'c': 4})

We see that read_only is actually a view of the underlying dict, and is not an independent object. This is something to be aware of. See the docs for further details.

types.SimpleNamespace

types.SimpleNamespace is a simple class that provides attribute access to its namespace, as well as a meaningful repr. It was added in Python 3.3.

>>>  from types import SimpleNamespace

>>>  data = SimpleNamespace(a=1, b=2)
>>>  data
namespace(a=1, b=2)
>>>  data.c = 3
>>>  data
namespace(a=1, b=2, c=3)

In short, types.SimpleNamespace is just a ultra-simple class, allowing you to set, change and delete attributes while it also provides a nice repr output string.

I sometimes use this as an easier-to-read-and-write alternative to dict. More and more though, I subclass it to get the flexible instantiation and repr output for free:

>>>  import random

>>>  class DataBag(SimpleNamespace):
>>>     def choice(self):
>>>         items = self.__dict__.items()
>>>         return random.choice(tuple(items))

>>>  data_bag = DataBag(a=1, b=2)
>>>  data_bag
DataBag(a=1, b=2)  
>>>  data_bag.choice()
(b, 2)

This subclassing of types.SimpleNamespace is not revolutionary really, but it can save on a few lines of text in some very common cases, which is nice. See the docs for details.


Related Post:

  1. May 22, 2017 Keyword argument demystify
  2. May 04, 2017 Looping techniques in Python
  3. May 03, 2017 Enhance your tuples
  4. May 02, 2017 Get more with collections!
  5. May 01, 2017 There is more to copying
  6. Apr 30, 2017 Implementing weak references in Python
  7. Apr 26, 2017 Next, Function or Method ?
  8. Apr 24, 2017 Generator Expressions
  9. Apr 23, 2017 Yield Keyword
  10. Apr 21, 2017 What are Generators?
  11. Apr 16, 2017 Lambda Functions in Python
  12. Apr 06, 2017 Function in Python are First-Class Object
  13. Apr 05, 2017 Django 1.11 Release Note a Reading
  14. Apr 03, 2017 One Hell Named JSON
  15. Dec 26, 2016 Queue in Python - Part 3
  16. Nov 02, 2016 Queue in Python - Part 2
  17. Nov 02, 2016 Queue in Python - Part 1
  18. Jun 25, 2016 Enable Spark Context on Your Ipython Notebook
  19. Apr 27, 2015 EAFP Coding Style in Python
  20. Jul 24, 2014 Kompresi CSS menggunakan Python