Formatting strings in two steps with Python

First published:

Last Edited:

Number of edits:

This note may be a bit of an extreme case of how to format strings with Python , but nonetheless it is useful to understand some inner workings of string formatting.

Because of some projects at work, I needed to be able to handle string formatting in two steps. Let's say, I have a string like this:

var = '{val1}_{val2}.dat'

The thing is that I want to format the string using only val1 and leaving the part concerning val2 intact, because that will be passed down to some other code. Basically, I want this:

var.format(val1=123)

to return this:

'123_{val2}.dat'

But if we try, we get a KeyError because val2 is missing. This means that Python is using a dictionary to handle the formatting, and if we could somehow get in between, we can actually skip the error and return the unformatted part of the string.

It takes a bit of black-magic googling, but the answer is actually in the docs . We can use a custom formatter, we just need to make look like a dictionary that handles the missing key appropriately:

class FormatDict(dict):
    def __missing__(self, key):
        return '{' + str(key) + '}'

If we try again, now it works and we get the expected outcome:

>>> var.format_map(FormatDict(val1=123))
'123_{val2}.dat'

However, we can go one step further. What happens if we actually specify a format for val2 , let's say:

>>> var = '{val1}_{val2:04}.dat'
>>> var.format_map(FormatDict(val1=123))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: '=' alignment not allowed in string format specifier

So, we need to work around to get it to work. A good starting point to look for answers is the PEP 3101 where the .format notation is introduced. Note that, under the hood, each argument is passed to a formatter class, and the __format__ method will be called. So, we want to define our own method only for the keys that are missing:

class FormatPlaceholder:
    def __init__(self, key):
        self.key = key

    def __format__(self, spec):
        result = self.key
        if spec:
            result += ":" + spec
        return "{" + result + "}"

class FormatDict(dict):
    def __missing__(self, key):
        return FormatPlaceholder(key)

In this case, if the key is missing, it get's a FormatPlaceholder , gets instantiated with the missing key and its __format__ method is called. Then, we simply append the specification to the result in case it is provided. Now we get it to work:

>>> var.format_map(FormatDict(val1=123))
'123_{val2:04}.dat'

If you want to see this pattern in the real-world: check my project one one use . To see how the __format__ method can be used on custom objects, you can check, for example, what Pint does to format quantities including their units.


Backlinks

These are the other notes that link to this one.

Comment

Share your thoughts on this note
Aquiles Carattino
Aquiles Carattino
This note you are reading is part of my digital garden. Follow the links to learn more, and remember that these notes evolve over time. After all, this website is not a blog.
© 2021 Aquiles Carattino
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License
Privacy Policy