Formatting strings in two steps with Python
This note may be a bit of an extreme case of how to format strings with Python , but nonetheless it is useful to understand some inner workings of string formatting.
Because of some projects at work, I needed to be able to handle string formatting in two steps. Let's say, I have a string like this:
var = '{val1}_{val2}.dat'
The thing is that I want to format the string using only
val1
and leaving the part concerning
val2
intact, because that will be passed down to some other code. Basically, I want this:
var.format(val1=123)
to return this:
'123_{val2}.dat'
But if we try, we get a
KeyError
because
val2
is missing. This means that
Python
is using a dictionary to handle the formatting, and if we could somehow get in between, we can actually skip the error and return the
unformatted
part of the string.
It takes a bit of black-magic googling, but the answer is actually in the docs . We can use a custom formatter, we just need to make look like a dictionary that handles the missing key appropriately:
class FormatDict(dict):
def __missing__(self, key):
return '{' + str(key) + '}'
If we try again, now it works and we get the expected outcome:
>>> var.format_map(FormatDict(val1=123))
'123_{val2}.dat'
However, we can go one step further. What happens if we actually specify a format for
val2
, let's say:
>>> var = '{val1}_{val2:04}.dat'
>>> var.format_map(FormatDict(val1=123))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: '=' alignment not allowed in string format specifier
So, we need to work around to get it to work. A good starting point to look for answers is the
PEP 3101
where the
.format
notation is introduced. Note that, under the hood, each argument is passed to a formatter class, and the
__format__
method will be called. So, we want to define our own method only for the keys that are missing:
class FormatPlaceholder:
def __init__(self, key):
self.key = key
def __format__(self, spec):
result = self.key
if spec:
result += ":" + spec
return "{" + result + "}"
class FormatDict(dict):
def __missing__(self, key):
return FormatPlaceholder(key)
In this case, if the key is missing, it get's a
FormatPlaceholder
, gets instantiated with the missing
key
and its
__format__
method is called. Then, we simply append the specification to the result in case it is provided. Now we get it to work:
>>> var.format_map(FormatDict(val1=123))
'123_{val2:04}.dat'
If you want to see this pattern in the real-world: check
my project
one
one use
. To see how the
__format__
method can be used on custom objects, you can check, for example, what
Pint
does to format quantities including their units.