Formatting strings in two steps with Python
This note may be a bit of an extreme case of how to format strings with Python , but nonetheless it is useful to understand some inner workings of string formatting.
Because of some projects at work, I needed to be able to handle string formatting in two steps. Let's say, I have a string like this:
var = '{val1}_{val2}.dat'
The thing is that I want to format the string using only
val1
and leaving the part concerning
val2
intact, because that will be passed down to some other code. Basically, I want this:
var.format(val1=123)
to return this:
'123_{val2}.dat'
But if we try, we get a
KeyError
because
val2
is missing. This means that
Python
is using a dictionary to handle the formatting, and if we could somehow get in between, we can actually skip the error and return the
unformatted
part of the string.
It takes a bit of black-magic googling, but the answer is actually in the docs . We can use a custom formatter, we just need to make look like a dictionary that handles the missing key appropriately:
class FormatDict(dict):
def __missing__(self, key):
return '{' + str(key) + '}'
If we try again, now it works and we get the expected outcome:
>>> var.format_map(FormatDict(val1=123))
'123_{val2}.dat'
However, we can go one step further. What happens if we actually specify a format for
val2
, let's say:
>>> var = '{val1}_{val2:04}.dat'
>>> var.format_map(FormatDict(val1=123))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: '=' alignment not allowed in string format specifier
So, we need to work around to get it to work. A good starting point to look for answers is the
PEP 3101
where the
.format
notation is introduced. Note that, under the hood, each argument is passed to a formatter class, and the
__format__
method will be called. So, we want to define our own method only for the keys that are missing:
class FormatPlaceholder:
def __init__(self, key):
self.key = key
def __format__(self, spec):
result = self.key
if spec:
result += ":" + spec
return "{" + result + "}"
class FormatDict(dict):
def __missing__(self, key):
return FormatPlaceholder(key)
In this case, if the key is missing, it get's a
FormatPlaceholder
, gets instantiated with the missing
key
and its
__format__
method is called. Then, we simply append the specification to the result in case it is provided. Now we get it to work:
>>> var.format_map(FormatDict(val1=123))
'123_{val2:04}.dat'
If you want to see this pattern in the real-world: check
my project
one
one use
. To see how the
__format__
method can be used on custom objects, you can check, for example, what
Pint
does to format quantities including their units.
Backlinks
These are the other notes that link to this one.