EP 182 · Invertible Parsing · Mar 21, 2022 ·Members

Video #182: Invertible Parsing: Map

smart_display

Loading stream…

Video #182: Invertible Parsing: Map

Episode: Video #182 Date: Mar 21, 2022 Access: Members Only 🔒 URL: https://www.pointfree.co/episodes/ep182-invertible-parsing-map

Episode thumbnail

Description

Our parser-printer library is looking incredible, but there’s a glaring problem that we have no yet addressed. We haven’t been able to make one of our favorite operations, map, printer-friendly. The types simply do not line up. This week we will finally address this shortcoming.

Video

Cloudflare Stream video ID: 39784a790580fd3480ffbcfafbe0fe8e Local file: video_182_invertible-parsing-map.mp4 *(download with --video 182)*

References

Transcript

0:05

So we are now able to parse and print on more general types of input, which means we do not need to sacrifice performance in order to unify parsing and printing.

0:19

This is looking incredible, but there’s still one glaring problem that we have no yet addressed. Earlier we came across some innocent looking code that simply mapped on a parser to transform its output and we found that we couldn’t make that operation printer-friendly. The types simply did not line up.

0:37

Well, we are finally ready to tackle this seemingly simple problem. Turns out it’s not quite so simple, and to solve this we really have to contort our minds in some uncomfortable ways and it truly gets at the heart of what it means for printing to be inverse of parsing. Map and parser-printers

0:57

Recall that when we first constructed the field parser we had a .map operation tacked on in order to transform the parsed substring into a string since that’s what the User struct expects: let fieldString = OneOf { Parse { "\"" Prefix { $0 != "\"" } "\"" } Prefix { $0 != "," } } .map { String($0) }

1:14

Doing these mapping transformations helps us massage the parser outputs into something more friendly. We wouldn’t want to stick Substring types in our models, and so chaining on .map helps us get more friendly types out of the parser such as strings.

1:30

This becomes even more important when dealing with UTF8View s because we would definitely not want to expose our model types to those weird details: let fieldString = OneOf { Parse { "\"".utf8 Prefix { $0 != .init(ascii: "\"") } "\"".utf8 } Prefix { $0 != .init(ascii: ",") } } .map { String(Substring($0)) }

1:45

But the moment we did this we lost printing capabilities: fieldString.print Referencing instance method ‘print(_:to:)’ on ‘Parsers.ZipVOV’ requires that ‘Parsers.Map’ conform to ‘Printer’

1:51

And that’s because Parsers.Map , which is the parser that is returned under the hood from the .map operation, does not and cannot conform to the Printer protocol: extension Parsers.Map: Printer where Upstream: Printer { func print(_ output: Output, to input: inout Upstream.Input) throws { self.upstream.print(<#Upstream.Output#>, to: &<#Upstream.Input#>) } }

2:07

These types simply do not line up. We don’t have an Upstream.Output to feed to upstream.print . We do have a transform function, but it can only transform NewOutput into Upstream.Output , and we need the opposite direction.

2:38

Honestly it shouldn’t be too surprising that we can’t implement this. After all, parser-printers are bidirectional operations. They can parse inputs into outputs and they can print outputs into inputs.

2:48

It seems to be asking too much to be able to transform the output of a parser-printer given only a single transformation from output to a new output. It may sound like we just have a “feeling” about the impossibility of making this into a printer, but it is actually intimately related to the concept of “positive” and “negative” position of function arguments, which is a topic we dove deep into in our episode on contravariance .

3:18

In that episode we defined the concept of “positive” position of a type parameter in a function signature as being on the right side of a function arrow, and “negative” position is to be on the left side of the arrow. This is a rough description, there’s more nuance to it than just that, but with that definition we can see that the Output type parameter is in both positive and negative position for the parser-printer signature, after de-sugaring the inout syntax: parse: (Input) throws -> (Input, Output) print: (Output, Input) throws -> Input

4:25

Our episode further showed that one can only define “map”-like operations on type parameters that are in positive position, and there’s a corresponding “contramap” operation on type parameters in negative position.

4:41

When type parameters appear on both the left and right side of a function arrow things are not so simple. We cannot simply transform one of the type parameters with a single function, but rather you need two functions for transforming in both directions.

4:55

So, what we want is .map operation defined on parser-printers that takes transformations that go in both directions and returns a newly transformed parser-printer.

5:05

Let’s just copy and paste the current .map operation to figure out what exactly needs to change: extension Parser { func map<NewOutput>( _ transform: @escaping (Output) -> NewOutput ) -> Parsers.Map<Self, NewOutput> { .init(upstream: self, transform: transform) } }

5:27

First thing we will change is to make this map transformation only work on printers. Its signature is going to look quite different from the map we are already familiar with, so let’s further constrain our extension to be only for when the parser is also a printer: extension Parser where Self: Printer { }

5:38

As we said a moment ago, this single transform from Output to NewOutput is not enough to transform a parser-printer into another parser-printer. We need some kind of way of reversing the effects of the transform. Let’s call it untransform for the lack of a better word right now: extension Parser where Self: Printer { func map<NewOutput>( transform: @escaping (Output) -> NewOutput, untransform: @escaping (NewOutput) -> Output ) -> Parsers.Map<Self, NewOutput> { .init(upstream: self, transform: transform) } }

6:05

Then using this information we need to return some kind of new type that conforms to the Parser and Printer protocols. The other version of map returned something called Parsers.Map , but we will create a new type for this.

6:23

Again, for the lack of a better word, let’s just call this InvertibleMap . It will be generic over the parser being transformed, as well as the new output we are transforming into: struct InvertibleMap<Upstream: Parser & Printer, NewOutput>: Parser, Printer { }

7:00

Now one thing that’s strange here is we have two spellings for parser-printer conformances. We have Parser, Printer and Parser & Printer . We could maybe clean up this mismatch by employing the same trick the Swift standard library employs for Codable .

7:29

If we jump to its definition we will see that Codable is simply a type alias to the composition of Decodable and Encodable : public typealias Codable = Decodable & Encodable

7:35

So maybe we could do the same: typealias ParserPrinter = Parser & Printer

7:44

And now we have a succinct, consistent means of labeling parser-printers: struct InvertibleMap< Upstream: ParserPrinter, NewOutput >: ParserPrinter { }

7:52

Next we need to hold onto the upstream parser-printer we are transforming, as well as the transform functions: struct InvertibleMap< Upstream: ParserPrinter, NewOutput >: ParserPrinter { public let upstream: Upstream public let transform: (Upstream.Output) -> NewOutput public let untransform: (NewOutput) -> Upstream.Output }

8:15

And somehow we need parse the upstream’s input into the new output and print a new output into an upstream’s input: func parse(_ input: inout Upstream.Input) throws -> NewOutput { <#code#> } func print(_ output: NewOutput, to input: inout Upstream.Input) throws { <#code#> }

8:27

Let’s take these on at time, starting with the parse method. We have an Upstream.Input value, which is exactly what the upstream ‘s parse method takes, so let’s start by running that: func parse(_ input: inout Upstream.Input) throws -> NewOutput { try self.upstream.parse(&input) }

8:41

Invoking the parse method returns a Upstream.Output value, but what we really need to return is an NewOutput value, which is different. Luckily we have a way to convert Upstream.Output values into NewOutput values, and that’s the transform function: func parse(_ input: inout Upstream.Input) throws -> NewOutput { try self.transform(self.upstream.parse(&input)) }

8:52

That gets the parse method compiling, so now we turn to the print method. Here we need to somehow print an NewOutput value into an Upstream.Input value. The upstream ’s print method doesn’t deal with NewOutput values. It deals with Upstream.Output values: func print(_ output: NewOutput, to input: inout Upstream.Input) throws { self.upstream.print(<#Upstream.Output#>, to: &<#Upstream.Input#>) }

9:10

Lucky for us we have a way of transform NewOutput values back into Upstream.Output values, and that’s the untransform function: func print(_ output: NewOutput, to input: inout Upstream.Input) throws { try self.upstream.print(self.untransform(output), to: &input) }

9:23

And this completes the implementation of InvertibleMap . It represents a parser combinator that can transform any parser printer into a new one of a different output type, as long as you provide transformations that can convert in both directions between the output types.

9:43

It’s pretty amazing to see the bodies of the parse and print endpoints side-by-side: try self.transform(self.upstream.parse(&input)) … try self.upstream.print(self.untransform(output), to: &input)

9:49

They are incredibly symmetric. To parse we simply run the parser and then transform the output, and to print we simply untransform the output and then print. This makes it clear why we really need transformations going in both directions. If we only had one direction this would be impossible.

10:07

We can now return this parser from our new map operation that takes two transformations: extension Parser { func map<NewOutput>( transform: @escaping (Output) -> NewOutput, untransform: @escaping (NewOutput) -> Output ) -> Parsers.Invertible<Self, NewOutput> { .init( upstream: self, transform: transform, untransform: untransform ) } }

10:21

This implements the operator, and now everything is compiling.

10:26

It’s worth mentioning why we think it’s OK to call this operation “map” even though it doesn’t look anything like the map we know and love. We did a deep dive into the map operation in the early days of Point-Free where we showed that many seemingly different types support a map operation. Everyone is probably familiar with arrays having map, and perhaps even optionals, but also results have map, dictionaries have map, asynchronous values have map, random number generators have map, and of course parsers have map.

10:57

In that episode we distilled the essence of map as having the following fundamental shape: map: ((A) -> B) -> (F<A>) -> F<B> As well as satisfying some basic properties.

11:18

So, map is something that can lift functions from A to B to functions from F<A> to F<B> . For example, given a function from Int to String , map can turn that into a function from array of Int to array of String s. Or, given a function from String to Bool , map can turn that into a function from a result of a String to a result of a Bool . And it goes on and on for many, many types. map: ((A) -> B) -> (F<A>) -> F<B> map: ((A) -> B) -> (Array<A>) -> Array<B> map: ((A) -> B) -> (Optional<A>) -> Optional<B> map: ((A) -> B) -> (Result<A, _>) -> Result<B, _> map: ((A) -> B) -> (Dictionary<_, A>) -> Dictionary<_, B> …

11:59

Now, although we said the fundamental shape is something that can lift functions from A to B into functions from F<A> into F<B> , that is actually unnecessarily restrictive. The important thing here isn’t that we literally have function from A to B but rather that we have some process for turning A s into B s.

12:19

Of course, functions are the most prototypical process for turning A s into B s, but there are others. In the past we have used this reasoning to justify the pullback operation on reducers, where we consider key paths and case paths to be a process for turning a Root s into a Value s.

12:41

The “process” in this reversible map operation isn’t a simple function, but rather a pair of functions. In fact, let’s give this pair of a functions its own dedicated name: struct Conversion<A, B> { let transform: (A) -> B let untransform: (B) -> A }

13:08

The naming for the functions held inside Conversion are a little strange. Not only are the names quite long, but “untransform” isn’t even a real English word. Perhaps better would be to call the properties apply and unapply : struct Conversion<A, B> { let apply: (A) -> B let unapply: (B) -> A } This reads a bit nicer.

13:22

And then we can map a parser-printer with a conversion like so: extension Parser where Self: Printer { … func map<NewOutput>( _ conversion: Conversion<Output, NewOutput> ) -> Parsers.ReversibleMap<Self, NewOutput> { .init( upstream: self, transform: conversion.apply, untransform: conversion.unapply ) } }

13:45

By giving the pair of functions a dedicated name we can now more clearly see why we think it’s OK to call this a map transformation. Given a conversion from A to B we can turn a parser-printer of A s into a parser-printer of B s: map: (Conversion<A, B>) -> (ParserPrinter<_, A>) -> ParserPrinter<_, B>

14:19

Now this looks quite similar to all the other maps we know and love from Swift. The only difference is that we use conversions as the process for turn A s into B s rather than plain functions.

14:32

We’ve seen this pattern before. When discussing ways to transform reducers in the Composable Architecture we defined a pullback operation that used key paths and case paths to do their work. For example, to transform a reducer that works on local state to a reducer that works on global state, you can do so via a key path from global state to local state: pullback: (WritableKeyPath<A, B>) -> (Reducer<B, _, _>) -> Reducer<A, _, _>

15:05

And similarly, to transform a reducer that works on local actions to one that works on global actions you can use a case path from global actions to local actions: pullback: (CasePath<A, B>) -> (Reducer<_, B, _>) -> Reducer<_, A, _>

15:18

Technically these transformations are going in the opposite direction, but the principle still stands. Mapping our parser-printers

15:33

OK, that was a lot of theory to explain the map operation without actually making use of it yet.

15:52

We can use it in much the same way that we did for the map operation on parsers, but now we just need to supply a conversion.

16:05

For example, our field parser currently produces a Substring as its output because we didn’t know how to map it to turn that Substring into a regular String : let field = OneOf { Parse { "\"" Prefix { $0 != "\"" } "\"" } Prefix { $0 != "," } }

16:24

Now we know we just need to map by a conversion so that we can describe how to simultaneously transform the parsing and printing: let field = OneOf { Parse { "\"" Prefix { $0 != "\"" } "\"" } Prefix { $0 != "," } } .map( .init( apply: <#(Substring) -> String#>, unapply: <#(String) -> Substring#> ) )

16:30

Now we just need to supply transformations for turning Substring s into String s and vice-versa: let field = OneOf { Parse { "\"" Prefix { $0 != "\"" } "\"" } Prefix { $0 != "," } } .map( .init( apply: { String($0) }, unapply: { Substring($0) } ) )

16:54

And just like that we finally have a parse-printer that outputs String s rather than Substring s: input = "" try field.print("Blob, Esq." as String, to: &input) input field.parse(&input) as String

17:11

Even better, this conversion from Substring to String is going to be very common, so we can actually bundle it up into a reusable conversion that lives as a static on the Conversion type: extension Conversion where A == Substring, B == String { static let string = Self { .init( apply: String.init, unapply: Substring.init ) } }

17:40

And now we can simplify the field parser to just: let field = OneOf { … } .map(.string)

17:50

The next parser we really wanted to be able to map on was the user parser. Right now it just produces a tuple as output, but ideally we could map that tuple into a User struct value.

18:03

If we try applying the map operation at the end of the user parser we will find we need to supply two transformations: one that can turn the tuple into some new output, and one that can turn that new output back into a tuple: let user = ParsePrint { Int.parser() Skip { "," zeroOrOneSpace } field Skip { "," zeroOrOneSpace } Bool.parser() } .map( .init( apply: <#((Int, String, Bool)) -> NewOutput#>, unapply: <#(NewOutput) -> (Int, String, Bool)#> ) )

18:20

The apply direction is easy enough to fill in because that’s precisely the User ’s initializer: .map( .init( apply: User.init(id:name:admin:), unapply: <#(NewOutput) -> (Int, String, Bool)#> ) )

18:28

To go in the unapply direction we need to somehow turn a user back into an tuple of an integer, string and boolean. This is essentially the opposite of initializing a user from a tuple of data. We now need to destructure the user back into a raw tuple of data: .map( .init( apply: User.init(id:name:admin:), unapply: { ($0.id, $0.name, $0.admin) } ) )

18:44

And amazingly we can now parse and print real users, not just tuples of user data: input = "" try user.print( User(id: 1, name: "Blob", admin: true), to: &input ) input // "1,Blob,true" try user.parse(&input) as User

18:54

Even better, we can also print arrays of users, and if any of those users have a name with a comma it will automatically be quoted: input = "" try users.print( [ User(id: 1, name: "Blob", admin: true), User(id: 2, name: "Blob, Esq.", admin: true), ], to: &input ) input // "1,Blob,true\n2,"Blob, Esq.",true" try users.parse(&input) as [User]

19:13

This is all looks really amazing. This user CSV parser-printer looks remarkably similar to what we had when it was only a parser. There are only two small changes we had to make to the original parser to make it printing capable. First, we had to map with a substring-to-string conversion to massage the name into a string, and then we had to map with a tuple-to- User conversion to massage all the user data into a proper User struct value.

19:49

Let’s take another look at this latter one: .map( .init( apply: User.init(id:name:admin:), unapply: { ($0.id, $0.name, $0.admin) } ) )

19:53

There are two things we can do to make this nicer. First, we can bundle this ad hoc conversion into a static helper like we did with the substring-to-string conversion. If we did it naively we might just literally copy and paste the conversion into a static: extension Conversion where A == (Int, String, Bool), B == User { static let user = Self( apply: User.init(id:name:admin:), unapply: { ($0.id, $0.name, $0.admin) } ) }

20:19

And then we can map the user parser-printer into the User struct: let user = ParsePrint { … } .map(.user)

20:24

But this really hasn’t accomplished much. We are just moving code around.

20:28

The real problem with this code is that it is incredibly verbose for something that should be very easy. Before we were considering printers we were able to simply map on a parser of tuples with a struct initializer to immediately bundle the data up into a proper Swift data type: .map(User.init)

20:43

But, if we want to preserve printing we need to provide transformations that go both ways: something to turn the tuple of integer, string and boolean into a User , and something to turn a User into a tuple of integer, string and boolean.

20:55

Fortunately for us Swift provides a very succinct way of providing this first piece of information. The automatically synthesized initializer for User is precisely a function that turns a tuple of data into a User : User.init as (Int, String, Bool) -> User

21:16

But not so fortunate for us Swift does not provide any nice way of going in the opposite direction: (User) -> (Int, String, Bool)

21:27

There is no easy way to “destructure” a user value into a tuple of the bare data that defines the value. This means every time we want to map a tuple of outputs produced by some parsers into a first-class Swift data type, we are going to have to create a conversion from scratch to do the value-to-tuple transformation. And that’s a pain because this is something we need to do a lot when creating parsers.

21:46

Well, luckily for us there’s a runtime magic trick we can perform to make this a lot easier. It isn’t perfect, and it would still be much better if Swift provided a first-class solution, but it will greatly reduce boilerplate, and make creating parser-printers a much nicer experience.

22:01

The trick begins by realizing that there is a way to coerce a tuple of data into a struct data type value without using an initializer. There’s a function called unsafeBitCast that allows you to convert one type to another type as long as they have the same underlying memory layout. A tuple of an integer, string and boolean and a User struct are very different things as far as the type system is concerned, but their raw data is the same. A User struct is just a wrapper around a tuple of data so that the type system can distinguish it from other types with identical data.

22:33

To show how this works, we can take a tuple of an integer, string and boolean and instantly convert it to a User struct value: unsafeBitCast((1, "Blob", true), to: User.self)

22:45

And we can go in the opposite direction: unsafeBitCast( User(id: 1, name: "Blob", admin: true), to: (Int, String, Bool).self )

23:01

As the name suggests, this function is not safe. It can easily lead to crashes, and works completely outside the purview of the compiler and type system. If we make a change to the tuple we will immediately get a crash: unsafeBitCast( User(id: 1, name: "Blob", admin: true), to: (Int, String, Bool, Int).self ) Fatal error: Can’t unsafeBitCast between types of different sizes

23:17

Or if we changed the fields of the User struct we would also get a crash.

23:31

This function operates so fully outside the purview of the compiler that it even allows you to construct values of types that have a completely private interface. For example, we could have a struct with a private initializer: struct Private { private let value: Int private init(value: Int) { self.value = value } }

23:46

And it is not possible to construct this: Private(value: 10) ‘Private’ initializer is inaccessible due to ‘private’ protection level

23:49

But, it is possible to bitcast an integer to the type, thus making it possible to construct: unsafeBitCast(1, to: Private.self)

24:21

Of course this works now, but if someone came along and changed the memory layout of the struct by adding or removing a field, then this unsafeBitCast would instantly start crashing: struct Private { let value: Int private let other = 1 private init(value: Int) { self.value = value } } unsafeBitCast(1, to: Private.self) Fatal error: Can’t unsafeBitCast between types of different sizes

24:36

And the compiler couldn’t do anything to help you find that problem early on.

24:44

This is why having a type like Never in Swift is useful. It is provably an uninhabited type, which means there are no values whatsoever in the type. We couldn’t coerce a value to be of type Never if we wanted, and the compiler knows this. In programming languages that can’t express uninhabited types you are forced to use structs or classes with private initializers, and just hope there are no tricks for secretly instantiating a value. But you will never have the compiler proving it definitively for you.

25:17

Now, although this unsafeBitCast function is dangerous to use, there are ways to make it slightly safer than if we just used it willy-nilly in our code. To see how, let’s first update our static user conversion to use unsafeBitCast under the hood even though it was easy enough to destructure the user into a tuple: extension Conversion where A == (Int, String, Bool), B == User { static let user = Self( apply: User.init(id:name:admin:), unapply: { unsafeBitCast($0, to: (Int, String, Bool).self) } ) }

25:43

This compiles and the users parser-printer works exactly as it did before. We haven’t accomplished much because this is still very ad-hoc code concerned just with the User struct. But we can start generalizing it a bit.

25:55

For one thing, the user type we initialize is determined by the B of the conversion, and the tuple type that we pass to unsafeBitCast is already determined by the A type of the conversion, so we can simplify things by spelling it out in those terms: extension Conversion where A == (Int, String, Bool), B == User { static let user = Self( apply: B.init, unapply: { unsafeBitCast($0, to: A.self) } ) }

26:11

This sets up a prototype of how we can greatly genericize the process of converting a struct into a tuple and back. Let’s copy and paste this conversion, remove the constraints to make it fully generic, and rename the property to struct to signify that it used for creating structs from tuple data: extension Conversion { static let struct = Self( apply: B.init, unapply: { unsafeBitCast($0, to: A.self) } ) }

26:31

Amazingly this would already compile if it wasn’t for the User initializer we are using. Turns out, this is the exact piece of information we need to hand to the conversion so that it can do it’s job. The User initializer is a function that turns a tuple into a User value, and that information alone is enough for all the types to be figured out: extension Conversion { static func struct(_ init: @escaping (A) -> B) -> Self { .init( apply: init, unapply: { unsafeBitCast($0, to: A.self) } ) } }

27:08

With this conversion we can now simplify mapping our user parser to bundle the tuple of data we extract into a User struct: let user = ParsePrint { … } .map(.struct(User.init))

27:20

Even better, we could create a new overloaded initializer of the Parse entry point to take a conversion. Currently we have an overloaded initializer that takes a simple, one-directional transformation function, but in order to preserve printing functionality we need a full conversion. We can just copy and paste the existing initializer and make a few small tweaks: extension Parse { init<Upstream, NewOutput>( _ conversion: Conversion<Upstream.Output, NewOutput>, @ParserBuilder with build: () -> Upstream ) where Parsers == Parsing.Parsers.InvertibleMap<Upstream, NewOutput> { self.init { build().map(conversion) } } }

28:28

And now our user parser-printer looks almost identical to what it was when it was just a parser: let user = ParsePrint(.struct(User.init)) { … }

28:49

The only difference is that we have to use this explicitly .struct conversion to keep us in the printing world.

29:04

Now, it is worth reiterating that this .struct conversion is quite dangerous to use. It is only meant to be used with the default synthesized initializer for a struct that maps a tuple of data directly to the underlying storage of the struct. Any changes to the signature of the initializer can cause a crash or other subtle bugs, including leaving out fields that you provide defaults for or even re-ordering the fields.

29:26

For example, suppose we had a Person struct that just holds a first and last name: struct Person { let firstName, lastName: String }

29:33

We can cook up a parser-printer to turn two space-separated strings into a Person value: let person = Parse(.struct(Person.init)) { Prefix { $0 != " " }.map(.string) " " Prefix { $0 != " " }.map(.string) }

30:03

And we can give this a spin by parsing an input and then printing the input: input = "Blob McBlob" let p = try person.parse(&input) input // "" try person.print(p, to: &input) input // "Blob McBlob"

30:29

Nothing too surprising here. The printing string matches exactly what we started with.

30:32

But something surprising does happen if we provide a custom initializer that swaps the first and last name arguments: struct Person { let firstName, lastName: String init(lastName: String, firstName: String) { self.firstName = firstName self.lastName = lastName } }

30:49

This looks innocent enough, but it is secretly introducing a subtle bug to our parser-printer. If we run the round-tripping code again we will see that the printed string is “McBlob Blob” even though the original input was “Blob McBlob”: input = "Blob McBlob" let p = try person.parse(&input) input // "" try person.print(p, to: &input) input // "McBlob Blob"

31:02

If we print out the intermediate Person value we will see that parsing got things mixed up: firstName: "McBlob" lastName: "Blob"

31:11

It put the last name in the first and the first in the last. And then because printing just takes the struct and destructures it into a tuple, we print with the names flipped.

31:21

So, small changes to the initializer of structs can break the reversible guarantees that we demand of our parser-printers. But there are even worse things that can happen. Suppose we add a new field to the Person struct, but we don’t update the initializer and instead set a default value: struct Person { let firstName, lastName: String let bio: String init(lastName: String, firstName: String) { self.firstName = firstName self.lastName = lastName self.bio = "" } }

31:40

Now our round-tripping code crashes: input = "Blob McBlob" let p = try person.parse(&input) print(p) try person.print(p, to: &input) input Fatal error: Can’t unsafeBitCast between types of different sizes

31:45

The crash happens on the printing because the first thing it tries to do is convert the Person value into a (String, String) tuple since that’s what the Person initializer types. But, the Person struct now holds three strings, and it is not possible to bit cast between tuples of different sizes: unsafeBitCast(("A", "B"), to: (String, String, String).self) Fatal error: Can’t unsafeBitCast between types of different sizes

31:59

So, this is the danger of using this .struct conversion to hide away some of the boilerplate we encountered when constructing the conversion from scratch. As long as you are using the automatically synthesized initializer for your struct data types you should be OK. And there are some checks we can make before bit casting to prevent crashes.

32:17

For example, we could check that the A ’s and B ’s memory layouts match before bit casting. This would catch the crash we just had because a 2-tuple has a different memory layout than the Person type does: MemoryLayout<(String, String)>.size // 32 MemoryLayout<Person>.size // 48

32:49

We aren’t going to dive deep into those topics right now, but suffice it to say that this .struct conversion can be made safer, and it will be safer in the final open sourced version of this library.

32:58

We have now fully converted our users parser to be a parser-printer. At the end of the day, there are only two small changes that had to be made to upgrade the parser to be a parser-printer. We had to change the .map on the field parser to use a conversion, and we had to change the ParsePrint initializer for the User parser to also use a conversion: let field = OneOf { Parse { "\"" Prefix { $0 != "\"" } "\"" } Prefix { $0 != "," } } // .map(String.init) .map(.string) let zeroOrOneSpace = OneOf { " " "" } // let user = Parse(User.init) { let user = ParsePrint(.struct(User.init)) { Int.parser() Skip { "," zeroOrOneSpace } field Skip { "," zeroOrOneSpace } Bool.parser() } let users = Many { user } separator: { "\n" } terminator: { End() }

33:19

We need those two small changes so that we could bidirectional describe how to transform between two different types of output we were dealing with. Once that was done we instantly got the ability to parse and print arrays of User struct values, and there is just one single package of logic and functionality that accomplishes all of this. The moment we update some logic it will be simultaneously updated for both the parser and printer. You no longer need to explicitly remember to go update the printer after updating the parser or vice-versa. Printing finesse

33:48

So, this is looking pretty amazing, but there are two small improvements we want to make before moving on. In the ad-hoc User printer we made awhile ago we put in extra logic so that we always print a single space after each comma even though the parser is capable of consuming zero or one spaces after each comma. Further, in previous episodes when dealing with the user parser we had upgraded the admin boolean flag to be a full Role enum that described multiple roles: guest, admin and member. Let’s see what it takes to implement those features in our parser-printer.

34:31

Let’s start with preferring to print a space after each comma. Currently we can see that indeed no space is being printed: input = "" try users.print( [ .init(id: 1, name: "Blob", admin: true), .init(id: 2, name: "Blob, Esq.", admin: true), ], to: &input ) input // "1,Blob,true\n2,"Blob, Esq.",true"

34:54

Our parser can clearly handle zero or one spaces, and does so using this little OneOf parser: let zeroOrOneSpace = OneOf { " " "" }

35:06

Remember that the parsers listed in a OneOf should go from most specific to least specific. That is, the first parser should succeed on fewer inputs than the second. In this case, the single space string " " acting as a parser will succeed on any string with a leading space. Whereas the empty string "" acting as a parser will succeed on every string.

35:24

So, as a parser, zeroOrOneSpace will first try to parse a single space from the input, and if that fails it will parse an empty string which just always succeeds. But, as a printer, zeroOrOneSpace goes in the reverse direction, first trying to print an empty string, and if that fails it will print a single space. Only problem is that printing an empty string always succeeds, and that is why no space is printed after the comma.

35:54

Turns out that sometimes we just want to override the printing behavior of a parser so that we can force a preference for how we want things printed. We can even theorize how we might want this syntax to look like. What if we could tack on a .printing operator to any parser-printer in order to override its print functionality: let zeroOrOneSpace = OneOf { " " "" } .printing(" ")

36:19

This allows us to continue parsing zero or one spaces, but printing will always be just one space.

36:27

Let’s see what it takes to implement this operator. We can start by getting a signature in place. It will be a method on the Parser method that takes the input we want to print, and it will somehow return a new parser-printer: extension Parser { func printing(_ input: Input) -> <#???#> { <#???#> } }

36:47

Like all operators in the library, we need to create a new type that conforms to Parser and Printer protocols that implements the actual logic for the operator. We will call this type Printing and will nest it in the Parsers namespace: extension Parsers { struct Printing: ParserPrinter { } }

37:16

It will be generic over the upstream parser that it is overriding and we will hold onto that parser as well as the input that we want to always print: extension Parsers { struct Printing<Upstream: Parser>: ParserPrinter { let upstream: Upstream let input: Upstream.Input } }

37:55

And with this defined, we can finish implementing the printing method: extension Parser { func printing(_ input: Input) -> Parsers.Printing<Self> { .init(upstream: self, input: input) } }

38:24

Implementing the parse method on this new parser-printer is as simple as calling out to the upstream parser: func parse(_ input: inout Upstream.Input) throws -> Upstream.Output { try self.upstream.parse(&input) }

38:33

Printing, however, is a little trickier: func print(_ output: Upstream.Output, to input: inout Upstream.Input) { <#???#> }

38:36

We know nothing about Upstream.Input so there doesn’t seem to be any way to somehow “append” self.input to the inout input we have access to. Sounds like we can’t implement this in fully generality, but we can if we constrain this to only work with AppendableCollection s, which is the majority of use cases we have: struct Printing<Upstream: Parser>: ParserPrinter where Upstream.Input: AppendableCollection { let upstream: Upstream let input: Upstream.Input func parse(_ input: inout Upstream.Input) throws -> Upstream.Output { try self.upstream.parse(&input) } func print( _ output: Upstream.Output, to input: inout Upstream.Input ) { input.append(contentsOf: self.input) } }

39:42

And now the zeroOrOneSpace parser-printer with preferred printing is compiling:

39:52

And incredibly printing an array of users now adds a single space after each comma: input = "" try users.print( [ .init(id: 1, name: "Blob", admin: true), .init(id: 2, name: "Blob, Esq.", admin: true), ], to: &input ) input // "1, Blob, true\n2, "Blob, Esq.", true"

40:02

Again we want to point out how amazing it is that we have parsing and printing packaged up into a single entity. There is only one place to make tweaks to the logic of our parser-printers.

40:18

Let’s look at the final missing piece to our user parser-printer, which is introducing a proper Role enum to describe 3 choices: an admin, guest or member: enum Role { case admin, guest, member } struct User { var id: Int var name: String var admin: Role }

40:42

This breaks a lot of things because we are parsing booleans where we should be parsing roles. Constructing a role parser is quite straightforward. It can be done with a OneOf and a parser for each case of the enum: let role = OneOf { "admin".map { Role.admin } "guest".map { Role.guest } "member".map { Role.member } }

41:11

However, this role parser is not a printer: role.print Referencing instance method ‘print(_:to:)’ on ‘OneOf’ requires that ‘Parsers.OneOf3<Parsers.Map<String, Role>, Parsers.Map<String, Role>, Parsers.Map<String, Role>>’ conform to ’Printer’

41:17

This is just because OneOf3 does not conform to the Printer protocol. We already made OneOf2 conform, but we haven’t yet taken care of OneOf3 . Let’s do that real quick, remembering that we need to try the printers in reverse order, from least specific to most specific: extension Parsers.OneOf3: Printer where P0: Printer, P1: Printer, P2: Printer { func print(_ output: P0.Output, to input: inout P0.Input) throws { let original = input do { try self.p2.print(output, to: &input) } catch { input = original do { try self.p1.print(output, to: &input) input = original } catch { try self.p0.print(output, to: &input) } } } }

42:25

Now the role parser is still not a printer: Referencing instance method ‘print(_:to:)’ on ‘Parsers.OneOf3’ requires that ‘Parsers.Map<String, Role>’ conform to ’Printer’

42:33

And this is because we are using the .map operation that takes a closure rather than a conversion. The map operation with a closure only describes a single direction of how to transform the output of the "admin" string parser, which is Void , to a Role .

42:53

But we need to go in the opposite direction too. We need to somehow turn a Role value back into a Void value. That sounds a little strange. It’s of course trivial to turn any value into a Void value because Void as a type only has a single value, so we can just return that: let role = OneOf { "admin".map(Conversion(apply: { Role.admin }, unapply: { _ in () })) "guest".map(Conversion(apply: { Role.guest }, unapply: { _ in () })) "member".map(Conversion(apply: { Role.member }, unapply: { _ in () })) }

43:32

And this now makes the role parser into a printer! role.print

43:39

This looks really strange. When the "admin" parser succeeds we convert its Void value to a Role.admin value, but when trying to print a role we are just always returning a Void value.

44:41

In fact, it does not work as we expect: input = "" try role.print(.admin, to: &input) input // "member"

44:58

Printing the Role.admin value appends the string “member” to the input for some reason. In fact, no matter what we try to print it will always append “member”: input = "" try role.print(.admin, to: &input) input // "member" input = "" try role.print(.guest, to: &input) input // "member" input = "" try role.print(.member, to: &input) input // "member"

45:10

The reason this is printing the wrong thing is because the unapply closures we are providing are not correct. They are completely logicless, always succeed, and always return Void . So when the OneOf tries each printer, starting with the last, it will immediately succeed and short circuit trying any other printers.

45:43

It looks like our definition of Conversion isn’t quite right. We are seeing here that sometimes when converting back and forth between inputs and outputs, we sometimes need a notion of failability built in.

46:03

So really a Conversion should be two throwing functions: struct Conversion<A, B> { let apply: (A) throws -> B let unapply: (B) throws -> A }

46:22

This will give conversions a chance to fail if there is no way to convert its data.

46:33

And now we have an opportunity to fail the printer. When trying to print a particular role, we could make sure it matches the role we are handling in the particular parser-printer, and if it does we return Void , and if it doesn’t we throw: struct ConvertingError: Error {} let role = OneOf { "admin".map(Conversion(apply: { Role.admin }, unapply: { guard $0 == .admin else { throw ConvertingError() } })) "guest".map(Conversion(apply: { Role.admin }, unapply: { guard $0 == .guest else { throw ConvertingError() } })) "member".map(Conversion(apply: { Role.admin }, unapply: { guard $0 == .member else { throw ConvertingError() } })) }

47:45

And it does work: input = "" try role.print(.guest, to: &input) input // "guest" input = "" try role.print(.admin, to: &input) input // "admin" input = "" try role.print(.member, to: &input) input // "member"

48:08

But those conversions are pretty gross. This is such a common thing to do that perhaps we should bake it into its own conversion helper like we did with substring-to-string conversion and tuple-to-struct conversion. For the lack of a better word we will call this the “exactly” conversion since it only works when the output we are printing exactly matches some other value: extension Conversion where A == Void, B: Equatable { static func exactly(_ output: B) -> Self { .init( apply: { output }, unapply: { guard $0 == output else { throw ConversionError() } } ) } }

49:34

And now the role parser becomes very short: let role = OneOf { "admin".map(.exactly(Role.admin)) "guest".map(.exactly(Role.guest)) "member".map(.exactly(Role.member)) }

50:13

So, this certainly gets the job done, but honestly the exactly naming isn’t great. The previous syntax of just mapping the Void output of a parser directly to a value made a lot more sense: let role = OneOf { "admin".map { Role.admin } "guest".map { Role.guest } "member".map { Role.member } }

50:28

And indeed we don’t think conversions should really be necessary to describe these transformations. Although the Map parser cannot be a printer in general due to the lack of bidirectional transformations to aid in parsing and printing, it is true that Map can be a printer under certain constrained conditions.

51:00

In fact, it’s under the same constraints that we used to define the exactly conversion, and even it’s implementation directly mimics that of unapply : extension Parsers.Map: Printer where Upstream: Printer, Upstream.Output == Void, Output: Equatable { func print(_ output: Output, to input: inout Upstream.Input) throws { guard self.transform(()) == output else { throw PrintingError() } try self.upstream.print((), to: &input) } }

52:10

With that Printer conformance defined we can now write the role parser exactly as we had before when only considering parsing: let role = OneOf { "admin".map { Role.admin } "guest".map { Role.guest } "member".map { Role.member } }

52:27

And it’s now magically a printer, and everything compiles and works as it did before, but now we can print user roles. Next time: bizarro printing

52:39

So we think this is absolutely incredible. The users parser has undergone only one single cosmetic change, that of using the .struct conversion instead of just the initializer, and almost as if by magic the parser has also become a printer. We can invoke the .print method on the value to have a type safe way of transforming an array of users back into a string, which can then be saved to disk or sent over the network to an API.

53:02

We want to stress that it can be a little mind trippy sometimes to figure out how to simultaneously parse and print. When considering printing we have to constantly be mindful of what it means to reverse the effects of parsing, and that can be a very subtle thing. For example, the Prefix parser consumes from the beginning of an input until a predicate fails, whereas the Prefix printer appends to the end of an input only if the entire value satisfies the predicate.

52:29

Another example was OneOf , where the OneOf parser tries a list of parsers from top-to-bottom, or most specific to least specific, and stops at the first successful one. Whereas the OneOf printer tries that list in reverse, from bottom-to-top, or least specific to most specific, and stops at the first successful one.

53:47

Even something as simple as mapping the output of a parser completely broke down when it came to printers. It is not enough to know how to transform an output into a new kind of output, you also need to know how to transform those new outputs back into outputs so that they can be printed. Stephen

54:02

Luckily, if you don’t care about printing, then you can just continue using the parser library as it exists today without ever thinking about printing. But, if you do care about printing, then all of these subtleties and complexities are problems that you would have even if you weren’t trying to create a unified parser-printer, it just may not be as obvious. Trying to create a parser-printer forces you to come face-to-face with the realities of how complex your domain is early since at every step of the way you have to prove that you can reverse the effects of parsing by printing. You are not allowed to just willy-nilly .map your parsers without a care in the world because that is not a printer-friendly thing to do. Each time you map to transform an output you have to be prepared to also supply a reverse transform to undo the effects for printing. And that can be a challenge to wrap your head around.

54:51

So, you might think after 5 episodes we would be done with printing, but that is not the case. We thought we were done, and in fact we had already recorded a nice outro episode that we should be transitioning to right now. But, either Stephen is taking invertibility too seriously by reversing the growth of his hair, or we had to re-shoot this episode due to some new information that came to light.

55:11

And indeed, the week we kicked off the invertible parsing series there were some really interesting discussions happening in the parsing library’s repo that made us realize there is still another subtlety when dealing with parser-printers. This was brought up by a Point-Free subscriber, David Peterson, who is using our library to build a parser for a specific documentation format. While constructing his parsers he came across some very simple and natural parsers that could not reasonably be made into printers.

55:40

Turns out, one of the fundamental operations of how we compose printers was still not quite right. There is one small tweak we can make that instantly unlocks the ability to turn his parsers into printers, and even fixes a few drawbacks some of the library’s parser-printers have. It’s honestly a little surprising to see just how subtle parser-printers can be, especially since we’ve been thinking about them for over 4 years now, and we’ve iterated on the concepts and APIs many, many times, but we still never uncovered this one issue.

56:11

But luckily these subtleties are mostly for the library to worry about. Not the library user. By sweating the details of these tiny parser-printer combinators to make sure they plug together correctly we can allow people to build immensely complex parser-printers and be confident that what you have built is correct. And this is why we think a composable framework of parser-printers is so much more superior than writing ad-hoc parser-printers.

56:36

So, with that said, let’s explore this new subtlety of parser-printers and see how we can fix it. References Invertible syntax descriptions: Unifying parsing and pretty printing Tillmann Rendel and Klaus Ostermann • Sep 30, 2010 Note Parsers and pretty-printers for a language are often quite similar, yet both are typically implemented separately, leading to redundancy and potential inconsistency. We propose a new interface of syntactic descriptions, with which both parser and pretty-printer can be described as a single program using this interface. Whether a syntactic description is used as a parser or as a pretty-printer is determined by the implementation of the interface. Syntactic descriptions enable programmers to describe the connection between concrete and abstract syntax once and for all, and use these descriptions for parsing or pretty-printing as needed. We also discuss the generalization of our programming technique towards an algebra of partial isomorphisms. This publication (from 2010!) was the initial inspiration for our parser-printer explorations, and a much less polished version of the code was employed on the Point-Free web site on day one of our launch! https://www.informatik.uni-marburg.de/~rendel/unparse/ Unified Parsing and Printing with Prisms Fraser Tweedale • Apr 29, 2016 Note Parsers and pretty printers are commonly defined as separate values, however, the same essential information about how the structured data is represented in a stream must exist in both values. This is therefore a violation of the DRY principle – usually quite an obvious one (a cursory glance at any corresponding FromJSON and ToJSON instances suffices to support this fact). Various methods of unifying parsers and printers have been proposed, most notably Invertible Syntax Descriptions due to Rendel and Ostermann (several Haskell implementations of this approach exist). Another approach to the parsing-printing problem using a construct known as a “prism” (a construct Point-Free viewers and library users may better know as a “case path”). https://skillsmatter.com/skillscasts/16594-unified-parsing-and-printing-with-prisms Downloads Sample code 0182-parser-printers-pt5 Point-Free A hub for advanced Swift programming. Brought to you by Brandon Williams and Stephen Celis . Content Become a member The Point-Free Way Beta previews Gifts Videos Collections Free clips Blog More About Us Community Slack Mastodon Twitter BlueSky GitHub Contact Us Privacy Policy © 2026 Point-Free, Inc. All rights are reserved for the videos and transcripts on this site. All other content is licensed under CC BY-NC-SA 4.0 , and the underlying source code to run this site is licensed under the MIT License .