Video #129: Parsing and Performance: Protocols

Episode: Video #129 Date: Dec 14, 2020 Access: Members Only 🔒 URL: https://www.pointfree.co/episodes/ep129-parsing-and-performance-protocols

Description

The performance gains we have made with the parser type have already been super impressive, but we can take things even further. We will explore the performance characteristics of closures using the time profiler and make some changes to how we define parsers to unlock even more speed.

Video

Cloudflare Stream video ID: 0676137a9f21d52221b65ba12ee62b43 Local file: video_129_parsing-and-performance-protocols.mp4 *(download with --video 129)*

References

Transcript

— 0:05

Pretty incredible. More than 10 times faster than the substring parser. This really does show that the performance gains to be had by working on UTF-8 can be truly substantial. This is the difference of being able to process more than 50 megabytes of logs per second, or being able to process a measly 4 megabytes of logs per second.

— 0:26

So this is pretty huge. We have found that our parser type is so general and so composable that we are not only able to parse at the low UTF-8 level in order to get huge performance gains, but we can also fluidly move between abstraction levels allowing us to choose the right balance of correctness and speed. This is absolutely incredible, and honestly it’s not something we’ve really ever seen in other parsing libraries.

— 0:51

And as hard as it may be to believe, it gets even better. There is even one more change we can make to our parser library that unlocks the final level of performance, and this will bring the performance of our parsers within a very slim margin of hand-rolled, ad hoc, imperative parsers, which are widely believed to be the fast way to write parsers even if they are a bit messy.

— 1:16

To see where this last performance gain could be hiding, let’s run one of our benchmarks in the time profile instrument and see what it exposes to us. Escaping closure overhead

— 1:30

Running our benchmark in the time profiler will show precisely what code paths are taking the longest. Let’s start by bringing back the marathon race suite: Benchmark.main([ // copyingSuite, // intParserSuite, raceSuite, // xcodeLogsSuite, ])

— 1:42

And let’s only run the UTF-8 benchmark in that suite: let raceSuite = BenchmarkSuite(name: "Race") { suite in // suite.benchmark("Substring") { // var input = upcomingRaces[...] // precondition(races.run(&input)?.count == 4) // } suite.benchmark("UTF8") { var input = upcomingRaces[...].utf8 precondition(racesUtf8.run(&input)?.count == 4) } }

— 1:48

We’ll hit command+I in order to build and run in instruments, and we will select the Time Profiler instrument.

— 2:00

Then we hit the red “start” button in the top left and that will run our benchmarks. When it finishes it will show our heaviest stack trace on the right, and clicking the last stack frame in that list shows us this.

— 2:14

This is shows a very, very deep stack. In fact, it’s about 79 frames deep. All of those frames aren’t related to parsing, in fact about the first 17 frames are just from the benchmarking library, and many of these frames can probably be further squashed and optimized away by Swift. In fact, we think that all of the frames labeled as “specialized closure in #1” all get optimized away, and so the real stack size is actually a bit smaller.

— 2:56

However, this instrument reading is still good as a heuristic, and it is definitely forcing us to come face-to-face with a reality of our style of parser combinators that up until now we could just sweep under the rug. Our parser type is just a function, which means to create a parser we have to use escaping closures. And parser combinators are things that transform that function into a new function via more functions, and large, complex parsers are built out of piecing together more and more of these things. So a parser, such as the races parser, is really just a massively nested escaping closure under the hood.

— 3:34

Now, we love functions on Point-Free. They’ve allowed us to build up a pretty impressive library, and creating parsers is super lightweight and even fun. But the reality is that using escaping closures comes with a small performance penalty. The Swift compiler often cannot inline them, and they are heap allocated rather than stack allocated, which is slower.

— 3:54

We can even see these performance costs in a benchmark. Let’s create a new file, Benchmark-Closures.swift with a new suite: // Benchmark-Closures.swift import Benchmark let closureSuite = BenchmarkSuite(name: "Closures") { suite in }

— 4:20

We can create a massively nested, escaping closure with the following: let f = {{{{{{{{{{ max(1, 100) }}}}}}}}}}

— 4:40

And we can write a benchmark for it: suite.benchmark("Escaping") { precondition(f()()()()()()()()()() == 100) }

— 5:01

And we’ll update main.swift to run this suite instead: // main.swift import Benchmark Benchmark.main([ // copyingSuite, // intParserSuite, // raceSuite, // xcodeLogsSuite, closureSuite, ])

— 5:07

And let’s compare that benchmark to one that just invokes the max function directly: suite.benchmark("None") { precondition(max(1, 100) == 100) }

— 5:16

And run: running Closures: Escaping... done! (103.19 ms) running Closures: None... done! (74.61 ms) name time std iterations ------------------------------------------------- Closures.Escaping 55.000 ns ± 462.34 % 1000000 Closures.None 26.000 ns ± 647.25 % 1000000 Program ended with exit code: 0

— 5:20

The numbers are really tiny, but we are definitely witnessing a difference in invoking a massively nested escaping closure versus just invoking a single function directly. And the more it becomes nested, the more these numbers will drift apart.

— 5:36

But even nested function calls do not have the same problems nested escaping closures do. We can create a massively nested function: func g() -> Int { func g() -> Int { func g() -> Int { func g() -> Int { func g() -> Int { func g() -> Int { func g() -> Int { func g() -> Int { func g() -> Int { func g() -> Int { max(1, 100) } return g() } return g() } return g() } return g() } return g() } return g() } return g() } return g() } return g() }

— 6:29

And benchmark it: suite.benchmark("Function") { precondition(g() == 100) } running Closures: Escaping... done! (113.07 ms) running Closures: Function... done! (75.09 ms) running Closures: None... done! (78.96 ms) name time std iterations ------------------------------------------------- Closures.Escaping 57.000 ns ± 515.68 % 1000000 Closures.Function 29.000 ns ± 795.98 % 1000000 Closures.None 29.000 ns ± 204.76 % 1000000 Program ended with exit code: 0

— 6:39

So even though the nested function seems to be doing a lot more work it has the same performance as just invoking the max function directly. The Swift optimizer seems to inline all of those nested function calls.

— 7:02

And it’s not just function calls that can be optimized. If a closure is non-escaping, then it too can be optimized. Let’s benchmark it to see it: suite.benchmark("Non-escaping") { let h = {{{{{{{{{{ max(1, 100) }}}}}}}}}} precondition(f()()()()()()()()()() == 100) } running Closures: Escaping... done! (118.03 ms) running Closures: Non-escaping... done! (72.45 ms) running Closures: None... done! (80.45 ms) running Closures: Function... done! (69.45 ms) name time std iterations ------------------------------------------------------ Closures.Escaping 65.000 ns ± 469.74 % 1000000 Closures.Non-escaping 27.000 ns ± 407.70 % 1000000 Closures.None 26.000 ns ± 1122.40 % 1000000 Closures.Function 24.000 ns ± 909.54 % 1000000 Program ended with exit code: 0

— 7:43

This time the closure is non-escaping because it does not outlive the lifecycle of the benchmark closure, and because of that it was able to do its work more efficiently than the previous, escaping f .

— 7:58

So Swift treats escaping closures and functions differently, and escaping closures come with a small cost that functions do not incur, even when used in similar ways.

— 8:09

Now, most of the time it’s not worth worrying about the performance penalty of using escaping closures. As we can see the times are super tiny, and these times only add up if your code paths are being called thousands or millions of times.

— 8:21

However, that’s exactly the situation we are in with parsers. If we need to parse megabytes or gigabytes of data we may need to invoke our massively nested escaping closures many thousands of times, and that could start to add up. Eliminating overhead via protocols

— 8:37

Luckily there is an alternative to closures that is far easier for the compiler to optimize and inline. While Swift does not optimize or inline closures, it does optimize functions and methods, and we can take advantage of this with another form of abstraction: protocols.

— 8:52

If you have been following Point-Free for any time at all you will know that we shy away from using protocols as the de facto way to form abstractions in Swift. We believe that protocols are often overused, and that there are simpler ways to form abstractions, such as using plain, concrete data types. We’ve covered this topic a lot in the past.

— 9:10

But there are times that protocols are appropriate, and Apple’s libraries and frameworks can show us some really good examples. The Swift standard library uses the Collection protocol to unify many disparate types, and not only are there lots of types that conform to the protocol but there are also lots of algorithms that are written against the Collection API. This makes the design of the protocol very powerful and far reaching.

— 9:50

The standard library also popularized a style of API in which each time you chain on a new operator you build up a bigger, more complex type. For example, if we .filter and .map a lazy collection like this: let collection = [1, 2, 3] .lazy .filter { _ in true } .map { $0 + 1 }

— 10:28

We will see this value is of type: LazyMapSequence<LazyFilterSequence<[Int]>.Elements, Int>

— 10:45

It has encoded the entire chain of operations directly into the type. It’s a LazyMapSequence of a LazyFilterSequence of an array of integers.

— 10:51

Having a static description of all of this information can be incredibly powerful. The library can provide small optimizations to specific combinations of chained operations in order to optimize how they are combined. For example, if we chained on another .map we would expect to get a LazyMapSequence of a LazyMapSequence of a LazyFilterSequence of an array of integers: let collection = [1, 2, 3] .lazy .filter { _ in true } .map { $0 + 1 } .map { $0 + 1 }

— 11:16

But instead it stays the same type as before. The API has somehow fused the last two .map operations into a single .map . And that’s incredibly powerful.

— 11:29

And then more recently Apple introduced us to the Combine framework, which is built with similar concepts. This framework is similarly built around protocols that many types conform to, and method operators that return these types to create a more and more complex, nested static type.

— 11:55

For example, if we create a future and then filter it and map it like this: import Combine let publisher = Future<Int, Never> { $0(.success(1)) } .filter { _ in true } .map { $0 + 1 }

— 12:13

We’ll get a quite complex type out the other end: let publisher: Publishers.Map<Publishers.Filter<Future<Int, Never>>, Int>

— 12:26

So this says that publisher is a map of a filter of a future. And we tack on another .map : let publisher = Future<Int, Never> { $0(.success(1)) } .filter { _ in true } .map { $0 + 1 } .map { $0 + 1 }

— 12:30

We’ll see that the type has not changed: let publisher: Publishers.Map< Publishers.Filter<Future<Int, Never>>, Int >

— 12:36

Under the hood, the Combine API has somehow fused those two operations together, theoretically making it more performant.

— 12:44

Similarly, in SwiftUI when we construct a view hierarchy we can see a shadow of that hierarchy represented directly in the static type: import SwiftUI let view = VStack { Text("Hi") HStack { Text("Go") Button("Now") {} } TextField("Enter password", text: .constant("")) }

— 13:17

The type of view is this beast: let view: VStack<TupleView<(Text, HStack<TupleView<(Text, Button<Text>)>>, TextField<Text>)>>

— 13:27

Which if we add some newlines we will see something that looks very similar to the view hierarchy: let view: VStack< TupleView<( Text, HStack< TupleView<( Text, Button<Text> )> >, TextField<Text> )> > = VStack { Text("hi") HStack { Text("Go") Button("Now") {} } TextField("Enter...", text: .constant("")) }

— 14:01

All three of these examples, collections, publishers and SwiftUI views, all benefit from retaining as much information as possible in the static type, and give Swift many more opportunities for optimizing later. A parser protocol

— 14:19

So, I think we are going to need to change the definition of the parser type again. We’ve refactored this type a number of times already, but this time, rather than changing the fundamental shape of the parser, we will instead merely translate the existing shape from one form of abstraction to another.

— 14:39

We’ll start with a protocol translation of the parser struct. Each generic converts over to an associated type, and we can capture the run property as a method requirement: struct Parser<Input, Output> { let run: (inout Input) -> Output? } protocol ParserProtocol { associatedtype Input associatedtype Output func run(_ input: inout Input) -> Output? }

— 15:15

Currently we are naming this ParserProtocol so that it does not conflict with the Parser struct we have, but when made into a proper library it should probably just be called Parser , just as Combine uses Publisher .

— 15:26

Before moving on, let’s group all of the parser protocol code we write in its own file.

— 15:41

To make a parser we need to create a whole new type that conforms to the protocol. It’s even quite mechanical to translate from one style of parser to the other. For example, the integer parser currently starts with an extension declaration: extension Parser where Input == Substring.UTF8View, Output == Int {

— 16:03

That becomes a whole new type conforming to ParserProtocol . We can do this translation by copying and pasting the Parser.int definition and making a few small changes.

— 16:15

First, it is no longer correct to extend the Parser type. Instead we will create a brand new struct that conforms to ParserProtocol . struct IntParser: ParserProtocol { // extension Parser where Input == Substring.UTF8View, Output == Int {

— 16:29

If we ask Xcode to fill in some stubs, the first requirement is declaring the associated types. struct IntParser: ParserProtocol { typealias Input = <#type#> typealias Output = <#type#>

— 16:39

Which we can explicitly declare with the same types of our constrained extension on the Parser type. struct IntParser: ParserProtocol { typealias Input = Substring.UTF8View typealias Output = Int }

— 16:46

Next, Xcode can stub in the run method: func run(_ input: inout Substring.UTF8View) -> Int? { <#code#> }

— 16:53

Which can contain all of the logic that is currently in the body of the static int parser, which we can copy and paste into the method: func run(_ input: inout Substring.UTF8View) -> Int? { let original = input var isFirstCharacter = true let intPrefix = input.prefix { c in defer { isFirstCharacter = false } return ( c == UTF8.CodeUnit(ascii: "-") || c == UTF8.CodeUnit(ascii: "+") ) && isFirstCharacter || ( UTF8.CodeUnit(ascii: "0")...UTF8.CodeUnit(ascii: "9") ) .contains(c) } guard let match = Int(String(Substring(intPrefix))) else { input = original return nil } input.removeFirst(intPrefix.count) return match }

— 17:06

So it’s a little bit noisier, but still basically has the exact same shape.

— 17:28

Let’s try converting another simple parser. The first parser was a parser that would pluck out the first element of any collection. It begins with this extension declaration: extension Parser where Input: Collection, Input.SubSequence == Input, Output == Input.Element {

— 17:41

To convert this we must introduce another new type, which we will call First in order to mimic the name of the current parser: struct First: ParserProtocol {}

— 18:03

And we can have Xcode stub out some of the requirements, which starts with declaring the Input and Output types. struct First: ParserProtocol { typealias Input = <#type#> typealias Output = <#type#> }

— 18:12

If we attempt to move the Parser extension constraints to the type aliases we will find it does not work: struct First: ParserProtocol { typealias Input = Collection where Input.SubSequence = SubSequence typealias Output = Input.Element }

— 18:29

This parser is a little different from the integer parser in that it is generic. The int parser was constrained to a concrete input, Substring.UTF8View , and a concrete output Int , whereas this parser should be able to work on any collection, satisfying some constraints.

— 18:52

Instead, we need to introduce a generic that can be constrained to the Collection protocol. This generic type will represent the collection of input we want to parse the first element off of, so we can call it Input and drop the type alias: struct First<Input>: ParserProtocol where Input: Collection { typealias Output = Input.Element }

— 19:17

Further, we must constrain that its SubSequence associated type must be equal to itself, which is what allows the parser to consume a little bit of its input using removeFirst and other mutating methods. struct First<Input>: ParserProtocol where Input: Collection, Input.SubSequence == Input { typealias Output = Input.Element }

— 19:44

And finally we implement its one requirement which basically as the same implementation as the current parser: func run(_ input: inout Collection) -> Collection.Element? { guard !input.isEmpty else { return nil } return input.removeFirst() }

— 20:00

Again, it’s a little bit noisier, but also has basically the same shape as before.

— 20:12

Let’s now consider a much more complicated parser: a parser combinator. Remember that combinators are functions that take parsers as input and return parsers as output.

— 20:24

One of the most powerful combinators we defined was the zip function, which allows you to run one parser after another and only if they both succeed do you get the outputs from each parser: func zip<Input, Output1, Output2>( _ p1: Parser<Input, Output1>, _ p2: Parser<Input, Output2> ) -> Parser<Input, (Output1, Output2)> { .init { input -> (Output1, Output2)? in let original = input guard let output1 = p1.run(&input) else { return nil } guard let output2 = p2.run(&input) else { input = original return nil } return (output1, output2) } }

— 20:39

To convert this operator to the new ParserProtocol world we will again copy and paste things over so that we can make a few small changes. struct Zip: ParserProtocol {}

— 20:57

In the previous style of zip we had a function that took two parsers as arguments. In this new style we will have a struct that holds onto two parsers as instance variables: let p1: ParserProtocol let p2: ParserProtocol

— 21:21

But we can’t use ParserProtocol in this way, we need to introduce generics to the struct that conform to this protocol: struct Zip<Parser1, Parser2>: ParserProtocol where Parser1: ParserProtocol, Parser2: ParserProtocol { let p1: Parser1 let p2: Parser2 }

— 21:59

If we let Xcode fill in the protocol stubs for us, we see that we first need to describe the associated input and output types: typealias Input = <#type#> typealias Output = <#type#>

— 22:07

The input should be the same input as the parsers this struct holds. We can just pass along the first: typealias Input = Parser1.Input

— 22:17

And constrain the type to ensure that both inputs are the same: Parser1.Input == Parser2.Input

— 22:28

And the output should be a tuple of both of the parsers outputs: typealias Output = (Parser1.Output, Parser2.Output)

— 22:43

And then finally we implement its requirement, whose body we can copy and paste in from the zip function: func run( _ input: inout Parser1.Input ) -> (Parser1.Output, Parser2.Output)? { let original = input guard let output1 = p1.run(&input) else { return nil } guard let output2 = p2.run(&input) else { input = original return nil } return (output1, output2) }

— 23:14

There’s one more thing we want to do. If we were to use the Zip parser right now it would look a lot like how we used to use the zip function: // Zip(parser1, parser2)

— 23:28

The zip function is a very fundamental operation, but a few weeks ago we found a much friendlier incarnation of it that suits parsers in particular, and we called it take : extension Parser { func take<NewOutput>( _ p: Parser<Input, NewOutput> ) -> Parser<Input, (Output, NewOutput)> { zip(self, p) } }

— 23:41

This allowed us to fluently describe how we want to chain parsers together and take their output, and when coupled with a few overloads and a corresponding skip operator we could build some seriously complex parsers with little work.

— 24:02

We can still allow this by extending the ParserProtocol and providing a method that returns a Zip value: extension ParserProtocol { func take<P: ParserProtocol>(_ other: P) -> Zip<Self, P> where P.Input == Input { Zip(p1: self, p2: other) } }

— 24:59

This is in fact how the Combine framework is designed. You build custom publishers as types conforming to the Publisher protocol, and then you extend the Publisher protocol to define methods that return the explicit types. Parser protocol benchmarks

— 25:14

And just like that we have 3 parsers already built, and they exemplify the 3 types of parsers we will most commonly come across:

— 25:23

There’s the simple parsers that operate on a specific type of input, in which case all you need is a struct and an implementation of the parse method.

— 25:34

Then there’s the more generic types of parsers, which causes the struct to be generic in order to express what kind of inputs that parser can handle.

— 25:43

And then there are parser combinators, which means your struct must hold onto other parsers so that they can be used later.

— 25:54

But before going any further let’s convince ourselves that this is a worthwhile thing to do. As long time viewers of Point-Free will know, we definitely do not reach for a protocol to solve most of our problems, and so if we are going to build our parser library around a protocol it better be worth it!

— 26:20

Let’s start by getting a new benchmark in place: // Benchmark-Protocol.swift import Benchmark let protocolSuite = BenchmarkSuite(name: "Protocol") { suite in }

— 26:33

And add it to our main.swift file: Benchmark.main([ // copyingSuite, // intParserSuite, // raceSuite, // xcodeLogsSuite, // closureSuite, protocolSuite, ])

— 26:39

So what should we benchmark? We can try out the two forms of parsing for integers: let string = "1234567890" suite.benchmark("Parser.int") { var string = string[...].utf8 precondition(Parser.int.run(&string) == 1234567890) } suite.benchmark("IntParser") { var input = string[...].utf8 precondition(IntParser().run(&input) == 1234567890) }

— 27:31

When we run this we see that the times are basically the same. running Protocol: Parser.int... done! (191.45 ms) running Protocol: IntParser... done! (195.79 ms) name time std iterations -------------------------------------------------- Protocol.Parser.int 128.000 ns ± 278.56 % 1000000 Protocol.IntParser 128.000 ns ± 303.70 % 1000000 Program ended with exit code: 0

— 27:55

And this is to be expected. Right now parsing with Parser.int or IntParser isn’t really that much different. The real differences should start to appear as we build up much more complicated parsers from simpler ones, which results in a heavily nested parsing closure.

— 28:20

So, let’s emulate that. Let’s parse a simple, albeit contrived string: suite.benchmark("Deeply nested: Parser") { var input = "1-2-3-4-5"[...].utf8 }

— 28:43

We can do this with a whole bunch of applications of the .int parser, .first parser, and take : let output = Parser<Substring.UTF8View, Int>.int .take(.first) .take(.int) .take(.first) .take(.int) .take(.first) .take(.int) .take(.first) .take(.int) .run(&input)

— 29:30

And let’s just have the precondition that this actually parsed something, meaning output is not nil and the input was fully consumed: precondition(output != nil) precondition(input.isEmpty)

— 29:44

We’ll do something similarly for the ParserProtocol version of this code: suite.benchmark("Deeply nested: ParserProtocol") { var input = "1-2-3-4-5"[...].utf8 let output = IntParser() .take(First()) .take(IntParser()) .take(First()) .take(IntParser()) .take(First()) .take(IntParser()) .take(First()) .take(IntParser()) .run(&input) precondition(output != nil) precondition(input.isEmpty) }

— 30:30

And when we run benchmarks: running Protocol: Parser.int... done! (214.08 ms) running Protocol: IntParser... done! (189.56 ms) running Protocol: Deeply nested: Parser... done! (438.24 ms) running Protocol: Deeply nested: ParserProtocol... done! (516.47 ms) name time std iterations ----------------------------------------------------------------------- Protocol.Parser.int 143.000 ns ± 345.98 % 1000000 Protocol.IntParser 122.000 ns ± 339.47 % 1000000 Protocol.Deeply nested: Parser 316.000 ns ± 205.47 % 1000000 Protocol.Deeply nested: ParserProtocol 405.000 ns ± 155.93 % 1000000 Program ended with exit code: 0

— 30:36

We sadly see that somehow the ParserProtocol is slower than the Parser struct! That is definitely not what we want. Turns out that this benchmark is hiding something from us, and is yet another lesson in how tricky benchmarks can be.

— 31:04

The first benchmark is secretly getting a speed boost that would not actually happen in real life code. To see the problem, let’s do something that seems innocuous enough. We are going to extract the parsers out of each benchmark and hold them in variables instead: let parser1 = Parser<Substring.UTF8View, Int>.int .take(.first) .take(.int) .take(.first) .take(.int) .take(.first) .take(.int) .take(.first) .take(.int) suite.benchmark("Deeply nested: Parser") { var input = "1-2-3-4-5"[...].utf8 let output = p1.run(&input) precondition(output != nil) precondition(input.isEmpty) }

— 31:24

If anything, this should make the benchmarks even speedier, after all we are no longer doing the work to piece together the parser. The only thing being measured is the work done to run the parser.

— 31:33

To make the comparison fair, we will also extract out the IntParser . let parser2 = IntParser() .take(First()) .take(IntParser()) .take(First()) .take(IntParser()) .take(First()) .take(IntParser()) .take(First()) .take(IntParser()) suite.benchmark("Deeply nested: ParserProtocol") { var input = "1-2-3-4-5"[...].utf8 let output = p2.run(&input) precondition(output != nil) precondition(input.isEmpty) }

— 31:44

And run: running Protocol: Parser.int... done! (201.94 ms) running Protocol: IntParser... done! (199.72 ms) running Protocol: Deeply nested: Parser... done! (726.02 ms) running Protocol: Deeply nested: ParserProtocol... done! (481.37 ms) name time std iterations ----------------------------------------------------------------------- Protocol.Parser.int 125.000 ns ± 344.92 % 1000000 Protocol.IntParser 128.000 ns ± 331.22 % 1000000 Protocol.Deeply nested: Parser 545.000 ns ± 175.47 % 1000000 Protocol.Deeply nested: ParserProtocol 346.000 ns ± 184.60 % 1000000 Program ended with exit code: 0

— 31:51

Yet somehow the closure-based parser has actually gotten much slower, while the protocol-based parser is just as fast as before. How could that be?

— 32:14

Well, turns out that Swift can be pretty smart about optimizing escaping closures when it knows the closure doesn’t actually escape. In the previous version of the parser we were composing everything to gather, and then immediately hitting .run on it: let output = Parser<Substring.UTF8View, Int>.int .take(.first) .take(.int) .take(.first) .take(.int) .take(.first) .take(.int) .take(.first) .take(.int) .run(&input)

— 32:41

This means that the deeply nested closure built in this parser never actually escaped. It was immediately run and then we were done with it. It appears that Swift can detect this situation and optimize the code to not incur the cost of an escaping closure.

— 32:56

However, in the current code we are holding onto the parser in a long-living variable, and so Swift must assume that it is indeed an escaping closure, and that comes with a pretty substantial performance penalty. But this is also far more realistic of how we would use parsers in real life. It’s more typical that we would break down our large parsers into smaller ones and then invoke it at a later time, rather than composing a large parser and running it all at the same time. So we feel that this benchmark now better reflects reality, and further shows just how subtle and difficult it can be to get benchmarks right.

— 33:44

And now that the benchmarks are more realistic, we see that the new protocol style of parsing is pretty significantly faster than our Parser struct. About 1.5 times faster, and this is quite a simple example parser. As the parser gets more complex and more nested the time differences will grow.

— 34:02

Let’s also see the differences in our two styles of parsers in instruments. We’ll comment out everything except for the “Deeply nested: Parser” parser and hit command+I to open instruments, run the time profiler, and then comment out everything except for “Deeply nested: ParserProtocol” and do the same.

— 35:34

It is quite obvious that there is a pretty big difference here. The stack trace for the heaviest frame of the Parser type is a lot more nested than the heaviest frame for the parser protocol. This is because Swift was able to inline and specialize a lot the code.

— 35:44

If you wanted to go even deeper into exploring the differences between these stack traces you could insert a print statement at the deepest level of our parsers that prints the current thread’s stack size. If you were to do this you would find the the stack size of the Parser type is about twice that of the ParserProtocol . We have some exercises for you to explore that that we encourage you to try.

— 36:20

We do want to mention that inlining code is not the be all to end all when it comes to performance. Firstly there are no guarantees what code is automatically inlined, and even if you inline everything specifically, that can come with its own costs. This is why benchmarking is so important, so you can measure these differences, and in particular benchmarking at both a micro and macro scale is important, as we have by measuring both smaller and larger parsers, because each scale can have its own performance characteristics. Next time: comparing a complex parser

— 37:00

So at this point we could keep converting more and more parsers, but there won’t be too many big lessons to be had in doing that. We may learn a few things about API design and making the friendliest interface as possible, but the conversion of all of our parsers to this new protocol is going to be mostly mechanical.

— 37:18

So, instead what we are going to do is ask “what’s the point?” Now surely we don’t need to justify why we are improving the performance of parsers. That’s just a net benefit to users of our parser library, and it’s even amazing that we were able to get these boosts in performance without substantially changing the way we approach parsing problems.

— 37:37

However, what does need to be justified is using our combinator-based parser library over other forms of parsing, especially when performance is a concern. We feel we have put together a convincing case of why you’d want to use parser combinators over other forms of parsing if composition, code use, and managing complexity is important to you, but we haven’t done such a good job of comparing the performance of the various styles of parsing. And if you are working on a problem where performance is crucial, then you may be willing to give up a little bit of the beauty of parser combinators if what you get out of it is speed.

— 38:16

So, can we honestly recommend people use the combinator-style of parsing instead of other methods that may be more performant?

— 38:25

Yes, yes we can. We are now going to demonstrate that all of the hard work we have done in our parser combinator library has brought us within a stone’s throw of the performance of more ad-hoc, hand-rolled parsers, and it’s quite surprising and impressive to see.

— 38:44

To explore this we are going to build yet another parser from scratch. There’s a parser that is ubiquitous in the field of parsing due to its simplicity in description, yet it’s subtle difficulty to get right. There are many implementations of this parser in various styles, and so that makes it perfect for us to benchmark our combinators against.

— 39:04

We will be parsing “comma separated values”, or CSV for short. It’s a very simple text format that allows you to describe tabular data. Let’s explore this…next time! References Why Combine has so many Publisher types Thomas Visser • Jul 4, 2019 A detailed article on the technique of “operator fusion” that Combine employs. https://www.thomasvisser.me/2019/07/04/combine-types/ An operator fusion primer Jasdev Singh • Apr 1, 2020 A detailed article on the technique of “operator fusion” that Combine employs. https://jasdev.me/fusion-primer swift-benchmark Google • Mar 13, 2020 A Swift library for benchmarking code snippets, similar to google/benchmark. http://github.com/google/swift-benchmark UTF-8 Michael Ilseman • Mar 20, 2019 Swift 5 made a fundamental change to the String API, making the preferred encoding UTF-8 instead of UTF-16. This brings many usability and performance improves to Swift strings. https://swift.org/blog/utf8-string/ Strings in Swift 4 Ole Begemann • Nov 27, 2017 An excerpt from the Advanced Swift that provides a deep discussion of the low-level representations of Swift strings. Although it pre-dates the transition of strings to UTF-8 in Swift 5 it is still a factually correct accounting of how to work with code units in strings. https://oleb.net/blog/2017/11/swift-4-strings/ Improve performance of Collection.removeFirst(_:) where Self == SubSequence Stephen Celis • Jul 28, 2020 While researching the string APIs for this episode we stumbled upon a massive inefficiency in how Swift implements removeFirst on certain collections. This PR fixes the problem and turns the method from an O(n) operation (where n is the length of the array) to an O(k) operation (where k is the number of elements being removed). https://github.com/apple/swift/pull/32451 Downloads Sample code 0129-parsing-performance-pt3 Point-Free A hub for advanced Swift programming. Brought to you by Brandon Williams and Stephen Celis . Content Become a member The Point-Free Way Beta previews Gifts Videos Collections Free clips Blog More About Us Community Slack Mastodon Twitter BlueSky GitHub Contact Us Privacy Policy © 2026 Point-Free, Inc. All rights are reserved for the videos and transcripts on this site. All other content is licensed under CC BY-NC-SA 4.0 , and the underlying source code to run this site is licensed under the MIT License .