Video #125: Generalized Parsing: Part 2

Episode: Video #125 Date: Nov 16, 2020 Access: Members Only 🔒 URL: https://www.pointfree.co/episodes/ep125-generalized-parsing-part-2

Description

Now that we have generalized the parser type it’s time to parse things that aren’t just plain strings. In just a few lines of code the parser type can parse environment variable dictionaries and even become a fully capable URL router.

Video

Cloudflare Stream video ID: 0aaea77f907c3fdb21cba4b6fe3264fa Local file: video_125_generalized-parsing-part-2.mp4 *(download with --video 125)*

Transcript

— 31:54

Now although it is a little annoying that we have to sprinkle a little bit of extra syntax to let the compiler know exactly what type it is working with, it is also forcing a very good pattern on us. The prefix parser only works for collections whose SubSequence is equal to itself. Such collections are able to mutate themselves in an efficient manner because they operate on views of a shared storage rather than make needless copies all over the place.

— 0:29

This is why we are forced to provide array slices and substrings as input to these parsers, because it allows them to do their work in the most efficient way possible. Parsing dictionaries

— 0:43

We’ve now got a pretty powerful abstraction for parsing in place. It not only can parse any kind of input into any kind of output, but also tries to force us to perform that parsing in an efficient manner by working on substrings and array slices.

— 1:02

But so far we haven’t really flexed our new muscles. Let’s parse a new format that represents a nebulous blob of data into something more structured. We’re going to start out slow. If we import Foundation we get access to a class called ProcessInfo which allows us to inspect all the environment variables for the current process running our playground or application. print(ProcessInfo.processInfo.environment) // [...]

— 1:34

There’s a lot of stuff in this dictionary, and perhaps we want to extract out some very specific data from it. For example, the IPHONE_SIMULATOR_ROOT has this big path in it: ProcessInfo.processInfo.environment["IPHONE_SIMULATOR_ROOT"] // "/Applications/Xcode-12.2-beta.app/Contents/Developer/..."

— 1:47

And from that we can clearly see the path of the version of Xcode we are working on right. What if we wanted to extract just that little bit of path from the environment. Well we have the prefix(through:) parser that is perfect for extracting out the beginning of a string up to and including a substring is recognized: Parser.prefix(through: ".app")

— 2:27

And so perhaps we can just run that right on the IPHONE_SIMULATOR_ROOT key in the dictionary: Parser.prefix(through: ".app") .run(ProcessInfo.processInfo.environment["IPHONE_SIMULATOR_ROOT"])

— 2:36

Well, first of all we can’t do this because subscripting into a dictionary returns an optional, and this parser wants to work on honest strings, so I guess we either need to do some if let dancing before hand, or we can just force unwrap for right now: Parser.prefix(through: ".app") .run(ProcessInfo.processInfo.environment["IPHONE_SIMULATOR_ROOT"]!)

— 2:51

But then even that doesn’t work because we have to further convert that string into a substring: Parser.prefix(through: ".app") .run(ProcessInfo.processInfo.environment["IPHONE_SIMULATOR_ROOT"]![...]) // ( // match "/Applications/Xcode-beta.app", // rest "/Contents/Developer/..." // )

— 3:19

Now of course we would never want to force unwrap in production. We definitely would want to safely unwrap the value in the IPHONE_SIMULATOR_ROOT key, but the very act of doing that is kinda parser like, right? We want to try to extract the value out at that key, and if it succeeds run a parser on the value and consume the key.

— 3:38

Sounds like we can cook up a new parser combinator that parses [String: String] dictionaries into strings. The signature of such a combinator could look like this: extension Parser where Input == [String: String] { static func key( _ key: String, _ parser: Parser<Substring, Output> ) -> Self { } }

— 4:19

To implement this parser we just need to write out in code what we’ve discussed out loud. We will subscript into the dictionary to make sure we have a value at that key, and if not we fail: extension Parser where Input == [String: String] { static func key( _ key: String, _ parser: Parser<Substring, Output> ) -> Self { return Self { dict in guard let value = dict[key] else { return nil } } } }

— 4:49

Right now value is an immutable String , but we need it to be a mutable substring in order to run our parser on it, so we can do that with this: guard var value = values[key]?[...]

— 5:10

Once we’ve got the value we need to try to parse it, and if that fails we can bail out: guard let output = parser.run(&value) else { return nil }

— 5:23

And then finally, if all of that succeeds we can consume the value at that key and return the output: dict[key] = value.isEmpty ? nil : String(value) return output

— 6:02

We can now use this new parser combinator to say that we want to parse the DYLD_FRAMEWORK_PATH key and further parse everything in that value, if it exists, up through to the ".app" substring: let xcodePath = Parser.key( "IPHONE_SIMULATOR_ROOT", .prefix(through: ".app") ) // Parser<[String: String], Substring>

— 7:18

This one little parser now has all of the guarding for an absent key baked into one package.

— 7:29

But even better, it can be combined with other parsers too. There’s another key in this dictionary that has the location of my home directory: ProcessInfo.processInfo.environment["SIMULATOR_HOST_HOME"] // "/Users/pointfreeco"

— 7:48

What if we wanted to parse the the username from this path. To do that we would want to consume the literal "/Users/" and then consume everything until the end of the string. This latter parser, the one that just consumes everything, which we can do using prefix(while:) . Parser<Substring, Void>.prefix("/Users/") .take(.prefix(while: { _ in true }))

— 8:45

And then we can pass this to the key parser so that it can operate on a dictionary. let username = Parser.key( "SIMULATOR_HOST_HOME", Parser.prefix("/Users/") .take(.prefix(while: { _ in true })) ) // Parser<[String: String], Substring>

— 9:33

And we can take it for a spin to see that it can pluck out our username and even fully consume the dictionary’s value at that key. Using prefix(while:) in this fashion is slightly strange, though. Maybe we can bundle up this functionality in a dedicated parser that reads better. extension Parser where Input == Substring, Output == Substring { static var rest: Self { Self.prefix(while: { _ in true }) } }

— 10:16

With this defined we can update the username parser: Parser.prefix("/Users/").take(.rest)

— 10:30

But using prefix(while:) is also less efficient than we could be. Instead of iterating over all of the substring’s characters, we could simply return the rest of the input and replace the substring with an empty one: extension Parser where Input == Substring, Output == Substring { static var rest: Self { Self { input in let rest = input input = "" return rest } } }

— 11:04

We now have an Xcode path parser and a username parser that both operate on string-string dictionaries, which means we can combine them into a single parser that bundles up and returns both values. xcodePath.take(username) // Parser<[String: String], (Substring, Substring)>

— 11:31

This is pretty powerful. At a very high level we are getting to describe how we will parse out data from a nebulous [String: String] dictionary, and we are still getting to make use of all the lower level string parsers we’ve developed previously.

— 11:44

But we can even improve this a bit. Right now the rest parser is needlessly restrictive. It’s not only substrings that can perform this operation. We’ll leave this improvement as an exercise for the viewer.

— 12:01

So, this is getting pretty cool. We are able to parse much more exotic types than just simple strings, but at the same time we are still able to make use of all the string parsers we’ve been developing. We aren’t losing out on any reusability while simultaneously greatly expanding the world of things we can parse. Parsing URL requests: defining a parsable type

— 12:18

But let’s push things further. Let’s parse a new data type that is probably something many of our viewers have had to deal with at one time or another: URL request parsing. We come face-to-face with this type of parsing as soon as we want to add deep linking to our application, because deep linking works by launching your app with a nebulous blob of a URL and its your job to pick apart that request to figure out what it represents, and then route the user to a particular part of your application.

— 12:51

Now we aren’t going to develop the full theory of URL parsing right now because we plan on going super deep into that topic soon enough. But already, just with our simple generalized parser we can get pretty far in parsing URL requests into something much more structured.

— 13:14

First, we need to decide what it is exactly that we are going to parse. Do we want to try to parse a literal URLRequest from Foundation ? Parser<URLRequest, <#???#>>

— 13:33

That’s certainly possible, but URLRequest is maybe a little nebulous for us to get a hold of, after all it doesn’t even split out the individual query params in the request. It’s just one big string.

— 13:51

Foundation gives us something to make this a little nicer, and its called URLComponents , and it does a pretty good job of separating out things for a URL: let url = URL(string: "https://www.pointfree.co/episodes/1?ref=twitter")! let components = URLComponents(url: url, resolvingAgainstBaseURL: false)! components.host // "www.pointfree.co" components.path // "/episodes/1" components.queryItems // [{name "ref", value "twitter"}]

— 14:05

This is a lot better, but one thing that’s a little annoying is that all of these types are plain String s but our parsers prefer to work on Substring . That’s going to mean we have to constantly convert these properties to substrings strings, run the parser, and then convert back into strings to mutate the request. That will be inefficient and annoying.

— 14:45

So instead let’s cook up a first class data type that will hold an idealized version of the request that is perfect for parsing. We will call it RequestData : struct RequestData { }

— 15:13

It will hold the path for the URL: var path: String

— 15:17

But, since we are inventing our own type for this we can instead use a form that is very efficient for trying to consume small parts of the path. Instead of a string, which in practice consists of a bunch of path components separated by slashes, let’s just pre-split those components into an array: var pathComponents: [String]

— 16:03

But, each of those path components are going to be operated on by our other parsers, and we know those parsers like substrings, so let’s just assume that we are starting with substrings: var pathComponents: [Substring]

— 16:37

Further, while parsing these path components we are going to be consuming entire chunks of the path, and by using an array here it means that each time we consume a component we have to create a whole new array. Instead we can use an array slice so that instead of creating copies we are simply mutating a view into a subset of the array: var pathComponents: ArraySlice<Substring>

— 17:33

Next we’ll hold a property for query params, which we will mimic how URLComponents represents query items by having an array of pairs: var queryItems: [(name: String, value: Substring)]

— 18:10

We can’t use ArraySlice for this because unlike pathComponents we will need random access to this array so that remove any value from the array.

— 18:34

The path components and query items are the two main things that URLComponents parses out of a URL for us, and for the purpose of deep linking that is probably all we care about. However, URLs have a lot more info associated with them that we may also want to parse, especially in other contexts such as routing on a web app.

— 18:54

This is something we will be talking about in depth in the future, but one of the first things a web server needs to do is accept the nebulous URL request that your browser sends it and figure out what action you are taking on the website. You could be doing a

POST 19:19

So let’s add a field to our RequestData to model the method that can be used for requesting, which is typically either a

POST 19:30

Two important things to note. We are using a plain String instead of Substring because we pretty much always will be parsing the whole thing at once. There’s really no need to incrementally consume the method. Second, it’s optional so that we can represent the idea of consuming it. Once we parse the method we should consume it and set it to nil so that if we try to parse it again it will fail, because that would be user error to try to parse method twice.

POST 20:03

Next we have the headers sent with the request. The URLRequest type in Foundation models this as a [String: String] dictionary, and we will do the same, except we’ll use a Substring for the value since it may be the case that we want to further parse information from the header value, though its probably not super common: var headers: [String: Substring]

POST 20:26

And finally requests have a body, which is extra data that can be attached to a request, and most commonly this is for submitting data in a

POST 20:57

So now we have the RequestData type that we want to be able to parse: struct RequestData { var body: Data? var headers: [String: Substring] var method: String? var pathComponents: ArraySlice<Substring> var queryItems: [(name: String, value: Substring)] }

POST 21:01

It is pretty straightforward to write a function that allows you to transform a URLRequest into one of these RequestData values, especially when using URLComponents , but we will leave that as an exercise for the viewer. Parsing URL requests: defining parsers

POST 21:28

We want to incrementally parser data from this request so that we can recognize what the user is trying to do on our site. In practice what you would do is try to parse this value into a bunch of request formats that we understand, such as the home page, episode page, account page and more. And then you combine all of those parsers into one big one that can parse any request coming in and turn it into a first class data value that can be handled in your application or web server.

POST 22:09

For example, suppose we wanted to try to parse a RequestData value in order to recognize it as a request for an episode page on the Point-Free website. Such a URL would look something like this: GET /episodes/42

POST 22:25

To make things a little interesting, let’s also try to parse a query parameter that represents the time we should have the video jump to: GET /episodes/42?time=120

POST 22:37

We could start by saying that we expect such a request to be a

GET 24:14

Note that we are taking special case to make sure that different capitalizations of the method can be used. Although it is customary to represent an HTTP method in all caps, it is not required.

GET 24:53

So with that parser available we can start off our RequestData parser for an episode page by saying that we want to parse the

GET 25:07

Next we can start parsing and consuming parts of the path components. To do this we can make a parser combinator such that when you hand it a parser of Substring s it will try to run that parser on the first path component of the request. There are a few tricky edge cases we also want to think about. For example, if the parser does not consume the entirety of the first path component then we should fail. We want this because if we were trying to parse the URL "foo/2" by running a "foo" parser followed by an int parser, then it would also recognize the URL "foo2/3" . So we want to be very explicit when parsing path components that you are meaning to consume the entire component before moving onto the next one.

GET 26:14

We can do this in a few steps: extension Parser where Input == RequestData { static func path(_ parser: Parser<Substring, Output>) -> Self { .init { input in guard var firstComponent = input.pathComponents.first else { return nil } let output = parser.run(&firstComponent) guard firstComponent.isEmpty else { return nil } input.pathComponents.removeFirst() return output } } }

GET 29:24

And with this new parser combinator we can try parsing a little more of the request. We can first parse off the "episodes" component of the path, which returns a void value and so can be discarded: // GET /episodes/42?time=120 Parser.method("GET") .skip(.path("episodes")) // Parser<RequestData, Void>

GET 29:57

And then we can parse off an integer from the path: Parser.method("GET") .skip(.path("episodes")) .take(.path(.int)) // Parser<RequestData, Int>

GET 30:24

Next let’s parse off the the "time" query param that represents at what time we should jump to in the video. We need another custom parser combinator for this, and this time we will supply the combinator the name of the query param we want to pluck out, as well as a Substring parser for converting that query value into something more useful. Similarly to the .path combinator we will require that the parser we run on the query value to fully consume the value, otherwise it will count as an overall failure of the parser.

GET 31:00

To do this we will first search the array of query items for the first one matching the name given to us, and then we can run the parser on that value: extension Parser where Input == RequestData { static func query( name: String, _ parser: Parser<Substring, Output> ) -> Self { .init { input in guard let index = input.queryItems .firstIndex(where: { name == $0.0 }) else { return nil } let original = input.queryItems[index].value guard let output = parser.run(&input.queryItems[index].value) else { return nil } guard input.queryItems[index].value.isEmpty else { input.queryItems[index].value = original return nil } input.queryItems.remove(at: index) return output } } }

GET 33:55

There’s a few edge cases to consider, but it’s still mostly straightforward. And now we can express wanting to parse the time query param and processing it as an integer: Parser.method("GET") .skip(.path("episodes")) .take(.path(.int)) .take(.query(name: "time", .int)) // Parser<RequestData, (Int, Int)> Optional parsers

GET 34:21

We can also improve this a bit, because right now this parser will fail if the query param is absent. That’s probably not we want. We probably want the parameter to be optional, where if its present we parse it and if it’s not we can just use nil .

GET 34:49

To do this we can employ yet another parser combinator that is capable of promoting any parser to one that never actually fails, but instead if it can’t parse something successfully we just return nil . Such a parser would have the following shape: optional: (Parser<Input, Output>) -> Parser<Input, Output?>

GET 35:37

One way to accomplish this is to add a computed property on Parser that can promote any parser into one that returns optional outputs: extension Parser { var optional: Parser<Input, Output?> { .init { input in .some(self.run(&input)) } } } We are forcing the parser to always succeed by wrapping the result in an explicit Optional.some .

GET 36:18

Then, with this method we can take the .query parser and make it optional: Parser .method("GET") .skip(.path("episodes")) .take(.path(.int)) .take(Parser.query(name: "time", .int).optional) // Parser<RequestData, (Int, Int?)>

GET 36:31

There’s a few things not quite optimal about this. First, we can no longer use dot syntax to omit Parser from this expression because we are further chaining on .optional at the end. Currently Swift can’t make sense of something like this: .take(.query(name: "time", .int).optional)

GET 36:52

There is some work being done in the compiler to allow expressions like this in Swift, so hopefully soon even this will be possible.

GET 37:02

But there’s one other thing that isn’t great, and that’s the fact that the way this code reads backwards. We say we want to parse a query, and then make it optional. What if we could flip the order so that the code reads as if we want to parse an optional query param: .take(.optional(.query(name: "time", .int)))

GET 37:20

That looks much nicer.

GET 37:22

To accomplish this we need to make .optional a static function on Parser , which can be done like so: extension Parser { static func optional( _ parser: Self ) -> Parser<Input, Output?> { .init { input in .some(parser.run(&input)) } } }

GET 37:51

However, this is not compiling for some reason: .take(.optional(.query(name: "time", .int))) It’s subtle, but it has to do with how Swift is being forced to figure out what type of parser to put in front of .optional :

GET 37:55

The way dot abbreviation works is that the type abbreviated must directly match the type returned from the function. If we “unabbreviate” the type we can get a bit more insight into what’s going wrong: .take(Parser<RequestData, Int?>.optional(.query(name: "time", .int)))

GET 38:27

However, the value returned from the .optional combinator returns a further optionalized Parser<RequestData, Int??> . extension Parser /*<Input, Output>*/ { static func optional( _ parser: Self ) -> Parser<Input, Output?> { … } }

GET 38:35

So, in order for our expression to type check we somehow need to get .optional to be defined on parsers that return optional outputs: extension Parser where Output == Optional<A> { }

GET 38:56

Swift cannot do this with this syntax unfortunately, but we have discussed how to accomplish this in a slightly roundabout way a few times on Point-Free. It is done by introduce a generic to the optional function and then constraining the parser’s Output generic so that it is the optional of the new generic: extension Parser { static func optional<A>( _ parser: Parser<Input, A> ) -> Self where Output == A? { … } }

GET 39:33

This now makes our code compile: .take(Parser<RequestData, Int?>.optional(.query(name: "time", .int)))

GET 39:36

And we can now take advantage of type inference to have Swift figure out the full type of the parser: .take(.optional(.query(name: "time", .int)))

GET 39:40

And now our request parser is looking super succinct: Parser.method("GET") .skip(.path("episodes")) .take(.path(.int)) .take(.optional(.query(name: "time", .int))) // Parser<RequestData, (Int, Int?)>

GET 40:05

Let’s take it for a spin. Here’s a request that should parse just fine: let episode = Parser.method("GET") .skip(.path("episodes")) .take(.path(.int)) .take(.optional(.query(name: "time", .int))) .run(request) let request = RequestData( body: nil, headers: ["User-Agent": "Safari"], method: "GET", pathComponents: ["episodes", "1"], queryItems: [(name: "time", value: "120")] ) episode.run(request) // (match: (1, 120), rest: {nil, [...], nil, ArraySlice([]), []})

GET 40:29

We can see that it correctly plucked out the 1 for the episode number, and 120 for the time query param. And if we remove the query param it still parses successfully too.

GET 40:29

If we drop the query parameter and parsing still succeeds: let request = RequestData( body: nil, headers: ["User-Agent": "Safari"], method: "GET", pathComponents: ["episodes", "1"], queryItems: [] ) episode.run(request) // (match: (1, nil), rest: {nil, [...], nil, ArraySlice([]), []})

GET 41:12

But if we add a typo to the path component by spelling “episodes” as “episode”, parsing will fail: let request = RequestData( body: nil, headers: ["User-Agent": "Safari"], method: "GET", pathComponents: ["episode", "1"], queryItems: (name: "time", value: "120") ) episode.run(request) // ( // match: nil, // rest: {nil, […], method "GET", ArraySlice([…]), […]} // )

GET 41:20

However, what if we added another path component, like say if an episode page had a comments section: // GET /episodes/1/comments let request = RequestData( body: nil, headers: ["User-Agent": "Safari"], method: "GET", pathComponents: ["episodes", "1", "comments"], queryItems: [(name: "time", value: "120")] )

GET 41:29

This is still parsing just fine, but now the rest has the extra path component left over. If we were to accept this as a valid parsing of this request then we would be unintentionally routing everyone that wants to go to the comments page to the main episode page. We need a way of telling our parser that not only have we parsed everything off the request that we expect, but also that we claim that nothing important is left over to parse.

GET 42:14

What we want is yet another parser that can enforce the rule that we have consumed all of the important stuff from the request, in particular we want to make sure that all of the path components have been consumed. In the future we might want to make it even stricter, like requiring that even the method is parsed, but for now let’s just start with path components.

GET 42:41

We will call this parser end , and it will confirm that all the path components have been consumed, and if that’s not the case it will fail. Further, if it passes that check then it will consume everything in the request so that no other parser can process this request: extension Parser where Input == RequestData, Output == Void { static var end: Self { Self { input in guard input.method == nil, input.pathComponents.isEmpty else { return nil } input.body = nil input.headers = [:] input.queryItems = [] return () } } }

GET 44:02

Now we can do: extension Parser where Input == RequestData, Output == Void { static var end: Self { Self { input in guard input.pathComponents.isEmpty else { return nil } input = .init() return () } } }

GET 44:18

If we tack this parser onto the end of our request parser we will now see that it fails because it had the "comments" path component left: episode.run(request)

GET 44:29

If we change the request back to its original state we will see that it succeeds again: pathComponents: ["episodes", "1"],

GET 44:35

And now that we know how to parse a single request, let’s beef it up so that we can parse a whole bunch of types of requests at once. We can start by creating an enum that lists out all of the different routes we want to recognize: enum Route { // GET /episodes/:int?time=:int case episode(id: Int, time: Int?) // GET /episodes/:int/comments case episodeComments(id: Int) }

GET 45:20

To parse a RequestData into one of these routes we can combine a bunch of parsers together that each try to parse the request into one of these cases: let router = Parser.oneOf( Parser.method("GET") .skip(.path("episodes")) .take(.path(.int)) .take(.optional(.query(name: "time", .int))) .skip(.end) .map(Route.episode(id:time:)), Parser.method("GET") .skip(.path("episodes")) .take(.path(.int)) .skip(.path("comments")) .skip(.end) .map(Route.episodeComments(id:)) ) // Parser<RequestData, Route>

GET 47:03

And then we can run the router on an episode request: router.run(request).match // (match: Route.episode(id: 1, time: 120), rest: …)

GET 47:37

And we can test the comments route, as well: pathComponents: ["episodes", "1", "comments"] … router.run(request).match // (match: Route.episodeComments(id: 1), rest: …)

GET 48:25

So we now have a working router that we could use on the server to route requests, or in an application to deep link. And because it’s built out of composable pieces, we’re even free to break them up and reuse components: let episodeSection = Parser .skip(Parser.method("GET")) .skip(.path("episodes")) .take(.path(.int)) let episode = episodeSection .take(.optional(.query(name: "time", .int))) .skip(.end) .map(Route.episode(id:time:)) let episodeComments = episodeSection .skip(.path("comments")) .skip(.end) .map(Route.episodeComments(id:)) let router = Parser.oneOf( episode, episodeComments )

GET 49:13

This is pretty impressive. We add just 70 additional lines of parsers to our base library and we have unlocked the ability to very succinctly and expressively parse incoming requests so that we can route them to different parts of our application or web site. There are entire libraries out there devoted to this functionality, and yet here we have discovered it to be just a small corollary to having powerful, generalized parsing library available to us.

GET 49:39

The only thing missing from this routing micro-library is a few more combinators for parsing the headers and body of the request, as well as a way to turn a nebulous URLRequest value into one of these RequestData values, but we will leave both of those things as exercises for the viewer. Next time: what’s the point?

GET 49:55

So I think this is pretty incredible. We have massively generalized our parsing library, all of the parsers we previously wrote still compile and run just like before, but we can now perform parsing tasks on all new types of input that would have previously been impossible.

GET 50:15

But as cool as all of that is, we still want to ask the all-important question that we ask at the end of every series of episodes on Point-Free: what’s the point? Because although we have generalized parsing we have also made it a little more complex. Not only do we have to think a bit harder when it comes to writing a general parser and have to have a bit of knowledge of generics and the Collection protocol, but we also sometimes have to give the compiler some extra hints in order for it to figure out the types.

GET 50:50

So, is it worth this extra complexity? Instead of generalizing parsers should we have spent a little more time creating more robust parsers that perhaps could have handled the complexities of parsing a raw URLRequest rather than inventing the RequestData type and trying to parse it?

GET 51:06

And we of course think its absolutely worth this extra complexity. Generalizing the parser signature has allowed us to parse all new types of input, but that’s only the beginning. The very act of generalizing has opened up all new possibilities that were previously impossible to see. For example:

GET 51:27

With zero changes to our core parser type we can create a new parser operator that allows us to incrementally parse a stream of data coming in from an outside source, such as standard input, and even incrementally stream output to an outside source, such as a file, standard output, or anything. This can dramatically improve the performance of parsers that need to work on large data sets for which it is unreasonable to bring a large chunk of data into memory and parse it into a large array of data for processing.

GET 52:00

So that’s incredible, but it gets better. By generalizing we can now see an all new form of composition that sits right next to our beloved map , zip and flatMap operators. This operator has the same shape that we have discovered on Point-Free time and time again, and it will be instrumental in allowing us to take a parser that works on a small piece of data and transform it into a parser that works on a larger, more complex piece of data.

GET 52:24

And if that weren’t enough, things get even better. This new form of composition turns out to be the key to unlock a new tier of performance in our parsers. We can increase the performance of some of our parsers by as much as 5-10x with minimal changes to the parsers themselves, which makes their performance competitive with hand-rolled parsers and even beats the performance of Apple’s parser helpers, such as the Scanner type.

GET 52:51

These are some really big claims we are making. We are saying that by simply generalizing the input type of our parsers we can unlock the ability to stream input into our parsers, uncover new forms of composition, and immediately improve the performance of our parsers, basically for free.

GET 53:11

So, let’s demonstrate these amazing feats, starting with streaming. As we mentioned before, it can be very inefficient to parse a large set of data for two reasons: first, we will need to bring the entire input string into memory, which you wouldn’t want to do if you are parsing tens or hundreds of megabytes of data, and second you will need to process the whole string at once and produce a huge piece of output data.

GET 53:37

So we not only want to be able to efficiently stream the input data into our parser to do work incrementally, but we may also want to stream its output somewhere, such as standard out or a file. Let’s tackle each of these problems separately. Downloads Sample code 0125-generalized-parsing-pt2 Point-Free A hub for advanced Swift programming. Brought to you by Brandon Williams and Stephen Celis . Content Become a member The Point-Free Way Beta previews Gifts Videos Collections Free clips Blog More About Us Community Slack Mastodon Twitter BlueSky GitHub Contact Us Privacy Policy © 2026 Point-Free, Inc. All rights are reserved for the videos and transcripts on this site. All other content is licensed under CC BY-NC-SA 4.0 , and the underlying source code to run this site is licensed under the MIT License .