Video #124: Generalized Parsing: Part 1

Episode: Video #124 Date: Nov 9, 2020 Access: Members Only 🔒 URL: https://www.pointfree.co/episodes/ep124-generalized-parsing-part-1

Description

The parser type we built so far is highly tuned to work on strings, but there are many things out in the world we’d want to parse, not just strings. It’s time to massively generalize parsing so that it can parse any kind of input into any kind of output.

Video

Cloudflare Stream video ID: c30090e53ebd329479b1150ab76407fe Local file: video_124_generalized-parsing-part-1.mp4 *(download with --video 124)*

Transcript

— 0:05

Over the last few weeks we have done the work to get everybody up to speed on functional parsing, which is a topic we covered in depth over two years ago and we’ve been wanting to pick up again. And last week we covered something completely new, which is how to change the signature and definition of zip so that it is a little more friendly for parsing because zip on parsers is a little different than zip on other types we’ve encountered it on.

— 0:30

Today we are continuing with parsing, but we are going to unlock a whole new level of functionality. We are going to massively generalize parsing so that it can parse any kind of input into any kind of output. So far all of the parsers we have discussed have been highly tuned to work on strings and substrings. However, there are many things out in the world we’d want to parse, not just strings, and generalizing will allow us to tackle all new problems.

— 1:02

But, even better than opening up new worlds of things we can parse, by generalizing we will also open whole new worlds of composition and we will get access to substantial performance gains, both of which were impossible to see before generalizing.

— 1:21

So, this is incredibly powerful, but it is going to take us time to get there. So let’s start by understanding why it is we’d want to generalize parsing in the first place. Why generalize?

— 1:31

Right now our parsers are great for parsing strings, but there are lots of things out in the world we’d want to be able to parse. Anytime we come across a nebulous blob of data that we want to turn into something more structured, parsing will be helpful.

— 1:54

For example, there are times that you can find yourself with a big ole String -to- String dictionary of data, like say the environment your app is running in: ProcessInfo.processInfo.environment // [...]

— 2:30

What if you want to transform this big blob of strings into something more structured. And even more importantly, what if some of the strings themselves held unstructured data from which you would like to extract something meaningful. For example, the IPHONE_SIMULATOR_ROOT environment variable holds a path to the simulator runtime: ProcessInfo.processInfo.environment["IPHONE_SIMULATOR_ROOT"] // "/Applications/Xcode-beta.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot"

— 3:01

And what if you wanted to further parse that file path to extract out the Xcode being currently run: "/Applications/Xcode-12-beta.app"

— 3:30

There’s another key that holds what kind of simulator is being used for the playground right now: ProcessInfo.processInfo.environment["SIMULATOR_DEVICE_NAME"] // "iPad Pro (9.7-inch)"

— 3:44

We could also further parse this value to figure out if the device is an iPhone, iPad, TV, and what size of device is being used.

— 3:54

You could of course do this all manually by doing a bunch of if let ’s to look for all the keys you are interested in, and then manually split on slashes and look for various patterns to find the info you want, but that can be really messy. We have seen over and over that parsers allow us to break down large, intimidating problems into smaller, more reasonable problems, and it would help a great deal for something like parsing information from environment variables.

— 4:26

There are even times that we have a type that seemingly is well-structured, but we’d like to transform it into something even more structured. For example, the URLComponents API that Apple gives us in the Foundation framework can be seen as a kind of parsing in its own right. It allows you to parse a raw string into something that extracts out important parts of the URL, such as its path, host, scheme and query params: let components = URLComponents( string: "https://www.pointfree.co/episodes/1?time=120" ) components?.host // "www.pointfree.co" components?.path // "/episodes/1" components?.queryItems // [{name "time", value "120"}]

— 6:22

Now that’s already super handy because URLComponents has done a lot of the heavy lifting when it comes to trying to figure out how to extract out these pieces of information from an unstructured string. But there is still some unstructured data in here. For example, queryItems is an array of URLQueryItem ’s, which is just a struct with two string fields: name and value .

— 6:43

So really what we have here is an array of tuples of strings: // [(name: String, value: String)]

— 7:01

And this is not very well-structured. There could be very important information we want to extract from this value, and not only do we need to search for the name in the array to find our data, but then we probably need to parse the value to get something useful out of it, such as parsing an integer for this time param, or the string “true” or “false” into a proper boolean.

— 7:31

Another example is parsing user input for command line tools. When you type in a command to terminal the tool must first figure out what you are trying to do with the command, which consists of parsing out any flags and parameters amongst other things. For example, you can create a new executable Swift package with this command: $ swift package init --name=MyPackage --type=executable

— 8:04

But in order for SPM to figure out what you want to do it must parse out that you are trying to initialize a package, and then figure out the name and type of the package. Sometimes CLI tools are provided arguments in key/value pairs like the above, and other times you customize the tool by providing simple flags, such as: $ swift build -v $ swift build --verbose

— 8:44

So, this is definitely a parsing problem, and if we approach it naively we may be tempted to try to incrementally parse this as a string. However, that is quite complicated because flags and arguments can be listed in any order. For example, if you want to build your package with verbose on and the sanitizer on you can do: $ swift build --verbose --sanitize

— 9:18

or: swift build --sanitize --verbose

— 9:28

So parsing a string in a left-to-right, incremental fashion is going to be quite difficult. Turns out, its much easier to break down the parts of the command into an array, and then incrementally parse that. In fact, Swift ships with an API that does just that: CommandLine.arguments

— 9:45

This turns the command line string into an array of arguments. In fact, if we run it right in this playground we will see that this very playground is being run as a command line tool: [ "/Users/point-free/Library/Developer/XCPGDevices/7791186F-129C-4B2D-A0DF-2B685D338A84/data/Containers/Bundle/Application/F6013A8B-FED5-4A87-B7E7-D2BA154AE029/GeneralizedParsing-17305-7.app/GeneralizedParsing", "-DVTDeviceRunExecutableOptionREPLMode", "1" ]

— 10:00

There is a binary deep in this folder structure, and it is being run with a -DVTDeviceRunExecutableOptionREPLMode argument that has a value of 1 .

— 11:03

So, writing CLI tools starts with a parsing problem, and so we would hope that our parsing library would be general enough to work on arrays of strings so that it could attack this problem.

— 11:25

And this is only just the beginning. Once you allow yourselves to think of parsing as a process that turns unstructured data into structured data you will start to see it everywhere. Generalizing the library

— 11:37

So let’s start the process of generalizing the Parser type. It’s going to require introducing a new generic to the type because we now want to be able to parse any kind of input: // struct Parser<Output> { // let run: (inout Substring) -> Output? // } struct Parser<Input, Output> { let run: (inout Input) -> Output? }

— 12:07

The parser still functions basically the same. Its job is to inspect the input, try to extract an Output from the input somehow, and possibly consume some of the input by mutating it.

— 12:19

This is of course going to break quite a few things, so let’s fix each of them to understand how we need to change our thinking when it comes to constructing parsers. To make it so that we are in compiling order we will comment everything out in the playground and slowly uncomment things as we fix them.

— 12:35

The first error we have has to do with this convenience method we created to make it easier to run parsers in our playground: extension Parser { func run(_ str: String) -> (match: Output?, rest: Substring) { var input = input[...] let match = self.run(&str) return (match, str) } }

— 12:47

This is handy because it allows you to feed a non- inout value into run, and then returns the parsed output along with the rest of the substring that is left to parse.

— 13:03

The easiest fix for this is to simply constrain the Input generic to be a Substring , and then it would just work exactly as before: extension Parser where Input == Substring { func run(_ str: String) -> (match: Output?, rest: Substring) { var str = str[...] let match = self.run(&str) return (match, str) } }

— 13:16

But we can make this much more generic. There’s no need to tie its functionality to substrings, we can make it work with all inputs: extension Parser { func run(_ input: Input) -> (match: Output?, rest: Input) { var input = input let match = self.run(&input) return (match, input) } }

— 13:38

This is more general than what we had before, but it’s also slightly different. Now when invoking this API for string parsing we will have to explicitly pass in a substring by using the ellipsis subscript: input[...]

— 13:55

It’s just a small detail to keep in mind.

— 13:57

The next error we have is in our int parser, which tries to parse an integer from the front of a string: extension Parser where Output == Int { static let int = Self { input in … } }

— 14:02

This no longer works because this is saying we are capable of parsing an integer off of any kind of input, not just strings. We need to further constrain the Input generic of this extension to make it clear that it is only meant to be used with Substring s: extension Parser where Input == Substring, Output == Int { static let int = Self { input in … } }

— 14:17

And now that compiler is happy with this bit of code. Let’s make sure this parser still works. To do that let’s comment out all the code after this parser so that we can be in building order, and then we can take the parser out for a spin: Parser.int.run("123 Hello") // (match 123, rest " Hello")

— 14:36

The next parser we have to fix is the double parser, and all we need to do for it is constrain its input to only work on Substring : extension Parser where Input == Substring, Output == Double { static let double = Self { input in … } }

— 14:51

And with that change we can make use of it: Parser.double.run("123.4 Hello") // (match 123.4, rest " Hello")

— 15:02

And similarly the char parser can be fixed by constraining the input: extension Parser where Input == Substring, Output == Character { static let char = Self { input in … } }

— 15:14

Next we have the always and never parsers, which represent parsers that always succeed with a value or always fail. They may seem like strange parsers, but they are really handy, and because they are already so generic we don’t even need to do anything to fix this: extension Parser { static func always(_ output: Output) -> Self { Self { _ in output } } static var never: Self { Self { _ in nil } } }

— 15:33

Next we have our trio of compositional operators, map , zip and flatMap , which give us the fundamental building blocks for breaking down large, complex parsers into smaller, simpler ones. Their implementations don’t need to change at all, we just need to be explicit with the Input generic in their signatures: extension Parser { func map<NewOutput>( _ f: @escaping (Output) -> NewOutput ) -> Parser<Input, NewOutput> { … } } extension Parser { func flatMap<NewOutput>( _ f: @escaping (Output) -> Parser<Input, NewOutput> ) -> Parser<Input, NewOutput> { … } } func zip<Input, Output1, Output2>( _ p1: Parser<Input, Output1>, _ p2: Parser<Input, Output2> ) -> Parser<Input, (Output1, Output2)> { … }

— 16:14

Then we have a few more parser combinator helpers that help express very common scenarios, such as running many parsers one after another and stopping once one succeeds. This combinator doesn’t even need any changes to compile: extension Parser { static func oneOf(_ ps: [Self]) -> Self { … } static func oneOf(_ ps: Self...) -> Self { … } }

— 16:32

Next we have the zeroOrMore parser combinator. It allows you to run a parser on an input many times until it fails, and then collects all of the outputs into an array. To get this compiling we mostly only need to update the signature to include the new Input generic: extension Parser { func zeroOrMore( separatedBy separator: Parser<Input, Void> = "" ) -> Parser<Input, [Output]> { … } }

— 16:53

But there’s also one other thing we need to fix. We default the separatedBy argument as "" , which is one of those literal parsers derived from the ExpressibleByStringLiteral conformance. When acting on substrings this parser just always succeeds and doesn’t consume anything because the empty string can be thought of as the prefix to every string in existence. Well, we can no longer lean on that kind of intuition since now Input can be anything, not just strings. So instead we can default this argument to a parser that always succeeds without consuming anything and returns void: extension Parser { func zeroOrMore( separatedBy separator: Parser<Input, Void> = .always(()) ) -> Parser<Input, [Output]> { … } }

— 17:37

Next we had a family of parsers whose sole purpose was to parse substrings off the front of strings. There was one for parsing an exact, literal string from the beginning, and to get it parsing we need to constrain its input to be substrings: extension Parser where Input == Substring, Output == Void { static func prefix(_ p: String) -> Self { … } }

— 18:02

Then we had one for parsing a substring from the front of a string until a predicate on characters was satisfied: extension Parser where Input == Substring, Output == Substring { static func prefix(while p: @escaping (Character) -> Bool) -> Self { … } } And then there was a parser for consuming the front of a string until a substring was found, and we had two versions depending on whether or not you want to further consume substring passed in: extension Parser where Input == Substring, Output == Substring { static func prefix(upTo substring: Substring) -> Self { … } static func prefix(through substring: Substring) -> Self { … } }

— 18:21

Next we conformed our parser to ExpressibleByStringLiteral so that we could use literal strings to represent the parser that simply parses that exact string from the beginning of the input. This helped us clear up a lot of noise in our parsers. To get this compiling we just have to add Input constraints just like we have been doing for the other parsers: extension Parser: ExpressibleByUnicodeScalarLiteral where Input == Substring, Output == Void { … } extension Parser: ExpressibleByExtendedGraphemeClusterLiteral where Input == Substring, Output == Void { … } extension Parser: ExpressibleByStringLiteral where Input == Substring, Output == Void { … }

— 18:51

So that was the core of our parser library that we covered long ago, but then last week we showed how to improve the ergonomics of the zip function. We saw that we could define a version of zip as a method on the Parser type with a friendly name for the domain of parsing. We defined a few of these methods, and they were called skip and take to signify that sometimes we want to run parsers and then discard their output and other times we want to keep their output.

— 19:16

So let’s quickly update their signatures to play nicely with our new Input generic: extension Parser { func skip<OtherOutput>(_ p: Parser<Input, OtherOutput>) -> Self { zip(self, p).map { a, _ in a } } func take<NewOutput>( _ p: Parser<Input, NewOutput> ) -> Parser<Input, (Output, NewOutput)> { zip(self, p) } func take<Output1, Output2, Output3>( _ p: Parser<Input, Output3> ) -> Parser<Input, (Output1, Output2, Output3)> where Output == (Output1, Output2) { zip(self, p).map { ab, c in (ab.0, ab.1, c) } } static func skip(_ p: Self) -> Parser<Input, Void> { p.map { _ in () } } func take<NewOutput>( _ p: Parser<Input, NewOutput> ) -> Parser<Input, NewOutput> where Output == Void { zip(self, p).map { _, a in a } } } Generalizing our parsers

— 19:32

Phew! That was a lot, but we have now generalized the core of the little parsing library we have been building. This little library is only about 250 lines of code, and we can parse some seriously complex things. Previously we have cooked up little parsers for parsing temperatures, money with currency symbols, geographic coordinates, custom string formats that hold the data for a list of marathon races, and then most recently we actually parsed the logs that Xcode spits out when running tests so that we could print them out in a nicely formatted style.

— 20:01

Turns out, most of those parsers already compile. For example, the temperature parsers, where we first parse an integer, and then a degree sign with an F compiles just fine: let temperature = Parser.int.skip("°F") temperature.run("100°F") temperature.run("-100°F")

— 20:14

When wanting to parse geographic coordinates we first came up with parsers that tried to detect the North/South/East/West identifiers and then turn them into signs that can be used to multiply the coordinates: let northSouth = Parser.char.flatMap { $0 == "N" ? .always(1.0) : $0 == "S" ? .always(-1) : .never } let eastWest = Parser.char.flatMap { $0 == "E" ? .always(1.0) : $0 == "W" ? .always(-1) : .never }

— 20:26

These too compile just fine.

— 20:31

Then we were able to combine a bunch of these parsers together to parse off a latitude and longitude coordinate separately, and these still compile: let latitude = Parser.double .skip("° ") .take(northSouth) .map(*) let longitude = Parser.double .skip("° ") .take(eastWest) .map(*)

— 20:45

And then we pieced it all together to parse a full geographic coordinate: struct Coordinate { let latitude: Double let longitude: Double } let zeroOrMoreSpaces = Parser.prefix(" ").zeroOrMore() let coord = latitude .skip(",") .skip(zeroOrMoreSpaces) .take(longitude) .map(Coordinate.init)

— 20:54

We even beefed up a little bit more by allowing us to parse any number of spaces after the comma, not just a single space.

— 21:01

It’s great to see that all of this parser code still works just fine even now that Parser has been generalized.

— 21:06

Next we created some parsers for parsing currency symbols and money. enum Currency { case eur, gbp, usd } let currency = Parser.oneOf( Parser.prefix("€").map { Currency.eur }, Parser.prefix("£").map { .gbp }, Parser.prefix("$").map { .usd } ) struct Money { let currency: Currency let value: Double } // "$100" let money = zip(currency, .double) .map(Money.init(currency:value:)) money.run("$100") money.run("£100") money.run("€100")

— 21:13

All of this code still compiles just fine too.

— 21:16

Then we tried using all of these little parsers to parse a really big string of marathon races into an array of first class Race data types: let upcomingRaces = """ … """ struct Race { let location: String let entranceFee: Money let path: [Coordinate] } To do we incrementally parsed pieces from the string to extract out a race, and then used the zeroOrMore combinator to parse out a bunch of races: let locationName = Parser.prefix(while: { $0 != "," }) let race = locationName.map(String.init) .skip(",") .skip(zeroOrMoreSpaces) .take(money) .skip("\n") .take(coord.zeroOrMore(separatedBy: "\n")) .map(Race.init(location:entranceFee:path:)) let races = race.zeroOrMore(separatedBy: "\n---\n")

— 21:30

This all compiles just fine, but then to run our parser on the upcomingRaces string we must first convert to the input into a substring: races.run(upcomingRaces[...])

— 21:52

The reason we haven’t had to do this when running any of our parsers above is because we were running them on string literals, and Substring is ExpressibleByStringLiteral . Here we are passing a concrete string value, and so we have to explicitly turn it into a substring.

— 22:11

And finally, we created a parser that could parse a big blob of Xcode test logs into an array of well structured data types that pluck out the essential information from the logs, such as failure messages, testing durations, and more: let logs = """ … """ enum TestResult { case failed( failureMessage: Substring, file: Substring, line: Int, testName: Substring, time: TimeInterval ) case passed(testName: Substring, time: TimeInterval) }

— 22:28

Amazing, all of the parses constructed to parse this super complex string format compile just fine with no changes whatsoever: let testCaseFinishedLine = Parser .skip(.prefix(through: " (")) .take(.double) .skip(" seconds).\n") let testCaseStartedLine = Parser .skip(.prefix(upTo: "Test Case '-[")) .take(.prefix(through: "\n")) .map { line in line.split(separator: " ")[3].dropLast(2) } let fileName = Parser .skip("/") .take(.prefix(through: ".swift")) .flatMap { path in path.split(separator: "/").last.map(Parser.always) ?? .never } let testCaseBody = fileName .skip(":") .take(.int) .skip(.prefix(through: "] : ")) .take(Parser.prefix(upTo: "Test Case '-[").map { $0.dropLast() }) let testFailed = testCaseStartedLine .take(testCaseBody) .take(testCaseFinishedLine) .map { testName, bodyData, time in TestResult.failed( failureMessage: bodyData.2, file: bodyData.0, line: bodyData.1, testName: testName, time: time ) } let testPassed = testCaseStartedLine .take(testCaseFinishedLine) .map(TestResult.passed(testName:time:)) let testResult = Parser.oneOf(testFailed, testPassed) let testResults = testResult.zeroOrMore()

— 22:30

The only thing we have to change is when running the parse we have to convert the logs to a substring: testResults.run(logs[...]) Taking advantage of generalized parsers

— 22:52

So that completes the work needed to get all of our parsers working with the new generalized library. It wasn’t much at all, everything basically worked already.

— 23:04

But now that we have all of that upfront work done let’s start to flex the new muscles we have gained from generalizing the library.

— 23:12

Generalizing prefix

— 23:32

Let’s start by taking some of parsers that work only on substrings and generalizing them to work on a much wider set of types. For example, this prefix parser works only on Substring inputs: extension Parser where Input == Substring, Output == Void { static func prefix(_ p: String) -> Self { Self { input in guard input.hasPrefix(p) else { return nil } input.removeFirst(p.count) return () } } }

— 23:44

However, this parser can made to work on a much wider set of types than just Substring . If we look at the implementation we see that it only really needs access to hasPrefix , removeFirst and count . So maybe there is some higher-level abstraction that we could write this algorithm against so that we get not only the implementation for Substring but also other types.

— 24:12

The essence of this operation is that we have some input value and we want to see if it begins with some subset. These kinds of operations naturally correspond to collections, which have the concept of subsequences that allow you to focus on just a subset of a value. So perhaps we can generalize this operation to operate on collections: extension Parser where Input: Collection, Output == Void { static func prefix(_ p: String) -> Self { … } }

— 24:38

This breaks a few things. First, it turns out that hasPrefix doesn’t work on all collections, but instead is something defined only for StringProtocol , which means it only works on String and Substring . However, there is a starts(with:) method on sequences that can give us the same functionality, but more generally: guard input.starts(with: p) else { return nil }

— 25:08

But right now p is still hard-coded to a concrete String and we want to generalize it to the input’s SubSequence type: static func prefix(_ p: Input.SubSequence) -> Self {

— 25:32

And now we get the following error: Referencing instance method ‘starts(with:)’ on ‘Sequence’ requires that ‘Input.Element’ conform to ‘Equatable’

— 25:47

So we need to further make sure that the input collection’s Element type is equatable: extension Parser where Input: Collection, Input.Element: Equatable, Output == Void { … }

— 25:54

And now finally Swift seems ok with the guard statement.

— 26:00

The next error we have is on the line where we perform the removeFirst : Referencing instance method ‘removeFirst’ on ‘Collection’ requires the types ‘Input’ and ‘Input.SubSequence’ be equivalent

— 26:09

In fact, this is even spelled out in the documentation of removeFirst , which we can find in the RangeReplaceableCollection docs: Note Available when Self is SubSequence .

— 26:29

When we typically encounter mutating methods like removeFirst on arrays, they come from the MutableCollection and RangeReplaceableCollection protocols, but it turns out that Collection has some of them too, but only when working with a subsequence. This is because subsequences like Substring can be “mutated” by moving around the start and end indices. In this case, removeFirst will mutate the subsequence’s startIndex by advancing it.

— 26:57

So we need to further constrain our parser: extension Parser where Input: Collection, Input.SubSequence == Input, Input.Element: Equatable, Output == Void { … }

— 27:34

That’s quite a bit, but it’s just the cost of doing business with the Collection protocol. It allows us to be a lot more generic with what kinds of types we can parse, but in order to do so we must be somewhat familiar with how collections work in Swift. Almost always when writing a parser against a general collection you are going to want to constrain its SubSequence to be Self since that gives you access to many of the mutating methods we need to consume bits of the input.

— 28:09

Generalizing that parser has broken our string literal conformance, but all we need to do to fix it is cast the string literal to a substring so that we can use our prefix combinator: extension Parser: ExpressibleByStringLiteral where Input == Substring, Output == Void { init(stringLiteral value: String) { self = .prefix(value[...]) } }

— 28:40

There are a couple more compiler errors below where we call to Parser.prefix . let zeroOrMoreSpaces = Parser.prefix(" ").zeroOrMore() … let currency = Parser.oneOf( Parser.prefix("€").map { Currency.eur }, Parser.prefix("£").map { .gbp }, Parser.prefix("$").map { .usd } ) Generic parameter ‘Input’ could not be inferred

— 28:40

It seems like the compiler is having trouble figuring. We’re not exactly sure why Swift is having trouble figuring this out, but we can make things explicit to get things building again: let zeroOrMoreSpaces = Parser<Substring, Void>.prefix(" ").zeroOrMore() … let currency = Parser<Substring, Currency>.oneOf( Parser.prefix("€").map { Currency.eur }, Parser.prefix("£").map { .gbp }, Parser.prefix("$").map { .usd } )

— 29:23

Now everything is back to compiling, but we can now parse prefixes from types other than strings. For example, say we had an array of integers, and we want to consume the first few elements but only if they match something specific. We might be tempted to do something like this: Parser.prefix([1, 2]).run([1, 2, 3, 4, 5, 6])

— 29:55

And we would hope that returns something like: (match: (), rest: [3, 4, 5, 6])

— 30:02

However, as we saw before, there doesn’t seem to be enough type information for the compiler to figure out what is going on. This is going to be a little unfortunate fact of life when dealing with generalized parsers. Because things are so general, it will be far easier for us to write code that is technically ambiguous to the Swift compiler, and so we may need to give little hints. Here we believe the ambiguity is coming from the fact that many types can conform to the ExpressibleByArrayLiteral protocol and we are using array literals here. One thing to nudge it in the right direction would be to explicitly run the parser on an array slice: Parser.prefix([1, 2]).run([1, 2, 3, 4, 5, 6][...]) // (match: (), rest: [3, 4, 5, 6])

— 30:51

Or to be explicit that we are dealing with a parser that operates on array slices: Parser<ArraySlice<Int>, Void>.prefix([1, 2]).run([1, 2, 3, 4, 5, 6]) // (match: (), rest: [3, 4, 5, 6])

— 31:08

And now it’s doing what we expect it to. Parser<ArraySlice<Int>, Void>.prefix([2, 3]).run([1, 2, 3, 4, 5, 6]) // (match: nil, rest: [1, 2, 3, 4, 5, 6])

— 31:24

If we changed it something that we would expect to fail: Parser.prefix([1, 2]).run([2, 3, 4, 5, 6][...) // (match: nil, rest: [2, 3, 4, 5, 6])

— 31:32

We get nil as a result and still have the whole array left to be parsed.

— 31:37

So it’s pretty incredible that we have this single, generic definition of prefix now that can work on strings, arrays, and any other collection type.

— 31:54

Now although it is a little annoying that we have to sprinkle a little bit of extra syntax to let the compiler know exactly what type it is working with, it is also forcing a very good pattern on us. The prefix parser only works for collections whose SubSequence is equal to itself. Such collections are able to mutate themselves in an efficient manner because they operate on views of a shared storage rather than make needless copies all over the place.

— 32:18

This is why we are forced to provide array slices and substrings as input to these parsers, because it allows them to do their work in the most efficient way possible. Next time: parsing dictionaries and more

— 32:32

We’ve now got a pretty powerful abstraction for parsing in place. It not only can parse any kind of input into any kind of output, but also tries to force us to perform that parsing in an efficient manner by working on substrings and array slices.

— 32:51

But so far we haven’t really flexed our new muscles. Let’s parse a new format that represents a nebulous blob of data into something more structured. We’re going to start out slow. If we import Foundation we get access to a class called ProcessInfo which allows us to inspect all the environment variables for the current process running our playground or application. Downloads Sample code 0124-generalized-parsing-pt1 Point-Free A hub for advanced Swift programming. Brought to you by Brandon Williams and Stephen Celis . Content Become a member The Point-Free Way Beta previews Gifts Videos Collections Free clips Blog More About Us Community Slack Mastodon Twitter BlueSky GitHub Contact Us Privacy Policy © 2026 Point-Free, Inc. All rights are reserved for the videos and transcripts on this site. All other content is licensed under CC BY-NC-SA 4.0 , and the underlying source code to run this site is licensed under the MIT License .