Video #317: SQL Builders: Advanced Selects

Episode: Video #317 Date: Mar 17, 2025 Access: Members Only 🔒 URL: https://www.pointfree.co/episodes/ep317-sql-builders-advanced-selects

Description

We now have a type-safe syntax for generating SELECT statements using key paths to the columns we want to select, but while this syntax is nice and what many existing libraries use, we can do better. Let’s introduce a more advanced syntax that leverages variadic generics and supports more complex query expressions.

Video

Cloudflare Stream video ID: 703f78672cdb44476cd1dfd3f5992bac Local file: video_317_sql-builders-advanced-selects.mp4 *(download with --video 317)*

References

Transcript

— 0:05

We now have a statically checked, type-safe way to select columns from a table in Swift. We simply use the select static method on our table type, then provide a variadic list of key paths of the columns we want to select, and we will magically get a valid SQL statement written for us that can query the database. Stephen

— 0:25

However, we can make things so, so much better. While the form of select we have built so far looks nice, and it’s how most SQL libraries out there handle things, but it unfortunately makes a lot of SQL functionality impossible to express. It turns out that

SELECT 0:42

Let’s see just how complex

SELECT 0:54

A very powerful feature of SQL is its ability to aggregate column data across many rows. One of the simplest is the count function. We can select count(*) from a table to count the number of rows in the table: SELECT count(*) FROM "reminders"; ┌──────────┐ │ count(*) │ ├──────────┤ │ 5 │ └──────────┘

SELECT 1:11

Or, if we want to count the number non-

NULL 1:24

Currently there are only 3 reminders that have a priority set, and so that is why this count is 3 instead of 5.

NULL 1:28

You can even provide an option to the count function to count only distinct entries: SELECT count(DISTINCT "isCompleted") FROM "reminders"; ┌───────────────────────────────┐ │ count(DISTINCT "isCompleted") │ ├───────────────────────────────┤ │ 2 │ └───────────────────────────────┘

NULL 1:41

This returns 2 because each reminder has either a 0 or 1 for its “ isCompleted ” column, and so there are only two distinct values across all reminders.

NULL 1:48

Further, you can aggregate multiple things at once in a query. For example, you can count the number of rows and the average priority across all reminders in one single query: SELECT count(*), avg("priority") FROM "reminders"; ┌──────────┬─────────────────┐ │ count(*) │ avg("priority") │ ├──────────┼─────────────────┤ │ 5 │ 2.0 │ └──────────┴─────────────────┘

NULL 2:08

There’s another aggregate function, group_concat , which will concatenate all the values in a column together as a string, with the possibility of including a delimiter too:

SELECT 2:29

You are even allowed to form expressions inside the column you are selecting. For example, suppose we wanted to select all reminders as well as a boolean that determines if the reminder is considered “high priority”, mean its priority column is 3. We can do so like this: sqlite> SELECT *, priority = 3 FROM reminders; ┌────┬─────────────────────┬─────────────┬──────────┬──────────────┐ │ id │ name │ isCompleted │ priority │ priority = 3 │ ├────┼─────────────────────┼─────────────┼──────────┼──────────────┤ │ 1 │ Groceries │ 0 │ │ │ │ 2 │ Get haircut │ 0 │ │ │ │ 3 │ Take a walk │ 1 │ 1 │ 0 │ │ 4 │ Buy concert tickets │ 1 │ 2 │ 0 │ │ 5 │ Call accountant │ 0 │ 3 │ 1 │ └────┴─────────────────────┴─────────────┴──────────┴──────────────┘ Notice that “Take a walk” and “Buy concert tickets” is 0, whereas “Call accountant” is 1, which means it is high priority.

SELECT 2:50

The columns that are blank are because the priority for those rows is

NULL 3:09

If we want to force a NULL value to be false we can use the coalesce function: sqlite> SELECT *, coalesce(priority, 0) = 3 FROM reminders; ┌────┬─────────────────────┬─────────────┬──────────┬───────────────────────────┐ │ id │ name │ isCompleted │ priority │ coalesce(priority = 3, 0) │ ├────┼─────────────────────┼─────────────┼──────────┼───────────────────────────┤ │ 1 │ Groceries │ 0 │ │ 0 │ │ 2 │ Get haircut │ 0 │ │ 0 │ │ 3 │ Take a walk │ 1 │ 1 │ 0 │ │ 4 │ Buy concert tickets │ 1 │ 2 │ 0 │ │ 5 │ Call accountant │ 0 │ 3 │ 1 │ └────┴─────────────────────┴─────────────┴──────────┴───────────────────────────┘

NULL 3:27

These queries show just how powerful SQL can be, and we would like our query builder library to support all of this. But right now our key path syntax needlessly boxes us into just very simple selects: Reminder.select(\.id, \.title)

NULL 3:37

What would it look like to support select the count of the non-NULL ids? Something like this: Reminder.select(\.id.count, \.title) …where we somehow add a count property that we can chain into for the key path?

NULL 3:49

But then how would we support counting distinct columns, like in the case of priority : Reminder.select(\.id.count, \.priority.count(distinct: true))

NULL 3:59

That kind of syntax isn’t even possible in Swift as of version 6.1. And if these seem impossible, then what hope do we have at doing the priority transformation with coalesce : Reminder.select(coalesce(\.priority = 3, false))

NULL 4:17

This is a theoretical syntax that I don’t think is possible to create in Swift.

NULL 4:20

Let’s theorize a syntax that could support all of these use cases, and more, and then we will see what it takes to make it a reality. What if instead of selecting by a variadic list of key paths we could open up a trailing closure: Reminder.select { }

NULL 4:43

This closure would be handed a reference to the Columns of the reminders table: Reminder.select { $0 }

NULL 4:49

…which provides us with a static description of the schema of the table.

NULL 4:52

Then we could pluck the columns we want to select off of the Columns argument and bundle them up into a tuple: Reminder.select { ($0.id, $0.title, $0.priority) }

NULL 5:06

So far this seems like only a small change from the key path syntax: Reminder.select(\.id, \.title, \.priority)

NULL 5:11

In fact, the key path syntax is a little less noisy and shorter.

NULL 5:14

However, the power of having access to the Columns value and being able to return a tuple of expressions opens up all new worlds to us. For example, we could define a count helper on the column fields that allows counting non-

NULL 5:28

And because count() is just a simple method, we could allow it to have an argument to specify that we only want to count distinct values: Reminder.select { ($0.id.count(), $0.priority.count(distinct: true)) }

NULL 5:36

And of course we would be allowed to support other kinds of aggregate functions, such as avg and group_concat : Reminder.select { ($0.id.count(), $0.priority.avg(), $0.groupConcat()) }

NULL 5:47

Even the priority condition would be possible to represent: Reminder.select { ($0, ($0.priority == 3) ?? false) }

NULL 6:04

This version is particularly interesting because it is allowing us to representing selecting all of the columns from the table, which is what $0 means, and then further selecting a new column that represents if the reminder is high priority.

NULL 6:15

We can make all of these kinds of queries compile just fine and generate valid SQL statements, and it will make a whole world of

SELECT 6:23

There are two main things we need to do to support this kind of syntax. First of all, right now all the fields on the Columns type are just strings that describe the name of the column: struct Columns { let id = "id" let title = "title" let isCompleted = "isCompleted" let priority = "priority" }

SELECT 6:33

We use these strings to build up the query, and our select static function even takes key paths that specifically go into String : static func select(_ columns: KeyPath<Columns, String>...) -> Select {

SELECT 6:44

Well, we now have to beef up these properties a bit. They can’t just be simple strings if we want to support things like: $0.id.count() $0.priority.avg() $0.title.groupConcat() …and so on. The properties should be a custom type that we define so that we can define operators like this on it.

SELECT 7:09

And then second, we need a new form of abstraction that can unite many types of queries into a single interface. Right now we have the Select type that represents a SQL

SELECT 7:27

And we’d like to mix and match these smaller query fragments together to form a larger query fragment: Reminder.select { ($0.id.count(), $0.priority.avg(), Tag.count()) } /*

SELECT 7:40

With such an abstraction available to us we would be able to have the select trailing closure return a variadic tuple of those kinds of values, and that would give you infinite flexibility in selecting any columns you want, aggregating any columns, and even performing subqueries.

SELECT 7:54

So, let’s start tackling these two changes. We’ll start with the column properties in the Columns type. Right now they are just strings: struct Columns { let id = "id" let title = "title" let isCompleted = "isCompleted" let priority = "priority" }

SELECT 8:04

Let’s define a wrapper type that represents a table column: struct Column { var name: String }

SELECT 8:11

Right now it just holds onto the name of the column, but in the future it will hold onto more info.

SELECT 8:16

And then we will update our existing select function to require that key paths into this Column type be provided: static func select(_ columns: KeyPath<Columns, Column>...) -> Select {

SELECT 8:21

…which requires us to further dot-chain onto the key path with the name field of the Column type: Select( from: tableName, columns: columns.map { Self.columns[keyPath: $0].name } )

SELECT 8:27

This breaks tests because we need to update the Columns type inside Reminder to use Column instead of plain strings: struct Columns { let id = Column(name: "id") let title = Column(name: "title") let isCompleted = Column(name: "isCompleted") let priority = Column(name: "priority") }

SELECT 8:49

Now everything is building, and tests pass just as they did before. Query expressions

SELECT 8:55

OK, we have now taken care of one of the two changes we outlined a moment ago in order to support more advanced SQL syntax. We have created a concrete type to represent a column so that it can hold onto a string description of the column name since Swift does not make it possible to get a nice description of key paths. Brandon

SELECT 9:11

Let’s now move onto the second change we need to implement, which is to introduce a form of abstraction that can unify a few disparate types we have right now, such as the Select type and this new Column type, but there will also be a lot more concrete types that need to be dealt with in the future.

SELECT 9:28

Let’s take a look.

SELECT 9:31

In order to support more advanced queries, such as this: Reminder.select { ($0.id.count(), $0.priority.avg(), $0.groupConcat()) }

SELECT 9:46

…we need a single abstraction that unifies all of these operations.

SELECT 9:50

We will call this new form of abstraction QueryExpression : protocol QueryExpression { }

SELECT 9:58

Conformances of this protocol need to be able to construct a fragment of SQL that represent their expression: protocol QueryExpression { var queryString: String { get } }

SELECT 10:11

For example, Column can become a QueryExpression by simply returning its name: struct Column: QueryExpression { … var queryString: String { name } }

SELECT 10:22

And the Select type already conforms to QueryExpression : struct Select: QueryExpression { … } …because we previously defined queryString for building up the full

SELECT 10:43

This abstraction now unifies two concepts that were previously completely separate: Column and Select . And more types will be able to join the abstraction layer soon.

SELECT 10:55

But before doing that, let’s make our dream select syntax a reality: // Reminder.select { ($0.id, $0.title) }

SELECT 11:12

Such a select static method needs to take a closure as an argument, and that closure accepts Columns as an argument: static func select( _ columns: (Columns) -> ??? ) -> Select { }

SELECT 11:35

Further, this columns closure can return a tuple of any number of values, as long as each one conforms to QueryExpression .

SELECT 12:06

Swift supports variadic values, but only as arguments to functions. That is, we can accept any number of QueryExpression values as an argument to a function: _: any QueryExpression... …though we do have to erase the types.

SELECT 12:30

But we cannot use this syntax to represent us returning a tuple of any number of QueryExpression values: _ columns: (Columns) -> (any QueryExpression...) Variadic parameter cannot appear outside of a function parameter list

SELECT 13:09

However, as of version 5.9 of Swift, the language supports the idea of variadic lists of generic types in function signatures and types. We can introduce a variadic list of generics to our select function like so: static func select<each ResultColumn>(

SELECT 13:35

You can think of this syntax being a different spelling of something like this: static func select<ResultColumn...>(

SELECT 13:44

This syntax better unifies the concept of variadic generics with that of variadic arguments, but this syntax was ultimately turned down due to worries of parser ambiguities.

SELECT 13:57

Then, in order to specify that the columns closure is capable of returning a tuple containing a value from all of the generics specified, we do the following: _ columns: (Columns) -> (repeat each ResultColumn)

SELECT 14:23

The (repeat each ResultColumn) syntax represents a tuple that has one element for each generic specified.

SELECT 14:42

We can further constraint each of the generics for this function like so: static func select<each ResultColumn: QueryExpression>(

SELECT 14:49

Again, you can think of this syntax has representing something like this: _ columns: (Columns) -> (ResultColumn...) …in order to unify it with how we think about variadic arguments.

SELECT 15:02

This is the signature we need to implement now. We can even give it a spin before implementing by sticking in a fatalError : static func select<each ResultColumn: QueryExpression>( _ columns: (Columns) -> (repeat each ResultColumn) ) -> Select { fatalError() }

SELECT 15:13

And then over in a new test we can see that we are allowed to use the trailing closure syntax for select, and we can select as many columns as we want: @Test func fancySelect() { Reminder.select { ($0.id, $0.title) } }

SELECT 15:42

So, the dream syntax has been realized, we just need to implement the function.

SELECT 15:48

We know we need to return a Select value, so we can start there: return Select( columns: <#[String]#>, from: <#String#> )

SELECT 15:56

And we can fill in the from argument from the name of the table: return Select( columns: <#[String]#>, from: tableName )

SELECT 15:59

To build up the array of column names we need to somehow iterate over the elements of the tuple that is returned from the trailing closure. We can get that tuple by invoking the columns closure on the columns static value that is defined on the table type: let columns = columns(Self.columns)

SELECT 16:32

If we inspect the type of this we will see: let columns: (repeat each ResultColumn)

SELECT 16:40

This represents a tuple whose each element is a value in one of the variadic of generics specified by invoking the function.

SELECT 16:50

The question is: how do we iterate over the elements of this tuple? We can’t just access the elements of the tuple like we normally would: columns.0 columns.1 columms.2

SELECT 17:04

…because we don’t know the size of this tuple. It’s a variadically sized tuple that is determined by the caller of the select method, and so it may have only 2 elements, or 1, or 100, or even 0!

SELECT 17:20

But the error message Swift outputs for these expressions does give us a hint of what to do: Value pack expansion can only appear inside a function argument list, tuple element, or as the expression of a for-in loop

SELECT 17:33

It tells us that we can loop over the elements with a “for-in” loop. So, is it as simple as this? for column in columns { }

SELECT 17:46

Well, unfortunately no. A variadic tuple does not conform to Sequence , and so cannot be used in a “for-in” loop. But, the error message for this gives us the next hint of what to do: For-in loop requires ‘(repeat each ResultColumn)’ to conform to ’Sequence’

SELECT 17:59

It turns out you can iterate over a tuple as if it were a sequence if you just say repeat each columns : for column in repeat each columns { }

SELECT 18:06

We can then iterate over each element of the columns tuple, which we know is a QueryExpression , and then turn it into a query string, and the append that to an array: var columnStrings: [String] = [] for column in repeat each columns { columnStrings.append(column.queryString) }

SELECT 18:34

And now that is the value we can return from the select method: return Select(columns: columnStrings, from: tableName)

SELECT 18:40

That is all it takes, and we can now write a test that it behaves how we expect: @Test func fancySelect() { assertInlineSnapshot( of: Reminder.select { ($0.id, $0.title) }.queryString, as: .sql ) { """ SELECT id, title FROM reminders """ } } This test passes and shows that basic kinds of queries work just as they did before. But now we are in a position to construct much more advanced queries. Query functions

SELECT 19:28

This is pretty incredible. We have now been able to leverage a very powerful feature of Swift, that of variadic generics, also known as parameter packs, in order to allow a select statement to select any number of columns from our table. Stephen

SELECT 19:41

And technically we aren’t just limited selecting actual, concrete columns from the table. There is a lot more freedom baked into this tools because we are free to select any kind of query expression. But, we aren’t taking advantage of that power yet because we don’t yet have any other conformances to the QueryExpression protocol.

SELECT 19:57

So, let’s now see how we can select aggregate expressions in our queries by taking full advantage of the QueryExpression protocol.

SELECT 20:07

What if we were allowed to write a query that counted the non-

NULL 20:29

We would want this query to represent the following raw SQL statement: @Test func selectCount() { assertInlineSnapshot( // count(id) of: Reminder.select { $0.id.count() }, as: .sql ) }

NULL 20:33

And perhaps we could even supply an argument to count for counting distinct values: @Test func selectCountDistinct() { assertInlineSnapshot( // count(DISTINCT id) of: Reminder.select { $0.id.count(distinct: true) }, as: .sql ) }

NULL 20:45

And of course we would also want something similar for the avg aggregate function, and even be able to use them all together: @Test func selectAvg() { assertInlineSnapshot( // avg(priority) of: Reminder.select { $0.priority.avg() }, as: .sql ) }

NULL 21:19

We now have all the pieces in place to make these queries possible. We can extend the Column type to add a count and avg function: struct Column: QueryExpression { … func count(distinct: Bool = false) -> some QueryExpression { } func avg() -> some QueryExpression { } }

NULL 21:56

Each of these functions will return a concrete conformance to QueryExpression that represents the count and avg functions. We can define all new types for this and implement their queryString s in the appropriate way: struct CountFunction: QueryExpression { let isDistinct: Bool let column: Column var queryString: String { "count(\(isDistinct ? "DISTINCT " : "")\(column.queryString))" } } struct AvgFunction: QueryExpression { let column: Column var queryString: String { "avg(\(column.queryString))" } }

NULL 23:07

Then we can return these types from our functions: func count(distinct: Bool = false) -> some QueryExpression { CountFunction(isDistinct: distinct, column: self) } func avg() -> some QueryExpression { AvgFunction(column: self) }

NULL 23:28

And that is truly all it takes to get our theoretical syntax building, and the tests even pass once recorded.

NULL 23:54

And with just a little more work we could support a lot more SQLite functions. For example, group_concat is simply a matter of adding a method to Column : extension Column { func groupConcat() -> some QueryExpression { GroupConcatFunction(column: self) } }

NULL 24:38

…that then calls out to the dedicated GroupConcatFunction type that conforms to QueryExpression : struct GroupConcatFunction: QueryExpression { let column: Column var queryString: String { "group_concat(\(column.queryString))" } }

NULL 24:46

And we can test it: @Test func selectGroupConcat() { assertSnapshot( of: Reminder.select { $0.title.groupConcat() }, as: .sql ) { """ SELECT group_concat(title) FROM reminders """ } }

NULL 24:57

The group_concat function has an optional separator that can get passed to it, so we could add an argument that will let us customize this. struct GroupConcatFunction: QueryExpression { let separator: String? let column: Column var queryString: String { return """ group_concat(\ \(column.queryString)\\(separator.map { ", '\($0)'" } ?? "")\ ) """ } }

NULL 25:43

Now we shouldn’t be interpolating SQL directly with an input like this, as it allows for what is called SQL injection attacks, but we will address that in the future.

NULL 25:50

We just need to thread the separator through our helper: extension Column { func groupConcat(separator: String? = nil) -> some QueryExpression { GroupConcatFunction(separator: separator, column: self) } }

NULL 26:05

And we can get some test coverage. @Test func selectGroupConcatWithSeparator() { assertSnapshot( of: Reminder.select { $0.title.groupConcat(separator: " - ") }, as: .sql ) { """ SELECT group_concat(title, ' - ') FROM reminders """ } }

NULL 26:12

With all these helpers written we can finally test that more advanced query that we sketched out earlier. @Test func selectAdvanced() { assertSnapshot( of: Reminder.select { ( $0.id.count(distinct: true), $0.title.groupConcat(), $0.priority.avg() ) }, as: .sql ) { """ SELECT count(DISTINCT id), group_concat(title), avg(priority) FROM reminders """ } }

NULL 26:50

And that is pretty incredible. Already this query is more advanced than most query builder libraries out there. And the key to this power is that we did not become overly fixated on a key path syntax for selecting columns. While that syntax does seem nice, it prevents you from constructing more powerful queries, such as using aggregate functions, and soon we will see even more power. Next time: ordering

NULL 27:10

We now have the basics of a query builder library implemented, and it already is more powerful than probably any Swift SQL library out there. It is capable of expressing complex selections from a table, including columns, aggregates, and even subqueries. And best of all, we are able to leverage some of Swift’s most advanced features, such as key paths, macros and variadic generics, to provide a lot of type safety and static checks that give us a lot of confidence we are writing valid SQL statements.

NULL 27:35

But there is still a lot more functionality in a SQL

WHERE 27:59

But for right now we are going to concentrate on ordering. Once you have performed a select you will typically have a big list of rows, and you often want to order those rows in particular manner. SQL has great support for sorting, allowing you to sort by any number of fields in any direction, and you can even perform computations in the ordering, such as sorting a text column by ignoring the case of characters.

WHERE 28:30

Let’s first see what SQL has to offer when it comes to ordering, and then see how we can recreate all of it in our query builder…next time! References SQLite The SQLite home page https://www.sqlite.org Downloads Sample code 0317-sql-building-pt4 Point-Free A hub for advanced Swift programming. Brought to you by Brandon Williams and Stephen Celis . Content Become a member The Point-Free Way Beta previews Gifts Videos Collections Free clips Blog More About Us Community Slack Mastodon Twitter BlueSky GitHub Contact Us Privacy Policy © 2026 Point-Free, Inc. All rights are reserved for the videos and transcripts on this site. All other content is licensed under CC BY-NC-SA 4.0 , and the underlying source code to run this site is licensed under the MIT License .