Video #190: Concurrency's Past: Threads

Episode: Video #190 Date: May 23, 2022 Access: Members Only 🔒 URL: https://www.pointfree.co/episodes/ep190-concurrency-s-past-threads

Description

To better understand Swift’s concurrency tools, let’s first look to the past, starting with threads. Threads are a tool most developers don’t reach for these days, but are important to understand, and the way they solve problems reverberate even in today’s tools.

Video

Cloudflare Stream video ID: 94a37748e77365774d94c8c5ff446c7b Local file: video_190_concurrency-s-past-threads.mp4 *(download with --video 190)*

References

Transcript

— 0:05

Just a little over 9 months ago, Swift 5.5 was released and with it came an all new suite of tools for tackling concurrency. We very briefly touched upon some of these tools in past episodes, like when we explored the new .refreshable view modifier in SwiftUI for adding pull-to-refresh to any view.

— 0:22

That exploration was interesting, but also a little superficial. We are ready to dive much deeper into Swift concurrency, and the main reason we want to do this now is because we are also ready to start more deeply integrating Swift’s concurrency tools into the Composable Architecture . In order to do that in the best way possible we are going to need understand a number of concurrency topics, many of which are quite subtle and tricky.

— 0:46

In this series of episodes we want to provide an introduction to most of the new tools that Swift has given us for modeling concurrency. The way we are going to do this is through a lens focused on the past. Apple platforms have a rich history of concurrency tools, from threads to dispatch queues, and most recently, Combine publishers. In order to understand why Swift’s modern concurrency APIs were designed the way they are, and what problems they try to fix, we must know how we got to this point. What did the previous tools excel at, and what were their faults? This will make it easier for us to learn how to use Swift’s new tools, understand why they sometimes don’t work the way we expect, and overall we will be more proficient at using the new APIs. Thread basics

— 1:28

The most obvious place to start when thinking about concurrency is threads. However, if you got into Apple platform development after 2009, in particular the year iOS 4 and OS X Snow Leopard were released, then you may have never written a single line of code that explicitly deals with them. That’s because in 2007 operation queues were introduced as an abstraction over threads, and then in 2009 Grand Central Dispatch was introduced, which made it even easier to write asynchronous code without thinking about threads. We’ll talk about operation queues and GCD later, but in order to appreciate how big of a step forward GCD was we need to see what life was like before it.

— 2:09

You might think that threads would be the simplest form of concurrency since it most closely matches what our computers are doing. After all, our computers have a certain number of cores, say 8, and so perhaps we have 8 threads that we can schedule work on and those threads can perform their work concurrently.

— 2:26

Well, already that is much too naive to be usable. In reality, the operating system manages dozens, or even hundreds, of “logical” threads that are mapped to the physical cores of the device. This allows the OS to hand out threads to anyone that wants one, and then it will take care of letting those threads perform work on the CPU, sometimes even allowing many threads to interleave their executions in short spurts of time. This is a complex process that we don’t have a lot of insight into, but concurrency APIs often expose various tools for tuning the priority of work and preventing race conditions when accessing data from multiple threads.

— 3:05

Let’s quickly see what it looks like to do concurrent programming in Swift by working directly with threads. To explore this we are going to start in a brand new SPM project for an executable so that we have a simple entry point to play around with APIs.

— 3:20

And let’s make sure the executable runs in release mode because we will be doing some benchmarking later in the episode, and you always want to benchmark with all compiler optimizations turned on.

— 3:30

There are a few ways you can create a thread on Apple platforms. First of all, it all lives in the Foundation framework, so let’s import that: import Foundation

— 3:39

Then the easiest way to fire up a new thread to perform work on is to use the detachNewThread class method on Thread : Thread.detachNewThread { }

— 3:52

You heard that right. Thread is a class, and in fact it’s even meant to be subclassed. 😲

— 3:55

This is a pretty substantial departure from how later concurrency APIs are designed. Grand Central Dispatch does have some classes in its API, but they are not meant to be subclassed, and Swift’s new concurrency APIs are largely modeled on value types with a few exceptions, including a whole new kind of reference type for data isolation.

— 4:13

The Thread class is an abstraction layer on top of something known as “POSIX threads”, or “pthreads”. This is the parallel execution model used by iOS and macOS, as well as many other platforms. You can even create pthreads directly in Swift by calling out to C functions, but that can be a real pain to do and so that’s why Apple provides this Thread class.

— 4:34

Now that we have our new thread we can start to perform some serious work in it. We don’t have anything important to compute at the moment, so to simulate some serious work we can just sleep the thread for a bit and then print the details on the thread: Thread.detachNewThread { Thread.sleep(forTimeInterval: 1) print(Thread.current) } Program ended with exit code: 0

— 4:54

Running this doesn’t print anything because detaching the thread doesn’t hold up the current thread, and so we breeze right by that work and the executable finishes: Thread.detachNewThread { Thread.sleep(forTimeInterval: 1) print(Thread.current) } print("!") ! Program ended with exit code: 0

— 5:15

We can keep the executable running for a little more time by making the main thread sleep for a little longer : import Foundation Thread.detachNewThread { Thread.sleep(forTimeInterval: 1) print(Thread.current) } Thread.sleep(forTimeInterval: 1.1) print("!")

— 5:26

Now when we run this we get information about the thread printed to the console: <NSThread: 0x106404d00>{number = 2, name = (null)} ! Program ended with exit code: 0

— 5:34

This is definitely a non-main thread we have spun up since the main thread has the following description: print(Thread.current) Thread.sleep(forTimeInterval: 1.1) <_NSMainThread: 0x10600aa90>{number = 1, name = main}

— 5:51

Note that Thread.current is some kind of magic global variable. Its behavior changes depending on what thread you are on. When printed from inside Thread.detachNewThread it logs something different than when printed from the global file scope. This is strange, but it’s something we are going to see over-and-over in each concurrency model.

— 6:11

We can get a little more insight into what is happening if we put a breakpoint after the thread is detached.

— 6:16

Now if we run and look at the debug navigator we will see we have 2 threads active: Thread 1 Queue : com.apple.main-thread (serial) #0 0x0000000100003ac0 in main at main.swift:8 #1 0x00000001000150f4 in start () Thread 2 #0 0x000000018946cebc in __semwait_signal () #1 0x0000000189377d88 in nanosleep () #2 0x000000018a4d623c in +[NSThread sleepForTimeInterval:] () #3 0x0000000100003b18 in closure #1 in at main.swift:4 #4 0x0000000100003ba8 in thunk for @escaping @callee_guaranteed () -> () () #5 0x000000018a61bdbc in __NSThread__block_start__ () #6 0x00000001001c1b40 in _pthread_start ()

— 6:24

One is the main thread, and the other is the thread we just created. We can even see from the stack trace that it is in the middle of sleep right now.

— 6:33

Already we can get our first glimpse into the complexities of working with concurrent code. Suppose we detached a whole bunch of threads and printing a unique number from each thread: Thread.detachNewThread { print("1", Thread.current) } Thread.detachNewThread { print("2", Thread.current) } Thread.detachNewThread { print("3", Thread.current) } Thread.detachNewThread { print("4", Thread.current) } Thread.detachNewThread { print("5", Thread.current) }

— 6:55

What do we expect to be printed if we run this? Hopefully 1, 2, 3, 4, 5.

— 7:02

Yet when we run we get: 1 <NSThread: 0x100710880>{number = 2, name = (null)} 3 <NSThread: 0x101231e90>{number = 3, name = (null)} 5 <NSThread: 0x10150d1d0>{number = 4, name = (null)} 2 <NSThread: 0x10112a9e0>{number = 5, name = (null)} 4 <NSThread: 0x101606470>{number = 6, name = (null)}

— 7:06

And if we run again we get: 1 <NSThread: 0x106214f20>{number = 2, name = (null)} 3 <NSThread: 0x1007a6910>{number = 3, name = (null)} 2 <NSThread: 0x1007a6cf0>{number = 4, name = (null)} 4 <NSThread: 0x1063040e0>{number = 6, name = (null)} 5 <NSThread: 0x1007a6e60>{number = 5, name = (null)}

— 7:13

It turns out that threads are not guaranteed to be started in the order they are created. The process in which the operating system hands out execution time to threads is quite complex, and so we should try our hardest to not make any assumptions about the order in which things are executed. If we really do need one thread to be run immediately after another thread, we need to implement additional logic to coordinate that, and that’s something we will look more deeply into later.

— 7:36

So, we are getting the first glimpse at how subtle multithreaded programming can be, but so far it’s quite straightforward. We can use Thread.detachNewThread to spin up a new thread, which is a context we can perform work on without holding up other threads. We can create as many of these threads as we want, theoretically. Our machines have a finite number of cores, but the operating system will do the tough work of figuring out which threads it allows to run and which it pauses. It will even allow many threads to interleave their computations so that one runs for a little while before being paused so that another can run, and then that one will be paused and the first will be resumed, and on and on. Priority and cancellation

— 8:13

The thread class also provides some really handy tools that make working with threads more versatile. For example, currently we fire off a thread with the detachNewThread function, and it just goes off and does its own thing, completely untethered to the rest of the application.

— 8:30

It’s possible to get a handle to the thread so that we can inspect and modify it. To do this we use an initializer on Thread that takes a block for executing its work: let thread = Thread { Thread.sleep(forTimeInterval: 1) print(Thread.current) }

— 9:17

This is a lazy operation. If we run the application we will see that the thread’s info is not printed like it was before. The thread does not start automatically, but rather it must be started explicitly: thread.start()

— 9:32

Now it works as it did before.

— 9:40

Now that we have a handle to the thread we can do a few interesting things with it. For one, we can set it’s priority: thread.threadPriority = 0.75

— 9:47

Priority is measured as a double between 0 and 1. This will signal to the operating system that this thread is low or high priority, and perhaps the OS will give the thread a little bit less or more time to execute relative to other threads, though there are no guarantees.

— 10:09

We can also use this thread handle to cancel the thread at any point in its execution. You do this by invoking the cancel method: thread.start() thread.cancel()

— 10:13

If we run this we will see that the new thread’s info is no longer printed to the console.

— 10:20

Now this is a little misleading. This makes us believe that thread.cancel() has the ability to halt the execution of the thread. That isn’t actually what is happening. By invoking cancel immediately after start it seems that the Thread class is smart enough to not even invoke the closure we provided.

— 10:35

If we wait a very small amount of time after starting the thread and before cancelling it, just enough for the thread to start execution, we will find that the full body of the thread closure is executed: thread.start() Thread.sleep(forTimeInterval: 0.01) thread.cancel()

— 10:48

The thread’s info was printed to the console after a 1 second wait even though we had canceled the thread.

— 10:55

It turns out that thread cancellation does not work by simply halting the execution of the closure’s body. If it did there could be numerous problems. What if at the beginning of the thread you had opened a file, network or database connection, and at the end of the thread it was closed? If cancellation simply halted execution in the middle of the thread then you run the risk of opening a connection that isn’t properly closed. And that’s only scratching the surface. There are a lot of bad states we could be left in if we allow threads to be simply killed mid-execution.

— 11:25

This is why thread cancellation is handled in a more “cooperative” manner. By invoking the cancel() method on the thread we are requesting that the thread be cancelled, which flips a little piece of boolean state in the thread, and then you can use that flag in order to short circuit logic in your thread’s work.

— 11:35

So, in our thread’s body, after we perform the sleep we could check if the thread had been canceled so that we can short circuit any of the later work: let thread = Thread { Thread.sleep(forTimeInterval: 1) guard !Thread.current.isCancelled else { print("Cancelled") return } print(Thread.current) }

— 11:52

Now when we run the executable we will see it does not print the thread’s information to the console.

— 11:58

So this is how cooperative cancellation works with threads. If we have a bunch of intense work we need to perform on a thread, it would be a good idea to sprinkle in some isCancelled checks so that you can free up the thread as soon as possible. For example, if this thread represented a single unit of work in a larger web crawler application, where it first makes a network request to load a webpage’s data, then parses the data into some structured data type, and then extracts out certain pieces of information from the webpage to store in some database or index, it would be a good idea to check the isCancelled flag between each of those steps so that you don’t do additional work that is not necessary.

— 12:35

Note that even though we can participate in cooperative cancellation, the cancellation mechanism isn’t as deeply ingrained into the concurrency tools as we would hope. For example, the Thread.sleep method does not participate in cooperative cancellation, which means it waits the full time even if cancelled.

— 12:53

To see this let’s print the amount of time the sleep lasted: let thread = Thread { let start = Date() defer { print("Finished in", Date().timeIntervalSince(start)) } Thread.sleep(forTimeInterval: 1) guard !Thread.current.isCancelled else { print("Cancelled") return } print(Thread.current) } Finished in 1.0012229681015015

— 13:02

Running this we will see that the thread slept for the full second we told it to. Ideally it might return control flow back to our closure as soon as it detected the thread was cancelled. So it seems that Thread.sleep doesn’t do any kind of polling or checking of the isCancelled flag which is a bummer. We’re not sure for the reason that is, but it may have been to avoid the overhead of any kind of polling on the CPU.

— 13:22

So priority and cancellation are two features of threads that show a glimpse of “cooperation.” You can tell a thread it has low priority so that other threads get more of a chance to execute their work on the CPU, and you can short-circuit your thread’s work when cancellation is detected, thus freeing up resources more quickly. Thread dictionaries

— 13:40

There’s another interesting piece of functionality that Foundation’s Thread class has. Every thread has access to a seemingly global, ubiquitous piece of state that can be accessed from anywhere called a threadDictionary . That may sound scary, but it’s actually incredibly useful for passing values deep into a system without needing to explicitly pass an argument to every function, method, initializer and so on. Further, these dictionaries are isolated to the particular thread, and so are guaranteed to be safe to mutate even though they are seemingly global.

— 14:12

There are many uses of this concept, and we’ll actually have episodes in the near future applying a similar concept to both reducers in the Composable Architecture as well as parsers, but there’s another really common use case, and it involves server-side applications.

— 14:26

A server-side application can essentially be thought of as a function that accepts an incoming URL request and produces an outgoing URL response. The request consists of things like path, method, query params, headers, and body data, and the response consists of things like status code, headers and body data.

— 14:47

Now requests can come in at a very fast rate, perhaps even hundreds or thousands of requests per second if you have a very popular site. In order to service such a high number of requests one might reach for threads.

— 15:14

Perhaps each time a request comes in we detach a new thread so that we can do all the work necessary for that request, which may include making database requests, network requests, making an intense computation, and more: func response(for request: URLRequest) -> HTTPURLResponse { // do the work to turn request into a response return .init() } Thread { response(for: .init(url: .init(string: "https://www.pointfree.co")!)) }

— 15:52

We don’t recommend literally implementing a web server in this manner, but this simplistic example will demonstrate the need for thread dictionaries.

— 16:00

In this situation, one often needs to associate some information to the request so that it is available throughout the entire lifecycle of the request. This data is unique to the request, and it should be easily accessible from any part of the code that is run in order to produce the server response.

— 16:18

For example, it is very common to sprinkle copious amounts of logging when implementing the logic for a website. At the very least you would log when a request starts and finishes, as well as how much time the request took. But you may also log every single network or database request that is made during the application’s lifecycle, as well as how much time those tasks took. You may also want to log a variety of events that are important to you, such as when a non-admin tries to access an admin page, when a network or database request fails, or when a piece of your application fails in a way that you don’t expect.

— 16:50

All of these logs are very important for you to understand the health of your website and debug any problems that occur. However, if you sprinkle in logs naively, you will end up with a pile of data that is difficult to make sense of.

— 17:02

For example, we can “tail” the logs for our pointfree.co website, which means we get a live view into every single log that is printed by our server code: $ heroku logs --tail -a pointfreeco

— 17:17

This immediately starts to print out a stream of logs that are printed from our server code.

— 17:23

We can see requests of people visiting episode pages and blog pages, requests for RSS feeds, and more. This is a lot of information at our disposable. One thing we can do to filter the logs is pipe the result into grep in order to only show the logs for which a pattern matches. Like if we wanted just the logs for requests to an episode page $ heroku logs --tail -a pointfreeco | grep "/episodes/"

— 17:42

So that’s nice, but still a bit of a jumbled mess.

— 17:46

What if we noticed in our logs that we logged an error that happened in a critical part of our application, like say subscribing. We would love to be able to see all the logs for that particular request that led up to that error. It might show what network requests or database requests were made, and overall just provide more data for us to figure out what went wrong.

— 18:09

This is what motivates the concept of a “request id”, which is a uniquely generated ID that is associated with each request that comes into the server. Then, every log made during that request is automatically prepended with the id. This means if we find a log showing a problem we can simply search our logs for that request id to find every log that led up to that problem.

— 18:34

So, we need this request id to be available throughout our entire server stack. We could of course pass a requestId: UUID to every single function, method, initializer, etc. in the entire application, but that would be very arduous. Also what if there’s more information we want to pass along, such as the start time of the request, or a logger? Will we need to update every single function, method, initializer to take these extra pieces of data anytime we need more ubiquitous state at our disposal?

— 19:07

This task is exactly what threadDictionary was made for. We can set a value in the dictionary when the request is received and the thread is created, and then it will be immediately available to all of our server code, and most importantly it will be completely silo’d to just the thread we are working in. This means we could have many threads running in parallel that are working on separate requests, and each one is given its own thread dictionary with request id.

— 19:41

So, let’s try it out. Before starting the thread we will set the requestId on its thread dictionary so that it is available to all the code that runs for this request: thread.threadDictionary["requestId"] = UUID() thread.start()

— 20:02

And now we can access the requestId from anywhere in our application by just reaching into the current thread’s dictionary. For example, we could print the before and after of making a database and network request, which we will simulate right now as simple thread sleeps: func response(for request: URLRequest) -> HTTPURLResponse { let requestId = Thread.current.threadDictionary["requestId"] as! UUID let start = Date() defer { print(requestId, "Finished in", Date().timeIntervalSince(start)) } print(requestId, "Making database query") Thread.sleep(forTimeInterval: 0.5) print(requestId, "Finished database query") print(requestId, "Making network request") Thread.sleep(forTimeInterval: 0.5) print(requestId, "Finished network request") return .init() } F26EFC5A-5AF0-4327-B164-C95D325AA698 Making database query F26EFC5A-5AF0-4327-B164-C95D325AA698 Finished database query F26EFC5A-5AF0-4327-B164-C95D325AA698 Making network request F26EFC5A-5AF0-4327-B164-C95D325AA698 Finished network request F26EFC5A-5AF0-4327-B164-C95D325AA698 Finished in 1.0103570222854614

— 22:18

And remember that the Thread.current.threadDictionary can be accessed from anywhere. Even if we call out to a function, which calls out to a method, which constructs some object, and then calls a method on that object, no matter how deep our code gets we can always pluck out the requestId from the current thread’s dictionary. In some sense it’s global but it’s siloed to just a single thread.

— 22:31

For example, we can refactor our response into some helpers that individually perform the work of running a database query and network request: func makeDatabaseQuery() { let requestId = Thread.current.threadDictionary["requestId"] as! UUID print(requestId, "Making database query") Thread.sleep(forTimeInterval: 0.5) print(requestId, "Finished database query") } func makeNetworkRequest() { let requestId = Thread.current.threadDictionary["requestId"] as! UUID print(requestId, "Making network request") Thread.sleep(forTimeInterval: 0.5) print(requestId, "Finished network request") } func response(for request: URLRequest) -> HTTPURLResponse { let start = Date() let requestId = Thread.current.threadDictionary["requestId"] as! UUID defer { print(requestId, "Finished in", Date().timeIntervalSince(start)) } makeDatabaseQuery() makeNetworkRequest() return .init() }

— 23:00

And when we run things, we’ll see that the request ids are constant across each helper.

— 23:08

So, this is looking pretty great. We’ve now seen that threads are a basic unit of concurrency on Apple platforms, that there are many ways of creating threads, and that you can customize thread behavior in a variety of ways, such as priority, cancellation and thread dictionaries. Problems: coordination

— 23:22

However, there are some serious problems lurking in the shadows of our code.

— 23:26

Most obviously the thread’s storage is modeled as a dictionary which means we lose a lot of static guarantees and type safety that the compiler could be supplying. For example, we could accidentally mistype the name of the key or we could change the id from a UUID to some other type, both of which would be bugs but the compiler would not be able to help us.

— 23:50

Next, there’s a very subtle problem with how we are using thread dictionaries here. Although it seems to be a great tool for passing data implicitly into deep parts of the application, it is unfortunately tied to a single thread. That may seem obvious, it’s a thread dictionary after all, but it is often the case we will want to spin up a new thread from an existing thread, and unfortunately the thread dictionary does not come along for the ride. So let’s see why that may be a problem.

— 24:21

For example, suppose that in our hypothetical server function above we wanted to perform the database query and network request in parallel and then join their results together once they both finish.

— 24:50

We might hope we could simply spin up two new threads and do all the work inside them: func response(for request: URLRequest) -> HTTPURLResponse { let start = Date() let requestId = Thread.current.threadDictionary["requestId"] as! UUID let databaseQueryThread = Thread { makeDatabaseQuery() } databaseQueryThread.start() let networkRequestThread = Thread { makeNetworkRequest() } networkRequestThread.start() // TODO: join threads somehow print(requestId, "Completed in", Date().timeIntervalSince(start)) return .init() }

— 25:12

However, this crashes because the inner threads do not have access to the thread dictionary from the outer threads.

— 25:23

Instead we have to explicitly copy the dictionary over: func response(for request: URLRequest) -> HTTPURLResponse { let start = Date() let requestId = Thread.current.threadDictionary["requestId"] as! UUID let databaseQueryThread = Thread { makeDatabaseQuery() } databaseQueryThread.threadDictionary.addEntries(from: Thread.current.threadDictionary as! [AnyHashable : Any]) databaseQueryThread.start() let networkRequestThread = Thread { makeNetworkRequest() } networkRequestThread.threadDictionary.addEntries(from: Thread.current.threadDictionary as! [AnyHashable : Any]) networkRequestThread.start() // TODO: join threads somehow print(requestId, "Completed in", Date().timeIntervalSince(start)) return .init() }

— 25:53

The reason for this behavior is that threads do not have the concept of “child” threads. That is, creating a new thread leads to a new isolated thread without inheriting anything from the thread from which it was created, such as priority, thread dictionary, etc.

— 26:09

We saw the same with cancellation, and we will later see that a parent-child relationship between asynchronous units of work can be incredibly powerful.

— 26:18

But this drawback is only the tip of the iceberg.

— 26:23

Let’s first consider this todo we have at the end of our response thread: // TODO: join threads somehow

— 26:32

Presumably we want to wait for the database and network requests to finish, and then do something with the data that was obtained. However, right now we are firing off two new threads for each of those asynchronous pieces of work, and the outer thread just hums along without stopping. We need to somehow wait the outer thread until the two other threads finish.

— 26:51

The Thread class does not provide nice tools to do this, and the only way we know how to do this is to perform a while loop with some short sleeps so that we can poll to see if the threads have finished: func response(for request: URLRequest) -> HTTPURLResponse { let start = Date() let requestId = Thread.current.threadDictionary["requestId"] as! UUID let databaseQueryThread = Thread { makeDatabaseQuery() } databaseQueryThread.threadDictionary.addEntries( from: Thread.current.threadDictionary as! [AnyHashable: Any] ) databaseQueryThread.start() let networkRequestThread = Thread { makeNetworkRequest() } networkRequestThread.threadDictionary.addEntries( from: Thread.current.threadDictionary as! [AnyHashable: Any] ) networkRequestThread.start() while !databaseQueryThread.isFinished || !networkRequestThread.isFinished { Thread.sleep(forTimeInterval: 0.1) } print(requestId, "Completed in", Date().timeIntervalSince(start)) return .init() }

— 27:30

This is a huge bummer.

— 27:33

This code is, to put it nicely, indirect. We have to contort ourselves in some pretty strange ways to coordinate these two pieces of asynchronous work, and it looks nothing like how we would write this code if everything was synchronous. Over time we may get used to this style, but it will still never beat being able to just execute some functions, one after the other, and hiding all of these thread and coordination details.

— 27:53

Thread dictionaries are not the only the only thing that don’t trickle down to threads spawned from other threads. Neither does cancellation: let thread = Thread { let start = Date() Thread.sleep(forTimeInterval: 1) print("Slept for", Date().timeIntervalSince(start)) Thread.detachNewThread { print("Inner thread isCancelled", Thread.current.isCancelled) } guard !Thread.current.isCancelled else { print("Cancelled") return } print(Thread.current) }

— 28:36

The inner thread does not get its isCancelled boolean flipped to true because it’s a whole new thread.

— 28:42

This may not be too surprising, after all it is a whole new thread, but practically speaking it can be really handy for cancellation to trickle down. We might have a thread doing a unit of work where one step could be split into two sub-steps that each could run in parallel on their own threads. It would be really nice if cancelling the outer thread could also cancel the two child threads created, but sadly it does not. Problems: expensiveness

— 29:10

But things get even worse. Suppose we had a huge number of concurrent units of work that we wanted to run. For example, we could be build a web crawler that needs to load thousands of web pages so that we can index their content.

— 29:33

We can simulate this by detaching a whole bunch of new threads and performing some CPU intensive work inside, like say an infinite loop: let workCount = 1_000 for n in 0..<workCount { Thread.detachNewThread { print(n, Thread.current) // TODO: do serious work to load and index a webpage while true {} } } Thread.sleep(forTimeInterval: 3)

— 30:32

If we run the executable now we will see 1,000 thread descriptions printed to the console. This means that literally 1,000 threads were created from running this code. In fact, if we put a breakpoint after the for loop we will clearly see in the stack trace that there are 1,000 threads, each of which is in the middle of a while loop: Thread 2 #0 0x0000000100004afc in closure #1 in thread() at main.swift #1 0x0000000100004b38 in thunk for @escaping @callee_guaranteed () -> () () #2 0x000000019a4e7dbc in __NSThread__block_start__ () #3 0x00000001001c9b40 in _pthread_start () Thread 3 #0 0x0000000100004afc in closure #1 in thread() at main.swift #1 0x0000000100004b38 in thunk for @escaping @callee_guaranteed () -> () () #2 0x000000019a4e7dbc in __NSThread__block_start__ () #3 0x00000001001c9b40 in _pthread_start () . . . Thread 1001 #0 0x0000000100004b00 in closure #1 in thread() at main.swift #1 0x0000000100004b38 in thunk for @escaping @callee_guaranteed () -> () () #2 0x000000019a4e7dbc in __NSThread__block_start__ () #3 0x00000001001c9b40 in _pthread_start ()

— 31:11

This is hugely problematic.

— 31:14

Threads are a heavy weight unit of computation. They represent a continuous stream of instructions that are run on a CPU core. By creating a thread we are telling the operating system that the work in the closure should be scheduled whenever possible and should even compete with the work being performed in other threads. Firing up 1,000 threads to load 1,000 webpages will create a huge amount of competition between threads, and the OS will struggle to give each thread a reasonable amount of time to do its work. Further, each thread is given its own call stack of frames, and on this machine the allowed stack size half a meg: po Thread.current.stackSize 524288

— 31:55

This means 1,000 threads could potentially take up to half a gig of memory just for the call stack.

— 32:16

We could of course refactor our web crawler so that instead of spinning up 1,000 threads it spins up only 100 that each process 10 websites, but that is just moving complexity from the OS level to the application level. It would be far better if we didn’t have to think about these kinds of problems and instead focus on our true problem at hand, that of loading 1,000 webpages.

— 32:38

And beyond the memory problem, creating this many threads, all competing for execution time with the CPU, is going to going to starve any other thread out there trying to do its own work. We can even see this in real concrete terms.

— 32:49

Suppose that in addition to the 1,000 threads we created above, we also detached a thread to compute something intense, like the 50,000th prime number. We can do so in a naive way by first defining a function that is capable of checking if a number is prime: func isPrime(_ p: Int) -> Bool { if p <= 1 { return false } if p <= 3 { return true } for i in 2...Int(sqrtf(Float(p))) { if p % i == 0 { return false } } return true }

— 33:11

And then define a function that can find the nth prime, and will do so by printing it to the console along with benchmarking numbers: func nthPrime(_ n: Int) { let start = Date() var primeCount = 0 var prime = 2 while primeCount < n { defer { prime += 1 } if isPrime(prime) { primeCount += 1 } } print( "\(n)th prime", prime-1, "time", Date().timeIntervalSince(start) ) }

— 33:19

And then we can detach a thread that computes the 50,000th prime: Thread.detachNewThread { print("Starting prime thread") nthPrime(50_000) }

— 33:30

Let’s first run this code without any competing threads to see how long it should take: let workCount = 0 Starting prime thread 50000th prime 611953 time 0.025572625

— 33:47

So looks like it takes roughly 0.025 seconds to compute the 50,000th prime.

— 33:47

Now let’s crank up the thread count and see how long it takes: let workCount = 1_000 Starting prime thread 50000th prime 611953 time 2.934731960296631

— 34:24

Wow, so over 100x slower now that we have a bunch of threads to compete with for time on the CPU.

— 34:34

The main problem we are seeing here is that threads are not meant for this kind of concurrency. You are not supposed to spin up an arbitrary number of them to perform work. The reason for this is that threads do not really allow for non-blocking work. You may have heard the term “non-blocking I/O” thrown around a lot when discussing concurrency, and what it refers to is the fact that often while performing asynchronous work there is a lot of downtime where the CPU doesn’t really need to do much work.

— 35:00

The simplest example of this is if we want to delay the execution of some work. We just want to wait for some time to pass, and then perform the work: Thread.sleep(forTimeInterval: 5) doWork()

— 35:11

The thread doesn’t need to actually do any work on the CPU during that waiting time. Another example is timers. If you want to run some code every 5 seconds, then the vast majority of time is spent just waiting around for time to pass: while true { Thread.sleep(forTimeInterval: 5) // TODO: do work }

— 35:45

Another common, but more subtle example, is when making a network request. We have to wait for the server to send us back some data: let data = apiClient.request()

— 35:54

In fact, it can often take 300ms, 500ms, a full second or more from the moment we make the request to the moment we actually get a response back. During that time the CPU doesn’t have much to do.

— 36:10

Ideally, in all of these situations, we’d be able to give up our thread momentarily, which would allow other threads to do their work, such as our prime number computation. We should be able to do this while also being able to start execution on our thread again once we are ready, like say when we finally get a response back from the network.

— 36:33

However, that’s not really possible with threads. Because they map so closely to the concept of physical CPU cores they are intended to continuously run computation. They don’t really support the idea of being removed from consideration for execution temporarily. In fact, one of the worst things you can do with a thread is block it, such as when you use Thread.sleep . This keeps the thread alive, mean it is taking up resources, but we aren’t doing anything useful with those resources.

— 36:00

In the case of wanting to simply wait for some time to pass it seems like it would be much smarter to somehow register a future date with the OS and ask the OS to notify us when that date is reached. Unfortunately that isn’t possible with only threads. You need more machinery at your disposal, like say a run loop that can continuous check if it’s time to execute scheduled work. If you did have that machinery at your disposal then it would be much more efficient than creating a thread and blocking it for a specified amount of time.

— 37:31

A popular way some people try to work around the thread explosion and starvation problems is to create what is known as “thread pools.” In this pattern a fixed number of threads are created and managed. Then code that wants a thread will not detach a new one, but rather will ask the pool for a thread. This allows the pool to have some smarts baked in to make sure too many threads are not created.

— 37:56

Theoretically this might look something like this: let threadPool = ThreadPool(size: 10) threadPool.requestThread { thread in Thread.sleep(forTimeInterval: 1) print(thread) }

— 38:26

We create a pool of threads, which is probably just some class that manages a bunch of internal state and behavior. We ask the pool for a thread, and in order to do so we pass the pool a closure which will be executed once a thread is ready for us. This gives the pool the ability to determine when it is ready to hand off a thread to us. If it has one available it could give us one immediately. However, if they are all tied up we may need to wait until one is free before we can pass it along.

— 38:47

This can greatly help with thread starvation, where you have way too many threads competing to get work done on the CPU, but it does have some drawbacks. First off, thread pools don’t help at all with blocking work. We can still obtain a thread from the pool and then block it while we wait for something, like a timer or a network request. That means we are still going to prevent other work using the thread pool from getting time on the CPU, which isn’t great. Work not using the thread pool will have a better chance at getting scheduled since there are less threads to compete with, which is at least a small win.

— 39:21

Another drawback of a thread pool is trying to be a local solution to a global problem. If we manage a thread pool for our application, then we can be sure we don’t accidentally spin up a bunch of threads, but our thread pool will not cooperate with any other code that we don’t control. Perhaps we have a 3rd party dependency that also manages a thread pool, or maybe some of Apple’s frameworks have thread pools, such as URLSession , core data, or who knows what else.

— 39:47

It would be possible for there to be dozens of thread pools out there completely independent of each other, and if each pool has 10 threads then we may still cumulatively have hundreds of threads out there.

— 39:57

Thread pools offer a glimpse into what is known as “cooperative concurrency.” It does not work out well for concurrency resources to be a Wild West land grab. If we act that way then we run the risk of starving the system and making it a worse experience for everyone. Instead, it can be better to have a layer of cooperativeness that allows resources to be shared. Unfortunately Foundation’s threading tools do not provide this for us out of the box, rather it’s on us to create these tools, and because we cannot provide a global cooperative solution it means our tools aren’t going to be as good as they could be. Problems: data races

— 40:33

Now, thread pools may half solve the problem of creating too many threads and starving resources, but they don’t help at all with the next huge problem we want to discuss, which is data races. So far we’ve had fun playing around with spinning up threads to see how the OS reacts down below, but we aren’t actually doing anything interesting in them.

— 40:54

As soon as we start doing real work we come face to face with race conditions. This is what happens when many threads try to access and mutate the same piece of data. Apple’s frameworks do provide tools for synchronizing access to data so that you can prevent problems, but the tools are not tightly integrated with the threading tools. You’re kind of left out in the cold when it comes to figuring out how to wield these tools properly.

— 41:15

Let’s quickly see how a data race can crop up in multithreaded code and see what tools we have to fix them.

— 41:38

Suppose we have a piece of mutable state that we want to read and mutate from multiple threads. We will package the state up in a class: class Counter { var count = 0 } let counter = Counter()

— 41:53

Then we will spin up a bunch of new threads, sleep them for a very brief amount of time, and then increment the counter: for _ in 0..<1_000 { Thread.detachNewThread { Thread.sleep(forTimeInterval: 0.01) counter.count += 1 } }

— 42:00

Since we are spinning up 1,000 threads we would expect that the count would be 1,000. But if we wait a short amount of time for the threads to do their work we will see we get something smaller: Thread.sleep(forTimeInterval: 0.5) print("count", counter.count) count 987

— 42:19

And every time we run the executable we get something slightly different.

— 42:29

This is happening because mutating a field does not happen in one single atomic CPU instruction, but rather many instructions. Multiple threads are running those same instructions, and so the instructions will start to become interleaved, allowing for the possibility of one thread overwriting the results of another.

— 42:44

For a simplified model of this, the counter.count += 1 line can be thought of as being split into 3 steps: first we retrieve the current count, then we increment it, then we assign it: for _ in 0..<1_000 { Thread.detachNewThread { Thread.sleep(forTimeInterval: 0.01) var count = counter.count count += 1 counter.count = count } }

— 43:00

If two threads are running these 3 lines concurrently, we might hope that they are executed in this order: var count1 = counter.count // 0 count1 += 1 // 1 counter.count = count1 // 1 var count2 = counter.count // 1 count2 += 1 // 2 counter.count = count2 // 2

— 43:13

However, in reality they will be interleaved in unpredictable ways, such as this: var count1 = counter.count // 0 var count2 = counter.count // 0 count1 += 1 // 1 count2 += 1 // 1 counter.count = count1 // 1 counter.count = count2 // 1

— 43:34

In this scenario we lost the opportunity to read the count after it was updated by the first thread. Instead both threads read the current count immediately, and so by the time all 6 steps are complete we have only set the count to 1 rather than 2.

— 43:47

This doesn’t always happen of course. In fact, it seems the vast majority of times it works as we expect, since we got a count 971 out of 1,000. But that just shows how pernicious multithreading can be. It can work correctly 99% of the time, but ultimately it is completely wrong.

— 44:00

To fix this we need to synchronize access so that when we start mutating the field we can block all other threads from starting their mutations, and then only once we are done with our mutation will we allow other threads to enter.

— 44:12

This is done by introducing locks, which allow you to block all other threads from executing a portion of code until you unlock. The easiest way to do this is to introduce a lock variable to our Counter class, and implement a method for locking and unlocking around the mutation: class Counter { let lock = NSLock() private(set) var count = 0 func increment() { self.lock.lock() defer { self.lock.unlock() } self.count += 1 } }

— 45:02

And if we use this method rather than access the count variable directly: for _ in 0..<1_000 { Thread.detachNewThread { Thread.sleep(forTimeInterval: 0.01) counter.increment() } }

— 45:09

We will see that we now get a consistent 1,000 count every time we run the executable: count 1000

— 45:14

So, it seems we’ve solved the data race. It is a little bit of a bummer that we have to create this little one-off method just to encapsulate the locking and unlocking.

— 45:21

Instead we could introduce a modify method that simply yields the counter to a closure that is locked before and unlocked after: func modify(work: (Counter) -> Void) { self.lock.lock() defer { self.lock.unlock() } work(self) }

— 45:44

Then we can increment in a thread-safe manner like so: counter.modify { $0.count += 1 }

— 46:11

Running this we still get a count of 1,000.

— 46:19

Still, it would be really nice if we could do something as simple as: counter.count += 1

— 46:27

And still be thread-safe.

— 26:29

Perhaps we can hook into the get and set of the count and perform our locking there? We could do this by introducing some private, underscored state that is the true count of the counter, and then expose a computed property that does the locking and unlocking under the hood: private var _count = 0 var count: Int { get { self.lock.lock() defer { self.lock.unlock() } return self._count } set { self.lock.lock() defer { self.lock.unlock() } self._count = newValue } }

— 47:02

But if we go back to counter.count += 1 we will see that we do not count up to 1,000: count 539

— 47:11

In fact, it’s even worse now.

— 47:26

This is happening because we are only locking the getting and setting portion of the mutation, but not the entire transaction. There are still opportunities for multiple threads to interleave these commands.

— 47:45

We can slightly improve the situation by leveraging a non-yet-public feature of Swift known as read-modify: var count: Int { _read { } _modify { } }

— 48:16

They are like get and set, but doing extra work to collapse the 3 step mutation process we mentioned above into a single step. We can use it like so: var count: Int { _read { self.lock.lock() defer { self.lock.unlock() } yield self._count } _modify { self.lock.lock() defer { self.lock.unlock() } yield &self._count } }

— 48:25

And when we run this we get a proper count of 1,000. This is because yielding in the modify allows us to directly mutate the count in one go, rather than doing the multi-step process of retrieving the value, transforming the value, and then plugging the value back in. This means our lock is encapsulating the entire transaction: counter.count += 1 // lock.lock() // counter.count = counter.count + 1 // lock.unlock()

— 48:45

However, there are still problems with this. If we further reference counter.count in the process of mutating counter.count : counter.count += 1 + counter.count / 100

— 49:01

We lose the ability to lock this entire transaction. This single line roughly corresponds to the following 6 lines: counter.count += 1 + counter.count / 100 // lock.lock() // let count = counter.count // lock.unlock() // lock.lock() // counter.count = 1 + count/100 // lock.unlock()

— 49:37

The fact that the lock does not cover the first line means it’s possible for threads to interleave. We can see this concretely by running the executable a few times and seeing that we get a different value each time.

— 50:06

However, if we go back to the .modify method style we will find that repeated runs of the executable output the exact same number: counter.modify { $0.count += 1 + $0.count / 100 }

— 50:34

So even using _read and _modify cannot fix this synchronization problem. It simply is not possible to lock property mutations in this style, and is why we need to either create one-off methods for mutating state or leverage the modify method.

— 50:48

This goes to show just how tricky multithreading and data races can be. What seems to be reasonable can often be incorrect and lead to incorrect results. The main problem with locks is they are fully decoupled from the concurrency tool we are using, which in this case is threads. Ideally the locking mechanism has intimate knowledge of how we are running multiple units of work at once in order to guarantee synchronization. This is what Swift’s new concurrency tools provide for us, but before we can discuss that there are a few more things to discuss.

— 51:19

So, Apple’s Thread class was the primary abstraction people would use on Apple’s platforms in order to unlock asynchrony and concurrency back in the day. It comes with some interesting features, such as priority, cancellation, thread dictionaries and more, but they also lack in many ways:

— 51:35

Threads don’t support the notion of child threads so that things like priority, cancellation, and thread dictionaries don’t trickle down to threads created from other threads.

— 51:45

It’s easy to accidentally explode the number of threads being used.

— 51:49

It’s hard to coordinate between threads.

— 51:52

Threaded code looks very different from unthreaded code.

— 51:55

And the tools for synchronizing between threads are crude. Next time: concurrency’s present

— 51:59

Now there’s a good chance that most of our viewers have never used threads directly in their codebase because ever since macOS Leopard, released 15 years ago, Apple has built abstractions on top of threads to help fix a lot of the problems we just uncovered. This includes operation queues, Grand Central Dispatch and even Combine. Let’s take a look at how those technologies improved upon threads, and see where they fall short.

— 52:28

Let’s start with operation queues, which were introduced in macOS Leopard and the first iOS SDK for iOS 2.0. We are only going to briefly talk about operation queues because they never gained as much popularity as GCD or Combine, but it’s still interesting to see how they tried to solve some of Thread ’s problems.

— 52:48

So let’s dive in…next time! References Threading Programming Guide Apple Note Threads are one of several technologies that make it possible to execute multiple code paths concurrently inside a single application. Although newer technologies such as operation objects and Grand Central Dispatch (GCD) provide a more modern and efficient infrastructure for implementing concurrency, OS X and iOS also provide interfaces for creating and managing threads. This document provides an introduction to the thread packages available in OS X and shows you how to use them. This document also describes the relevant technologies provided to support threading and the synchronization of multithreaded code inside your application. https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Multithreading/Introduction/Introduction.html#//apple_ref/doc/uid/10000057i Introducing Swift Atomics Karoy Lorentey • Oct 1, 2020 Note I’m delighted to announce Swift Atomics, a new open source package that enables direct use of low-level atomic operations in Swift code. The goal of this library is to enable intrepid systems programmers to start building synchronization constructs (such as concurrent data structures) directly in Swift. https://www.swift.org/blog/swift-atomics/ Downloads Sample code 0190-concurrency-pt1 Point-Free A hub for advanced Swift programming. Brought to you by Brandon Williams and Stephen Celis . Content Become a member The Point-Free Way Beta previews Gifts Videos Collections Free clips Blog More About Us Community Slack Mastodon Twitter BlueSky GitHub Contact Us Privacy Policy © 2026 Point-Free, Inc. All rights are reserved for the videos and transcripts on this site. All other content is licensed under CC BY-NC-SA 4.0 , and the underlying source code to run this site is licensed under the MIT License .