-
-
Save robertmryan/fec5ea71c26a1098bc4c64c14eab5577 to your computer and use it in GitHub Desktop.
| import Foundation | |
| import PlaygroundSupport | |
| PlaygroundPage.current.needsIndefiniteExecution = true | |
| class Demo { | |
| func getAsyncStream() -> AsyncStream<Int> { | |
| AsyncStream { continuation in | |
| let task = Task { | |
| for i in 1...5 { | |
| try await Task.sleep(for: .seconds(1)) | |
| continuation.yield(i) | |
| } | |
| continuation.finish() | |
| } | |
| continuation.onTermination = { _ in | |
| task.cancel() | |
| } | |
| } | |
| } | |
| init() { print("initialised") } | |
| deinit { print("deinitialised") } | |
| } | |
| do { | |
| let demo = Demo() | |
| for await value in demo.getAsyncStream() { | |
| print(value) | |
| } | |
| } | |
| PlaygroundPage.current.finishExecution() // see https://stackoverflow.com/questions/78556983/async-await-unexpected-error-in-playground |
@Pramod-36 … Yep, constraining concurrency has trade-offs. And if you don’t want to limit the degree of concurrency, then don’t: It’s entirely up to you. But I’d suggest that when you start dealing with massive parallelism (like thousands of tasks), there are all sorts of practical considerations that start to crop up, and you might want to revisit whether it might be prudent to constrain the degree of something reasonable for your problem:
-
You give an example of an “image inverter”, which is likely to be CPU-bound, anyway. So, if you start 1,000 image inversions, the number of available CPU cores will constrain you, anyway. So I can create 1,000 tasks, but if my Mac has 20 cores, it can only do 20 CPU-bound tasks at a time, regardless, and the other 9,980 tasks will just be sitting there, taking up memory, waiting for a thread from the concurrency thread pool (a.k.a., the “cooperative thread pool”), anyway. So, even if the first 999 tasks take 10 seconds each and the last one takes 0.1 seconds, you simply won’t see that last one start until it is afforded one of the very limited CPU cores, even if you don’t constrain the degree of concurrency yourself.
As an aside, when doing a vast number of parallel CPU-bound tasks, we often introduce “striding” (so that rather than 1,000 individual tasks, we might do 40 tasks, each processing 25 items); parallelism isn’t free and each task has a little overhead, so we often stride to maximize performance. It often takes some experimentation to empirically identify the ideal striding factor for the particular use case. Personally, I often start with something like 2 or 4 times the number of processors, and then benchmark the various alternatives.
-
You focus on how long it takes to finish the fastest task. Often, we focus on how long it takes to finish all of the tasks. E.g., if downloading images, affording 6 concurrent requests will dramatically improve the performance. Allowing 10 at a time will marginally improve it. But beyond that, you have diminishing returns while introducing new problems (like latter network requests failing due to timeouts, excessive app peak memory usage, excessive server load, etc.).
-
If you are showing results in the UI, you would generally want it to favor those tasks associated with images that are visible. We wouldn’t want to sacrifice the speed for the user-visible images simply to maximize the speed for which it processes the 1,000th image.
Given your hesitancy, feel free to defer this “constrained degree of concurrency” question until you start to experience memory and performance problems. But I wager you will eventually encounter situations where you might want to revisit this.
FWIW, see Beyond the basics of structured concurrency for a discussion on constraining concurrency.
Hi @robertmryan , thanks for your reply.
I now have a much better understanding of concurrency, especially the difference between structured and unstructured concurrency.
I have one small doubt regarding the final solution that limits the number of concurrent tasks. This may be a minor concern, but I wanted to clarify my understanding.
Let’s say we have 5 images and maxConcurrent = 2.
The first two images take 10 seconds and 7 seconds to download.
The remaining three images each take 1 second.
With the current logic:
When processing the3 image (index == maxConcurrent), the code waits for one of the first two tasks to complete. In this example, that would be the second image (7 seconds). Only after that completes is the third image added to the task group. The same pattern repeats for the fourth image, and so on.
From a user-experience perspective, this means the user would not see the third image until 7 seconds, even though it only takes 1 second to download. In contrast, if we did not limit concurrency, the third image could be displayed almost immediately.
what might be taken into considration User experience or memory issue that may occur? (This would happen even with 1,000 tasks—some quick images could still end up waiting on slower ones.)
Sample code