-
-
Save robertmryan/fec5ea71c26a1098bc4c64c14eab5577 to your computer and use it in GitHub Desktop.
| import Foundation | |
| import PlaygroundSupport | |
| PlaygroundPage.current.needsIndefiniteExecution = true | |
| class Demo { | |
| func getAsyncStream() -> AsyncStream<Int> { | |
| AsyncStream { continuation in | |
| let task = Task { | |
| for i in 1...5 { | |
| try await Task.sleep(for: .seconds(1)) | |
| continuation.yield(i) | |
| } | |
| continuation.finish() | |
| } | |
| continuation.onTermination = { _ in | |
| task.cancel() | |
| } | |
| } | |
| } | |
| init() { print("initialised") } | |
| deinit { print("deinitialised") } | |
| } | |
| do { | |
| let demo = Demo() | |
| for await value in demo.getAsyncStream() { | |
| print(value) | |
| } | |
| } | |
| PlaygroundPage.current.finishExecution() // see https://stackoverflow.com/questions/78556983/async-await-unexpected-error-in-playground |
Hi @robertmryan , thanks for your reply.
I now have a much better understanding of concurrency, especially the difference between structured and unstructured concurrency.
I have one small doubt regarding the final solution that limits the number of concurrent tasks. This may be a minor concern, but I wanted to clarify my understanding.
Let’s say we have 5 images and maxConcurrent = 2.
The first two images take 10 seconds and 7 seconds to download.
The remaining three images each take 1 second.
With the current logic:
if index >= maxConcurrent, let urlAndResult = await group.next() {
continuation.yield(urlAndResult)
}When processing the3 image (index == maxConcurrent), the code waits for one of the first two tasks to complete. In this example, that would be the second image (7 seconds). Only after that completes is the third image added to the task group. The same pattern repeats for the fourth image, and so on.
From a user-experience perspective, this means the user would not see the third image until 7 seconds, even though it only takes 1 second to download. In contrast, if we did not limit concurrency, the third image could be displayed almost immediately.
what might be taken into considration User experience or memory issue that may occur? (This would happen even with 1,000 tasks—some quick images could still end up waiting on slower ones.)
Sample code
import Foundation
import PlaygroundSupport
PlaygroundPage.current.needsIndefiniteExecution = true
class ImageInverter {
func convertToInvertedImage(_ image: Image) async -> Image {
try? await Task.sleep(for: .seconds(image.timeToDownload))
return image
}
func process(maxConcurrentTasks: Int = 2, _ images: [Image]) -> AsyncStream<Image>{
AsyncStream { continuation in
let task = Task {
await withTaskGroup(of: Image.self) { group in
for (index, image) in images.enumerated() {
if index >= maxConcurrentTasks, let invertedImage = await group.next(){
continuation.yield(invertedImage)
}
group.addTask { [weak self] in
let invertedImage = await self!.convertToInvertedImage(image)
return invertedImage
}
}
for await invertedImage in group {
continuation.yield(invertedImage)
}
continuation.finish()
}
}
continuation.onTermination = { state in
print("Stream termination state:", state)
task.cancel()
}
}
}
}
struct Image {
let name: String
let timeToDownload: Double
}
let images: [Image] = [
.init(name: "1", timeToDownload: 6),
.init(name: "2", timeToDownload: 9),
.init(name: "3", timeToDownload: 1),
.init(name: "4", timeToDownload: 2),
.init(name: "5", timeToDownload: 3)
]
let imageInverter = ImageInverter()
let startDate = Date.now
for await invertedImage in imageInverter.process(maxConcurrentTasks: 2, images) {
print("Recieved \(invertedImage) in in \(Date.now.timeIntervalSince(startDate)) seconds")
}
PlaygroundPage.current.finishExecution()@Pramod-36 … Yep, constraining concurrency has trade-offs. And if you don’t want to limit the degree of concurrency, then don’t: It’s entirely up to you. But I’d suggest that when you start dealing with massive parallelism (like thousands of tasks), there are all sorts of practical considerations that start to crop up, and you might want to revisit whether it might be prudent to constrain the degree of something reasonable for your problem:
-
You give an example of an “image inverter”, which is likely to be CPU-bound, anyway. So, if you start 1,000 image inversions, the number of available CPU cores will constrain you, anyway. So I can create 1,000 tasks, but if my Mac has 20 cores, it can only do 20 CPU-bound tasks at a time, regardless, and the other 9,980 tasks will just be sitting there, taking up memory, waiting for a thread from the concurrency thread pool (a.k.a., the “cooperative thread pool”), anyway. So, even if the first 999 tasks take 10 seconds each and the last one takes 0.1 seconds, you simply won’t see that last one start until it is afforded one of the very limited CPU cores, even if you don’t constrain the degree of concurrency yourself.
As an aside, when doing a vast number of parallel CPU-bound tasks, we often introduce “striding” (so that rather than 1,000 individual tasks, we might do 40 tasks, each processing 25 items); parallelism isn’t free and each task has a little overhead, so we often stride to maximize performance. It often takes some experimentation to empirically identify the ideal striding factor for the particular use case. Personally, I often start with something like 2 or 4 times the number of processors, and then benchmark the various alternatives.
-
You focus on how long it takes to finish the fastest task. Often, we focus on how long it takes to finish all of the tasks. E.g., if downloading images, affording 6 concurrent requests will dramatically improve the performance. Allowing 10 at a time will marginally improve it. But beyond that, you have diminishing returns while introducing new problems (like latter network requests failing due to timeouts, excessive app peak memory usage, excessive server load, etc.).
-
If you are showing results in the UI, you would generally want it to favor those tasks associated with images that are visible. We wouldn’t want to sacrifice the speed for the user-visible images simply to maximize the speed for which it processes the 1,000th image.
Given your hesitancy, feel free to defer this “constrained degree of concurrency” question until you start to experience memory and performance problems. But I wager you will eventually encounter situations where you might want to revisit this.
FWIW, see Beyond the basics of structured concurrency for a discussion on constraining concurrency.
@Pramod-36
Yes, we would want a separate task (lowercase “t”) for each. But we would avoid introducing unnecessary unstructured concurrency (a
Task {…}), but rather remain within structured concurrency. It keeps life much simpler.The idiomatic solution would be a “task group”:
Thus, perhaps a simple solution would look like:
Now, that having been said, there are two limitations in the above:
image(for:)tothrowerrors (e.g., a network request fails) and then the asynchronous sequence would yield aResult<UIImage, any Error>, instead, so the caller knows which images succeeded and which failed (while not stopping just because one of the images failed).So, perhaps:
This enjoys automatic tracking of when the collection of asynchronous tasks is complete, while remaining in a pattern that supports cancellation of the whole sequence.
If you want to go nuts, I might suggest a few more refinements: Notably, the above is fine if you have a dozen images. But what if you have 1,000? You really don’t want to create an unbridled number of tasks, but rather constrain it to some reasonable number that both enjoys concurrency, but limits it to some reasonable number. People often use something like 6 or 10 concurrent requests, because beyond that, you don’t see discernible performance benefits, but do suffer memory issues or overloading the system with too many tasks.
So, you might end up with something like the following, where we will
await group.next()above a certain number. This will constrain the degree of concurrency to no more than n requests at a time:There are additional refinements one might add, but this strikes me as a good starting point. But the key is:
FWIW, I have not done exhaustive testing of the above, so I apologize for any errors herein. But hopefully it illustrates the idea.