Simple Understanding of Coroutines

5 min readAug 6, 2023

Why coroutine?

Blocking IO & none blocking IO

If a program wants to access the blocking API of the operating system, such as accessing a website, reading and writing disks, databases, or directly sleeping, it will give up the CPU and wait until the data is ready before being woken up by the operating system to continue execution. This mode of stopping and waiting seriously affects the processing speed of the program, so the none blocking API appeared.
The None blocking API is also very simple. It just registers a callback function. When the data is ready, the callback function will be called, and the main program will take care of other things on its own. This programming mode greatly increases the throughput of the program and is still widely used, such as the famous nginx.

Originally, the end of story would end with the none blocking mode, but there is a notorious problem with the none blocking mode, which is “callback hell”. For example, if you want to implement a logic with javascript, call back the a() function after one second, then call back the b() function in a one second later, and then call back the c() function in b one second later, the code will roughly become like this:( The meaning of setTimeout here is to call a user-defined function after a certain event, you can also replace it with reading disk, network and other operations that will give up the CPU)

setTimeout(function() {
    console.log('first');
    setTimeout(function(){
        console.log('second');
        setTimeout(function(){
            console.log('three');
        }, 1000)
    }, 1000)
}, 1000)

// Output
first
second
three

The above code becomes very difficult to understand and difficult to maintain. So there is the concept of coroutines. In simple words, I want to have the performance of callback programming, and I want to continue the sequential logic of blocking IO. How to achieve it? Two sentences:

The coroutine framework implements an “operating system” by itself
User code voluntarily relinquishes the CPU

The implementation of coroutine

In the framework of coroutines, a miniature “operating system” — main event loop is implemented. This main event loop contains a set of coroutines and a scheduler. You can think of each coroutine as a function + a separate stack and backup register structure. Whenever a coroutine voluntarily gives up the CPU (yield, await), it saves its own state, and then the scheduler will choose one coroutine in ready status, switch registers and stack continue to run.

The relationship between coroutines, threads, processes, and OS is as follows:

The bottom one is the kernel scheduler, which is responsible for scheduling threads. There may be more than one thread in the same process. They all share the same memory space, but have their own separate stack and ID. The coroutine is included in the thread. It means that if the thread is slept, the coroutine will also be slept, and the coroutine framework implements a simple scheduler by itself, and each coroutine has its own stack (this stack can be large or small, depending on different frameworks) , each time the coroutine is switched, the context — set of registers of the coroutine will be backed up
Different from the real operating system, the coroutine must actively give up the CPU before being switched, while the operating system controls the hardware and can switch processes through interrupts and system calls. The cost of their switching is also very different. Different processes have different page tables and different kernel objects, which means that each switching will cause a large cache miss. However, coroutines share the same address space, and the cost is much smaller. Another difference is that the code of the coroutine cannot call blocking system calls, otherwise the entire program will still block.

How to use coroutine?

Here is an example for using coroutines in python:

import asyncio
import time

async def a():
    print("start a ！")
    await asyncio.sleep(1)
    print("end a ！")

async def b():
    print("start b ！")
    await asyncio.sleep(2)
    print("end b ！")

async def main():
    await asyncio.gather(a(), b())

if __name__ == "__main__":
    start = time.perf_counter()

    asyncio.run(main())

    print('used time {} s'.format(time.perf_counter() - start))

In the __main__ function, call asyncio.run to start the event loop. All await calls in function a and function b will voluntarily give up the CPU and be switched to another coroutine or the main program itself by the scheduler. In main() function, the gather function waits for both a() and b() to finish before returning. The result of the operation is as follows:

$ python .\test1.py
start a!
start b!
end a!
end b!
used time 1.995282 s

Coroutines are often used to access network resources, read disks, or databases, please check following example:

import aiohttp
import asyncio

async def fetch(session, url):
  print("send request：", url)
  async with session.get(url, verify_ssl=False) as response:
      content = await response.content.read()
      file_name = url.rsplit('_')[-1]
      with open(file_name, mode='wb') as file_object:
          file_object.write(content)

async def main():
  async with aiohttp.ClientSession() as session:
      url_list = [
          'https://test1.jpg',
          'https://test2.jpg',
          'https://test3.jpg'
      ]
      tasks = [asyncio.create_task(fetch(session, url)) for url in url_list]
      await asyncio.wait(tasks)


if __name__ == '__main__':
  asyncio.run(main())

This program opens three coroutines at the same time to download three pictures. Every time, it will give up the CPU due to network requests. When the request is satisfied, it will continue to run. The programming mode of using coroutines is easier to understand, and it is better than the most primitive stop and wait mode, and it’s much more efficient.
Different languages support coroutines in different ways. It is more convenient to use coroutines in go, just add a go keyword before calling the function name:

package main

import (
    "time"
    "fmt"
)

func say(s string) {
    for i := 0; i < 3; i++ {
        time.Sleep(100 * time.Millisecond)
        fmt.Println(s)
    }
}

func main() {
    go say("hello world")  // add go here
    time.Sleep(1000 * time.Millisecond)
    fmt.Println("over!")
}

Note: sleep (1000) must be added to the main function. If main exits before say, it will end before say runs.
In the go language, coroutines can support multi-core (multi-thread), just add one keyword: GOMAXPROCS

runtime.GOMAXPROCS(2)

Also added the syntax of synchronization between coroutines: waitGroup

A WaitGroup waits for a collection of goroutines to finish. The main goroutine calls Add to set the number of goroutines to wait for. Then each of the goroutines runs and calls Done when finished. At the same time, Wait can be used to block until all goroutines have finished.

package main

import (
    "time"
    "fmt"
    "sync"
)

func main() {
    var wg sync.WaitGroup
    wg.Add(2)
    say2("hello", &wg)
    say2("world", &wg)
    wg.Wait()
    fmt.Println("over!")
}

func say2(s string, waitGroup *sync.WaitGroup) {
    defer waitGroup.Done()

    for i := 0; i < 3; i++ {
        fmt.Println(s)
    }
}

In this example, the main function no longer needs sleep to wait for the coroutine, but instead uses WaitGroup.

At the beginning, it was defined in wg.Add(2) to wait for two coroutines;
Then call defer waitGroup.Done() at the beginning of say2 (coroutine) to mark itself in waitGroup;
Finally, the main function finally calls wait to block the main function;

In conclusion

Coroutine is a user-mode concurrency implementation that is lighter than threads. The cost of switching between coroutines is lower than that of threads. You don’t need to worry about synchronization and mutual exclusion issues, and you don’t need to use various locks. It is especially suitable for IO-intensive jobs. But it is not suitable for CPU-intensive jobs (because the coroutine itself does not increase the CPU, but only uses one CPU to switch between different tasks, and the event loop code itself and the state preservation of the coroutine code consume more CPU).

Simple Understanding of Coroutines

Why coroutine?

The implementation of coroutine

How to use coroutine?

In conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by go.fly

No responses yet