Asynchronous Programming - Coroutine Deep Dive

Unity for Game Development

Posted 2026/05/302026년 5월 30일 토요일 AM 12:00

10 min read1476 words

Deep Dive

은행 여러분들께선 오늘 새로운 통장을 만들기 위해 차를 끌고 은행을 방문했습니다. 아뿔싸, 은행을 방문하니 손님을 응대해줄 창구가 하나밖에 없는 은행이네요. 어쩔 수 없이 조금 기다려야겠습니다. 약 1시간을 기다린 뒤, 드디어 여러분들의 차례가 되었습니다. 은행원 앞에 앉아서, 안내에 따라 통장을 만드는 절차를 진행합니다. 은행원이 도장이 필요하다고 하는데, 안타깝게도 여러분들은 깜빡하고 차에 도장을 두고 은행에 왔습니다. “차에 도장을 두고 와서, 5분 정도만 기다려주세요!” 라고 말한 뒤, 은행 밖으로 나왔습니다. 여러분들이 도장을 가지러 차에 간 사이, 은행원은 잠시 다른 업무를 진행합니다. 동기와 비동기 손님들은 순차적으로 한 명씩 처리되고 있었기 때문에, 약 1시간 동안 아무것도 하지 못하고 대기만 했습니다. 이것을 보고 라고 합니다. 앞선 일이 완료가 될 때까지 계속해서 기다리기만 해야 하는 것이지요. 겨우 여러분들의 차례가 된 후, 도장을 가지러 차에 간 동안 은행원…

https://minssuy.com/Unity/Asynchronous-Coroutine/

If the previous post was mostly about how to use coroutines, this post digs deep into what a coroutine really is.
If you don’t know coroutines yet, this isn’t the right post for you — I’d recommend reading the previous post and trying them out first.

IEnumerator, IEnumerable

When you use a coroutine, you declare a function with the return type IEnumerator and call it through StartCoroutine.

void Start()
{
    StartCoroutine(TestCoroutine());
}

IEnumerator TestCoroutine()
{
    yield return null;
}

But IEnumerator wasn’t actually created for coroutines.
Its original purpose was as an interface for iterating over collections like List, Dictionary, and Array.

List<int> numbers = new List<int> { 1, 2, 3 };

foreach (int n in numbers)
{
    Debug.Log(n);
}

foreach iterates over the List and runs Debug.Log, but how is foreach able to iterate in the first place?

IEnumerable

For foreach to work, the collection must implement IEnumerable internally.
Let’s take a look inside the IEnumerable interface.

Github Link : dotnet/runtime/IEnumerable.cs

public interface IEnumerable
{
    IEnumerator GetEnumerator();
}

If you inherit from it, you have to implement a function called GetEnumerator whose return type is IEnumerator.
Let’s check whether List actually inherits and implements it.

Github Link : dotnet/runtime/List.cs

public Enumerator GetEnumerator() => new Enumerator(this);

IEnumerator<T> IEnumerable<T>.GetEnumerator() =>
    Count == 0 ? SZGenericArrayEnumerator<T>.Empty :
    GetEnumerator();

IEnumerator IEnumerable.GetEnumerator() => ((IEnumerable<T>)this).GetEnumerator();

We can confirm that List does inherit the interface and implement GetEnumerator.
Our goal was just to verify that List implements it, so there’s no need to understand the code itself.

That said, as the name GetEnumerator suggests, we can tell it has to return an Enumerator.
So let’s find out what an Enumerator actually is.

IEnumerator

Let’s look at the List.cs code again.

Github Link : dotnet/runtime/List.cs

public struct Enumerator : IEnumerator<T>, IEnumerator
{
    private readonly List<T> _list;
    private readonly int _version;

    private int _index;
    private T? _current;

    internal Enumerator(List<T> list)
    {
        _list = list;
        _version = list._version;
    }

    public void Dispose()
    {
    }

    public bool MoveNext()
    {
        .
        .
        return false;
    }

    public T Current => _current!;

    object? IEnumerator.Current
    {
        get
        {
            .
            .
            return _current;
        }
    }

    void IEnumerator.Reset()
    {
        .
        .
        _index = 0;
        _current = default;
    }
}

As before, I removed parts of the original code since understanding how it works isn’t the goal.
The point to notice here is that Enumerator inherits from IEnumerator.

Aha — so we can understand the structure of List in the following order.

For foreach to iterate, the collection must implement IEnumerable internally.
IEnumerable must implement a function GetEnumerator whose return type is IEnumerator.
GetEnumerator returns an Enumerator.
Enumerator inherits from IEnumerator.

Finally, it’s IEnumerator’s turn. Let’s check the source code.

Github Link : dotnet/runtime/IEnumerator.cs

public interface IEnumerator
{
    bool MoveNext();
    object Current { get; }
    void Reset();
}

The function names are remarkably intuitive. It feels like the puzzle is coming together.
So foreach was iterating using the functions of the collection’s IEnumerator all along.

foreach probably works in a way like this:

// foreach's Todo
IEnumerator enumerator = list.GetEnumerator();

while (enumerator.MoveNext())
{
    var item = enumerator.Current;
}

So far we’ve looked at what IEnumerator and IEnumerable really are.
Then how did Unity make coroutines by leveraging IEnumerator?

StartCoroutine

Let’s trace inside StartCoroutine().

Github Link : Unity-Technologies/UnityCsReference/MonoBehaviour.bindings.cs

// L108
public Coroutine StartCoroutine(IEnumerator routine)
{
    ...
    return StartCoroutineManaged2(routine);
}

// L195
extern Coroutine StartCoroutineManaged2(IEnumerator enumerator);

StartCoroutineManaged2 is declared as extern.
extern means the actual implementation lives in C++ native code, and since the Unity engine core is closed-source, we can’t look beyond that point. Still, we can infer its internal behavior from the header declaration at the top of the file and from the official Unity documentation.

// L19
[NativeHeader("Runtime/Scripting/DelayedCallUtility.h")]

Unity Documentation - Best practice guides/Coroutines

“All of the code from a coroutine’s first resumption point until its completion is executed within the DelayedCallManager inside Unity’s main loop.”

“A coroutine runs as an instance of a class automatically generated by the C# compiler. This object tracks the coroutine’s internal state and remembers where to resume after a yield.”

According to the official Unity documentation, when StartCoroutine is called, the IEnumerator is registered internally with the DelayedCallManager. From then on, each frame it checks Current to decide when to resume, and when the time comes, it calls MoveNext().

Just as foreach iterates over a collection with MoveNext() and Current, Unity repurposed IEnumerator to control execution flow in the exact same way.

Compilation

Coroutines are part of the built-in library — they require no external dependencies and are simple and easy to use.
But in this post, let’s peel back the convenience and expose the the ugly truth about coroutines.

After all, everything comes with a trade-off.

sharplab.io

The site above lets you enter C# code and see the compiled code.
Let’s write a simple coroutine like the one below and check the compiled result.

Before compiling

// before compiling
using System.Collections;

class Test
{
    IEnumerator TestCoroutine()
    {
        int count = 0;
        yield return null;
        count++;
        yield return null;
        count++;
    }
}

After compiling

// after compiling
internal class Test
{
    [CompilerGenerated]
    private sealed class <TestCoroutine>d__0 : IEnumerator<object>, IEnumerator, IDisposable
    {
        private int <>1__state;

        private object <>2__current;

        public Test <>4__this;

        private int <count>5__1;

        object IEnumerator<object>.Current
        {
            [DebuggerHidden]
            get
            {
                return <>2__current;
            }
        }

        object IEnumerator.Current
        {
            [DebuggerHidden]
            get
            {
                return <>2__current;
            }
        }

        [DebuggerHidden]
        public <TestCoroutine>d__0(int <>1__state)
        {
            this.<>1__state = <>1__state;
        }

        [DebuggerHidden]
        void IDisposable.Dispose()
        {
        }

        private bool MoveNext()
        {
            switch (<>1__state)
            {
                default:
                    return false;
                case 0:
                    <>1__state = -1;
                    <count>5__1 = 0;
                    <>2__current = null;
                    <>1__state = 1;
                    return true;
                case 1:
                    <>1__state = -1;
                    <count>5__1++;
                    <>2__current = null;
                    <>1__state = 2;
                    return true;
                case 2:
                    <>1__state = -1;
                    <count>5__1++;
                    return false;
            }
        }

        bool IEnumerator.MoveNext()
        {
            //ILSpy generated this explicit interface implementation from .override directive in MoveNext
            return this.MoveNext();
        }

        [DebuggerHidden]
        void IEnumerator.Reset()
        {
            throw new NotSupportedException();
        }
    }

    [NullableContext(1)]
    [IteratorStateMachine(typeof(<TestCoroutine>d__0))]
    private IEnumerator TestCoroutine()
    {
        <TestCoroutine>d__0 <TestCoroutine>d__ = new <TestCoroutine>d__0(0);
        <TestCoroutine>d__.<>4__this = this;
        return <TestCoroutine>d__;
    }
}

Understanding the compiled code isn’t the goal. There are two things we should focus on.

First : The State Machine

The TestCoroutine() function was transformed into a class called <TestCoroutine>d__0.
At the same time, a state value called <>1__state appeared inside it.

private sealed class <TestCoroutine>d__0
{
    private int <>1__state;
}

Based on this state value, every time MoveNext() is called it branches with a switch statement.

private bool MoveNext()
{
    switch (<>1__state)
    {
        default:
            return false;

        case 0: // first execution
            <>1__state = -1;
            <count>5__1 = 0;
            <>2__current = null;
            <>1__state = 1;  // move to next state
            return true; // yield return null

        case 1: // after the first yield
            <>1__state = -1;
            <count>5__1++;
            <>2__current = null;
            <>1__state = 2; // move to next state
            return true; // yield return null

        case 2: // after the second yield return
            <>1__state = -1;
            <count>5__1++;
            return false; // coroutine ends
    }
}

The function doesn’t actually pause. The coroutine is turned into a state machine that saves the state value and does return true, and on the next MoveNext() call it resumes from the case matching the saved state value.

Second : Local Variables Become Class Fields

Did you notice why the compiler turned this into a class?

private sealed class <TestCoroutine>d__0
{
    private int <count>5__1;
}

A normal function disappears from the stack once it finishes executing, so its local variables vanish along with it. But a coroutine pauses with yield return and resumes later, so the value of the local variable count must persist even while it’s paused. Since this isn’t possible on the stack, the compiler generates a class, hoists the local variable up into a field of that class, and keeps it on the heap.

Now that it’s been made into a class, all that’s left is to create it with new and return it.

private IEnumerator TestCoroutine()
{
    <TestCoroutine>d__0 <TestCoroutine>d__ = new <TestCoroutine>d__0(0);
}

Here, the bare face of coroutines is revealed.
The coroutine was turned into a class in order to remember its state, and creating that class with new triggers a heap allocation.
In Unity, a heap allocation means that once the coroutine ends, it becomes a target for the GC to collect.

If heap allocations pile up and the GC runs frequently, it can lead to frame drops.

Coroutine

In the Compilation section, we confirmed that one state machine class gets allocated on the heap.
But the allocation at startup doesn’t end there.

Let’s check the signature of StartCoroutine we saw earlier once more.

public Coroutine StartCoroutine(IEnumerator routine)

The return type is Coroutine. Let’s find out what Coroutine really is too.

Github Link : Unity-Technologies/UnityCsReference/Coroutine.bindings.cs

public sealed class Coroutine : YieldInstruction
{
    internal IntPtr m_Ptr;
    Coroutine() {}

    ~Coroutine()
    {
        ReleaseCoroutine(m_Ptr);
    }

    [FreeFunction("Coroutine::CleanupCoroutineGC", true)]
    extern static void ReleaseCoroutine(IntPtr ptr);
}

See it? Coroutine is a class, not a struct.
It’s a wrapper class pointing to the actual coroutine the engine manages internally, and a new one is created and returned every time StartCoroutine is called. As you can tell from the destructor ~Coroutine calling CleanupCoroutineGC, this thing is a heap object managed by the GC.

In the end, starting a coroutine once triggers two heap allocations.

The state machine class created by the compiler
The Coroutine wrapper object returned by Unity

YieldInstruction

In the section above, we learned that starting a coroutine once triggers two heap allocations.
But there’s also something we allocate with new by our own hands when we use coroutines.

It’s the Waitfamily —WaitForSeconds and friends, these are called YieldInstruction.

public class WaitForSeconds : YieldInstruction { ... }
public class WaitForFixedUpdate : YieldInstruction { ... }
public class WaitForEndOfFrame : YieldInstruction { ... }

Hold on — taking a closer look, these are also classes that inherit from something called YieldInstruction.
We’ve been directly creating instances of these classes with the new keyword when using coroutines.

Here the grim reality of coroutines is revealed.
The coroutine itself is turned into a class that gets a heap allocation, and YieldInstruction triggers a heap allocation too.

Let’s take a look at a horrifying example.

void Start()
{
    StartCoroutine(TestCoroutine());
}

IEnumerator TestCoroutine()
{
    while (true)
    {
        yield return new WaitForSeconds(1f); // new on every loop
    }
}

If you can see how horrifying the code above is, you’ve gotten everything this post has to offer.
Two heap allocations happened when TestCoroutine started, and after that the WaitForSeconds allocations keep piling up every second.
Since this is a commonly seen pattern when using coroutines, it’s all the more horrifying.

There’s a way to make the situation above better. It’s caching.

private WaitForSeconds _waitForSeconds = new WaitForSeconds(1f);

IEnumerator TestCoroutine()
{
    while (true)
    {
        yield return _waitForSeconds;
    }
}

By creating the WaitForSeconds object only once, putting it on the heap, and reusing it, no additional heap allocation occurs.

But do we always want to wait exactly 1 second?
Since we want to wait for whatever duration we choose on each call, in practice we mostly use it like this:

IEnumerator TestCoroutine(float waitTime)
{
    while (true)
    {
        yield return new WaitForSeconds(waitTime); // heap alloc per call
    }
}

When the wait time changes dynamically, caching is virtually impossible.
Caching is an optimization that’s only valid when you always wait for the same amount of time; it’s hard to apply in the dynamic case.

To sum up, heap allocations occur in two places.
One is the moment you start a coroutine (twice — the state machine plus the Coroutine wrapper), and the other is the moment you new a fresh YieldInstruction. Conversely, the mechanism of a coroutine pausing with yield and then resuming carries no extra allocation on its own. One caveat: returning a value type like yield return 0 gets boxed into an object and allocates every time, so for a one-frame wait you must use yield return null.

The real problem is patterns that repeat allocations. Code that calls StartCoroutine anew every frame, or allocates a fresh YieldInstruction each time inside a loop, is exactly that. On the other hand, keeping a single coroutine alive for a long time and looping with yield return null allocates once at the start and that’s it — so it actually carries less overhead.

Wrapping Up

Coroutines are clearly a powerful and convenient tool.
But as we’ve seen, they aren’t suited for logic that runs every frame or very frequently.

Coroutines have been around since the early days of Unity, roughly 2005, which makes them a feature about 20 years old as of this writing.
Because the feature itself is so old, it has a bit of a legacy feel, and at the same time it’s so deeply embedded into MonoBehaviour that it seems hard to pull out.

Have people solved these problems?
Isn’t there a more powerful asynchronous tool that can replace coroutines like these?

Reference

Asynchronous Programming - Coroutine