Copilot discovered a hidden parallelization bug in our code!

I have been working as a research intern at Microsoft for quite some time now¹ and during this time, our project² has grown to be quite complex. Somewhere along the way, one of our team members figured out that we could parallelize a particular operation in our codebase, which could potentially become a bottleneck as we start working with larger inputs. Essentially, we had something like the following in our C# code³.

foreach (var file in SomeFileCollection) {
    var parsedFile = file.SomeParsingOperation();
    if (parsedFile != null) {
        // store parsedFile in a Dictionary
    }
}

They noticed that SomeParsingOperation can be expensive for large files, and modified this loop to run in parallel using System.Threading.Tasks.Parallel.ForEach. Luckily for us, SomeParsingOperation had an asynchronous version called SomeParsingOperationAsync.

using System.Threading.Tasks;

Parallel.ForEach(SomeFileCollection, file => {
    var parsedFile = file.SomeParsingOperationAsync().Result;
    if (parsedFile != null) {
        // store parsedFile in a ConcurrentDictionary
    }
});

This change seemed very reasonable, and though we didn’t have any performance tests to validate the improvement, we merged the change into our project. For many months, everything seemed to be working fine—until one day, we started noticing that this part of the code was taking several hours on certain large inputs. We realized that we had never run our code on such inputs before, and immediately concluded that it is likely these inputs are too large for even the parallelized code to parse quickly. We knew that in practice, we do not need to parse all the files in SomeFileCollection for our functionality. So my solution to this problem was to implement some heuristic to select only a subset of SomeFileCollection that we need to parse, and discard the rest.

Before opening a pull request, I asked Copilot to review my changes⁴. There were no major issues, and I asked it to fix the minor nitpicks it found. One of these was something to do with the parallel code from above. I thought its some style suggestion and nonchalantly accepted this change like the rest⁵. I ran our code on a larger input again, and what took several hours before was now taking just a few minutes. Sweet! I requested one of my colleagues to re-run the code on some more large inputs so that we can be sure, and called it a day. Some time later, they texted me back saying that it indeed takes only a few minutes, but he just realized that he forgot to pass the flag that enabled my heuristic for cutting down the number of files to parse.

This meant that the only change affecting the performance was that in the parallel code. Copilot had modified it to use System.Threading.Tasks.Parallel.ForEachAsync, which it claimed would give us “true parallelism”.

Parallel.ForEachAsync(SomeFileCollection, async (file, ct) => {
    var parsedFile = await file.SomeParsingOperationAsync();
    if (parsedFile != null) {
        // store parsedFile in a ConcurrentDictionary
    }
}).GetAwaiter().GetResult();

I created another pull request to isolate this change and we ran tested it out again. And voila—this was all we needed to do⁶! In the original code, Parallel.ForEach blocks a thread until file.SomeParsingOperationAsync().Result is available. This takes away the opportunity for the thread to be used for other tasks while waiting for the result, which is especially bad since SomeParsingOperationAsync performs I/O operations. On the other hand, Parallel.ForEachAsync is non-blocking and allows the thread to be used for other tasks while waiting for the result of file.SomeParsingOperationAsync()⁷.

The lesson here is to really understand how concurrency and parallelism work in your programming language. It is very easy to write code that looks parallel, but is actually not. In our case, Copilot happened to point out what we had missed for months. Maybe we should have started using Copilot earlier—but better late than never!

My manager calls me a resident intern because of how long I have been there. ↩
I am working on automatically triaging and patching static analysis warnings in large codebases. Why am I doing this? Here is why. ↩
I know you are wondering why we were writing C# code, but I assure you that we are not eccentrics. We just work at Microsoft. ↩
Protip: ask Copilot to review your code with 3 different LLMs (I ask it to review with Claude Opus 4.6, Gemini 3 Pro, and GPT-5.4) and then consolidate the feedback into a single report. ↩
You can say that I was vibe-reviewing, but I will simply say that the suggestion passed my vibe-check. ↩
Frankly, I was slightly disappointed that my elegant fix was afterall unnecessary. ↩
This blog gives a nice and brief explanation. ↩

PREVIOUSNethi Nethi—On 2025

NEXTSome career advice for my juniors