Hi Nathaniel,
Just follow the link to my article, I cite them over there. I just collected a pair, of them, but there are a lot more studies about it.
The first time that I have this noticed, it came from a large company that has thousands of developers and some colleagues —also PhD like me— did just a few benchmarks. They were all in the same direction. Sadly, the numbers were not published.
But, if you look at the Atlassian source (they say that is created through thousands of reviews of teams from Jira, Confluence, and others) you can see that when there are pull requests with more than 10 files, or 50 lines of code the review time quickly plummets (more than 100 lines of code reviewed per second). They even provide a tool to make the same measure on your teams. And I have seen that it is quite consistent.
Please notice that it plummets with just 10 files or 50 lines of code: that is considered large. Of course, there are some large PRs with plenty of renames and other stuff... but they are a small part. So, the large PR here is not so large.
I agree that code review is necessary, but that does not mean that it should be blocking a Pull Request. For example, I love to do team code reviews, in which all the team discusses the new code, or just Friday's review, in which the team just discusses the diff from Monday to Friday. That gives a clearer, better picture for improving code quality. Also, the team is continuously reviewing old code while writing new one.
And finally, indeed there are bad actors trying to add nefarious code, I have seen some cases, but, 1) it is not common, 2) if they want to make it the code is not as evident as one PR that can be reviewed, 3) there should be other mechanisms to catch those. You cannot rely on having people without bad days.
Yet, that article had all of these and a lot more :-)
Thanks for the challenge!