Jüri Allik, Professor of Experimental Psychology at the University of Tartu, belongs to the top one per cent of the world’s most cited scientists in his field. His recipes for becoming a top researcher were among the top 10 most popular posts on our blog in 2013. This is the second post in Professor Allik’s three-part series on scientific publishing. See the first post: Brave New World of Scientific Publishing.
Peer review is one of those things that has made people repeat the famous quote by Winston Churchill about democracy: “Democracy is the worst form of government, except for all those other forms that have been tried.” Speaking for myself, I don’t think that such praise is actually well-grounded when it comes to peer review. The other systems have not quite been tested thoroughly enough.
Thus, peer review is mostly an artificial hindrance, set to limit the number of publications. Expert scientists’ opinions of their colleagues’ work is in itself a good reason to write scientific papers in the first place. I will always say that the very essence of science is reciprocal controllability of the results that make it possible for the ideas to compete with each other.
Some ideas become appreciated and will live on in future works, based on the forerunners. Others are plain wrong or not productive enough and should rather become extinct. Basically, this means that the peer review concept is a system for devitalizing the ideas that are not fertile enough. A single question remains: Is there point in maintaining a complex killing machine for ideas? Maybe letting bad ideas die their natural death would cost less? Because science is turning irreversibly towards all-electronic publishing, there’s no great need for an obstacle that would help save paper and ink.
But before presenting a system superior to peer review, the one which currently reigns, it would be reasonable to enlarge upon the little-known dark sides to peer review. The system of review that closed publishing is based on has three main shortcomings: slowness, ineffectiveness, and amorality.
The peer review system is slow
The first bad thing about the peer review system would be that it is slow. I had a personal experience with that early on, when we sent our first article to the Journal of Personality and Social Psychology, the most referenced journal in psychology. This means that the total number of references this journal gets exceeds every other psychology journal. A high number of references means high prestige as well. The possibility that your article will get noticed is also greater than in every other journal. It’s only natural that we wanted the challenge that comes with a really demanding audience.
Eight months after we had sent the manuscript (this happened in the paper age), we wrote a concerned letter to the editor, asking when some news might arrive. After some time, the editor replied: unfortunately our manuscript had been lost. The editor had put it on the lowest shelf but wasn’t able to stoop that low because of a bad back. But he promised that now that the article had been found again, he would act quickly. That’s what he did, sending the article to three friends who, after some cursory reading, advised us to make some changes. According to the editor, the real meaning of these suggestions was rejection.
It all seemed as though the editor did not want to accept our paper right from the beginning and just wanted to have a suitable plea for rejecting it. Because I can’t stand injustice, I wrote to an official at the American Psychological Association, responsible for publishing the association’s journals, that in our opinion the rejection had been a biased and unjust decision, and the fate of the manuscript would have been different if its authors had been from Harvard or Stanford, instead of a little-known place called Estonia. We received a really polite answer, assuring us that the editor with a bad back was known as an honest man, not somebody who would act in a biased manner. We had lost a whole year because all the journals ask for the author’s written confirmation that the same manuscript has not been sent to another journal at the same time.
Nowadays, all journals use an electronic system for their business, one that makes sure that no manuscripts will get lost. When a reviewer who has received an article for evaluation exceeds the time limit, the system starts to bombard him or her with automatically generated reminders. As reviewing is voluntary and unpaid, there’s no guarantee that the reviewer has time, or sometimes even a chance to keep his or her promise. If the editor doesn’t solve the problem, it can easily waste three or four months, or, in some extraordinary cases when even a reminder sent by the editor won’t do, over half a year. The author is completely helpless against such drag. The only thing he or she can do is to take back the manuscript – quite a wild act because it won’t bring back the time already lost.
This period can last even longer, as, after the first round of reviews, the editor can basically choose between two actions – to reject the manuscript, referring to serious flaws, or send it back to the author, asking him or her to present a new version, taking into account remarks and suggestions by reviewers and the editor. Depending on the seriousness of the problems, fixing the article could take up to a couple of months, especially when the suggestions include gathering more data or doing new analyses, the latter possibly as time-consuming as the former.
Now, when the corrected and changed manuscript is again presented to the journal, the editor would, as a rule, send it to the same reviewers. However, if the editor sees the need to do so, he or she could find a new reviewer, one who hasn’t seen the paper before. In the worst case, there could be three, even four rounds of amending the work. Basically, changing the article is no guarantee that it will be accepted in the end. It is possible that even an article that has been revamped several times will finally be rejected.
Finding reviewers has become more time-consuming as well. A few years ago, it was enough to contact three or four potential reviewers, and there would be two ready to read and comment on the manuscript. But lately it’s not rare to have to ask 8–10 persons, until two of them agree to do it. All the old dogs are too busy reviewing, thus the percentage of younger ones not fully ready yet for the job is rising.
The peer review system is ineffective
Is peer review efficient? The existing peer review system has been tested repeatedly. One of the most scandalous experiments was conducted by Peters and Ceci. They took 12 articles already published in 12 journals with an average rejection percentage of 80. The authors of all 12 articles were working in leading American universities. Peters and Ceci replaced the real authors’ names with fictional ones, and the addresses with unknown or fictional American universities. The 12 articles, formatted as manuscripts, were presented to the same 12 journals where the original articles had been published 18–32 months before.
In only three cases were the editors able to detect that it was an article already published in their journal. Nearly 90% of reviewers suggested that the already published article had better not be published, often referring to “serious methodological shortcomings”. How much it had to do with the unfamiliarly named universities remains unknown.
The experiment showed that, at least in the field of psychology, the peer review system is not efficient, as it does not allow detection of articles already published – in short, plagiarism. Quite obviously, the reviewers didn’t decide upon the substance of a manuscript, but rather the academic position of the fictional authors. Many of them probably acted on the bias that authors working at Mickey Mouse University could not think clearly.
Additional problems came through the fact that the new reviewers’ opinions didn’t match well with the viewpoints of former reviewers who had already suggested publishing the same manuscripts. There are lots of studies that show that two or three reviewers seldom have matching opinions. When experts cannot agree on the virtues and flaws of a manuscript, the trustworthiness of the peer review system becomes kind of a non-topic. A typical study indicates that in many fields, the concurrency of the reviewers’ opinions is only a little higher than it would be in completely random circumstances.
One possible counter-argument could be that there are issues with evaluating an “average” article, but that a really good work is easily spotted. So, when it comes to really significant scientific achievement, the agreement between reviewers and editors would be much greater. To test this hypothesis, one should take highly renowned articles, probably referenced a lot, and ask the authors how easy or hard it was for them to get those published.
That’s exactly what was done in a study that examined 205 genuine reference classics. It turned out that at least 10 percent of them had been rejected as unsuitable for publishing. For example, one of my own most highly referenced articles was rejected from four journals as having a serious methodological handicap, before it was finally published in the Journal of Cross-Cultural Psychology (Google Scholar says that this article has been referenced 294 times by February 24, 2014).
Thus, the peer review system in use at the moment is not very efficient at spotting good ideas. At the same time, it is even more incapable of detecting bad ideas. The easiest proof of that would be the case of Diederik Stapel, a social psychologist who over many years successfully published articles in good journals, in spite of the fact that the data used was simply made up.
From the journal previously mentioned, the Journal of Personality and Social Psychology, 10 articles written by Stapel were removed, but by then there were already over a thousand references to them! It turned out that in all the experiments described in these works, Stapel just cooked up the data, so the results sounded like something people would like to hear. For example, his fabricated results claimed that a dirty environment makes for racist thoughts. Or what about good news for vegetarians – according to Stapel’s fabrications, they turned out to be more emphatic than carnivores. If we also take into account the startlingly great amount of articles that no one ever cited, it should be clear even to a skeptic that the peer review system can’t, at least always, accomplish its main goal – set apart good and bad (or boring) science.
(To avoid the wrong impression that frauds are conducted only in the field of psychology, one has to admit that, for example, the scandal that physicist Jan Hendrik Schön got into is a whole different ballpark. After it turned out that Schön had fabricated his results, 8 articles from the journal Science and 7 articles from the journal Nature had to be retreated.)
The peer review system is amoral
But what makes the peer review system amoral? I think that it’s because the editor, as well as the reviewers, have almost no responsibility towards the author. For many of them, becoming the chief editor of an influential journal is the high point of their career. When a journal is the most important one in a certain field, it makes the position of chief editor extremely powerful. To get to this point, one has to have intelligence as well as diplomatic skills. But as soon as he or she is in the desired place, arrogance can proliferate.
With reviewers it’s basically the same mechanism at work. The blind review system makes it easier to be irresponsible. Arguably, if the reviewer says something critical about a well-known author’s work, then it can later harm his or her career. It is also thought that if the reviewer were not able to retain anonymity, it would be less likely that he or she agree to review anything. There has been no research about how the quality of a review and anonymity might be linked, but very many authors have experienced the negative consequences of anonymity. Our compatriot and an outstanding astronomer, Ernst Julius Öpik, has written that anonymous reviewing is like hitting another person in the dark, where the victim has no chance to answer with the same.
Our research group received one of the ugliest reviews for our work just a few months ago from the Journal of Vision. Although the journal has open access, the first sentence of the review was sinister: “Although I found the underlying idea for the experiment to be really good, I was disappointed by the conceptual and technical implementation. The authors are not on top of the matter that they study and confused about fundamental psychophysical concepts such as Weber’s law, mental operations (translation, multiplication), accuracy versus precision, as I demonstrate below. Their mathematical and statistical concepts seem to be surprisingly poor as well. Thus, in all, this is a good idea but very badly worked out. I recommend the authors to sit back and think over matters and concepts (…) I strongly suggest removing the R&S model”.
It went on in a similar style, complete with advice to grab some high school textbooks and look up how the mean value is calculated. I would just like to add that I know more about Weber’s law than the arrogant reviewer, probably with a fresh doctoral degree, could have even thought. One of the things I know is that the law was first figured out with the help of Alfred Vilhem Volkmann, who had arrived to Tartu in 1837, carrying his own microscope. And the R&S model is not made from-the-hip, but the best explanation so far about how the visual system calculates mean values, published in the premier journal in the field – Vision Research. In addition to the insults, the reviewer managed to cram at least ten factual mistakes into the short review, a sign of not paying much attention at university.
(It all pales next to what an outstanding learning expert wrote in a review for a really groundbreaking discovery by John Garcia: “They are no more likely to be true than you would find bird droppings in a cuckoo clock“.)
A little comfort came from the fact that the editor sent an apology after our protests. It didn’t change the decision to reject our manuscript. The peer review system is not worth much if it allows borderline sociopaths to throw insults with no fear of sanctions, behind the protective shield of anonymity.
Many journals try to cure the ills of blind review and end up making it even worse. An example of this is the double blind review. In this case, it’s not just the reviewer who is anonymous. The manuscript has to be cleaned of all information that might betray its authors’ identities as well. In many cases, it’s an impossible task. Every work has its own unique style and a bunch of stealthy details that would make it possible for an expert to guess the author(s) with great likelihood. Additionally, it’s not clear if everyone has enough to gain from illusory anonymity. A psychologist can’t hide that the questionnaire was in Estonian, the participants came from Tartu or nearby places, and the question at hand has been touched by a sole research group thus far.
Many journals allow the author to name, in addition to the persons they would like to see reviewing their article, those they would most certainly have not cut their teeth on it, but it’s not very helpful either. Yes, it can make sure that the editor will not send your manuscript to some bald-faced enemy of yours who has publicly criticized you personally or your approach. But there exists a whole small army of colleagues all over the world who consider you as a competitor in the race for the same discoveries. It takes noteworthy generosity from the reviewer to pass a chance at delaying the publication of their competitors by a bit, especially if the reviewer has a similar article sent to some other journal.
By the way, one of the reasons for reviewing could be the possibility to get a scoop of what the competing labs are doing before their results are made public.
This is the second post in Professor Jüri Allik’s three-part series on scientific publishing. See the first post: Brave New World of Scientific Publishing. Coming up next: The Future of the Worst Possible Science World.