In which we learn @acm.org is exceptionally bad at retracting articles that need retracting:
arxiv.org/abs/2602.191...
In which we learn @acm.org is exceptionally bad at retracting articles that need retracting:
arxiv.org/abs/2602.191...
"Figure 4: NLP Techniques Over Time Period on System Application". It is a bar plot, showing the publication years of 4 groups of natural language processing approaches: 1. "Bag-of-Words", 2. "Word2Vec, GloVe", 3. "ELMo, BERT", 4. "Transformers". All these approaches are relatively recent, but the x-axis goes from the year 0 to the year 2250, making it almost impossible to see a difference between the 4 bars.
How not to show the publication years of prior works.
#IEEE
Happy Advent! !!!πβ¨
Welcome to a new edition of the Research Integrity Advent Calendar.
Each day brings a small challenge: spot the problem, detect inconsistencies, and sharpen your skills.
Enjoy the season and the daily puzzles!
Day 1. papermills.tilda.ws/advent2025
Reverse Image Forensics Challenge: Try to find a unique area in Figure 9 of this recent Scientific Reports paper: 10.1038/s41598-025-17456-6 [Aug 2025] - Annotated by ImageTwin.ai
Articles with tortured phrases
I invite @acm.org to review their articles with Tortured Phrases dbrech.irit.fr/pls/apex/f?p... (not the first time I'm doing so, here and on LinkedIn www.linkedin.com/posts/guilla...)
Today, our article "The entities enabling scientific fraud at scale are large, resilient, and growing rapidly" is finally published in PNAS. I hope that it proves to be a wake-up-call for the whole scientific community.
reeserichardson.blog/2025/08/04/a...
COSIG logo: COSIG (Collection of Open Science Integrity Guides) Now available at cosig.net!
Anyone can do post-publication peer review.
Anyone can be a steward of the scientific literature.
Anyone can do forensic metascience.
Anyone can sleuth.
That's why we are launching COSIG: the Collection of Open Science Integrity Guides, an open source resource for all of the above.
cosig.net
Apparently, they're plotted in "Smart PLS". I'm not familiar with the software, but it seems like the blue-and-yellow might be a default color scheme.
www.pls-sem.net/news-1/new-s...
It's a version of sudoku, except instead of 3x3 squares you have weird shaped blocks. The markings like "4+" mean that the numbers in the block should add up to 4. "60x" = the numbers should multiply to 60, etc
Six images of cell cultures, with colored rectangles marking where different pairs of images overlap with each other. Some of the overlaps are just narrow strips at the edge of an image.
This was a tricky one. Not surprised the reviewers hadn't noticed. #ImageForensics
It is fortunate that Springer, Elsevier, etc usually insist on vector graphs. You obviously can't do this with a JPEG.
Yes. These figures are usually stored as vector graphics, so you can open them up in a vector graphics editor, "ungroup" them into separate graphical objects (Ctrl+Shift+G in Inkscape), and move them around. Those white rectangles are just another object, added to mask a part of the line below.
Yes, I opened it in Inkscape.
It's actually kinda creative, in a terrible, fraud kind of way...
Twelve! It nearly doubled overnight. Just yesterday we were discussing if it was 6 or 7.
All but one are from IJAMT, one is the Journal of Materials Science.
"average person contains 3 spiders" factoid actualy just statistical error. average person contains 0 spiders. Spiders Elisabe, who lives in Area 51 & is made of spiders, is an outlier adn should not have been counted
Obviously, this depends on the field. Machine learning / AI seems to be the worst affected, since it's such a popular topic with a gazillion niche applications.
Unfortunately, I wouldn't be so sure about that. Pick a narrow enough topic, do a Google Scholar search, and you WILL come across tortured phrases within the first few pages of results.
For example, I tried "Parkinson detection using deep learning" and the 2nd page gave me this:
Honestly, I've been worried about this for a while. There's so much pollution in the literature, it's only a matter of time before people start picking up nonsense like "profound learning" and "irregular woodland" because they *think* it's an accepted term. It's probably happening already...
Yep. Here's your result number 3 (and 5) - turns out I'd already commented on it on PubPeer...
pubpeer.com/publications...
Of course not - these are not "AI" (If by "AI" you meant "large language models").
It's just a dumb synonym replacement script, the same one that's been in use for a decade.
And what a collection that is!
"lung knob recognizable proof", "enhancement calculation", "power esteem between the lung knob and the unwanted foundation spot", ...
Pubpeer thread for the paper "Propagation of Error and the Reliability of Global Air Temperature Projections" by Patrick Frank. It has 346 comments.
Found one by accident (it came up in a search for the one you posted):
It really reads like something written by ChatGPT, too (before all the "paraphrasing", at least)...
"boa constrictor pilot"
(Anaconda Navigator)
screenshot of an article, featuring the sentence "The uncommon ubiquity of Sherlock Holmes and his loyal companion and biographer, Dr. Watson (Watson), slowly got to be portion of a modern mythology, the center of which is still found at 221B Pastry specialist Road in London."
Tortured phrase of the day:
"221B Pastry specialist Road in London"
Goes really well with uncooked information (raw data).
I knew the ICCCNT 2024 conference proceedings were full of papermill nonsense, but seeing the actual number in the PPS still shocked me.
346 (!!) papers with tortured phrases!
Does #IEEE really have zero quality checks?
Brief proceedings-level report here: pubpeer.com/publications...
Fair enough.
I wanted a full "here's why this is papermill" comment so that I could make the "proceedings-level" report on ICCCNT:
pubpeer.com/publications...
(Turns out it wasn't even necessary. I was able to find plenty of other examples.)