ArXiv, the open-access repository for preprint academic research, will impose a one-year ban on authors who submit papers containing obviously AI-generated content with issues such as plagiarism, bias, errors, incorrect references, or misleading information.
In a post on X late Thursday evening, Thomas Dietterich, chair of the computer science section of ArXiv, outlined the new penalties. He stated:
“If generative AI tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s). We have recently clarified our penalties for this. If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper.”
Dietterich provided examples of incontrovertible evidence, including:
- Hallucinated references
- Meta-comments from LLMs, such as: “Here is a 200-word summary; would you like me to make any changes?” or “The data in this table is illustrative; fill it in with the real numbers from your experiments.”
The penalty includes a one-year ban from ArXiv, followed by a requirement that all subsequent submissions must first be accepted at a reputable peer-reviewed venue.
Dietterich clarified in an email on Friday that this is a one-strike rule—authors caught once will face the ban—but emphasized that decisions are open to appeal. He added:
“I want to emphasize that we only apply this to cases of incontrovertible evidence. I should also add that our internal process requires first a moderator to document the problem and then for the Section Chair to confirm before imposing the penalty.”
ArXiv’s Crackdown on AI Slop
In November 2025, ArXiv announced it would no longer accept computer science review articles and position papers due to an influx of AI slop. The repository stated:
“Generative AI/large language models have added to this flood by making papers—especially papers not introducing new research results—fast and easy to write. While categories across arXiv have all seen a major increase in submissions, it’s particularly pronounced in ArXiv’s CS category.”
In January 2026, ArXiv introduced a new policy requiring first-time submitters to obtain an endorsement from an established author, citing a rise in fraudulent submissions.
The Growing Problem of AI-Generated Citations
AI-generated and fabricated citations are increasingly straining the peer-review process. A recent study by Columbia University researchers examined 2.5 million biomedical papers over three years and found:
- In the first seven weeks of 2026, one in 277 papers contained fabricated references.
- In 2023, the rate was one in 2,828.
- In 2025, the rate was one in 458.
These issues highlight the broader challenge of AI-generated content infiltrating academic publishing. ArXiv, managed by Cornell Tech, will become an independent nonprofit corporation in July 2026. Greg Morrisett, dean and vice provost of Cornell Tech, told Science.org that the change aims to help ArXiv secure more funding to address the rise of “AI slop.”