Hidden text in PDFs may be fooling AI paper reviewers

A hidden line of text could be all it takes to game an AI paper reviewer. BITS Pilani researchers show how invisible prompts inside PDFs can inflate scores, flip rejects to accepts, and expose a serious security flaw in automated peer review.

10 Mar 2026 12:57 IST

Hidden text in PDFs may be fooling AI paper reviewers

Listen to this article

0.75x1x1.5x

00:00/ 00:00

What if a weak research paper did not need better ideas, better data, or better science: just a hidden line of text to fool an AI reviewer? That is the unsettling question behind a new security-focused study led by Prof. Dhruv Kumar, Department of Computer Science & Information Systems, BITS Pilani, which examines how invisible prompt injections inside PDFs can manipulate Large Language Model-based review systems. In simple terms, the research argues that an attacker may not need to persuade a human at all. They may only need to quietly plant instructions that the human never sees, but the AI does.

Advertisment

And that is what makes the finding hard to ignore. According to his team, the real danger is not just sloppy automation in peer review, but a security gap hiding in plain sight. If an AI reviewer reads a paper through a parser that also captures hidden text, the submission itself can become an attack vector. A paper is no longer just a paper. It can double as a payload.

How the review pipeline becomes an attack surface

The study, titled When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection, looks at how modern review pipelines process scientific submissions. In many cases, the PDF is first converted into machine-readable text, preserving layout, tables, and hidden layers, before an LLM scores it against a rubric. That convenience also creates risk. If the parser picks up invisible instructions, the model may treat them as part of the paper rather than as hostile input.

That is where the attack begins.

How the review pipeline becomes an attack surface

The exploit method

The researchers tested 15 attack strategies across 13 models using a dataset of 200 accepted and rejected papers, including official templates and real-world submissions. One of the most striking methods, called “Maximum Mark Magyk,” hid instructions in white 1-point text in the PDF margin. The text was invisible to human reviewers, but readable by the parser. It used deliberate typos, symbolic variables, and JSON-shaped output cues to nudge the model into assigning top scores across the board.

Advertisment

Why the security risk is serious

This is not a minor scoring glitch. The paper frames it as a direct security problem. The attacker’s main goal is to flip a verdict from reject to accept, with score inflation as the secondary goal. To measure that risk, the team introduced the Weighted Adversarial Vulnerability Score, or WAVS, which looks not just at score changes, but at how serious the decision shift is.

Hidden text in PDFs may be fooling AI paper reviewers1

The findings suggest the threat is real. Some open-source models showed decision flip rates of up to 86.26% under attack. Closed-source systems performed better overall, but the paper argues they were not immune. Instead, the weakness changed form. Cruder token tricks lost power, while more advanced deception methods created what the researchers describe as “reasoning traps,” where a model’s own instruction-following ability becomes the thing used against it.

The takeaway

The message is straightforward. AI can help sort and summarize large volumes of submissions, but it should not be trusted as the sole judge in a pipeline that still treats untrusted PDFs as clean input. Until systems add stronger sanitization, adversarial testing, and human oversight, the risk is not just flawed peer review. It is a security hole in the machinery of scientific judgment.

Advertisment

More For You

Copy-Paste This Command and You’re Hacked: New Windows Terminal Attack Spreads Lumma Stealer

Inside Coruna: the exploit kit chaining 23 iOS vulnerabilities to hack older iPhones

Kali Linux just made penetration testing conversational with Claude AI

Deepfakes and automated malware are redefining identity risk

Advertisment

Stay connected with us through our social media channels for the latest updates and news!

Advertisment