Reviewer Guidelines
Everything you need to evaluate AI-generated scholarly articles with accuracy, integrity, and impact. Your review is a public companion piece — not a private report.
Thank you for joining Latent Scholar as a reviewer. Your expertise is essential to our mission of creating a validated benchmark of AI-generated academic content. Because AI-generated content cannot be revised by an author, your review serves a unique purpose: it is not a private report prompting revisions — it is a public companion piece that helps readers understand how much they can trust the article and how to use it.
01 — What Makes This Different from Traditional Peer Review
If you have reviewed articles for academic journals before, you'll notice some key differences:
| Traditional Peer Review | Latent Scholar Review |
|---|---|
| Goal: Decide accept / reject / revise | Goal: Document strengths and weaknesses of AI-generated texts |
| Private feedback to authors | Public review published alongside the article |
| Authors will revise based on feedback | No revision — you are reviewing an AI-generated text with no potential for revisions |
| Judge originality and contribution to knowledge | Judge accuracy, reasoning, and reliability |
| Usually anonymous (double-blind) | Your choice: named or anonymous |
You're not gatekeeping — you're creating a public record of what AI can and cannot do. Finding flaws is valuable data, not just criticism.
02 — Your Role as a Reviewer
Think of yourself as a trusted expert annotator. Your task is to help readers understand:
- How much they can trust this article — Is it accurate? Are there errors?
- What the AI did well — What aspects are solid, well-reasoned, or useful?
- Where the AI failed — What's wrong, missing, hallucinated, or misleading?
- How this content should be used — Is it a good starting point? Useless? Risky?
| You are asked to: | You are not expected to: |
|---|---|
| Audit the article's accuracy, coherence, ethics, and usefulness | Request revisions or resubmission from an author |
| Explain its strengths and limitations in clear, accessible language | Provide detailed line editing or coaching |
| Advise readers on how (or whether) the article can be used | Fix the paper — your role is to evaluate and contextualise it |
03 — Core Principles
Tone and Style
We ask that all reviews be:
| Professional & Respectful Critique the text, not hypothetical authors |
Clear & Concrete Explain what is wrong, why it matters, and how readers should interpret it |
| Balanced Note genuine strengths as well as limitations |
Accessible Write for a broad academic audience, not only specialists |
04 — The Review Process: Step by Step
Step 1: Select an Article
Browse available articles at latentscholar.org/articles. Choose one that matches your expertise.
- Look at the discipline / field tag
- Read the title and short description
- Check which AI model generated it (GPT, Gemini, Claude)
- Articles with fewer reviews are higher priority
Step 2: Read the Full Article
Read the article carefully, as you would any paper in your field. As you read, take notes on:
- Claims that seem accurate vs. questionable
- Arguments that are well-constructed vs. weak
- Statements that feel "off" or suspiciously confident
- Citations that look real vs. potentially fabricated
- Gaps in reasoning or missing counterarguments
Don't try to verify every citation on first read. Get a sense of the overall quality first, then spot-check the most important references.
Step 3: Complete the Review Form
The review form has two parts: structured ratings (dropdown menus for 7 criteria) and a written evaluation (your narrative assessment). Submit your review using these form fields:
- Reviewer Information: Name, email, and institution
- Publication Preference: Choose whether to publish anonymously or under your name
- Overall Evaluation: Select Meets Standards, Needs Work, or Below Standards
- Evaluation Criteria: Use the drop-down menus for each of the seven criteria
- Review and Evaluation: Provide your comprehensive written assessment
05 — Automated Quality Checks
To assist your review, the following automated checks are performed on each submission before you see it:
| Check | Description |
|---|---|
| Cite-Ref Score | Verifies that in-text citations appear in the reference section and checks whether references are valid and accessible |
| AI-Generated | Uses detection tools to identify whether the article appears to be purely AI-generated content |
| Plagiarism | Scans for potential plagiarism across academic databases and web sources |
These automated checks handle basic verification, but they can't replace expert judgment. Your role is to catch what automation misses: subtle errors in reasoning, misrepresented sources, disciplinary inaccuracies, and nuanced quality issues that only a domain expert can identify.
06 — Understanding the Evaluation Criteria
You'll rate the article on seven dimensions using dropdown menus:
Overall Evaluation
After rating all seven criteria, you'll provide an overall assessment:
07 — Spotting Common AI Failures
AI-generated academic content has predictable failure modes. Here's what to watch for:
1. Hallucinated Citations
The most common and serious issue. AI will fabricate citations that look real but don't exist.
- Author names that don't appear in the field
- Journal names that seem slightly wrong
- Very recent dates (AI often creates "2023" or "2024" citations)
- DOIs that don't resolve
- Titles that are unusually descriptive or "perfect" for the claim being made
How to check: Use Google Scholar, Crossref, or the DOI resolver (doi.org). Spot-check 3–5 key citations. Our automated Cite-Ref Score catches some issues, but human verification catches what automation misses.
2. Confidently Wrong Statements
AI doesn't express uncertainty well. It will state falsehoods with the same confidence as truths.
- Definitive statements about contested topics
- Statistics or percentages without clear sources
- Claims that "research shows" without specific citations
- Historical claims that seem too neat or dramatic
3. Surface-Level Synthesis
AI often produces text that sounds sophisticated but lacks depth.
- Paragraphs that could apply to many topics (generic)
- Correct definitions but no real analysis
- Listing perspectives without engaging with tensions between them
- Conclusion that just restates the introduction
4. Logical Gaps & Non Sequiturs
Arguments that seem to flow but don't actually connect.
- Conclusions that don't follow from the evidence presented
- Missing steps in the argument
- Contradictions between different sections
- Transition phrases ("therefore," "thus") that don't actually connect logically
5. Outdated or Incomplete Information
AI models have knowledge cutoffs and may present old information as current.
- References to "current" events that may no longer be current
- Missing recent developments in fast-moving fields
- Presenting superseded findings as current consensus
- Ignoring major replication failures or retractions
08 — Writing Your Review
Beyond the dropdown ratings, you'll write a narrative review. This is the most valuable part.
What to Include
Opening Summary
2–3 sentences. What is this article about? What's your overall assessment?
Strengths
1–3 points. What did the AI do well? What sections are accurate and useful?
Weaknesses
As many as relevant. What errors? What's missing? What would mislead a non-expert?
Guidance for Readers
1–2 sentences. How should someone use this article? What should they be cautious about?
Examples: Weak vs. Strong Reviews
"This review article attempts to synthesise research on cognitive load theory in online learning. The AI demonstrates solid understanding of core CLT concepts (intrinsic, extraneous, germane load) and correctly summarises Sweller's foundational work.
However, I identified several significant issues:
1. The citation 'Morrison & Chen (2022)' in paragraph 4 appears to be fabricated — I could not locate this paper in any database, and these authors have not published together.
2. The article claims 'research consistently shows a 40% improvement in retention' — this specific statistic is not supported by the cited sources and seems invented.
3. The section on worked examples conflates element interactivity with task complexity in ways that would confuse readers.
The article could serve as a basic introduction for someone unfamiliar with CLT, but readers should independently verify all citations and treat specific statistics with scepticism."
09 — Time Management: The 30-Minute Review
You don't need to spend hours on each review. Here's an efficient approach:
| Time | Activity | What to Do |
|---|---|---|
| 5 min | Skim & orient | Read abstract, headings, and conclusion. Get the overall structure. |
| 12 min | Careful read | Read the full article. Take quick notes on issues as you go. |
| 5 min | Spot-check citations | Verify 3–5 key citations. Note any that fail. |
| 8 min | Write review | Complete the form. Select ratings, write your assessment. |
You don't need to find everything. Your goal is to provide a useful expert perspective, not to comprehensively audit every claim. Document what you find; don't feel obligated to verify every sentence.
10 — Frequently Asked Questions
"What if I'm not sure if something is wrong?"
Say so. "I couldn't verify this claim" or "This seems inconsistent with my understanding" are perfectly valid observations. You don't need to be certain — your professional judgment is the point.
"Should I be harsh or generous?"
Be accurate. Don't inflate ratings to be nice, and don't deflate them to seem rigorous. Call it as you see it. Finding problems is valuable data, not a criticism of anyone.
"What if I find the article is actually pretty good?"
That's also valuable data. Documenting what AI does well is just as important as documenting failures. A review that says "This is surprisingly competent for these reasons" is useful.
"Should I comment on the writing style?"
Yes, if it's notable. AI-generated text often has a distinctive "voice" — overly formal, repetitive phrases, or suspiciously perfect flow. This is worth noting.
"Can I review articles outside my exact specialty?"
Adjacent areas are fine. If you're a social psychologist, you can review a cognitive psychology paper. But don't review organic chemistry if you're a historian. Your expertise should be relevant.
"Can I use my name or should I be anonymous?"
Your choice. Named reviews get attribution (good for CV). Anonymous reviews are fine if you prefer privacy. Either is equally valued.
"What if I disagree with another reviewer?"
That's fine and expected. Different experts notice different things. Both reviews add value. You don't need to reconcile your views.
"How do I get credit for this on my CV?"
List it under "Professional Service" or "Peer Review." Example: Reviewer, Latent Scholar (AI-generated academic content evaluation), 2025–present. If your review is published with attribution, you can also cite it directly.
11 — Quick Reference Checklist
Use this as a quick guide when reviewing:
Your review does more than evaluate a single article — it contributes to a living record of how LLMs perform in scholarly reasoning across disciplines:
- Benchmark AI models against human expert expectations
- Build a transparent reference base for detecting AI-assisted writing
- Encourage responsible, verifiable integration of AI in academic work
If you run into problems or have questions:
Email: info@latentscholar.org
Response time: Usually within 24–48 hours
Pick a discipline. Dive in.
Share what you find.
Your expertise is building the foundation for understanding what AI-assisted scholarship can — and cannot — reliably produce.
