Guidelines

Welcome to the Team!

Thank you for joining Latent Scholar as a reviewer. Your expertise is essential to our mission of creating a validated benchmark of AI-generated academic content. Because AI-generated content cannot be revised by an author, your review serves a unique purpose: it is not a private report prompting revisions—it is a public companion piece that helps readers understand how much they can trust the article and how they should use it.

What Makes This Different from Traditional Peer Review

If you have reviewed articles for academic journals before, you'll notice some key differences:

Traditional Peer Review	Latent Scholar Review
Goal: Decide accept/reject/revise	Goal: Document strengths and weaknesses of AI-generated texts
Private feedback to authors	Public review alongside the article
Authors will revise based on feedback	No revision—You are reviewing an AI-generated text with no potential for revisions
Judge originality and contribution to knowledge	Judge accuracy, reasoning, and reliability
Usually anonymous (double-blind)	Your choice: named or anonymous

Key Mindset Shift

You're not gatekeeping – you're creating a public record of what AI can and cannot do. Finding flaws is valuable data, not just criticism.

Your Role as a Reviewer

Think of yourself as a trusted expert annotator. Your task is to help readers understand:

How much they can trust this article – Is it accurate? Are there errors?
What the AI did well – What aspects are solid, well-reasoned, or useful?
Where the AI failed – What's wrong, missing, hallucinated, or misleading?
How this content should be used – Is it a good starting point? Useless? Risky?

You are asked to:	You are not expected to:
Audit the article's accuracy, coherence, ethics, and usefulness	Request revisions or resubmission from an author
Explain its strengths and limitations in clear, accessible language	Provide detailed line editing or coaching
Advise readers on how (or whether) the article can be used	Fix the paper—your role is to evaluate and contextualize it

Core Principles

Fairness & Objectivity

Review each article by the same scholarly standards used for human-authored work—accuracy, reasoning, clarity, structure, and contribution. Do not penalize or privilege the article because it was generated by AI.

Transparency & Accountability

Provide clear, evidence-based feedback. Your review forms part of a transparent record that tracks the capabilities and limits of AI-mediated scholarship over time.

Scientific Integrity

Scrutinize claims, methods, and interpretations. Identify where reasoning aligns with accepted knowledge and where it diverges or becomes speculative.

Note: Reviews may be published alongside articles to foster openness and reproducibility.

Tone and Style

We ask that all reviews be:

Professional & Respectful Critique the text, not hypothetical authors	Clear & Concrete Explain what is wrong, why it matters, and how readers should interpret it
Balanced Note genuine strengths as well as limitations	Accessible Write for a broad academic audience, not only specialists

The Review Process: Step by Step

Step 1: Select an Article

Browse available articles at latentscholar.org/articles. Choose one that matches your expertise.

Look at the discipline/field tag
Read the title and short description
Check which AI model generated it (GPT, Gemini, Claude)
Note: Articles with fewer reviews are higher priority

Step 2: Read the Full Article

Read the article carefully, as you would any paper in your field. As you read, take notes on:

Claims that seem accurate vs. questionable
Arguments that are well-constructed vs. weak
Statements that feel "off" or suspiciously confident
Citations that look real vs. potentially fabricated
Gaps in reasoning or missing counterarguments

💡 Pro Tip

Don't try to verify every citation on first read. Get a sense of the overall quality first, then spot-check the most important references.

Step 3: Complete the Review Form

The review form has two parts: structured ratings (dropdown menus for 7 criteria) and written evaluation (your narrative assessment).

Submit your review using the form fields:

Reviewer Information: Name, email, and institution
Publication Preference: Choose whether to publish anonymously or under your name
Overall Evaluation: Select Meets Standards, Needs Work, or Below Standards
Evaluation Criteria: Use the drop-down menus for each of the seven criteria
Review and Evaluation: Provide your comprehensive assessment

Automated Quality Checks

To assist your review, the following automated checks are performed on each submission before you see it:

Check	Description
Cite-Ref Score	Verifies that in-text citations appear in the reference section and checks whether references are valid and accessible
AI-Generated	Uses detection tools to identify whether the article appears to be purely AI-generated content
Plagiarism	Scans for potential plagiarism across academic databases and web sources

What This Means for You

These automated checks handle basic verification, but they can't replace expert judgment. Your role is to catch what automation misses: subtle errors in reasoning, misrepresented sources, disciplinary inaccuracies, and nuanced quality issues that only a domain expert can identify.

Understanding the Evaluation Criteria

You'll rate the article on seven dimensions using dropdown menus:

1. Accuracy & Validity

Does the article represent facts, concepts, data, and methods correctly?

Excellent / Strong

Core claims are factually accurate; no significant errors detected; represents the field's understanding correctly

Satisfactory / Minor

Mostly accurate but contains some errors, oversimplifications, or outdated information that don't undermine the main argument

Weak / Major

Contains significant factual errors, misrepresentations, or hallucinated content that seriously undermines credibility

2. Evidence & Citations

Are sources appropriate, verifiable, and properly used?

Excellent / Strong

Citations are real, relevant, and correctly attributed; references support the claims made; citation format is consistent

Satisfactory / Minor

Most citations are valid but some are questionable, misattributed, or don't fully support the claims; formatting inconsistencies

Weak / Major

Multiple fabricated citations; sources don't exist or say something different; evidence doesn't support claims

3. Methodology / Approach

Is the approach appropriate for the topic? Are methods correctly described?

Excellent / Strong

Methodology is appropriate for the research question; methods are correctly explained; limitations acknowledged

Satisfactory / Minor

Approach is reasonable but description is incomplete, oversimplified, or missing key considerations

Weak / Major

Methodology is inappropriate, incorrectly described, or fundamentally flawed; would not pass expert scrutiny

4. Reasoning & Argumentation

Is the logic coherent? Does the argument flow? Are there gaps?

Excellent / Strong

Arguments are logical, well-structured, and build on each other; counterarguments addressed; conclusions follow from evidence

Satisfactory / Minor

Generally logical but with some gaps, unsupported leaps, or missing counterarguments

Weak / Major

Logical fallacies; conclusions don't follow from premises; major gaps in reasoning; internally contradictory

5. Structure & Clarity

Is the article well-organized, readable, and professionally written?

Excellent / Strong

Clear organization; appropriate sections; readable prose; meets academic writing standards

Satisfactory / Minor

Generally clear but some sections are awkwardly organized, too verbose, or unclear

Weak / Major

Poorly organized; hard to follow; significant writing quality issues; doesn't meet academic standards

6. Originality & Insight

Does the article offer useful synthesis, novel perspectives, or valuable insights?

Excellent / Strong

Provides valuable synthesis, fresh connections, or useful frameworks; would be informative to readers in the field

Satisfactory / Minor

Competent summary but mostly derivative; limited new insight; reads like a textbook summary

Weak / Major

No meaningful contribution; superficial treatment; reader would learn nothing useful

7. Ethics & Responsible Use

Are there ethical concerns, unsafe recommendations, or potential for misuse?

Excellent / Strong

No ethical concerns; claims are appropriately hedged; no dangerous or irresponsible content

Satisfactory / Minor

Some overconfident claims or missing caveats, but unlikely to cause harm

Weak / Major

Contains dangerous advice, bias, misinformation that could cause harm, or seriously irresponsible claims

Overall Evaluation

After rating all seven criteria, you'll provide an overall assessment:

Meets Standards	A competent reader could reasonably rely on this article as a starting point for understanding the topic. Errors are minor or easily identified.
Needs Work	Has some useful content but significant issues that require careful reading. Reader should verify key claims independently.
Below Standards	Fundamentally unreliable. Reader should not trust this content without substantial independent verification. Major factual or methodological problems.

Spotting Common AI Failures

AI-generated academic content has predictable failure modes. Here's what to watch for:

1. Hallucinated Citations

The most common and serious issue. AI will fabricate citations that look real but don't exist.

🚩 Red Flags

Author names that don't appear in the field
Journal names that seem slightly wrong
Very recent dates (AI often creates "2023" or "2024" citations)
DOIs that don't resolve
Titles that are unusually descriptive or "perfect" for the claim being made

How to check: Use Google Scholar, Crossref, or the DOI resolver (doi.org/[DOI]). Spot-check 3-5 key citations. Our automated Cite-Ref Score catches some issues, but human verification catches what automation misses.

2. Confidently Wrong Statements

AI doesn't express uncertainty well. It will state falsehoods with the same confidence as truths.

🚩 Red Flags

Definitive statements about contested topics
Statistics or percentages without clear sources
Claims that "research shows" without specific citations
Historical claims that seem too neat or dramatic

3. Surface-Level Synthesis

AI often produces text that sounds sophisticated but lacks depth.

🚩 Red Flags

Paragraphs that could apply to many topics (generic)
Correct definitions but no real analysis
Listing perspectives without engaging with tensions between them
Conclusion that just restates the introduction

4. Logical Gaps & Non Sequiturs

Arguments that seem to flow but don't actually connect.

🚩 Red Flags

Conclusions that don't follow from the evidence presented
Missing steps in the argument
Contradictions between different sections
Transition phrases ("therefore," "thus") that don't actually connect logically

5. Outdated or Incomplete Information

AI models have knowledge cutoffs and may present old information as current.

🚩 Red Flags

References to "current" events that may no longer be current
Missing recent developments in fast-moving fields
Presenting superseded findings as current consensus
Ignoring major replication failures or retractions

Writing Your Review

Beyond the dropdown ratings, you'll write a narrative review. This is the most valuable part.

What to Include

Opening Summary (2-3 sentences)

What is this article about?
What's your overall assessment?

Strengths (1-3 points)

What did the AI do well?
What sections are accurate and useful?
What would be informative for readers?

Weaknesses (as many as relevant)

What errors did you find? Be specific.
What's missing that should be included?
What would mislead a non-expert reader?
What citations are problematic?

Guidance for Readers (1-2 sentences)

How should someone use this article?
What should they be cautious about?

Examples: Good vs. Weak Reviews

❌ WEAK REVIEW

"This article has some good points but also some errors. The citations seem mostly okay. Overall it's a decent summary of the topic but could be better. I would rate it as needing work."

Why it's weak: Too vague. No specific examples. Doesn't help readers know what to trust or avoid. Not useful as benchmark data.

✓ STRONG REVIEW

"This review article attempts to synthesize research on cognitive load theory in online learning. The AI demonstrates solid understanding of core CLT concepts (intrinsic, extraneous, germane load) and correctly summarizes Sweller's foundational work.

However, I identified several significant issues:

1. The citation 'Morrison & Chen (2022)' in paragraph 4 appears to be fabricated – I could not locate this paper in any database, and these authors have not published together.

2. The article claims 'research consistently shows a 40% improvement in retention' – this specific statistic is not supported by the cited sources and seems invented.

3. The section on worked examples conflates element interactivity with task complexity in ways that would confuse readers.

The article could serve as a basic introduction for someone unfamiliar with CLT, but readers should independently verify all citations and treat specific statistics with skepticism."

Why it's strong: Specific examples of problems. Identifies both strengths and weaknesses. Tells readers exactly what to watch out for. Creates useful benchmark data.

Time Management: The 30-Minute Review

You don't need to spend hours on each review. Here's an efficient approach:

Time	Activity	What to Do
5 min	Skim & orient	Read abstract, headings, and conclusion. Get the overall structure.
12 min	Careful read	Read the full article. Take quick notes on issues as you go.
5 min	Spot-check citations	Verify 3-5 key citations. Note any that fail.
8 min	Write review	Complete the form. Select ratings, write your assessment.

Remember

You don't need to find everything. Your goal is to provide a useful expert perspective, not to comprehensively audit every claim. Document what you find; don't feel obligated to verify every sentence.

Frequently Asked Questions

"What if I'm not sure if something is wrong?"

Say so. "I couldn't verify this claim" or "This seems inconsistent with my understanding" are perfectly valid observations. You don't need to be certain – your professional judgment is the point.

"Should I be harsh or generous?"

Be accurate. Don't inflate ratings to be nice, and don't deflate them to seem rigorous. Call it as you see it. Finding problems is valuable data, not a criticism of anyone.

"What if I find the article is actually pretty good?"

That's also valuable data! Documenting what AI does well is just as important as documenting failures. A review that says "This is surprisingly competent for these reasons" is useful.

"Should I comment on the writing style?"

Yes, if it's notable. AI-generated text often has a distinctive "voice" – overly formal, repetitive phrases, or suspiciously perfect flow. This is worth noting.

"Can I review articles outside my exact specialty?"

Adjacent areas are fine. If you're a social psychologist, you can review a cognitive psychology paper. But don't review organic chemistry if you're a historian. Your expertise should be relevant.

"Can I use my name or should I be anonymous?"

Your choice. Named reviews get attribution (good for CV). Anonymous reviews are fine if you prefer privacy. Either is equally valued.

"What if I disagree with another reviewer?"

That's fine and expected. Different experts notice different things. Both reviews add value. You don't need to reconcile your views.

"How do I get credit for this on my CV?"

List it under "Professional Service" or "Peer Review." Example: "Reviewer, Latent Scholar (AI-generated academic content evaluation), 2025-present." If your review is published with attribution, you can also cite it directly.

Quick Reference Checklist

Use this as a quick guide when reviewing:

Before You Start

☐ Is this article in my area of expertise?
☐ Do I have 20-30 minutes to complete this?
☐ Have I noted the AI model and article type?

While Reading

☐ Am I noting claims that seem questionable?
☐ Am I watching for overly confident statements?
☐ Am I tracking citations that need verification?
☐ Am I noting logical gaps or contradictions?

Citation Check

☐ Have I spot-checked 3-5 key citations?
☐ Did I check if cited papers actually exist?
☐ Did I verify claims match what sources say?

Writing the Review

☐ Have I provided specific examples (not just vague comments)?
☐ Have I noted both strengths and weaknesses?
☐ Have I given readers guidance on how to use this article?
☐ Is my overall rating consistent with my comments?

Before Submitting

☐ Have I completed all seven rating dropdowns?
☐ Have I selected my overall evaluation?
☐ Have I chosen my publication preference (named/anonymous)?
☐ Have I agreed to the Reviewer Agreement?

Why Your Review Matters

Your review does more than evaluate a single article—it contributes to a living record of how LLMs perform in scholarly reasoning across disciplines:

Benchmark AI models against human expert expectations
Build a transparent reference base for detecting AI-assisted writing
Encourage responsible, verifiable integration of AI in academic work

Questions or Issues?

If you run into problems or have questions:

Email: info@latentscholar.org
Response time: Usually within 24-48 hours

Ready to explore? Pick a discipline, dive into an article, and share your review!

Browse Articles →

Thank You!

Your expertise is building the foundation for understanding AI in academic contexts. Every review you contribute helps researchers and educators understand what these tools can and cannot do.