Research Paper Writing Cheatsheet

01

📋

Abstract 150–250 words ▼

The 6-Part Story Arc

① Problem→

② Existing Work→

③ Gap / Limitation→

④ Our Approach→

⑤ How It Helps→

⑥ Key Results

What Each Part Must Answer

① ProblemWhat real-world challenge are you tackling? State it in 1–2 sentences, no jargon yet.

② Existing workWhat do current approaches do? Name the paradigm, not individual papers.

③ Gap / limitWhere do they fail? (speed, accuracy, scalability, cost, domain, language…) Be specific.

④ Our approachWhat did you propose, build, or design? One clear sentence — the contribution.

⑤ Why betterWhat principle makes it superior? (avoids X assumption, leverages Y signal, reduces Z cost)

⑥ Key resultsTop 1–2 numbers compared to baselines. E.g. "reduces WER by 18% over the best baseline."

Example structure (Speech domain)

Low-resource automatic speech recognition (ASR) remains challenging because existing end-to-end models require thousands of hours of labeled speech. Prior work uses data augmentation or transfer learning, but both struggle on rare dialects with under 10 h of training data. We propose DialectAdapt, a lightweight fine-tuning framework that aligns pre-trained wav2vec 2.0 features to low-resource dialects using only 2 h of unlabeled audio. By exploiting phonetic proximity between related dialects, our method avoids costly re-training. On four Arabic dialects, DialectAdapt reduces character error rate by 23% over the strongest baseline and matches a fully supervised model trained with 10× more data.

Do

Write it last — after the paper is done
Use concrete numbers in results
State the dataset / domain clearly
Keep every sentence standalone — no "as shown in Fig 2"

Don't

Copy-paste intro paragraphs verbatim
Use undefined acronyms
Vague claims: "significantly improves"
Cite references in the abstract

02

🚪

Introduction ~1 page ▼

Paragraph-by-paragraph structure

§1 — HookStart with the real-world importance of the problem. Why does it matter? Who is affected?

§2 — BackgroundSet context: what is the general task, what has been attempted. Broad strokes only.

§3 — Problem gapPrecisely state what current methods cannot do, and why this matters. This is the "motivating tension."

§4 — Our proposalIntroduce your method/system at a high level. What is the core idea?

§5 — ContributionsBullet list of 3–5 concrete, verifiable contributions. These are your "promises" to the reader.

§6 — Roadmap"The rest of the paper is organized as follows…" — one sentence per section.

How to write contributions

Good contribution bullets

• We propose DialectAdapt, the first framework to align wav2vec 2.0 features using cross-dialect phoneme mapping without transcribed target-dialect data.
• We release a benchmark of 4 Arabic dialects with standardized train/test splits (total 40 h).
• We demonstrate that DialectAdapt achieves a 23% CER reduction over the best prior method across all four dialects.

Do

Open with a compelling real-world sentence
Distinguish your work from related work early
Make contributions specific and falsifiable
Use past tense for related work, present for yours

Don't

Start with "In recent years, deep learning…"
List vague contributions like "we improve performance"
Explain the method in detail here — save it for §4
Overstate novelty — reviewers will notice

03

📚

Related Work ~0.5–1 page ▼

Purpose of this section

Show you know the field — breadth and depth of citations matters to reviewers
Define the landscape so the reader understands where your work sits
Justify why existing work is insufficient (sets up your contribution)
Give credit where it is due — academic integrity

How to organize

By themeGroup papers by approach category, not chronology. E.g., "Transfer Learning methods", "Augmentation strategies", "Multilingual models".

By proximityStart with distantly related, move to closely related, ending with the most similar (to highlight your gap).

Critical toneFor each cluster, note their key limitation. Not harsh — academic and factual. "However, these approaches assume…"

Good critique sentence pattern

[Method X] [Smith et al., 2022] achieves strong results on standard benchmarks, but relies on large parallel corpora that are unavailable for low-resource dialects. Similarly, [Method Y] reduces data requirements via self-supervised pre-training, yet still requires at least 50 h of in-domain audio — a condition our target setting cannot satisfy.

Do

Cite at least 20–40 papers for a conference paper
End with a paragraph explaining how your work differs from all of the above
Include recent (last 2–3 years) and foundational papers

Don't

Just list papers without analysis: "X did this. Y did that."
Dismiss competitors unfairly or without evidence
Forget to cite concurrent/rival work — reviewers will flag it

04

⚙️

Methodology / Proposed Method Core section ▼

Golden rule

A reader must be able to re-implement your method from this section alone — no guesswork, no "details in code."

Recommended structure

Problem formulationDefine inputs, outputs, and notation formally. "Given X, our goal is to produce Y such that Z."

Overview / architectureHigh-level diagram (figure) showing all components and their relationships. Explain the figure in prose.

Component detailSub-sections for each module. Include equations where relevant. State assumptions explicitly.

Training / inferenceLoss functions, optimization strategy, hyperparameter choices, and how inference differs from training.

Complexity analysisTime and space complexity of your method vs baselines if relevant.

Notation checklist

All variables defined before first use
Consistent notation throughout (don't switch x to h midway)
Equations numbered and referenced in text
Dimensions stated for all tensors/matrices
Figures have clear captions explaining every component

Do

Use a system overview figure — it anchors the section
Highlight the novel components vs. borrowed ones
Explain design choices: "We chose X over Y because…"
Define all hyperparameters (even default ones)

Don't

Hide details with "following standard practice"
Include implementation code — pseudocode is fine
Over-explain well-known baselines — just cite them

05

🔬

Experimental Setup Reproducibility ▼

What must be covered

DatasetsName, size (train/dev/test split), language/domain, source, license, preprocessing steps applied.

BaselinesList every method you compare against. For each: name, reference, key settings used. Include the strongest known baseline.

Evaluation metricsDefine each metric mathematically. Explain why it is appropriate for your task.

ImplementationFramework (PyTorch, TF…), hardware (GPU model + count), training time, batch size, learning rate, optimizer, seeds for reproducibility.

Statistical rigorReport mean ± std over N runs. State significance tests used (t-test, Wilcoxon, bootstrap). p < 0.05 threshold.

Research questions to answer

RQ1: Does our method outperform all baselines on the primary metric?
RQ2: How does each component contribute? (ablation study)
RQ3: How does the method scale / generalize across datasets?
RQ4: What is the computational cost vs accuracy trade-off?

Do

Use the same test set for ALL methods — no cherry-picking
Report results of baselines yourself if possible (re-run under same conditions)
State if you used public checkpoints and which version

Don't

Compare against outdated or weak baselines only
Report only best run — report average of multiple seeds
Use different test sets for different methods

06

📊

Results Evidence section ▼

Must-have result types

Main resultsTable comparing your method vs all baselines across all datasets/metrics. Bold the best result per column.

Ablation studyRemove one component at a time. Shows each part's individual contribution. Proves the method isn't accidental.

Qualitative analysisExamples where your method works well and where it fails. Error analysis. Helps readers build intuition.

Sensitivity analysisHow sensitive is performance to key hyperparameters? Line plots over parameter range.

Efficiency resultsInference speed, memory, parameter count vs accuracy. Especially important if your paper claims efficiency gains.

Table and figure design rules

Every table / figure referenced in text before it appears
Table caption above the table; figure caption below the figure
Statistical significance marked (*, †, ‡) with legend
Improvement % stated explicitly in text, not left to the reader
Y-axis of plots must start at a meaningful value — not always 0
Color-blind-friendly palettes for all plots

How to narrate results

Table 2 shows the main results. Our method achieves a CER of 18.4% on CV-Arabic, outperforming the strongest baseline (XLS-R, 22.5%) by 18.2% relative. The improvement is consistent across all four dialects (avg. 21.3% relative reduction), confirming the generalization of our approach. Statistical significance is confirmed (Wilcoxon, p<0.01) for all comparisons.

07

💬

Discussion Critical thinking ▼

What to discuss

Why it worksExplain the mechanism behind your results. What property of your method causes the improvement?

Surprising findingsCall out unexpected results — where did a baseline beat you, or where did you beat expectations?

LimitationsBe honest. What does your method not handle? What assumptions could be violated in the wild?

Broader impactSocietal, ethical, or deployment considerations. Who benefits? Who could be harmed? Required in many venues.

Future workNatural extensions. What would you do with more compute, data, or time?

Do

Acknowledge limitations proactively — reviewers respect honesty
Connect discussion to related work findings
Propose concrete future directions, not just "more data"

Don't

Repeat what you already said in results
Ignore failure cases — they are informative
Over-promise on future work you have no plan to do

08

🏁

Conclusion ~1 paragraph ▼

Structure

Sentence 1–2Restate the problem and your solution in 1–2 sentences. Do not copy from abstract verbatim.

Sentence 3–4Summarize the 2–3 most important findings with numbers.

Sentence 5Closing statement on impact or importance of the work.

Common mistake

The conclusion is NOT a copy of the abstract. It should feel like a short, confident landing — not a repetition. Write it fresh, with the reader in mind who has just read the full paper.

09

🔗

References Integrity ▼

Follow the venue style exactly (ACL, IEEE, APA…) — use BibTeX / Zotero to manage
Every citation in text must have an entry in references; every entry must be cited in text
Use official DOI or arXiv links where possible
Prefer published versions over arXiv preprints when both exist
Check that author names, years, and titles are correct — copy-paste errors are common
For datasets and codebases: cite the original paper AND the version/URL used

10

✨

General Writing Tips Style & quality ▼

Writing process

Order to write

Method → Experiments → Results → Related Work → Intro → Abstract → Conclusion

Revision passes

1st: structure & logic · 2nd: clarity & evidence · 3rd: grammar & style · 4th: formatting

Get feedback early

Share a 1-page outline before writing the full paper. Fix structure, not prose.

Peer review

Ask a colleague outside your field to read. If they are confused, reviewers will be too.

Sentence-level rules

One idea per sentence. Split long sentences aggressively.
Use active voice: "We propose…" not "A method is proposed…"
Avoid weasel words: "quite", "fairly", "somewhat", "various", "several"
Be consistent: pick one term and use it throughout (don't alternate "model" / "system" / "framework")
Spell out all acronyms at first use: "Character Error Rate (CER)"
Avoid starting consecutive sentences with the same word

Claims must be backed by evidence

Weak claims (avoid)

"Our method significantly outperforms all baselines"
"The results clearly demonstrate superiority"
"This approach is much faster"

Strong claims (use)

"Our method reduces CER by 23% (Table 2)"
"Results in Table 3 confirm a 4.1-point improvement on all four datasets"
"Inference runs in 12 ms vs 87 ms for the baseline (7× faster)"

Final checklist before submission

Every figure and table is referenced in text
All acronyms defined at first use
Abstract contains numbers from results
Contributions match what is actually demonstrated
Reproducibility: code/data release URL or statement included
Anonymous version ready (if double-blind): no author names, no self-identifying links
Spell-check and grammar-check completed
Page limit respected; appendix used for supplementary material
Ethics and broader impact statement included (if required by venue)

The Research Paper Cheatsheet