Skip to content

Reproducibility map

This file maps every figure and quantitative claim produced by the DeepMapper analyses to the script that produces it and the public dataset it consumes. No new data was generated; everything runs on public accessions.

Install the package and fetch the data first (see the repo README.md and docs/data-sources.md), then run each script from the repo root.

Public datasets

Accession Platform Use
SRP073767 (Zheng et al. 2017, 10x sorted PBMC) 10x 3' sorted CD4+/CD8+ subsets; ribosomal, chord, backbone analyses
GSE96583 (Kang et al. 2018) 10x 3' IFN-beta-stimulated vs control; interferon signature + per-donor control
GSE99254 (Guo et al. 2018, NSCLC) SMART-seq2 antisense lncRNA; all-ncRNA chord; exhaustion chord
GSE98638 (Zheng et al. 2017, HCC) SMART-seq2 antisense replication; exhaustion replication
GSE108989 (Zhang et al. 2018, CRC) SMART-seq2 antisense replication
Elyahu et al. 2019 (Single Cell Portal SCP490) 10x 3' mouse CD4 naive vs effector-memory; cross-species ribosomal validation

Figure / result to script

Item Script
Fig 1 (state separation + ribosomal-only) bench/ribosomal_validation.py
Fig 2 (HVG discards ribosomal genes) bench/hvg_ribosomal_rank.py
Fig 3 (interferon shared genes) bench/c5_kang/c5_kang_analysis.py
Fig 4 (antisense overlap control + cross-cohort) bench/independent_validation/lncrna_antisense.py, antisense_overlap_control.py
Fig 5 (gene chord, held-out) bench/gene_chord_honest.py
Fig 6 (all-non-coding chord) bench/ncrna_chord.py, ncrna_chord_biotype.py
Fig 7 (exhaustion chord + score benchmark) bench/independent_validation/exhaustion_til_vs_blood.py, exhaustion_vs_score.py
Sec 2.2 confound controls (depth/cycle/effectorness) bench/review_controls.py
Sec 2.3 per-donor interferon control bench/c5_kang/c5_kang_donor_isg.py
Sec 2.4 antisense enrichment null bench/independent_validation/antisense_enrichment_null.py
Sec 3 backbone head-to-head (linear approx cnn) bench/backbone_headtohead.py
Sec 3 deterministic linear / passes bench/passes_and_determinism.py, pydeepmapper/linear_baseline.py
Cross-species mouse (de-novo ribosomal recovery) bench/mouse_dm_run.py
Cross-species confound control (scanpy DPT) bench/mouse_phase2_dpt.py

Each script writes its result to a JSON or CSV file under results/ (gitignored). Re-running a script regenerates its output from the public data.

Environment

  • Python 3.9 or newer. Install with pip install -e ".[all]".
  • GPU-heavy steps (DeepMapper training and attribution) need torch with Apple MPS or CUDA. CPU works but is slow. The sklearn-only controls run on CPU in seconds.
  • Typical invocation: PYTORCH_ENABLE_MPS_FALLBACK=1 python bench/<script>.py.

Citation

See CITATION.cff. If a release DOI is minted (for example via the Zenodo to GitHub integration), cite that archive in the Code Availability statement.