Reproducibility map¶
This file maps every figure and quantitative claim produced by the DeepMapper analyses to the script that produces it and the public dataset it consumes. No new data was generated; everything runs on public accessions.
Install the package and fetch the data first (see the repo README.md and
docs/data-sources.md), then run each script from the repo root.
Public datasets¶
| Accession | Platform | Use |
|---|---|---|
| SRP073767 (Zheng et al. 2017, 10x sorted PBMC) | 10x 3' | sorted CD4+/CD8+ subsets; ribosomal, chord, backbone analyses |
| GSE96583 (Kang et al. 2018) | 10x 3' | IFN-beta-stimulated vs control; interferon signature + per-donor control |
| GSE99254 (Guo et al. 2018, NSCLC) | SMART-seq2 | antisense lncRNA; all-ncRNA chord; exhaustion chord |
| GSE98638 (Zheng et al. 2017, HCC) | SMART-seq2 | antisense replication; exhaustion replication |
| GSE108989 (Zhang et al. 2018, CRC) | SMART-seq2 | antisense replication |
| Elyahu et al. 2019 (Single Cell Portal SCP490) | 10x 3' | mouse CD4 naive vs effector-memory; cross-species ribosomal validation |
Figure / result to script¶
| Item | Script |
|---|---|
| Fig 1 (state separation + ribosomal-only) | bench/ribosomal_validation.py |
| Fig 2 (HVG discards ribosomal genes) | bench/hvg_ribosomal_rank.py |
| Fig 3 (interferon shared genes) | bench/c5_kang/c5_kang_analysis.py |
| Fig 4 (antisense overlap control + cross-cohort) | bench/independent_validation/lncrna_antisense.py, antisense_overlap_control.py |
| Fig 5 (gene chord, held-out) | bench/gene_chord_honest.py |
| Fig 6 (all-non-coding chord) | bench/ncrna_chord.py, ncrna_chord_biotype.py |
| Fig 7 (exhaustion chord + score benchmark) | bench/independent_validation/exhaustion_til_vs_blood.py, exhaustion_vs_score.py |
| Sec 2.2 confound controls (depth/cycle/effectorness) | bench/review_controls.py |
| Sec 2.3 per-donor interferon control | bench/c5_kang/c5_kang_donor_isg.py |
| Sec 2.4 antisense enrichment null | bench/independent_validation/antisense_enrichment_null.py |
| Sec 3 backbone head-to-head (linear approx cnn) | bench/backbone_headtohead.py |
| Sec 3 deterministic linear / passes | bench/passes_and_determinism.py, pydeepmapper/linear_baseline.py |
| Cross-species mouse (de-novo ribosomal recovery) | bench/mouse_dm_run.py |
| Cross-species confound control (scanpy DPT) | bench/mouse_phase2_dpt.py |
Each script writes its result to a JSON or CSV file under results/ (gitignored).
Re-running a script regenerates its output from the public data.
Environment¶
- Python 3.9 or newer. Install with
pip install -e ".[all]". - GPU-heavy steps (DeepMapper training and attribution) need torch with Apple MPS or CUDA. CPU works but is slow. The sklearn-only controls run on CPU in seconds.
- Typical invocation:
PYTORCH_ENABLE_MPS_FALLBACK=1 python bench/<script>.py.
Citation¶
See CITATION.cff. If a release DOI is minted (for example via the Zenodo to GitHub
integration), cite that archive in the Code Availability statement.