Proteomics, Metabolomics, and Computational Neuroscience
A Tier 2 bundle covering three quantitative subfields of biology that share heavy reliance on mass spectrometry, computational pipelines, and large open data repositories. Proteomics and metabolomics are the natural extensions of genomics and transcriptomics into the protein and small-molecule layers; computational neuroscience extends physiology and behavior into mathematical and machine-learning models. All three increasingly converge on foundation-model approaches over the 2023–2026 window.
Part I — Proteomics
Mass spectrometry workflows
Bottom-up shotgun. The dominant workflow. Protein extract → reduce/alkylate → tryptic (or Lys-C, Glu-C, AspN, chymotryptic) digest → reversed-phase nano-LC → ESI ionization → MS1 precursor mass + MS2 fragmentation (HCD higher-energy collisional dissociation, CID collision-induced dissociation, ETD electron-transfer dissociation, EThcD hybrid). Peptide-spectrum matches re-assembled to protein groups; razor / unique peptide assignment.
Top-down. Intact protein analysis (Kelleher Northwestern; Heck Utrecht; Smith PNNL); requires high-resolution FT-MS (Orbitrap or FT-ICR); proteoform-level information including PTMs and isoforms. Limited dynamic range and throughput compared to bottom-up.
Middle-down. Limited proteolysis (e.g., GluC, IdeS for antibodies) → 5-20 kDa fragments; bridges top- and bottom-up.
Instrumentation
- Thermo Orbitrap family — Q Exactive HF / HF-X, Exploris 240/480/MX, Eclipse Tribrid, Astral (2023) — fastest MS2 acquisition (~200 Hz) plus high resolution; dominant in high-end proteomics labs through 2024-2026. Astral has become the de-facto reference for plasma deep-coverage workflows.
- Bruker timsTOF — TIMS trapped ion mobility separation upstream of TOF; timsTOF Pro 2, timsTOF Ultra (2023), timsTOF HT, timsTOF SCP (single-cell). PASEF (parallel accumulation serial fragmentation; Meier-Mann 2018) drives sensitivity. Particularly strong for low-input and single-cell.
- SCIEX ZenoTOF 7600 — EAD electron-activated dissociation for PTM site localization; growing in glycoproteomics.
Quantification strategies
- Label-free quantification (LFQ). MS1 intensity-based (MaxQuant LFQ, Spectronaut directLFQ, DIA-NN); MS2-spectral counting now legacy.
- SILAC — stable isotope labeling by amino acids in culture (Mann lab 2002). 13C-Lys/Arg metabolic labeling; up to triple SILAC; high accuracy but limited to cell culture or animal SILAC mice.
- TMT — tandem mass tags (Thompson, Schwarz, Karas 2003; commercial Thermo). Isobaric labels with reporter ion cluster in MS2. TMT 16-plex (2020), TMT 18-plex (2024). MS3 synchronous precursor selection (SPS-MS3) suppresses ratio compression. Workhorse for cell line panels, drug treatment matrices, large clinical studies. Competing iTRAQ (Sciex) 8-plex now niche.
- DIA — data-independent acquisition (SWATH; Gillet, Aebersold 2012 MCP). Sequential precursor windows fragment all ions; deep reproducibility. Spectronaut (Biognosys), DIA-NN (Demichev, Ralser; open-source, ML-based scoring), Skyline (MacCoss UW). DIA has overtaken DDA as the default acquisition mode in clinical and large-cohort discovery proteomics 2022 onward.
- PRM — parallel reaction monitoring. Targeted; high-resolution alternative to SRM/MRM (triple-quad).
Software and databases
- MaxQuant (Cox-Mann; Andromeda search engine integrated). Free, dominant in DDA label-free / SILAC / TMT.
- Spectronaut, DIA-NN, FragPipe (Nesvizhskii, Michigan; MSFragger search engine) — DIA leaders.
- Skyline (Michael MacCoss UW) — targeted PRM/SRM.
- Proteome Discoverer (Thermo), Mascot (Matrix Science), Sequest (Yates 1994).
- UniProt-SwissProt / TrEMBL (~250 million proteins; manually curated and computationally translated).
- PRIDE (EBI; ~1 million files; the proteome equivalent of GEO).
- PaxDB, ProteomicsDB, MaxQB — proteome-wide abundance.
- CPTAC — NCI Clinical Proteomic Tumor Analysis Consortium; pan-cancer proteogenomics.
Single-cell proteomics
- SCoPE-MS / SCoPE2 (Slavov lab Northeastern 2018, 2021) — TMT-based with carrier channel boosting low-input ionization.
- nanoPOTS — nanodroplet processing in one pot for trace samples (Kelly, Smith, PNNL 2018 Nat Comm). Sub-µL volumes, picogram input.
- plexDIA (Demichev, Ralser, Slavov 2022 Nat Biotech) — mTRAQ-multiplexed DIA single-cell.
- Nautilus (next-gen single-molecule protein sequencing in development), Quantum-Si Platinum (commercial single-molecule peptide sequencer, 2023+), Nanostring nCounter / GeoMx / CosMx, Nanopore protein sequencing — multiple pre-commercial platforms competing to displace MS for ultra-low input.
- Encodia / Nanopro / Erisyon — ProteoCode style fingerprinting (truncation-edge sequencing).
Affinity / aptamer proteomics
- Olink (Uppsala; founded 2016; IPO 2021; acquired by Thermo Fisher 2024 for USD 3.1 B). Proximity Extension Assay — dual antibody pair with DNA-tag proximity ligation → qPCR or NGS. Olink Explore HT covers ~5000 plasma proteins; UK Biobank cohort 54k participants → 250k+ by 2026. Has become the reference plasma deep-proteome resource.
- SomaLogic SomaScan — SOMAmer modified aptamers; ~7000 plasma proteins (v4.1 SomaScan 11K coming online). deCODE/Amgen and UK Biobank reference cohorts.
Plasma proteomics at population scale
UK Biobank PPP (Pharma Proteomics Project) — Olink + SomaScan profiling drives massive pQTL, biomarker, and drug-target discovery (Sun et al. 2023 Nature; Pietzner et al. 2021 Science). All of Us, deCODE, FinnGen, Estonian Biobank, China Kadoorie Biobank, BioBank Japan have similar efforts in progress. Cross-link genetics-and-genomics for pQTL methodology.
Part II — Metabolomics
Targeted vs untargeted
Targeted. Defined panel (50–500 metabolites) with authentic standards for absolute quantification (μM units). Commercial kits: AbsoluteIDQ (Biocrates Innsbruck — p180, p400, MxP Quant 500), Lipidyzer (Sciex). Workflow: standards → calibration → run → quant. Gold-standard for clinical biomarker translation.
Untargeted. Broad-coverage discovery; thousands of features per sample; relative quantification. Identification via accurate mass + isotope pattern + MS/MS fragmentation library matching + retention time. Confidence levels Metabolomics Standards Initiative (MSI): L1 (authentic standard), L2 (library MS/MS match), L3 (putative class), L4 (unknown).
Instrumentation
- High-res MS — Thermo Orbitrap (Q Exactive HF, Exploris); Bruker timsTOF (ion mobility separates isomers); Waters Synapt G2-Si / Cyclic IMS; Agilent 6546 LC/Q-TOF.
- GC-MS for volatile and derivatized (TMS, MOX-TMS) metabolites; Pegasus BT (LECO; TOF); Agilent 5977.
- CE-MS capillary electrophoresis (Human Metabolome Technologies / Soga; polar / charged metabolites).
- NMR metabolomics — ¹H NOESY-presat (urine, plasma); CPMG (T2-edited for macromolecule suppression); Bruker Avance, JEOL. Lower sensitivity vs MS but reproducible, non-destructive, fully quantitative, hands-off; widely used in pharma toxicology and large epidemiology cohorts.
Separation
HILIC (hydrophilic interaction LC; polar metabolites — amino acids, nucleotides, sugars) and reversed-phase C18 (lipids, drugs). UPLC at 0.2–0.5 mL/min, sub-2 µm particles. Ion mobility (DTIMS, TWIMS, TIMS, FAIMS) adds orthogonal CCS dimension.
Libraries
- HMDB — Human Metabolome Database (Wishart, U Alberta; v5 ~220k metabolites with experimental and predicted spectra).
- METLIN (Siuzdak, Scripps; over 1 million spectra; CCS values included).
- MoNA — MassBank of North America (Fiehn UC Davis).
- MassBank (European/Japanese federation).
- LIPID MAPS (Subramaniam, UCSD; comprehensive lipid classification + structures + spectra).
- GNPS — Global Natural Products Social (Dorrestein UCSD; molecular networking; community-shared library; natural products focus).
Lipidomics
Shotgun lipidomics (Han, Gross 2005 Mass Spectrom Rev) — direct-infusion ESI with class-specific scans; bypasses chromatography. LC-based lipidomics dominates for isomer separation. Lipotype (Berlin; ~5000 lipid species); Metabolon; Owlstone Medical (breath VOC); Q-Linea. LIPID MAPS classification: Fatty acyls, Glycerolipids, Glycerophospholipids, Sphingolipids, Sterol lipids, Prenols, Saccharolipids, Polyketides.
Software
- XCMS (Smith, Scripps 2006; R/Bioconductor; XCMS Online cloud).
- MZmine 3 (open-source modular pipeline).
- MetaboAnalyst (Xia, McGill; web-based stats and pathway analysis).
- Compound Discoverer (Thermo), Progenesis QI (Waters).
- mzMine, MS-DIAL (Tsugawa).
- GNPS Molecular Networking for natural-product clustering.
Applications
Clinical biomarkers (cardiovascular metabolomics — Mayo/JHU; oncology TMA cohorts), drug response / pharmacometabolomics, microbiome-host co-metabolism (SCFAs, secondary bile acids, indole derivatives), exposome (Wild 2005 concept; profiling environmental/diet exposures), cardiometabolic (T2D, obesity), neurodegenerative (CSF profiling Alzheimer’s, Parkinson’s), inborn errors of metabolism (newborn screening MS/MS panel 30+ disorders).
Foundation models for omics
- scGPT (Cui, Wang, Toronto / Vector 2024 Nature Methods) — pre-trained transformer on 33M cells across organs; cell-type annotation, perturbation prediction.
- Geneformer (Theodoris, Ellinor 2023 Nature) — transformer on 30M cells; in silico perturbation.
- Nucleotide Transformer (InstaDeep, NVIDIA, Inria 2023).
- Evo (Arc Institute 2024; long-context StripedHyena on 2.7M genomes).
- HyenaDNA (Stanford 2023).
- ESM-3 (Meta AI 2024) — multi-modal protein foundation; 98B-parameter top variant predicts sequence, structure, function jointly.
- Boltz-1 (MIT 2024 open-source) — protein-ligand complex predictor following AlphaFold 3 lineage.
Cross-link genetics-and-genomics and structural-biology.
Part III — Computational Neuroscience
Single-neuron models
- Hodgkin–Huxley 1952 J Physiol — squid giant axon membrane currents; 4-variable model (V, m, h, n); Nobel 1963 (Hodgkin, Huxley, Eccles).
- Cable theory and compartmental models. Rall (NIH 1960s); discretized dendrites for spatial spread of synaptic currents. NEURON simulator (Hines, Carnevale Yale 1997) and GENESIS are the canonical compartmental simulators.
- Reduced models. Leaky integrate-and-fire (LIF; Lapicque 1907 originated). Izhikevich 2003 simple model — 2 ODEs captures most cortical firing patterns; computationally cheap.
- AdEx — Adaptive exponential I&F (Brette-Gerstner 2005); used in HBP / EBRAINS large-scale models.
Network-level models
- Rate models. Wilson–Cowan 1972 — excitatory/inhibitory population dynamics; mean-field. Used for cortical oscillations, decision-making.
- Spiking neural networks (SNN). Brian2 (Stimberg, Brette, Goodman 2019 eLife); NEST (Diesmann, Gewaltig); GeNN GPU-accelerated (Nowotny); CARLsim; NEMO.
- Reservoir computing / Liquid State Machine. Maass, Natschläger, Markram 2002. Echo State Network — Jaeger 2001. Random recurrent network as fixed feature map; only readout trained.
Plasticity rules
- Hebbian. Hebb 1949 “cells that fire together wire together.”
- BCM (Bienenstock, Cooper, Munro 1982) — sliding threshold; explains experience-dependent V1 plasticity.
- STDP — spike-timing-dependent plasticity (Markram, Sakmann 1997; Bi-Poo 1998; Magee 1997). Asymmetric pre-post timing kernel.
- Homeostatic (Turrigiano 1998 Nature) — synaptic scaling for stability.
- Three-factor rules — coincidence + neuromodulator (dopamine) for credit assignment (Frémaux-Gerstner 2016).
Learning theory
- Bayesian brain (Friston, Mumford, Knill, Pouget) — free energy principle; active inference.
- Predictive coding (Rao-Ballard 1999 Nature Neurosci) — feedback predictions, feedforward errors; recently linked to amortized inference / variational autoencoders.
- Reinforcement learning and dopamine. Schultz, Dayan, Montague 1997 Science — “A neural substrate of prediction and reward” — dopamine encodes temporal-difference reward prediction error. Sutton-Barto framework. Recent extensions: distributional RL (Dabney 2020 Nature — dopamine encodes distribution, not mean).
Connectomics
- C. elegans 1986 (White, Southgate, Thomson, Brenner Phil Trans R Soc B) — 302 neurons, ~7000 connections; full hermaphrodite connectome by EM serial section. Male connectome completed by Cook, Hall, Emmons 2019.
- Drosophila — FAFB Full Adult Fly Brain (Zheng-Bock 2018 Cell) ~25 TB EM; Janelia FlyEM hemibrain (Scheffer 2020 eLife); 2024 FlyEM consortium published full adult-female brain connectome ~140k neurons (Dorkenwald 2024 Nature).
- Mouse cortex MICrONS (Allen Institute + Princeton + Baylor 2023 Nature 6-paper package) — 1 mm³ V1/HVA cortex; ~200k neurons, ~500M synapses, paired with functional 2-photon imaging.
- Human Connectome Project (HCP; Van Essen, Glasser et al. 2010s) — diffusion MRI + resting-state fMRI on 1200 subjects; macro-scale.
Brain atlases
- Allen Brain Atlas family — Allen Mouse Brain Atlas (Lein 2007 Nature); Allen Human Brain Atlas; Allen Brain Cell Atlas (ABCA) 2023 (32 million cells, mouse whole-brain 5300 cell types).
- BICCN — BRAIN Initiative Cell Census Network — multi-modal MERFISH + scRNA + epigenomics primary motor cortex (Yao 2021 Nature package).
- HuBMAP, BRAINSPAN (developmental human, Brain Initiative).
Brain-computer interfaces
- BrainGate — Donoghue, Hochberg 2006 Nature first human reach with Utah array.
- Synchron Stentrode (Oxley) — endovascular electrode through jugular; six-patient ALS/SCI trial reading communication signals (COMMAND 2022).
- Neuralink — N1 device; first human implant (Noland Arbaugh) Jan 2024 controlling cursor / chess; ~5-7 patients by mid-2026; precision threads with surgical robot.
- Precision Neuroscience (Rapoport) Layer 7 cortical interface — high-density flexible array on cortical surface, no penetration.
- Paradromics, Blackrock NeuroPort, Onward Medical, INBRAIN Neuroelectronics — competitive landscape.
Models of computation
- Spaun (Eliasmith 2012 Science) — 2.5M-neuron functional brain model; Semantic Pointer Architecture; 8 simple cognitive tasks.
- Blue Brain Project (Markram EPFL; 2005 founded; ~€1B accumulated funding; concluded 2024 as initially conceived). Detailed cortical column simulation.
- BrainScaleS (Schemmel, Meier Heidelberg) — analog neuromorphic on wafer-scale.
- Loihi (Intel) and TrueNorth (IBM; Modha 2014) — digital neuromorphic chips. Loihi 2 (2021) with up to 1M neurons. NorthPole IBM 2023 inference accelerator successor.
- Akida (BrainChip), SpiNNaker (Furber Manchester) — alternative neuromorphic platforms.
Simulators
NEURON (compartmental), NEST (point neurons), Brian2 (Python-first), GeNN (GPU), CARLsim, NEMO, PyNN (cross-simulator interface), Nengo (Eliasmith; SPA), BrainPy (Wang Hong Kong; JAX-based 2023+), Arbor (multi-compartment GPU; Klijn et al. EBRAINS).
Neural data analysis
- Spike sorting. Kilosort 4 (Pachitariu, Stringer; UCL/Janelia), MountainSort, JRCLUST. Tetrode and high-density Neuropixels (IMEC; Steinmetz 2017+) probes (>3000 sites; up to 384 simultaneous channels in NPX 2.0; Ultra in 2024).
- Two-photon Ca²⁺ imaging. CaImAn (Giovannucci 2019 eLife), Suite2p (Pachitariu, Stringer 2017). Resonant scanning ~30 Hz; mesoscope two-photon (Sofroniew 2016) ~5 mm FOV.
- One-photon miniature endoscope. Inscopix nVista, UCLA Miniscope (Aharoni 2019), CNMF-E (Zhou 2018).
- SLM multiphoton for targeted holographic photostimulation (Vaziri, Yuste, Emiliani).
Behavioral analysis ML
- DeepLabCut (Mathis 2018 Nat Neurosci) — transfer-learned ResNet pose estimation; open-source. Now multi-animal DLC.
- SLEAP (Pereira, Murthy 2022 Nat Methods) — multi-animal pose; alternative to DeepLabCut.
- MoSeq — Motion Sequencing (Wiltschko, Datta 2015 Neuron) — depth video → HMM behavioral syllables.
- B-SOID (Hsu, Yttri 2021 Nat Comm), VAME, A-SOID, Keypoint-MoSeq (Weinreb 2023) — unsupervised behavior segmentation.
Reverse engineering of behavior
- IBL — International Brain Laboratory (22 labs; standardized mouse 2AFC visual decision task; open data and pipelines; founded 2017). The “first big team” model for neuroscience.
- BICAN — BRAIN Initiative Cell Atlas Network (next-gen post-BICCN, 2024-).
Foundation models for neural data
- NeuroFormer (Antoniades 2023; preprint) — transformer pretrained on neural population recordings.
- Brain2Music (Google 2023) — fMRI → music reconstruction.
- fMRI image reconstruction. Takagi, Nishimoto 2023 CVPR — stable diffusion conditioned on BOLD activity reconstructs viewed images.
- MindEye / MindEye2 (Scotti, Banerjee, Cohen 2024) — improved fMRI-to-image with subject generalization.
- Tang, Huth 2023 Nature Neuroscience — “Semantic reconstruction of continuous language from non-invasive brain recordings” — GPT-based decoder of meaning from fMRI.
DeepMind brain-related directions
ICM (Inferred Continuous-Discrete Models for time-series brain dynamics); JEPA-style world models extended to neuroscience; Trans4mer cortex models for next-decade-scale large neural data.
Adjacent
- cell-molecular-biology — protein expression and post-translational machinery that proteomics measures.
- genetics-and-genomics — sequencing technologies; pQTL and eQTL methodology.
- structural-biology — protein structure complements sequence + abundance.
- immunology-foundations — Olink panels include large immune-relevant subsets.
- neuroscience-foundations — physiology and circuits underlying computational models.
- microbiology-foundations — gut microbiome metabolomics.
- marine-biology — marine natural product metabolomics via GNPS.
- virology-and-vaccine-platforms — antibody-based proteomics and population-scale serology.
- analytical-chemistry-methods — LC-MS, NMR, ion mobility instrumentation in detail.
- biochemistry-foundations — metabolic pathway logic.
- medicinal-and-photo-chemistry — pharmacometabolomics, chemoproteomic target ID.
- biomedical-engineering-and-neural-interfaces — Neuralink, BrainGate, Stentrode hardware.
- machine-learning-systems — foundation models, distributed training of biological models.
- surgical-and-neural-robotics — neurosurgical robots (Neuralink R1) for BCI implantation.