Four roles, one record of how each result was produced. Pick yours.
Deliver FASTQs and processed outputs with the pipeline that produced them, no shipped drives or shared S3 keys.
Run RNA-seq, ATAC-seq, or variant-calling pipelines without command-line work.
Keep study metadata, sequencing runs, and analyses in one record per study.
Take a count matrix to figures without provisioning compute.
Seven recurring pains in computational biology, and what Horizon does about each.
Before you run a single job, you're filing tickets. Waiting on IT. Explaining what a compute node is to someone who manages Windows laptops. Getting cloud infrastructure stood up is a weeks-long detour that has nothing to do with your science.
A gold-standard RNA-seq pipeline isn't something you download. It's assembled, containerized, stress-tested, and validated — a multi-week effort requiring real computational expertise. Most bench scientists either skip this step and accept substandard results, or wait for an informatician who's already buried.
AWS Batch, IAM roles, S3 permissions, job queues — connecting a bioinformatics pipeline to cloud compute is a discipline unto itself. It's not your discipline, and it shouldn't have to be.
Adjusting a p-value threshold or fold-change cutoff should take seconds — like tasting a dish and adding salt. Instead it means filing a re-run request, joining a queue, and waiting hours or days. Want to change your sample-to-sample comparison? Same process, all over again.
A completed analysis sitting on a server is worth nothing until your collaborators can see it. Exporting figures, packaging outputs, and transmitting securely can take days — during which your collaborator is blocked and the science is stalled.
Eighteen months after a run, a reviewer asks exactly which pipeline version, parameter set, and input files produced Figure 3. Or a regulatory submission requires a complete computational audit trail. If that record wasn't captured automatically, reconstructing it is an archaeological dig — and sometimes the artifact is just gone.
Datasets stored on an individual's machine, a local server, or a CRO's infrastructure are one offboarding or budget cut away from disappearing. In a field where longitudinal data is irreplaceable, this is an unacceptable fragility.