Base5 Data Generator
Our Sample-to-Solution workflow delivers ground truth resolution in complex immunogenomic regions to enable better clinical outcomes

01
What We Consume
-
Large (200kb-4Mb) genomic targets
-
Multiple targets in same tube (e.g. MHC + KIR, LRC, NKC)
A Sample
-
Cultured cells (10 million +)
-
Whole Blood (2 mL +)

02
Our 3-Stage Process
Bias free high molecular weight enrichment
High accuracy, 8kb -20kb+ reads
Reference-free assembly and phasing

03
What We Produce
All genomic sequencing information, including:
-
Phased allele calls
-
Novel allele detection
-
Full length haplotypes
Full-stack platform ~3 weeks turnaround time from sample delivery to report
Best foundation for any downstream analysis
Unprecedented resolution in the MHC/HLA locus
Fully contiguous full length haplotype resolved de novo MHC assemblies for heterozygous samples
-
Entire MHC locus assembled with single-base resolution for both haplotypes
-
4-field allele resolution over 4.5mb assembled region
-
Phasing possible for allele calls over entire MHC region
-
Exact mapping for structural variants



Unprecedented resolution in the KIR region
Fully contiguous phased KIR haplotypes for heterozygous samples
-
Entire 250kb KIR locus assembled and phased with single basepair resolution for both haplotypes
-
KIR allele calls at highest possible resolution (7-digits) including novel allele detection
-
Full phasing and haplotyping of all genes across the entire KIR locus
-
Exact mapping and characterization of structural variants such as gene-block tandem repeats
-
Full visibility into intergenic regions
-
No other methods reveal this level of clarity

-
How is Base5's Data Generator different?Although other methods exist for ultra high resolution gene typing, the Base5 Data Generator is different in a number of ways: A single assay for any or all multimegabase regions of interest (e.g. MHC, KIR, and NKC) containing 100s of genes Novel allele detection - we sequence each gene at single base resolution, which allows us to detect and report novel alleles Phasing information across the entire region - where other platforms can at most phase single genes, Base5’s Data Generator can phase across the entire region revealing multimegabase haplotypes, e.g. HLA-DRA alleles phased with HLA-DQB2 alleles and all HLA alleles in between. Minimal bias - bias-free enrichment, no amplification, no allele dropout, reference-free assembly, allele calling from de novo assembly.
-
What kinds of samples can Base5’s Data Generator process?Cultured cells (10 million +) Whole Blood (2 mL +)
-
What regions can Base5’s Data Generator target?The Data Generator platform is target agnostic, so we can target any or multiple regions of interest in the human genome. The ideal region size ranges from 50 kilobases to 5 megabases. We routinely target MHC (HLA), KIR, extended LRC, NKC, CYP2D6.
-
What is the turn-around-time?2-3 weeks from receipt of the samples
-
Is Base5’s Data Generator CLIA certified?No, the Data Generator is suited for RUO.
-
What other products does Base5 offer?Base5’s Insight Lens is a force multiplier to existing methods for short read sequencing data, including variant detection, HLA/KIR typing, imputation and GWAS fine-mapping. Insight Lens’ superior performance stems from the most complete pan-immunogenome reference constructed by the Data Generator. Furthermore, Data Generator customers can gain important context for their results with the Insight Lens pan-immunogenome graph. Base5’s Discovery Engine is currently in development, but is coming soon - it will take the field to a new era by leveraging unmatched data quality with machine learning powered analysis to find and interpret altogether new information.
-
Is Base5 Genomics restricted to human samples?Although, our current product offering are for human samples, the underlying technology is more widely applicable. Feel free to reach out if you want to hear more.
-
What file format should I use?The imputation server accepts, preferably gzipped, VCF-files as well as 23andMe or Ancestry TXT-files. For VCF format, the imputation server accepts a gzipped VCF-file (recommended) or a VCF-file for a single chromosome (chromosome 6). Coordinates should be NCBI36, GRCh37 or GRCh38. You may further filter your VCF file to include only the MHC region (GRCh38 chr6:28509120-33481577 | GRCh37 chr6:28476897-33449354 | NCBI36 chr6:28585776-33556332). For 23andMe or Ancestry files, the imputation server accepts a gzipped TXT-file (recommended) or a TXT-file. Coordinates should be NCBI36, GRCh37 or GRCh38. This is indicated in the header of the file, where it will say assembly or reference build 36, 37, or 38, which corresponds to NCBI36, GRCh37 or GRCh38 respectively. You may filter the file to only contain chromosome 6.
-
How many samples can I submit at one time?We accept VCF files with 1 to 100 samples. Contact us if you would like to submit a job with more than 100 samples. 23andMe and Ancestry files contain one sample per file.
-
What is the turnaround time?It typically takes just a couple of minutes before you receive an email with a link to your result. If you submit a lot of samples it may take up to 10 minutes. Your can view the status of your job in the jobs table.
-
How do I get my results?When your job finishes you get an email with a link to download your results. You can also find the link to download your results in the jobs table. The link expires after 24 hours. The link will download a password protected compressed archive (zip). The password required to unzip the archive is provided in the email. If you are having trouble downloading the file, copy the link address of the download button and paste it into your browser. You may have some security settings preventing the download to go through directly.
-
Is there an example file that I can use to test the imputation server?Yes, you can use this file. It contains variant calls for the 7 NIST GIAB samples subset for the ThermoFisher PMDA array. You will need to select GRCh38 as the array built.
-
Can I share the variant imputation Early Access Program with friends or colleagues?Yes, you can share the Early Access Program landing page and password with people in your network. Please keep in mind, however, that as stated in our terms of use we are not compliant for non-US samples at this moment.
Want to Learn More?
We would love to meet with you and talk about how Base5 can supercharge your development capabilities.