bringing data and tools together

Genboree is a web-based platform for multi-omic research and data analysis using the latest bioinformatics tools.

You can upload your data and perform various analyses using a “drag and drop” user interface. Keep it private or share with collaborators. Bioinformatics tools and computational infrastructure are available for researchers who may not have programming expertise, or the time to pursue technical programming and/or scripting.

Get Started



The Genboree Workbench manages the many technical software and hardware aspects of genome-centric research for end-users. The Workbench contains bioinformatics tools useful for analyses in genomics, epigenomics, metagenomics, and transcriptomics, and a “drag and drop” interface makes the Workbench easy to use.

Access the Genboree Workbench

The Epigenome Toolset

Tools for analyzing DNA methylation and histone marks at the level of whole genomes, pathways, gene elements (i.e.. promoters, enhancers), and genetic loci to identify biological patterns. Read more about the Epigenomics Toolset

Epigenome Toolset Heatmap


The heatmap tool performs hierarchical clustering based on pairwise correlation of normalized epigenomic data sets. Users can choose from many available methods implemented in R for normalization, correlation, distance measure calculation, hierarchical clustering, and visualization.

Epigenome Toolset Spark


Interactive visualization for k-means clustering of epigenomic data. Spark helps discover and visualize patterns of epigenomic profiles on a genome-wide scale. Regions of interest (i.e. promoters, enhancers, etc) can be downloaded in the form of annotation tracks for downstream analyses.

The Transcriptome Toolset

Tools and pipelines for RNA-Seq data analysis. More details coming soon...

RNA-Seq - Small RNA

Small RNA-seq pipeline:

The small RNA-seq pipeline is for the processing and analysis of RNA-seq data generated to profile small exRNAs. It can handle multiple libraries and outputs abundance estimates, a variety of quality control metrics such as read-length distribution, summaries of reads mapped, and detailed information for each read mapped to each library.

RNA-Seq - Long RNA

Long RNA-seq pipeline:

The long RNA-seq pipeline is for the processing and analysis of RNA-seq data generated from long-RNAs. The pipeline performs a quality check (FastQC), maps reads to a reference genome (Bowtie2), and post-processes the aligned reads (Samtools). The pipeline also performs gene-expression quantification, generates tracks for visualization, calculates mapping bias, and computes annotation coverage (all using RSEQtools).

The Microbiome Toolset

An interactive environment to conduct 16S rRNA microbiome and whole genome shotgun (WGS) sequencing analyses. The Toolset drives hypothesis generation by providing a wide range of analyses for studying metagenomic datasets. More details coming soon...

16S rRNA beta diversity clustering by clinical metadata

16S rRNA:

The 16S rRNA metagenomics toolset equips a researcher to explore their microbiome samples for a variety of analysis types such as alpha diversity, beta diversity, phylogenetic profiling, supervised machine learning, and feature selection.

whole metagenomic shotgun cladogram biomarkers

Shotgun Metagenomic Sequencing (WGS) (Coming Soon):

The shotgun metagenomic sequencing toolset empowers users to explore ‘what is there’ and ‘what are they doing’ types of questions. Taxonomic identity assesses identifiable organisms and results in summary matrices (relative percentages), biomarker discovery, heatmaps, and cladograms. Functional annotation identifies potentially active pathways from KEGG orthologous genes in which to compare and contrast sample groups.


Online collaboration

The Genboree Commons is a place to create projects, discussion forums and wikis, and to share documents with your colleagues.

All of your documents and content are private within a project unless you choose to share with others. It’s easy to add colleagues to your projects, change access privileges, and communicate with one another.

Access the Genboree Commons

genboree commons provides support for documents, projects, wikis, and much more

KB - New!

GenboreeKB - beta

GenboreeKB is a core component of ClinGenDB, the database for the Clinical Genome Resource (ClinGen) project, an NIH-funded program dedicated to creating a database of clinically relevant genomic variants to inform genome interpretation. ClinGen is supporting algorithmic and expert curation of data from a variety of sources, including hundreds of thousands of test results shared by clinical genetics laboratories through ClinVar and other research data and bioinformatic predictions.

Access the Genboree KnowledgeBase



A Data Repository of Tissue-specific Epigenomic States

The Human Epigenome Atlas contains human reference epigenomes and the results of integrative and comparative analyses. Atlas data provides detailed insights into locus-specific epigenomic states, including histone marks and DNA methylation across tissues, cell types, developmental stages, physiological conditions, genotypes, and disease states.

Access the Human Epigenome Atlas


exrna research Image: National Institutes of Health

NIH ExRNA Communication Program

The goal of the ERCP is to better understand the fundamental biological mechanisms of extracellular RNA (exRNA) generation, secretion, and transport, to create a public dataset of where exRNAs exist in normal human body fluids, and to explore their potential as therapeutics and biomarkers.

Learn more about the ERCP.

clingen research Image: National Institutes of Health

The NIH ClinGen Resource

One of the components of the Clinical Genome Resource project is ClinGenDB infrastructure to enable the development of a knowledge base about genetic variants of clinical significance. ClinGenDB infrastructure consists of databases and web services implemented using Genboree KnowledgeBase (GenboreeKB).

Learn more about ClinGen.

epigenomics research Image: National Institutes of Health

NIH Roadmap Epigenomics Mapping Consortium

The goal of this consortium is to map DNA methylation, histone modifications, chromatin accessibility, and small RNA transcripts in tissues and organs frequently involved in human disease.

Learn more about the Epigenomics Roadmap project.

genboree methylation of dna

About Genboree

The Bioinformatics Research Laboratory (BRL) has developed Genboree and is comprised of bioinformaticians, software engineers, biologists, research scientists, graduate students, and interns. Projects span a variety of areas, including epigenomics, clinical genomics, cancer biology, metagenomics, algorithm development, and semantic web technology.

Computing Resourses

Genboree computing is supported by an extensive infrastructure including both local clusters and cloud computing access. BRL has 480 compute and server nodes and over 80TB of disc space, including redundant servers hosting the Genboree system.

Application Progamming Interface (API)

The Genboree API exposes the data entities represented within Genboree and also allows authorized users to modify stored data. The API is based on REST principles and a Resource Oriented Architecture (ROA).

Genboree Network

The Genboree Network is comprised of Genboree servers containing tools and services across the web that connect Genboree Workbench installations in different geographical locations via HTTP-based Application Programming Interfaces (APIs). The Genboree Network provides an innovative solution to problems hampering basic and translational applications of massively parallel sequencing: availability of bioinformatic tools for non-programmers, data access, access to cloud computing resources, web-based collaborative and data management resources.

BRL gratefully acknowledges the generous support of the following funding agencies: