Please read the ISDB manual for guidelines on using HIRIS and the data contained within.
These downloadable datasets are broad summaries of the full data contained within HIRIS. Many questions are best answered by querying the database directly. (See below for connection details.) If you’re unfamiliar with SQL databases, you’re encouraged to collaborate with your local bioinformatics practitioner.
Note that integration sites may be covered by multiple genes (such as those on opposite strands or read-through variants) and if so will be reported once per gene, resulting in multiple rows for a single integration site. If you want to count integration sites, you should use the integration summary without the gene annotations.
Download latest version as: CSV, Excel, JS, or JSON
The latest version is from Sat May 26 07:15 2018 PDT.
This dataset contains the following fields:
environment
subject
landmark
location
orientation_in_landmark
multiplicity
source_names
pubmed_ids
The manual contains more information on this downloadable dataset.
Download latest version as: CSV, Excel, JS, or JSON
The latest version is from Sat May 26 07:15 2018 PDT.
This dataset contains the following fields:
environment
subject
ncbi_gene_id
gene
gene_type
landmark
location
orientation_in_landmark
orientation_in_gene
multiplicity
source_names
pubmed_ids
The manual contains more information on this downloadable dataset.
Download latest version as: CSV, Excel, JS, or JSON
The latest version is from Sat May 26 07:15 2018 PDT.
This dataset contains the following fields:
environment
ncbi_gene_id
gene
gene_type
subjects
unique_sites
proliferating_sites
total_in_gene
The manual contains more information on this downloadable dataset.
When it’s time to do a set of analyses, we freeze a version of HIRIS at a moment in time. All analysis is then done on that frozen version, ensuring that the data doesn’t drift as analyses are tweaked and finalized. This is similar to using a single, specific version of the human genome while working on a study.
The following frozen versions are available:
If you’re handy with code, you can connect to the database directly as a read-only user. We like both writing SQL and using R’s dplyr package. To connect to the database, try:
Type | PostgreSQL |
---|---|
Host | hiris.mullins.microbiol.washington.edu |
User | hiris_ro |
Database name | hiris |
You may also find the reference of available tables handy.
If you’re interested in a specific region of the human genome, such as a gene, the UCSC Genome Browser is a great way to start visualizing the data. We provide two custom tracks which you can load into the genome browser, one for in vivo sites and another for in vitro sites.
If you’d rather view the data in other genomics software such as IGV, you can download the track data as BED files.
A GFF3 file is produced for each gene containing integration sites, letting you quickly focus on a specific gene of interest.
Get started by learning how to use these exports. Once familiar with their data, you can browse the files or download a zip file containing them all.
HIRIS is under active development. However, we don’t know of any issues affecting data quality at the moment. Hooray!
If you spot a problem in HIRIS, please let us know. We strive to maintain high quality data ready to be used in analysis.