This page is part of a copy of HIRIS frozen in time on 26 October 2017. The permanent URL of this frozen version is: https://mullinslab.microbiol.washington.edu/hiris/frozen/Edlefsen_et_al_2018/.
If you’re not using this frozen version for analysis, you may want to switch to the latest version.
Please read the ISDB manual for guidelines on using HIRIS and the data contained within.
These downloadable datasets are broad summaries of the full data contained within HIRIS. Many questions are best answered by querying the database directly. (See below for connection details.) If you’re unfamiliar with SQL databases, you’re encouraged to collaborate with your local bioinformatics practitioner.
Note that integration sites may be covered by multiple genes (such as those on opposite strands or read-through variants) and if so will be reported once per gene, resulting in multiple rows for a single integration site. If you want to count integration sites, you should use the integration summary without the gene annotations.
Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON
The Edlefsen_et_al_2018 version is from Thu Oct 26 11:46 2017 PDT.
This dataset contains the following fields:
environment
subject
landmark
location
orientation_in_landmark
multiplicity
source_names
pubmed_ids
The manual contains more information on this downloadable dataset.
Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON
The Edlefsen_et_al_2018 version is from Thu Oct 26 11:46 2017 PDT.
This dataset contains the following fields:
environment
subject
ncbi_gene_id
gene
landmark
location
orientation_in_landmark
orientation_in_gene
multiplicity
source_names
pubmed_ids
The manual contains more information on this downloadable dataset.
Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON
The Edlefsen_et_al_2018 version is from Thu Oct 26 11:45 2017 PDT.
This dataset contains the following fields:
environment
ncbi_gene_id
gene
subjects
unique_sites
proliferating_sites
total_in_gene
The manual contains more information on this downloadable dataset.
If you’re handy with code, you can connect to the database directly as a read-only user. We like both writing SQL and using R’s dplyr package. To connect to the database, try:
Type | PostgreSQL |
---|---|
Host | hiris.mullins.microbiol.washington.edu |
User | hiris_ro |
Database name | isdb_2016-10-25_transcribed_genes_public |
You may also find the reference of available tables handy.
If you’re interested in a specific region of the human genome, such as a gene, the UCSC Genome Browser is a great way to start visualizing the data. We provide two custom tracks which you can load into the genome browser, one for in vivo sites and another for in vitro sites.
If you’d rather view the data in other genomics software such as IGV, you can download the track data as BED files.
A GFF3 file is produced for each gene containing integration sites, letting you quickly focus on a specific gene of interest.
Get started by learning how to use these exports. Once familiar with their data, you can browse the files or download a zip file containing them all.
HIRIS is under active development. However, we don’t know of any issues affecting data quality at the moment. Hooray!
If you spot a problem in HIRIS, please let us know. We strive to maintain high quality data ready to be used in analysis.