Get the Data — HIRIS: HIV-1 Reservoirs Integration Sites

Download Edlefsen_et_al_2018 datasets

Please read the ISDB manual for guidelines on using HIRIS and the data contained within.

These downloadable datasets are broad summaries of the full data contained within HIRIS. Many questions are best answered by querying the database directly. (See below for connection details.) If you’re unfamiliar with SQL databases, you’re encouraged to collaborate with your local bioinformatics practitioner.

Note that integration sites may be covered by multiple genes (such as those on opposite strands or read-through variants) and if so will be reported once per gene, resulting in multiple rows for a single integration site. If you want to count integration sites, you should use the integration summary without the gene annotations.

Integration summary

Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON

The Edlefsen_et_al_2018 version is from Thu Oct 26 11:46 2017 PDT.

This dataset contains the following fields:

environment
subject
landmark
location
orientation_in_landmark
multiplicity
source_names
pubmed_ids

The manual contains more information on this downloadable dataset.

Integration summary with annotated genes

Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON

The Edlefsen_et_al_2018 version is from Thu Oct 26 11:46 2017 PDT.

This dataset contains the following fields:

environment
subject
ncbi_gene_id
gene
landmark
location
orientation_in_landmark
orientation_in_gene
multiplicity
source_names
pubmed_ids

The manual contains more information on this downloadable dataset.

Summary by gene

Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON

The Edlefsen_et_al_2018 version is from Thu Oct 26 11:45 2017 PDT.

This dataset contains the following fields:

environment
ncbi_gene_id
gene
subjects
unique_sites
proliferating_sites
total_in_gene

The manual contains more information on this downloadable dataset.

Access the database

If you’re handy with code, you can connect to the database directly as a read-only user. We like both writing SQL and using R’s dplyr package. To connect to the database, try:

Type	PostgreSQL
Host	`hiris.mullins.microbiol.washington.edu`
User	`hiris_ro`
Database name	`isdb_2016-10-25_transcribed_genes_public`

You may also find the reference of available tables handy.

Genome Browser and IGV

If you’re interested in a specific region of the human genome, such as a gene, the UCSC Genome Browser is a great way to start visualizing the data. We provide two custom tracks which you can load into the genome browser, one for in vivo sites and another for in vitro sites.

If you’d rather view the data in other genomics software such as IGV, you can download the track data as BED files.

Annotation files per gene

A GFF3 file is produced for each gene containing integration sites, letting you quickly focus on a specific gene of interest.

Get started by learning how to use these exports. Once familiar with their data, you can browse the files or download a zip file containing them all.

Known issues

HIRIS is under active development. However, we don’t know of any issues affecting data quality at the moment. Hooray!

If you spot a problem in HIRIS, please let us know. We strive to maintain high quality data ready to be used in analysis.

HIRIS: HIV-1 Reservoirs Integration Sites “high-rise”

Download Edlefsen_et_al_2018 datasets

Integration summary

Integration summary with annotated genes

Summary by gene

Access the database

Genome Browser and IGV

Annotation files per gene

Known issues