HIRIS: HIV-1 Reservoirs Integration Sites “high-rise”

Download latest datasets

Please read the ISDB manual for guidelines on using HIRIS and the data contained within.

These downloadable datasets are broad summaries of the full data contained within HIRIS. Many questions are best answered by querying the database directly. (See below for connection details.) If you’re unfamiliar with SQL databases, you’re encouraged to collaborate with your local bioinformatics practitioner.

Note that integration sites may be covered by multiple genes (such as those on opposite strands or read-through variants) and if so will be reported once per gene, resulting in multiple rows for a single integration site. If you want to count integration sites, you should use the integration summary without the gene annotations.

Integration summary

Download latest version as: CSV, Excel, JS, or JSON

The latest version is from Sat May 26 07:15 2018 PDT.

This dataset contains the following fields:

  1. environment
  2. subject
  3. landmark
  4. location
  5. orientation_in_landmark
  6. multiplicity
  7. source_names
  8. pubmed_ids

The manual contains more information on this downloadable dataset.

Integration summary with annotated genes

Download latest version as: CSV, Excel, JS, or JSON

The latest version is from Sat May 26 07:15 2018 PDT.

This dataset contains the following fields:

  1. environment
  2. subject
  3. ncbi_gene_id
  4. gene
  5. gene_type
  6. landmark
  7. location
  8. orientation_in_landmark
  9. orientation_in_gene
  10. multiplicity
  11. source_names
  12. pubmed_ids

The manual contains more information on this downloadable dataset.

Summary by gene

Download latest version as: CSV, Excel, JS, or JSON

The latest version is from Sat May 26 07:15 2018 PDT.

This dataset contains the following fields:

  1. environment
  2. ncbi_gene_id
  3. gene
  4. gene_type
  5. subjects
  6. unique_sites
  7. proliferating_sites
  8. total_in_gene

The manual contains more information on this downloadable dataset.

Frozen versions

When it’s time to do a set of analyses, we freeze a version of HIRIS at a moment in time. All analysis is then done on that frozen version, ensuring that the data doesn’t drift as analyses are tweaked and finalized. This is similar to using a single, specific version of the human genome while working on a study.

The following frozen versions are available:

Access the database

If you’re handy with code, you can connect to the database directly as a read-only user. We like both writing SQL and using R’s dplyr package. To connect to the database, try:

TypePostgreSQL
Hosthiris.mullins.microbiol.washington.edu
Userhiris_ro
Database namehiris

You may also find the reference of available tables handy.

Genome Browser and IGV

If you’re interested in a specific region of the human genome, such as a gene, the UCSC Genome Browser is a great way to start visualizing the data. We provide two custom tracks which you can load into the genome browser, one for in vivo sites and another for in vitro sites.

If you’d rather view the data in other genomics software such as IGV, you can download the track data as BED files.

Annotation files per gene

A GFF3 file is produced for each gene containing integration sites, letting you quickly focus on a specific gene of interest.

Get started by learning how to use these exports. Once familiar with their data, you can browse the files or download a zip file containing them all.

Known issues

HIRIS is under active development. However, we don’t know of any issues affecting data quality at the moment. Hooray!

If you spot a problem in HIRIS, please let us know. We strive to maintain high quality data ready to be used in analysis.