This page is part of a copy of HIRIS frozen in time on 26 October 2017. The permanent URL of this frozen version is: https://mullinslab.microbiol.washington.edu/hiris/frozen/Edlefsen_et_al_2018/.

If you’re not using this frozen version for analysis, you may want to switch to the latest version.

HIRIS: HIV-1 Reservoirs Integration Sites “high-rise”

Download Edlefsen_et_al_2018 datasets

Please read the ISDB manual for guidelines on using HIRIS and the data contained within.

These downloadable datasets are broad summaries of the full data contained within HIRIS. Many questions are best answered by querying the database directly. (See below for connection details.) If you’re unfamiliar with SQL databases, you’re encouraged to collaborate with your local bioinformatics practitioner.

Note that integration sites may be covered by multiple genes (such as those on opposite strands or read-through variants) and if so will be reported once per gene, resulting in multiple rows for a single integration site. If you want to count integration sites, you should use the integration summary without the gene annotations.

Integration summary

Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON

The Edlefsen_et_al_2018 version is from Thu Oct 26 11:46 2017 PDT.

This dataset contains the following fields:

  1. environment
  2. subject
  3. landmark
  4. location
  5. orientation_in_landmark
  6. multiplicity
  7. source_names
  8. pubmed_ids

The manual contains more information on this downloadable dataset.

Integration summary with annotated genes

Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON

The Edlefsen_et_al_2018 version is from Thu Oct 26 11:46 2017 PDT.

This dataset contains the following fields:

  1. environment
  2. subject
  3. ncbi_gene_id
  4. gene
  5. landmark
  6. location
  7. orientation_in_landmark
  8. orientation_in_gene
  9. multiplicity
  10. source_names
  11. pubmed_ids

The manual contains more information on this downloadable dataset.

Summary by gene

Download Edlefsen_et_al_2018 version as: CSV, Excel, or JSON

The Edlefsen_et_al_2018 version is from Thu Oct 26 11:45 2017 PDT.

This dataset contains the following fields:

  1. environment
  2. ncbi_gene_id
  3. gene
  4. subjects
  5. unique_sites
  6. proliferating_sites
  7. total_in_gene

The manual contains more information on this downloadable dataset.

Access the database

If you’re handy with code, you can connect to the database directly as a read-only user. We like both writing SQL and using R’s dplyr package. To connect to the database, try:

TypePostgreSQL
Hosthiris.mullins.microbiol.washington.edu
Userhiris_ro
Database nameisdb_2016-10-25_transcribed_genes_public

You may also find the reference of available tables handy.

Genome Browser and IGV

If you’re interested in a specific region of the human genome, such as a gene, the UCSC Genome Browser is a great way to start visualizing the data. We provide two custom tracks which you can load into the genome browser, one for in vivo sites and another for in vitro sites.

If you’d rather view the data in other genomics software such as IGV, you can download the track data as BED files.

Annotation files per gene

A GFF3 file is produced for each gene containing integration sites, letting you quickly focus on a specific gene of interest.

Get started by learning how to use these exports. Once familiar with their data, you can browse the files or download a zip file containing them all.

Known issues

HIRIS is under active development. However, we don’t know of any issues affecting data quality at the moment. Hooray!

If you spot a problem in HIRIS, please let us know. We strive to maintain high quality data ready to be used in analysis.