This page documents functions using the IGVF REST API, documented at https://api.catalog.igvf.org/#.
Note that functions will only return a limited number
of responses, see limit
and page
arguments below for control
over number of responses.
gene_variants()
locates variants associated with a gene.
Only one of gene_id
, hgnc
, gene_name
, or alias
should be specified.
variant_genes()
locates genes
associated with a variant.
Only one of spdi
, hgvs
, rsid
, variant_id
,
or chr + position
should be specified.
gene_elements()
locates elements
associated with a gene.
elements()
locates genomic elements
based on a genomic range query.
element_genes()
locates genomic elements and associated genes
based on a genomic range query.
Usage
gene_variants(
gene_id = NULL,
hgnc = NULL,
gene_name = NULL,
alias = NULL,
organism = "Homo sapiens",
log10pvalue = NULL,
effect_size = NULL,
page = 0L,
limit = 25L,
verbose = FALSE
)
variant_genes(
spdi = NULL,
hgvs = NULL,
rsid = NULL,
variant_id = NULL,
chr = NULL,
position = NULL,
organism = "Homo sapiens",
log10pvalue = NULL,
effect_size = NULL,
page = 0L,
limit = 25L,
verbose = FALSE
)
gene_elements(gene_id = NULL, page = 0L, limit = 25L, verbose = FALSE)
elements(range = NULL, page = 0L, limit = 25L)
element_genes(range = NULL, page = 0L, limit = 25L, verbose = FALSE)
Arguments
- gene_id
character(1) Ensembl gene identifier, e.g., "ENSG00000106633"
- hgnc
character(1) HGNC identifier
- gene_name
character(1) Gene symbol, e.g., "GCK"
- alias
character(1) Gene alias
- organism
character(1) Either 'Homo sapiens' (default) or 'Mus musculus'
- log10pvalue
character(1) The following can be used to set thresholds on the negative log10pvalue: gt (>), gte (>=), lt (<), lte (<=), with a ":" following and a value, e.g., "gt:5.0"
- effect_size
character(1) Optional string used for thresholding on the effect size of the variant on the gene. See 'log10pvalue'. E.g., "gt:0.5"
- page
integer(1) when there are more response items than
limit
, offers pagination. starts on page 0L, next is 1L, ...- limit
integer(1) the limit parameter controls the page size and can not exceed 1000
- verbose
logical(1) return additional information about variants and genes
- spdi
character(1) SPDI of variant
- hgvs
character(1) HGVS of variant
- rsid
character(1) RSID of variant
- variant_id
character(1) IGVF variant ID
- chr
character(1) UCSC-style chromosome name of variant, e.g. "chr1"
- position
character(1) 0-based position of variant
- range
the query GRanges (expects 1-based start position)
Value
gene_variants()
returns a tibble describing variants
associated with the gene; use verbose = TRUE
to retrieve more
extensive information.
variant_genes()
returns a tibble describing genes
associated with a variant; use verbose = TRUE
to retrieve more
extensive information.
gene_elements()
returns a tibble describing elements
associated with the gene; use verbose = TRUE
to retrieve more
extensive information.
elements()
returns a GRanges object describing elements.
element_genes()
returns a tibble describing genomic element and gene pairs.
Examples
rigvf::gene_variants(gene_name = "GCK")
#> # A tibble: 25 × 9
#> `sequence variant` gene label log10pvalue effect_size source source_url
#> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
#> 1 variants/8c6a683829bcb… gene… eQTL 4.89 0.274 GTEx https://s…
#> 2 variants/cf796b5a16212… gene… eQTL 5.76 0.221 GTEx https://s…
#> 3 variants/9a36af4633321… gene… eQTL 6.17 -0.266 GTEx https://s…
#> 4 variants/2fefe07a0750b… gene… eQTL 3.69 0.158 GTEx https://s…
#> 5 variants/ab6df1152a643… gene… eQTL 16.9 -0.353 GTEx https://s…
#> 6 variants/92833b52621e5… gene… eQTL 4.86 -0.170 GTEx https://s…
#> 7 variants/bceca4e6ac3cd… gene… eQTL 4.63 -0.340 GTEx https://s…
#> 8 variants/0a8ba63e5451a… gene… eQTL 4.94 0.215 GTEx https://s…
#> 9 variants/80f639e0da643… gene… eQTL 6.59 -0.330 GTEx https://s…
#> 10 variants/7f4ca6f1cfd70… gene… eQTL 4.10 -0.165 GTEx https://s…
#> # ℹ 15 more rows
#> # ℹ 2 more variables: biological_context <chr>, chr <chr>
rigvf::gene_variants(gene_name = "GCK", effect_size="gt:0.5")
#> # A tibble: 13 × 9
#> `sequence variant` gene label log10pvalue effect_size source source_url
#> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
#> 1 variants/2e1a52f2551d6… gene… eQTL 5.08 0.502 GTEx https://s…
#> 2 variants/513df90562f2e… gene… eQTL 3.68 0.628 GTEx https://s…
#> 3 variants/ba959d0d02884… gene… eQTL 3.90 0.515 GTEx https://s…
#> 4 variants/965035c14bc1a… gene… eQTL 3.65 0.548 GTEx https://s…
#> 5 variants/e0d388dd97ea5… gene… eQTL 5.74 0.541 GTEx https://s…
#> 6 variants/a2362bdd461e7… gene… eQTL 4.63 0.770 GTEx https://s…
#> 7 variants/a4919ef974d2c… gene… eQTL 4.17 0.732 GTEx https://s…
#> 8 variants/c117f48f3e28e… gene… eQTL 3.98 0.589 GTEx https://s…
#> 9 variants/f72c43e10b242… gene… eQTL 4.03 0.581 GTEx https://s…
#> 10 variants/ba959d0d02884… gene… eQTL 4.09 0.502 GTEx https://s…
#> 11 variants/7fc7d0a82445b… gene… eQTL 5.36 0.552 GTEx https://s…
#> 12 variants/74b4b8f7b37f5… gene… eQTL 4.13 0.604 GTEx https://s…
#> 13 variants/829ce03f1d460… gene… eQTL 4.50 0.631 GTEx https://s…
#> # ℹ 2 more variables: biological_context <chr>, chr <chr>
rigvf::gene_variants(gene_name = "GCK", verbose = TRUE)
#> # A tibble: 25 × 9
#> `sequence variant` gene label log10pvalue effect_size source
#> <list> <list> <chr> <dbl> <dbl> <chr>
#> 1 <named list [14]> <named list [11]> eQTL 4.89 0.274 GTEx
#> 2 <named list [14]> <named list [11]> eQTL 5.76 0.221 GTEx
#> 3 <named list [14]> <named list [11]> eQTL 6.17 -0.266 GTEx
#> 4 <named list [14]> <named list [11]> eQTL 3.69 0.158 GTEx
#> 5 <named list [14]> <named list [11]> eQTL 16.9 -0.353 GTEx
#> 6 <named list [14]> <named list [11]> eQTL 4.86 -0.170 GTEx
#> 7 <named list [14]> <named list [11]> eQTL 4.63 -0.340 GTEx
#> 8 <named list [14]> <named list [11]> eQTL 4.94 0.215 GTEx
#> 9 <named list [14]> <named list [11]> eQTL 6.59 -0.330 GTEx
#> 10 <named list [14]> <named list [11]> eQTL 4.10 -0.165 GTEx
#> # ℹ 15 more rows
#> # ℹ 3 more variables: source_url <chr>, biological_context <chr>, chr <chr>
rigvf::variant_genes(spdi = "NC_000001.11:920568:G:A")
#> # A tibble: 6 × 12
#> `sequence variant` gene label log10pvalue effect_size source source_url
#> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
#> 1 variants/c41b54297becfa… gene… eQTL 5.40 1.60 GTEx https://s…
#> 2 variants/c41b54297becfa… gene… eQTL 4.97 1.92 GTEx https://s…
#> 3 variants/c41b54297becfa… gene… eQTL 4.79 1.67 GTEx https://s…
#> 4 variants/c41b54297becfa… gene… eQTL 5.35 -0.671 GTEx https://s…
#> 5 variants/c41b54297becfa… gene… eQTL 4.82 0.719 GTEx https://s…
#> 6 variants/c41b54297becfa… gene… eQTL 4.31 0.562 GTEx https://s…
#> # ℹ 5 more variables: biological_context <chr>, chr <chr>, intron_chr <list>,
#> # intron_start <list>, intron_end <list>
res <- rigvf::gene_elements(gene_id = "ENSG00000187961")
res
#> # A tibble: 25 × 2
#> gene regions
#> <list> <list>
#> 1 <named list [5]> <named list [8]>
#> 2 <named list [5]> <named list [8]>
#> 3 <named list [5]> <named list [8]>
#> 4 <named list [5]> <named list [8]>
#> 5 <named list [5]> <named list [8]>
#> 6 <named list [5]> <named list [8]>
#> 7 <named list [5]> <named list [8]>
#> 8 <named list [5]> <named list [8]>
#> 9 <named list [5]> <named list [8]>
#> 10 <named list [5]> <named list [8]>
#> # ℹ 15 more rows
res |>
dplyr::select(regions) |>
tidyr::unnest_wider(regions)
#> # A tibble: 25 × 8
#> id cell_type score model dataset enhancer_type enhancer_start
#> <chr> <chr> <dbl> <chr> <chr> <chr> <int>
#> 1 genomic_elements… hela 0.0244 ENCO… https:… accessible d… 1000976
#> 2 genomic_elements… cd4-posi… 0.0168 ENCO… https:… accessible d… 1000976
#> 3 genomic_elements… cd8-posi… 0.0271 ENCO… https:… accessible d… 1000976
#> 4 genomic_elements… esophagu… 0.0180 ENCO… https:… accessible d… 1000976
#> 5 genomic_elements… astrocyte 0.0188 ENCO… https:… accessible d… 1000976
#> 6 genomic_elements… endometr… 0.0156 ENCO… https:… accessible d… 1000976
#> 7 genomic_elements… ependyma… 0.0160 ENCO… https:… accessible d… 1000976
#> 8 genomic_elements… osteobla… 0.0109 ENCO… https:… accessible d… 1000976
#> 9 genomic_elements… macropha… 0.0137 ENCO… https:… accessible d… 1000976
#> 10 genomic_elements… astrocyte 0.0172 ENCO… https:… accessible d… 1000976
#> # ℹ 15 more rows
#> # ℹ 1 more variable: enhancer_end <int>
rng <- GenomicRanges::GRanges("chr1", IRanges::IRanges(1157520,1158189))
rigvf::elements(range = rng)
#> GRanges object with 25 ranges and 5 metadata columns:
#> seqnames ranges strand | name
#> <Rle> <IRanges> <Rle> | <character>
#> [1] chr1 1158136-1158445 * | EH38E1310547
#> [2] chr1 1157437-1157782 * | EH38E2777055
#> [3] chr1 1157886-1158053 * | EH38E2777056
#> [4] chr1 1157528-1158185 * | enhancer_chr1_115752..
#> [5] chr1 1157528-1158185 * | enhancer_chr1_115752..
#> ... ... ... ... . ...
#> [21] chr1 1157528-1158185 * | enhancer_chr1_115752..
#> [22] chr1 1157528-1158185 * | enhancer_chr1_115752..
#> [23] chr1 1157528-1158185 * | enhancer_chr1_115752..
#> [24] chr1 1157528-1158185 * | enhancer_chr1_115752..
#> [25] chr1 1157528-1158185 * | enhancer_chr1_115752..
#> source_annotation type source
#> <character> <character> <character>
#> [1] dELS: distal Enhance.. candidate cis regula.. ENCODE_SCREEN (ccREs)
#> [2] dELS: distal Enhance.. candidate cis regula.. ENCODE_SCREEN (ccREs)
#> [3] dELS: distal Enhance.. candidate cis regula.. ENCODE_SCREEN (ccREs)
#> [4] enhancer accessible dna eleme.. ENCODE_EpiRaction
#> [5] enhancer accessible dna eleme.. ENCODE_EpiRaction
#> ... ... ... ...
#> [21] enhancer accessible dna eleme.. ENCODE_EpiRaction
#> [22] enhancer accessible dna eleme.. ENCODE_EpiRaction
#> [23] enhancer accessible dna eleme.. ENCODE_EpiRaction
#> [24] enhancer accessible dna eleme.. ENCODE_EpiRaction
#> [25] enhancer accessible dna eleme.. ENCODE_EpiRaction
#> source_url
#> <character>
#> [1] https://data.igvf.or..
#> [2] https://data.igvf.or..
#> [3] https://data.igvf.or..
#> [4] https://www.encodepr..
#> [5] https://www.encodepr..
#> ... ...
#> [21] https://www.encodepr..
#> [22] https://www.encodepr..
#> [23] https://www.encodepr..
#> [24] https://www.encodepr..
#> [25] https://www.encodepr..
#> -------
#> seqinfo: 1 sequence from hg38 genome; no seqlengths
rigvf::element_genes(range = rng)
#> # A tibble: 25 × 6
#> score source source_url significant gene biosample
#> <dbl> <chr> <chr> <list> <chr> <chr>
#> 1 0.0102 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… prostate…
#> 2 0.0113 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… hela
#> 3 0.0170 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… monocyte
#> 4 0.0137 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… fibrobla…
#> 5 0.0106 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… prostate…
#> 6 0.0185 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… endometr…
#> 7 0.0341 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… bone mar…
#> 8 0.0454 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… hela
#> 9 0.0460 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… stomach
#> 10 0.0126 ENCODE_EpiRaction https://www.encodeproje… <NULL> gene… hepg2
#> # ℹ 15 more rows