Skip to contents
library(SeedMatchR)
library(gt)

Generating a SeedMatchReport

A SeedMatchReport is an analysis that will run through a pre-defined list of sequence definitions for your siRNA and scan annotations found in your DESEQ2 results file to report some basic statistics.

Example workflow

Load annotations

annodb = load_annotations(reference.name = "rnor6", canonical = FALSE, min.feature.width = 8, longest.utr = T)
#> Build AnnotationFilter for transcript features based on the following parameters: 
#> Keep only standard chroms: TRUE
#> Remove rows with NA in transcript ID: TRUE
#> Keep only protein coding genes and transcripts: TRUE
#> Filtering for transcripts with support level: FALSE
#> Keep only the ENSEMBL canonical transcript: FALSE
#> Filtering for specific genes: FALSE
#> Filtering for specific transcripts: FALSE
#> Filtering for specific gene symbols: FALSE
#> Filtering for specific entrez id: FALSE
#> Loading annotations from AnnotationHub for rnor6
#> loading from cache
#> require("rtracklayer")
#> Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
#> 'DelayedArray::makeNindexFromArrayViewport' when loading 'SummarizedExperiment'
#> loading from cache
#> require("ensembldb")
#> Extracting 3UTR from ensembldb object.
#> Keeping the longest UTR per gene.
#> Extracting sequences for each feature.
#> Keeping sequences that are >= 8

Load example DESeq2 data

get_example_data("sirna")
#> Example data directory being created at: /home/runner/.local/share/R/SeedMatchR
#> Warning in dir.create(data.path, recursive = TRUE):
#> '/home/runner/.local/share/R/SeedMatchR' already exists

sirna.data = load_example_data("sirna")

res <- sirna.data$Schlegel_2022_Ttr_D1_30mkg

res = filter_res(res)

Generate report

The report can be generated searches with and without indels. It is important to think about how indels will alter the results of the analysis. The edit distance (D) corresponds to the number of indels and mismatches allowed during the search. The edit distance is the total of mismatches + indels. Therefore, if you have the indel.bool flag set to TRUE then any insertion and deletion will counts towards the edit distance. So a edit distance of 4 could be 4 mismatches or 3 mismatches + 1 indel or any combination of indel + mismatches.

Generate report without indels

default.report = SeedMatchReport(res = res, seqs = annodb$seqs, guide.seq = "UUAUAGAGCAAGAACACUGUUUU", indel.bool = FALSE)

default.report$table
In-silico siRNA Binding Prediction
Identifying siRNA hits in the transcriptome
Full Guide Strand (g2:g23)
18-mer (g2:g19)
15-mer (g2:g19)
8mer 7mer-m8 7mer-A1 6mer Total
D0 D1 D2 D3 D4 D0 D1 D2 D3 D4 D0 D1 D2 D3 D4
SeedMatchReport
In silico predictions 0 1 0 0 2 0 0 0 7 102 0 0 15 266 2,226 86 214 409 764 4,092
Expressed predictions 0 1 0 0 0 0 0 0 7 64 0 0 11 166 1,267 37 104 231 409 2,297
Off-target predictions 0 1 0 0 0 0 0 0 1 3 0 0 3 11 50 3 6 14 24 116
% off-target 0.00% 100.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 14.29% 4.69% 0.00% 0.00% 27.27% 6.63% 3.95% 8.11% 5.77% 6.06% 5.87% 5.05%

Generate report with indels

indel.report = SeedMatchReport(res = res, seqs = annodb$seqs, guide.seq = "UUAUAGAGCAAGAACACUGUUUU", indel.bool = TRUE)

indel.report$table
In-silico siRNA Binding Prediction
Identifying siRNA hits in the transcriptome
Full Guide Strand (g2:g23)
18-mer (g2:g19)
15-mer (g2:g19)
8mer 7mer-m8 7mer-A1 6mer Total
D0 D1 D2 D3 D4 D0 D1 D2 D3 D4 D0 D1 D2 D3 D4
SeedMatchReport
In silico predictions 0 1 0 1 20 0 0 1 82 1,124 0 0 8 1,100 6,765 15 27 55 135 9,334
Expressed predictions 0 1 0 1 11 0 0 0 53 663 0 0 3 642 3,483 6 9 31 66 4,969
Off-target predictions 0 1 0 1 2 0 0 0 5 31 0 0 0 28 124 1 0 2 7 202
% off-target 0.00% 100.00% 0.00% 100.00% 18.18% 0.00% 0.00% 0.00% 9.43% 4.68% 0.00% 0.00% 0.00% 4.36% 3.56% 16.67% 0.00% 6.45% 10.61% 4.07%

Generate report with wobbles

wobble.report = SeedMatchReport(res = res, seqs = annodb$seqs, guide.seq = "UUAUAGAGCAAGAACACUGUUUU", indel.bool = FALSE, allow_wobbles = TRUE)

wobble.report$table
In-silico siRNA Binding Prediction
Identifying siRNA hits in the transcriptome
Full Guide Strand (g2:g23)
18-mer (g2:g19)
15-mer (g2:g19)
8mer 7mer-m8 7mer-A1 6mer Total
D0 D1 D2 D3 D4 D0 D1 D2 D3 D4 D0 D1 D2 D3 D4
SeedMatchReport
In silico predictions 0 0 0 0 0 0 0 0 1 4 0 0 0 37 524 0 0 0 0 566
Expressed predictions 0 0 0 0 0 0 0 0 1 4 0 0 0 24 321 0 0 0 0 350
Off-target predictions 0 0 0 0 0 0 0 0 1 0 0 0 0 3 17 0 0 0 0 21
% off-target 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 0.00% 0.00% 0.00% 0.00% 12.50% 5.30% 0.00% 0.00% 0.00% 0.00% 6.00%

Generate report with wobbles and with indels

indel.wobble.report = SeedMatchReport(res = res, seqs = annodb$seqs, guide.seq = "UUAUAGAGCAAGAACACUGUUUU", indel.bool = TRUE, allow_wobbles = TRUE)

indel.wobble.report$table
In-silico siRNA Binding Prediction
Identifying siRNA hits in the transcriptome
Full Guide Strand (g2:g23)
18-mer (g2:g19)
15-mer (g2:g19)
8mer 7mer-m8 7mer-A1 6mer Total
D0 D1 D2 D3 D4 D0 D1 D2 D3 D4 D0 D1 D2 D3 D4
SeedMatchReport
In silico predictions 0 0 0 0 0 0 0 0 1 15 0 0 0 124 2,471 0 0 0 0 2,611
Expressed predictions 0 0 0 0 0 0 0 0 1 12 0 0 0 74 1,384 0 0 0 0 1,471
Off-target predictions 0 0 0 0 0 0 0 0 1 0 0 0 0 8 55 0 0 0 0 64
% off-target 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 0.00% 0.00% 0.00% 0.00% 10.81% 3.97% 0.00% 0.00% 0.00% 0.00% 4.35%