Skip to contents

Function to compute overall, positive, and negative Wasserstein differences between ECDFs

Usage

wass_dist(seed, non.seed, n.interp = 10000)

Arguments

seed

a vector of log2 FC values for the pre-specified seed matching genes.

non.seed

a vector of log2FC values for the pre-specified non-seed matching genes.

n.interp

number of interpolation points between (0,1) for integrating eCDF difference.

Value

A list containing: list(abs.auc = Absolute Wasserstein statistic, neg.cens.auc = Negative Wasserstein statistic, pos.cens.auc = Positive Wasserstein statistic, dens.plot = Density plot, cdf.plot = CDF plot, diff.plot = Difference plot)

Examples

if (FALSE) { # interactive()
library(dplyr)

guide.seq = "UUAUAGAGCAAGAACACUGUUUU"

anno.db = load_annotations("rnor7")

# Load test data
get_example_data("sirna")

sirna.data = load_example_data("sirna")

res <- sirna.data$Schlegel_2022_Ttr_D1_30mkg

# Filter DESeq2 results for SeedMatchR
res = filter_res(res, fdr.cutoff=1, fc.cutoff=0, rm.na.log2fc = TRUE)

res = SeedMatchR(res = res, gtf = anno.db$gtf, seqs = anno.db$seqs,
sequence = guide.seq, seed.name = "mer7m8", tx.id.col= FALSE)

# Gene set 1
mer7m8.list = res$log2FoldChange[res$mer7m8 >= 1]

# Gene set 2
background.list = res$log2FoldChange[!(res$mer7m8 %in% mer7m8.list)]

ecdf.res = SeedMatchR::wass_dist(res, mer7m8.list, background.list)
}