Skip to contents

This functions will take differential expression results, such as from DESEQ2, as a data.frame and plot the ecdf for the input gene.lists.

The gene sets to plot should be provided as a list of lists.

Example:

gene.lists = list("Background" = c("gene1", "gene2"), "Target" = c("gene2", "gene3"), "Overlap" = c("gene2"))

This function will also perform statistical testing if plot.hist is TRUE. The output will be saved to a PDF if an output.filename is provided.

Users can define the groups that are to be compared in the statistical test using the null.name and target.name arguments. The names must be found in gene.lists. The factor.order is used to order the groups in the analysis.

This functions returns:

  • $plot: The ECDF plot

  • $stats: The stats results object

Usage

de_fc_ecdf(
  res,
  gene.lists,
  title = "ECDF",
  output.filename = NULL,
  palette = SeedMatchR.palette,
  factor.order = NULL,
  x.lims = c(-1, 1),
  stats.test = c("KS", "Wilcoxen"),
  alternative = c("greater", "less", "two.sided"),
  null.name = 1,
  target.name = 2,
  height = 5,
  width = 5,
  dpi = 320,
  l2fc.col = "log2FoldChange"
)

Arguments

res

The DESeq2 results dataframe

gene.lists

A nest list of gene names. Example: gene.lists = list("Background" = gene.list2, "Target" = gene.list1, "Overlap" = gene.list3)

title

The tile of the plot

output.filename

If the output filename is provided, then the plot is saved.

palette

The color palette to use for your curves

factor.order

The order to use for the legends

x.lims

The xlimits range

stats.test

The statistic test to use. Options: KS, Kuiper, DTS, CVM, AD, Wass

alternative

The alternative hypothesis to test. Options: greater, less, two.sided

null.name

The name in the gene.list to use as the null for ecdf plots

target.name

The name in the gene.list to use as the target for ecdf plots

height

Plot height in inches

width

Plot width in inches

dpi

The dpi resolution for the figure

l2fc.col

The name of the column containing log2FoldChange values. Based on DESEQ2 names as default.

Value

A ggplot object for the ECDF plot

Examples

if (FALSE) { # interactive()
library(dplyr)

guide.seq = "UUAUAGAGCAAGAACACUGUUUU"

anno.db = load_species_anno_db("human")

features = get_feature_seqs(anno.db$tx.db, anno.db$dna)

# Load test data
get_example_data("sirna")

sirna.data = load_example_data("sirna")

res <- sirna.data$Schlegel_2022_Ttr_D1_30mkg

# Filter DESeq2 results for SeedMatchR
res = filter_deseq(res, fdr.cutoff=1, fc.cutoff=0, rm.na.log2fc = TRUE)

res = SeedMatchR(res, anno.db$gtf, features$seqs, guide.seq, "mer7m8")

# Gene set 1
mer7m8.list = res$gene_id[res$mer7m8 >= 1]

# Gene set 2
background.list = res$gene_id[!(res$mer7m8 %in% mer7m8.list)]

ecdf.results = de_fc_ecdf(res,
list("Background" = background.list, "mer7m8" = mer7m8.list),
stats.test = "KS",
factor.order = c("Background", "mer7m8"),
null.name = "Background",
target.name = "mer7m8")
}