Plot the ECDF for differential expression analysis log2(Fold Changes)
Source:R/de_fc_ecdf.R
de_fc_ecdf.Rd
This functions will take differential expression results, such as from DESEQ2, as a data.frame
and
plot the ecdf for the input gene.lists
.
The gene sets to plot should be provided as a list of lists.
Example:
gene.lists = list("Background" = c("gene1", "gene2"), "Target" = c("gene2", "gene3"), "Overlap" = c("gene2"))
This function will also perform statistical testing if plot.hist
is TRUE.
The output will be saved to a PDF if an output.filename
is provided.
Users can define the groups that are to be compared in the statistical test
using the null.name
and target.name
arguments. The names must be found
in gene.lists
. The factor.order
is used to order the groups in the
analysis.
This functions returns:
$plot
: The ECDF plot$stats
: The stats results object
Usage
de_fc_ecdf(
res,
gene.lists,
title = "ECDF",
output.filename = NULL,
palette = SeedMatchR.palette,
factor.order = NULL,
x.lims = c(-1, 1),
stats.test = c("KS", "Wilcoxen"),
alternative = c("greater", "less", "two.sided"),
null.name = 1,
target.name = 2,
height = 5,
width = 5,
dpi = 320,
l2fc.col = "log2FoldChange"
)
Arguments
- res
The DESeq2 results dataframe
- gene.lists
A nest list of gene names. Example: gene.lists = list("Background" = gene.list2, "Target" = gene.list1, "Overlap" = gene.list3)
- title
The tile of the plot
- output.filename
If the output filename is provided, then the plot is saved.
- palette
The color palette to use for your curves
- factor.order
The order to use for the legends
- x.lims
The xlimits range
- stats.test
The statistic test to use. Options: KS, Kuiper, DTS, CVM, AD, Wass
- alternative
The alternative hypothesis to test. Options: greater, less, two.sided
- null.name
The name in the gene.list to use as the null for ecdf plots
- target.name
The name in the gene.list to use as the target for ecdf plots
- height
Plot height in inches
- width
Plot width in inches
- dpi
The dpi resolution for the figure
- l2fc.col
The name of the column containing log2FoldChange values. Based on DESEQ2 names as default.
Examples
if (FALSE) { # interactive()
library(dplyr)
guide.seq = "UUAUAGAGCAAGAACACUGUUUU"
anno.db = load_species_anno_db("human")
features = get_feature_seqs(anno.db$tx.db, anno.db$dna)
# Load test data
get_example_data("sirna")
sirna.data = load_example_data("sirna")
res <- sirna.data$Schlegel_2022_Ttr_D1_30mkg
# Filter DESeq2 results for SeedMatchR
res = filter_deseq(res, fdr.cutoff=1, fc.cutoff=0, rm.na.log2fc = TRUE)
res = SeedMatchR(res, anno.db$gtf, features$seqs, guide.seq, "mer7m8")
# Gene set 1
mer7m8.list = res$gene_id[res$mer7m8 >= 1]
# Gene set 2
background.list = res$gene_id[!(res$mer7m8 %in% mer7m8.list)]
ecdf.results = de_fc_ecdf(res,
list("Background" = background.list, "mer7m8" = mer7m8.list),
stats.test = "KS",
factor.order = c("Background", "mer7m8"),
null.name = "Background",
target.name = "mer7m8")
}