Changelog • SeedMatchR

SeedMatchR 2.0.0

Requires R >= 4.3.2
Remove the requirement for GTF file. This was mainly used to map and aggregate counts across features, but most updated tx objects. report a single transcript per gene. The user can use their own code to summarize counts across multiple transcript per gene if they need to. The names in the sequences and the names in the gene_id column must match or there will be an error.
The argument universal_set was changed to shared_genes to avoid confusion. This argument will result in the intersection of featuers in the sequence database and results data frame being reported.
Added articles describing how to generate k-mer counts for a sequence library, and specifics of mismatches/indels in the search.
P-values are reported according to standards mentioned in this publication.
Test statistics are rounded to 3 digits when plotted.
Use ensembldb objects from AnnotationHub instead of making them from the GTF file. There is much better coverage of meta data from the ensembldb objects available for filtering. This also resulted in the GTF file being generated by load_species_anno_db being extracted from the txdb object itself.
Added the function build_annotation_filter to help create AnnotationFilterList objects for selecting transcripts from ensembldb objects.
Updated documentation with examples of generating annotations.
Added a slot for filter to output list of results from load_annotations.
Added functions for searching with wobbles or bulges.
Updated statistics functions to include the DTS package and custom Wasserstein statistic.
Added the option to return a data.frame of matches or or a column that is added to the DESEQ2 input file.
Added the SeedMatchLogo function for plotting the logos around seed matches.
Reworked the function load_annotations. This function now takes as input a reference.name that corresponds to either hg38, mm10, mm9, rnor6, and rnor7.
Added argument fixed to search functions to enable searching of ambiguous nucleotides.
You can now specify custom seed definitions using the start.pos and stop.pos arguments in the SeedMatchR and get_seed functions.
The function load_annotations now replaces load_species_anno_db. This function now returns an object with the dna, gtf, seqs, and features slots. The function has additional flags for reducing ranges and filtering for protein coding genes.
Example data is saved to a temp directory.

SeedMatchR 1.0.0

First release prepared for CRAN submission