Remove sequences from fasta or fastq file
Seqtk# get list of subset IDs # get all geneIDs from a fasta file
cat genes.fasta | grep '>' | cut -f 1 -d ' ' | sed 's/>//g' > list_of_geneIDs.txt # edit gene list to get the subset how you need, example: get top 3 genes as subset
head -3 list list_of_geneIDs.txt > subsetIDs.txt gene_001
gene_002 gene_003 # extract subset of gene sequences based on list of IDs
in .txt file
seqtk subseq genes.fasta subsetIDs.txt > genes_subset.fastq Python
|
Tools > Sequence data >