-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Hello,
I wonder if you could give me more information about the required format for the fasta headers. I have been running pasteTaxID and while I don't get any errors, the tax ids do not show up in the results.
This header:
'>acc|GENBANK|AB866984.1|Human_immunodeficiency_virus_1_gene_for_pol_protein,_partial_cds,_isolate:_F10-5112353-1.|Human_immunodeficiency_virus_1|VRL|25-JUL-2014'
Comes out as:
'>ti||acc|GENBANK|AB866984.1|Human_immunodeficiency_virus_1_gene_for_pol_protein,_partial_cds,_isolate:_F10-5112353-1.|Human_immunodeficiency_virus_1|VRL|25-JUL-2014'
I am guessing it may have something to do with the header format. I did try to remove the GENBANK part so the header was:
'>acc|AB866984.1|Human_immunodeficiency_virus_1_gene_for_pol_protein,_partial_cds,_isolate:_F10-5112353-1.|Human_immunodeficiency_virus_1|VRL|25-JUL-2014'
A few of the tax ids were found, but most were not. For example:
' >ti||acc|FJ640294.1|Uncultured_marine_virus_isolate_CBSM-188_genomic_sequence.|uncultured_marine_virus|ENV|07-APR-2009'
'>ti|186617|acc|FJ640295.1|Uncultured_marine_virus_isolate_CBSM-189_genomic_sequence.|uncultured_marine_virus|ENV|07-APR-2009'
Any help would be appreciated.
Thanks,
Maddy