VIRUS_DB2.0
An Online Crowdsourcing Virus Database for Classification Based on Natural Vector
Submit Sequences
Natural vector is a fast and accurate tool to classify DNA sequences. The sequences could contain A, C, G, T, W,
S, M, K, R, Y, B, D, H, V, N (UPAC SYMBOLS ONLY ). Other symbols would be ignored in the
computation.
You just need to choose one way to submit your sequences. Only the direct input sequence other than FASTA
file will be used if we find two inputs. You should either submit your sequence or provide us the FASTA file for
a single sequence. For multiple-segment sequences, only FASTA file is accepted.
If you only provide THE WHOLE GENOME SEQUENCE , then VirusDB will output its first 5 nearest
neighbors in the
whole dataset and the distances between each pair, or an email informing that no prediction is made, depending
on the distance between the query sequence and nearest neighbor in the whole dataset.
If you provide THE WHOLE GENOME SEQUENCE, THE BALTIMORE GROUP, AND THE FAMILY LABEL , then the
output will be its
first 5 nearest neighbors in this family group and the distances between each pair, or an email informing that
no prediction is made, depending on the distance between the query sequence and nearest neighbor in the FAMILY
group.
When the distance is not small enough, VirusDB doesn’t make any prediction since they may be unreliable. In that
case, the informing no-prediction email will be sent to you. For more details of prediction, please see
reference [3].