Training datasets

FASTA format files of the sequences used to train and test PA are available below, sorted by localization site.
For a description of how the dataset was created, please see the paper.



Sub-cellular localization abbreviations

TERM
ABBREV.
TERM
ABBREV.
cell wall
wal
membrane
mem
chloroplast
chl
mitochodrial
mit
cytoplasmic
cyt
nuclear
nuc
endoplasmic reticulum
end
outer membrane
out
extracellular
ext
periplasmic
per
golgi
gol
peroxisomal
pex
inner membrane
inn
vacuolar
vac
lysosomal
lys
no prediction
N.P.
Overall Recall
O.R.
Overall Precision
O.P.
Overall Specificity
O.S.
Sequence Coverage
S.C.