PhyloSift Reference Marker Genes
https://figshare.com/articles/PhyloSift_markers_database/5755404/1
We’re working to update PhyloSift with an expanded marker set which will incorporate additional marker genes for eukaryotes (e.g. plastid genes), taxon-specific sets of gene families, and support for viral marker genes.
The following markers (DNGNGWU*) were originally mined from bacterial/archaeal genomes, but our in-house tests show that at least 33 of these markers have full-length eukaryotic homologs (based on searches against the yeast genome). Percentage values given for Bacteria/Archaea indicate the proportion of taxa in each group whose genomes contain each marker gene (obtained from Wu et al. 2013). Asterisk denotes potentially multi-copy markers, as determined by in-house assessments of microbial genome assemblies at UC Davis (NOTE: As of 1/10/14, these three markers with an asterisk – DNGNGWU00004, DNGNGWU00008, and DNGNGWU00038 – are currently disabled during PhyloSift runs, even though they are still included in automatic marker package downloads):
PhyloSift Marker |
Gene Name |
DNGNGWU00001 | ribosomal protein S2 rpsB (Archaea: 100%, Bacteria: 99.5%) |
DNGNGWU00002 | ribosomal protein S10 rpsJ (Archaea: 100%, Bacteria: 98.51%) |
DNGNGWU00003 | ribosomal protein L1 rplA (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00004* | translation elongation factor EF-2 (Archaea: 100%, Bacteria: 99.67%) |
DNGNGWU00005 | translation initiation factor IF-2 (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00006 | metalloendopeptidase (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00007 | ribosomal protein L22 (Archaea: 100%, Bacteria: 99.67%) |
DNGNGWU00008* | ffh signal recognition particle protein (Archaea: 100%, Bacteria: 98.18%) |
DNGNGWU00009 | ribosomal protein L4/L1e rplD (Archaea: 100%, Bacteria: 99.67%) |
DNGNGWU00010 | ribosomal protein L2 rplB (Archaea: 100%, Bacteria: 99.5%) |
DNGNGWU00011 | ribosomal protein S9 rpsI (Archaea: 100%, Bacteria: 100%) |
DNGNGWU00012 | ribosomal protein L3 rplC (Archaea: 100%, Bacteria: 99.5%) |
DNGNGWU00013 | phenylalanyl-tRNA synthetase beta subunit (Archaea: 100%, Bacteria: 99.67%) |
DNGNGWU00014 | ribosomal protein L14b/L23e rplN (Archaea: 100%, Bacteria: 99.34%) |
DNGNGWU00015 | ribosomal protein S5 (Archaea: 100%, Bacteria: 99.5%) |
DNGNGWU00016 | ribosomal protein S19 rpsS (Archaea: 100%, Bacteria: 99.17%) |
DNGNGWU00017 | ribosomal protein S7 (Archaea: 100%, Bacteria: 99.67%) |
DNGNGWU00018 | ribosomal protein L16/L10E rplP (Archaea: 100%, Bacteria: 99.67%) |
DNGNGWU00019 | ribosomal protein S13 rpsM (Archaea: 100%, Bacteria: 99.17%) |
DNGNGWU00020 | phenylalanyl-tRNA synthetase alpha subunit (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00021 | ribosomal protein L15 (Archaea: 100%, Bacteria: 99.5%) |
DNGNGWU00022 | ribosomal protein L25/L23 (Archaea: 100%, Bacteria: 99.17%) |
DNGNGWU00023 | ribosomal protein L6 rplF (Archaea: 100%, Bacteria: 99.5%) |
DNGNGWU00024 | ribosomal protein L11 rplK (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00025 | ribosomal protein L5 rplE (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00026 | ribosomal protein S12/S23 (Archaea: 100%, Bacteria: 99.17%) |
DNGNGWU00027 | ribosomal protein L29 (Archaea: 98.39%, Bacteria: 98.68%) |
DNGNGWU00028 | ribosomal protein S3 rpsC (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00029 | ribosomal protein S11 rpsK (Archaea: 100%, Bacteria: 99.17%) |
DNGNGWU00030 | ribosomal protein L10 (Archaea: 98.39%, Bacteria: 99.67%) |
DNGNGWU00031 | ribosomal protein S8 (Archaea: 100%, Bacteria: 99.5%) |
DNGNGWU00032 | tRNA pseudouridine synthase B (Archaea: 95.16%, Bacteria: 97.35%) |
DNGNGWU00033 | ribosomal protein L18P/L5E (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00034 | ribosomal protein S15P/S13e (Archaea: 100%, Bacteria: 99.84%) |
DNGNGWU00035 | Porphobilinogen deaminase (Archaea: 85.48%, Bacteria: 86.59%) |
DNGNGWU00036 | ribosomal protein S17 (Archaea: 100%, Bacteria: 99.17%) |
DNGNGWU00037 | ribosomal protein L13 rplM (Archaea: 100%, Bacteria: 99.83%) |
DNGNGWU00038* | phosphoribosylformylglycinamidine cyclo-ligase rpsE (Archaea: 90.32%, Bacteria: 92.38%) |
DNGNGWU00039 | ribonuclease HII (Archaea: 100%, Bacteria: 98.51%) |
DNGNGWU00040 | ribosomal protein L24 (Archaea: 100%, Bacteria: 99.5%) |
PhyloSift also includes a suite of markers that are more narrowly focused on eukaryotes, including both nuclear and mitochondrial markers:
PhyloSift Marker (Eukaryotic) |
Gene Name |
14-3-3 | 5-monooxygenase activation protein (HomoloGene ID: 100743) |
40S | 40S ribosomal protein S4 (HomoloGene ID: 90857) |
Actin | actin, beta (HomoloGene ID: 110648) |
Atub | tubulin, alpha 4a (HomoloGene ID: 68496) |
Btub | tubulin, beta 4 (HomoloGene ID: 55952) |
ef1aLike | eukaryotic translation elongation factor 1, alpha 1 (HomoloGene ID: 105313) |
ef2 | eukaryotic translation elongation factor 2 (HomoloGene ID: 100816) |
enolase | enolase 1 (HomoloGene ID: 68183) |
gamma | tubulin, gamma |
grc5 | 60S ribosomal protein L10 (HomoloGene ID: 68830) |
hsp70 | Hsp70 protein |
hsp70cyt | heat shock 70kDa protein 8 (HomoloGene ID: 68524) |
hsp70er | predicted Hsp70 protein |
Hsp90 | heat shock protein 90kDa alpha (cytosolic) (HomoloGene ID: 74306) |
metk | methionine adenosyltransferase II alpha, S-adenosylmethionine synthetase (HomoloGene ID: 38112) |
Rad51 | RAD-associated protein |
rps22 | Rps15a (ribosomal protein S15A) (HomoloGene ID: 128371) |
Rps23a | 40S ribosomal protein S23 (HomoloGene ID: 799) |
TFIIH | (hypothetical protein) |
Tsec61 | Sec61 alpha 1 subunit (HomoloGene ID: 55537) |
U5 | splicing factor Prp8 |
mtDNA_ATP6 | Mitochondrial ATP synthase subunit 6 |
mtDNA_ATP8 | Mitochondrial ATP synthase subunit 8 |
mtDNA_Cox1 | Mitochondrial cytochrome c oxidase subunit 1 |
mtDNA_Cox2 | Mitochondrial cytochrome c oxidase subunit 2 |
mtDNA_Cox3 | Mitochondrial cytochrome c oxidase subunit 3 |
mtDNA_CytB | Mitochondrial Cytochrome b |
mtDNA_ND1 | Mitochondrial NADH Deyhydrogenase subunit 1 |
mtDNA_ND2 | Mitochondrial NADH Deyhydrogenase subunit 2 |
mtDNA_ND4 | Mitochondrial NADH Deyhydrogenase subunit 4 |
mtDNA_ND4L | Mitochondrial NADH Deyhydrogenase subunit 4L |
mtDNA_ND5 | Mitochondrial NADH Deyhydrogenase subunit 5 |
mtDNA_ND6 | Mitochondrial NADH Deyhydrogenase subunit 6 |