En savoir plus

Notre utilisation de cookies

« Cookies » désigne un ensemble d’informations déposées dans le terminal de l’utilisateur lorsque celui-ci navigue sur un site web. Il s’agit d’un fichier contenant notamment un identifiant sous forme de numéro, le nom du serveur qui l’a déposé et éventuellement une date d’expiration. Grâce aux cookies, des informations sur votre visite, notamment votre langue de prédilection et d'autres paramètres, sont enregistrées sur le site web. Cela peut faciliter votre visite suivante sur ce site et renforcer l'utilité de ce dernier pour vous.

Afin d’améliorer votre expérience, nous utilisons des cookies pour conserver certaines informations de connexion et fournir une navigation sûre, collecter des statistiques en vue d’optimiser les fonctionnalités du site. Afin de voir précisément tous les cookies que nous utilisons, nous vous invitons à télécharger « Ghostery », une extension gratuite pour navigateurs permettant de les détecter et, dans certains cas, de les bloquer.

Ghostery est disponible gratuitement à cette adresse : https://www.ghostery.com/fr/products/

Vous pouvez également consulter le site de la CNIL afin d’apprendre à paramétrer votre navigateur pour contrôler les dépôts de cookies sur votre terminal.

S’agissant des cookies publicitaires déposés par des tiers, vous pouvez également vous connecter au site http://www.youronlinechoices.com/fr/controler-ses-cookies/, proposé par les professionnels de la publicité digitale regroupés au sein de l’association européenne EDAA (European Digital Advertising Alliance). Vous pourrez ainsi refuser ou accepter les cookies utilisés par les adhérents de l'EDAA.

Il est par ailleurs possible de s’opposer à certains cookies tiers directement auprès des éditeurs :

Catégorie de cookie

Moyens de désactivation

Cookies analytiques et de performance

Realytics
Google Analytics
Spoteffects
Optimizely

Cookies de ciblage ou publicitaires

DoubleClick
Mediarithmics

Les différents types de cookies pouvant être utilisés sur nos sites internet sont les suivants :

Cookies obligatoires

Cookies fonctionnels

Cookies sociaux et publicitaires

Ces cookies sont nécessaires au bon fonctionnement du site, ils ne peuvent pas être désactivés. Ils nous sont utiles pour vous fournir une connexion sécuritaire et assurer la disponibilité a minima de notre site internet.

Ces cookies nous permettent d’analyser l’utilisation du site afin de pouvoir en mesurer et en améliorer la performance. Ils nous permettent par exemple de conserver vos informations de connexion et d’afficher de façon plus cohérente les différents modules de notre site.

Ces cookies sont utilisés par des agences de publicité (par exemple Google) et par des réseaux sociaux (par exemple LinkedIn et Facebook) et autorisent notamment le partage des pages sur les réseaux sociaux, la publication de commentaires, la diffusion (sur notre site ou non) de publicités adaptées à vos centres d’intérêt.

Sur nos CMS EZPublish, il s’agit des cookies sessions CAS et PHP et du cookie New Relic pour le monitoring (IP, délais de réponse).

Ces cookies sont supprimés à la fin de la session (déconnexion ou fermeture du navigateur)

Sur nos CMS EZPublish, il s’agit du cookie XiTi pour la mesure d’audience. La société AT Internet est notre sous-traitant et conserve les informations (IP, date et heure de connexion, durée de connexion, pages consultées) 6 mois.

Sur nos CMS EZPublish, il n’y a pas de cookie de ce type.

Pour obtenir plus d’informations concernant les cookies que nous utilisons, vous pouvez vous adresser au Déléguée Informatique et Libertés de l’INRA par email à cil-dpo@inra.fr ou par courrier à :

INRA
24, chemin de Borde Rouge –Auzeville – CS52627
31326 Castanet Tolosan cedex - France

Dernière mise à jour : Mai 2018

Menu Logo Principal Société Française de Bio-Informatique GdR Bionformatique Moléculaire du CNRS

DECODAGE – Communauté d’Annotation des Génomes

Databanks

TriAnnot Databanks

TriAnnot Databanks

The databank versions used by the pipeline for a given analysis are displayed within the first line of the GFF3 output files. Below, for each databank used by the TriAnnot pipeline is given:

  • The UNIX file name with the databank version
    • For example: repeat_mips_poaceaeplus_v1308
  • The TriAnnot output EMBL file name for which this databank has been used. In [ ] the plant species for which an annotation is available within the TriAnnot pipeline.
    • For example: 2_REPEATMASKER_MIPSrepeatPoa.embl [wheat, barley, maize, rice]
  • The EMBL files have been specifically designed for using with GenomeView. Therefore, the feature name which should be unique is used to name thetrack of GenomeView for this specific EMBL output
    • For example: RM_MIPSrepeatPo
  • Therefore, for the use of MIPS repeat databank we have
    • unix=> repeat_mips_poaceaeplus_v1308
    • embl=> 2_REPEATMASKER_MIPSrepeatPoa.embl
      • genomeView: RM_MIPSrepeatPo

Transposable Elements and repeat databanks for annotation and masking

  • TREP - GrainGenes
    • v10
    • The Triticeae Repeat Sequence Databases from Thomas Wicker et al. [Wicker et al., 2002 Trends in Plant Science 7:561-562]
      • TREPtotal
      • TREPprot
        • unix=> repeat_trepprot_triticeae_v10
        • embl=> 3_BLASTX_TREPprot.embl [wheat, barley, maize, rice]
          • genomeView: BLASTX_TREPprot
      • TREPcons - Databank not publicly available and not display by itself
        • v2012_04
        • repeat_trepcons_v1203 - 593 sequences
        • Refined by Josquin DARON based on the work of Timothée FLUTRE
      • Specific local databanks
        • repeat_urgi_v2013 (3bSeqIt2AnnotF_refTEs) - 2,159 sequences
        • repeat_choulet_v1.2 (fcREP_all_1_2) - 3,212 sequences
        • repeat_lesur_oak_v1202 (oak_repeats) - 41 sequences
  • MIPS repeats from The MIPS Repeat Element Database (mips-REdat)
    • v9_3p
    • The current public version mips-REdat_v9.3p has a size of ~450 Mb and contains ~62.000 sequences. To reduce redundancies the sequences where clustered with >=95% identity over >=95% length coverage, taking the longest element as representative. The public release does not contain yet unpublished data or sequences from Repbase. The repeat database can be browsed on this website and customized subsets can be downloaded with user defined taxon and/or repeat type restrictions. A bulk download is available via FTP
  • For TriAnnot repeat annotation and masking two major databanks are then used:
    • for wheat, barley, rice and maize
      1. unix=> repeat_mips_poaceaeplus_v1308 which contains
        • mipsREdat_9.3p_Poaceae_TEs
        • repeat_choulet_v1.2
        • repeat_trepcons_v1203
        • repeat_urgi_v2013
      2. embl=> 2_REPEATMASKER_MIPSrepeatPoa.embl
        1. genomeView: RM_MIPSrepeatPo
    • for oak
      1. unix=> repeat_mips_eudicot_plus_v1308 which contains
        • mipsREdat_9.3p_Gossypium
        • mipsREdat_9.3p_Eudicot_TEs
        • repeat_lesur_oak_v1202
      2. embl=> 2_REPEATMASKER_MIPSrepeatEud.embl
        1. genomeView: RM_MIPSrepeatEu

Other Databanks for annotation and masking

  • UniVec from NCBI
    • v7.1
    • UniVec is a database that can be used to quickly identify segments within nucleic acid sequences which may be of vector origin (vector contamination). In addition to vector sequences, UniVec also contains sequences for those adapters, linkers, and primers commonly used in the process of cloning cDNA or genomic DNA
      • unix=> univec_v7.1
      • embl=> 2_REPEATMASKER_univec.embl [wheat, barley, maize, rice, oak]
        • genomeView: RM_univec
  • Escherichia coli genomes
    • v116
    • SRS specific request databanks with EMBL. Only three K12 genomes are used:
      • AP009048 Escherichia coli str. K12 substr. W3110 DNA, complete genome
      • CP000948 Escherichia coli str. K12 substr. DH10B, complete genome
      • U00096 Escherichia coli str. K-12 substr. MG1655, complete genome
        • unix=> ecoli_v116
        • embl=>2_REPEATMASKER_Ecoli.embl [wheat, barley, maize, rice, oak]
          • genomeView: RM_Ecoli

Nucleic Databanks

  • RNA-seqdatabanks
    • Wheat: Comprehensive ensemble de novo transcriptome assembly of Illumina short RNA-seq reads sampled from five different tissues made by MIPS on December 2012 for IWGSC sequence survey project - Databank not publicly available
      • v1212
        • unix=> tae_rnas_mipsAssemblydenovo_v1212
        • embl=> 6_EXONERATE_rnaSeqWheat.embl
          • genomeView: EXO_N_rnaSeqWhe
    • Oak: Singletons and Assemblies of 454/Sanger/Illumina reads from Quercus robur and Quercus petraea - provided by Isabelle Lesur (file v0CV3) - Databank not publicly available
      • vOCV3
        • unix=> quer_rnas_vOCV3
        • embl=> 6_EXONERATE_rnaSeqOak.embl
          • genomeView: EXO_N_rnaSeqOak
    • Barley: Barley RNA-seq contigs set representing 23,797 genes (91%) of the IBSC high confident gene set published by Mayer et al. Nature 2012 and made by MIPS - Databank not publicly available
      • v1211
        • unix=> hvu_rnas_v1211
        • embl=> 6_EXONERATE_rnaSeqBarley.embl
          • genomeView: EXO_N_rnaSeqBar
  • EMBL-ENA specific request databanks for transcript resources (polyA/T removed, vector cleaning with RepeatMasker against Univec)
    • v116
      • Triticum Full-Length cDNA + Riken FL-cDNA (TriFLDB_Ta6162 (6162 wheat full-length cDNA sequences (3.2 MB) ) + TriFLDB_Ta5740 (New 5740 FLcDNAs dataset of wheat (2.8 MB)) + TriFLDB4905 (Newly sequenced wheat FLcDNA (2.7 MB))
        • unix=> trit_plus_fl_v116
        • embl=> 6_EXONERATE_TRITfl.embl [wheat]
          • genomeView: EXO_N_TRITfl
      • Triticum & Aegilops EST
        • unix=> trit_aegi_est_v116
        • embl=> 6_EXONERATE_TRITest.embl [wheat]
          • genomeView: EXO_N_TRITest
      • Hordeum Full-Length cDNA
        • unix=> hord_fl_v116
        • embl=> 6_EXONERATE_HORDfl.embl [barley]
          • genomeView: EXO_N_HORDfl
      • Hordeum EST
        • unix=> hord_est_v116
        • embl=> 6_EXONERATE_HORDest.embl [barley]
          • genomeView: EXO_N_HORDest
      • Oryza Full-Length cDNA
        • unix=> oryz_fl_v116
        • embl=> 6_EXONERATE_ORYZfl.embl [rice]
          • genomeView: EXO_N_ORYZfl
      • Oryza EST
        • unix=> oryz_est_v113
        • embl=> 6_EXONERATE_ORYZest.embl [rice]
          • genomeView: EXO_N_ORYZest
      • Zea Full-Length cDNA
        • unix=> zea_fl_v116
        • embl=> 6_EXONERATE_ZEAfl.embl [maize]
          • genomeView: EXO_N_ZEAfl
      • Zea EST
        • unix=> zea_est_v116
        • embl=> 6_EXONERATE_ZEAest.embl [maize]
          • genomeView: EXO_N_ZEAest
      • Arabidopsis Full-Length cDNA
        • unix=> arab_fl_v116
        • embl=> 6_EXONERATE_ARABfl.embl [oak]
          • genomeView: EXO_N_ARABfl
      • Prunus Full-Length cDNA
        • unix=> prun_fl_v116
        • embl=> 6_EXONERATE_PRUNfl.embl [oak]
          • genomeView: EXO_N_PRUNfl
      • Populus Full-Length cDNA
        • unix=> popu_fl_v116
        • embl=> 6_EXONERATE_POPUfl.embl [oak]
          • genomeView: EXO_N_POPUfl
      • Poaceae Full-Length cDNA
        • unix=> poac_fl_v116
        • embl=> 6_EXONERATE_POACfl.embl [wheat, barley, maize, rice]
          • genomeView: EXO_N_POACfl
      • Rosids Full-Length cDNA
        • unix=> rosi_fl_v116
        • embl=> 6_EXONERATE_ROSIfl.embl [oak]
          • genomeView: EXO_N_ROSIfl
      • Quercus EST
        • unix=> quer_est_v116
        • embl=> 6_EXONERATE_QUERest.embl [oak]
          • genomeView: EXO_N_QUERest
  • NCBI unigene [Wheeler et al., 2003 Nucleic Acids research 31:28-33]
    • UniGene computationally identifies transcripts from the same locus; analyzes expression by tissue, age, and health status; and reports related proteins (protEST) and clone resources
      • Triticum aestivum - build #63
        • unix=> tae_ugs_v63
        • embl=> 6_EXONERATE_TAEugs.embl [wheat]
          • genomeView: EXO_N_TAEugs
      • Hordeum vulgare - build #59
        • unix=> hvu_ugs_v59
        • embl=> 6_EXONERATE_HVUugs.embl [barley]
          • genomeView: EXO_N_HVUugs
      • Oryza sativa - build #86
        • unix=> osa_ugs_v86
        • embl=> 6_EXONERATE_OSAugs.embl [rice]
          • genomeView: EXO_N_OSAugs
      • Zea mays - build #84
        • unix=> zma_ugs_v84
        • embl=> 6_EXONERATE_ZMAugs.embl [maize]
          • genomeView: EXO_N_ZMAugs
      • Quercus robur - build #1
        • unix=> qro_ugs_v1
        • embl=> 6_EXONERATE_QROugs.embl [oak]
          • genomeView: EXO_N_QROugs
  • NCBI RefSeq for organelles genomes
    • The NCBI RefSeq project is an ongoing effort to provide a curated, non-redundant collection of reference sequences, representative of the central dogma, for each major organism
      • v1309
      • mitochondrial genomes
        • unix=> viri_refmt_v1309
        • embl=> 14_BLASTN_refSeqMito.embl [wheat, barley, maize, rice, oak]
          • genomeView: BLASTN_refSeqMi
      • chloroplastic genomes
        • unix=> viri_refcp_v1309
        • embl=> 14_BLASTN_refSeqChloro.embl [wheat, barley, maize, rice, oak]
          • genomeView: BLASTN_refSeqCh

Protein Databanks

  • UniProtKB/Swissprot
    • v2013_09
    • A curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases: see description at ExPASy web site
      • unix=> swissprot_v2013_09
      • embl=> 12_EXONERATE_uniprotSprot.embl [wheat, barley, maize, rice, oak]
        • genomeView: EXO_P_uniprotSp
  • EMBL-ENA specific request databanks for protein resources
    • Magnoliophyta (flowering plants) genus
      • Only used for functional annotation and within SIMprot
      • unix=> magn_prot_v2013_09
    • Triticum genus
      • unix=> trit_prot_v2013_09
      • embl=> 7_EXONERATE_protTRI.embl [wheat]
        • genomeView: EXO_X_protTRI
      • embl=> 12_EXONERATE_protTRI.embl [wheat, barley, maize, rice, oak]
        • genomeView: EXO_P_protTRI
    • Hordeum genus
      • unix=> hord_prot_v2013_09
      • embl=> 7_EXONERATE_protHOR.embl [barley]
        • genomeView: EXO_X_protHOR
      • embl=> 12_EXONERATE_protHOR.embl [wheat, barley, maize, rice, oak]
        • genomeView: EXO_P_protHOR
    • Oryza genus
      • unix=> oryz_prot_v2013_09
      • embl=> 7_EXONERATE_protORIZ.embl [rice]
        • genomeView: EXO_X_protORIZ
    • Zea genus
      • unix=> zea_prot_v2013_09
      • embl=> 7_EXONERATE_protZEA.embl [maize]
        • genomeView: EXO_X_protZEA
    • Saccharum genus
      • unix=> sacc_prot_v2013_09
      • embl=> 12_EXONERATE_protSACC.embl [wheat, barley, maize, rice, oak]
        • genomeView: EXO_P_protSACC
    • Rosids genus
      • unix=> rosi_prot_v2013_09
      • embl=> 7_EXONERATE_protROSI.embl [oak]
        • genomeView: EXO_X_protROSI

Protein Domain

  • Pfam at Sanger
    • v27
    • Pfam is a collection of protein family alignments which were constructed semi-automatically using hidden Markov models (HMMs). Pfam families have permanent accession numbers and contain functional annotation and cross-references to other databases.
    • Pfam is used within the functional annotation and the InterProScan modules (see Software).

Genome Models, CDS & CDS-derived peptides

  • Aegilops tauschii [Jia et al., 2013 Nature 496:91-5; PubMed]. see GenBank
    • Genome - unmasked
      • v1302 - see file GenBank
        • unix=> ata_geno_v1302
        • embl=> 14_BLASTN_genoATA.embl [wheat, barley, maize, rice]
          • genomeView: BLASTN_genoATA
    • Peptides
      • v1302 - see file GenBank
        • unix=> ata_pep_v1302
        • embl=> 7_EXONERATE_pepATA.embl [wheat, barley]
          • genomeView: EXO_X_pepATA
        • embl=> 12_EXONERATE_pepATA.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepATA
  • Arabidopsis thaliana at TAIR - see readme [The Arabidopsis Genome Initiative 2000 Nature 408:796-815; PubMed ]. See also AtGDB - EnsemblPlants
  • Brachypodium distachyon at Phytozome - see readme [International Brachypodium Initiative 2010 Nature 463:763-768; PubMed ]. See also BdGDB - EnsemblPlants
    • Genome - strong masked
      • v192 - see file
        • unix=> bdi_geno_v192
        • embl=> 14_BLASTN_genoBDI.embl [wheat, barley, maize, rice]
          • genomeView: BLASTN_genoBDI
    • CDS
      • v192 - see file
        • unix=> bdi_cds_v192
        • embl=> 6_EXONERATE_cdsBDI.embl [wheat, barley, maize, rice]
          • genomeView: EXO_N_cdsBDI
    • Peptides
      • v192 - see file
        • unix=> bdi_pep_v192
        • embl=> 7_EXONERATE_pepBDI.embl [wheat, barley, maize, rice]
          • genomeView: EXO_X_pepBDI
        • embl=> 12_EXONERATE_pepBDI.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepBDI
  • Brassica rapa at Phytozome - see readme [Wang et al., 2011 Nat Genet. 43:1035-1039 ; PubMed]. See also EnsemblPlants
    • Peptides
      • v197 - see file
        • unix=> bra_pep_v197
        • embl=> 12_EXONERATE_pepBRA.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepBRA
  • Glycin max at Phytozome - see readme [Schmutz et al., 2010 Nature 463:178-183; PubMed ]. See also SbGDB - EnsemblPlants
    • Genome - strong masked
      • v189 - see file
        • unix=> gma_geno_v189
        • embl=> 14_BLASTN_genoGMA.embl [oak]
          • genomeView: BLASTN_genoGMA
    • Pepides
      • v189 - see file
        • unix=> gma_pep_v189
        • embl=> 12_EXONERATE_pepGMA.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepGMA
  • Hordeum vulgare [International Barley Genome Sequencing et al., 2012 Nature 491:711-716; PubMed ]. See also SbGDB - EnsemblPlants . Fasta files obtained directly from MIPS
    • Genome - unmasked
      • v2012
        • unix=> hvu_geno_v2012
        • embl=> 14_BLASTN_genoHVU.embl [wheat, barley, maize, rice]
          • genomeView: BLASTN_genoHVU
    • CDS
      • v2012
        • High Confidence genes
          • unix=> hvu_cdsHC_v2012
          • embl=> 6_EXONERATE_cdsHVUhc.embl [wheat, barley, maize, rice]
            • genomeView: EXO_N_cdsHVUhc
        • Low Confidence genes
          • unix=> hvu_cdsLC_v2012
          • embl=> 6_EXONERATE_cdsHVUlc.embl [wheat, barley, maize, rice]
            • genomeView: EXO_N_cdsHVUlc
    • Peptides
      • v2012
        • High Confidence proteins
          • unix=> hvu_pepHC_v2012
          • embl=> 7_EXONERATE_pepHVUhc.embl [wheat, barley, maize, rice]
            • genomeView: EXO_X_pepHVUhc
          • embl=> 12_EXONERATE_pepHVUhc.embl [wheat, barley, maize, rice, oak]
            • genomeView: EXO_P_pepHVUhc
        • Low Confidence proteins
          • unix=> hvu_pepLC_v2012
          • embl=> 7_EXONERATE_pepHVUlc.embl [wheat, barley]
            • genomeView: EXO_X_pepHVUlc
          • embl=> 12_EXONERATE_pepHVUlc.embl [wheat, barley, maize, rice, oak]
            • genomeView: EXO_P_pepHVUlc
  • Medicago trunculata at Phytozome - see readme [Young et al., 2011 Nature 480:520-524; PubMed]. See also EnsemblPlants
    • Peptides
      • v198 - see file
        • unix=> mtr_pep_v198
        • embl=> 12_EXONERATE_pepMTR.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepMTR
  • Oryza sativa at MSU - see current annotation [Ouyang et al., 2007 Nucleic Acids Res. 35:D883-887; PubMed]. See also OsGDB - EnsemblPlants . However, TriAnnot uses the Phytozome  - see readme  fasta files
    • CDS
      • v204 - see file
        • unix=> osa_cds_v204
        • embl=> 6_EXONERATE_cdsOSA.embl [maize, rice]
          • genomeView=> EXO_N_cdsOSA
    • Peptides
      • v204 - see file
        • unix=> osa_pep_v204
        • embl=> 7_EXONERATE_pepOSA.embl [wheat, barley, maize, rice]
          • genomeView: EXO_X_pepOSA
        • embl=> 12_EXONERATE_pepOSA.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepOSA
  • Oryza sativa at RAP-DB - (IRGSP –Nipponbare) - [Tanaka et al., 2008 Nucleic Acids Research 36 :D1028-D1033; PubMed ]. See also OsGDB - EnsemblPlants
    • Genome - low masked
    • CDS
      • v1a - see file CDS sequences in FASTA format
        • unix=> osa_irgsp_cds_v1a
        • embl=> 6_EXONERATE_cdsOSAirgsp.embl [wheat, barley, maize, rice]
          • genomeView: EXO_N_cdsOSAirg
    • Peptides
      • v1a - see file Protein sequences (translated CDSs) in FASTA format
        • unix=> osa_irgsp_pep_v1a
        • embl=> 7_EXONERATE_pepOSAirgsp.embl [wheat, barley, maize, rice]
          • genomeView: EXO_X_pepOSAirg
        • embl=> 12_EXONERATE_pepOSAirgsp.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepOSAirg
  • Populus trichocarpa at Phytozome - see readme [Kelleher et al., 2007 Plant J. 50:1063-78; PubMed ]. See also PtGDB - EnsemblPlants
    • Genome - strong masked
      • v120 - see file
        • unix=> ptr_geno_v210
        • embl=> 14_BLASTN_genoPTR.embl [oak]
          • genomeView: BLASTN_genoPTR
    • CDS
      • v120 - see file
        • unix=> ptr_cds_v210
        • embl=> 6_EXONERATE_cdsPTR.embl [oak]
          • genomeView: EXO_N_cdsPTR
    • Peptide
      • v120 - see file
        • unix=> ptr_pep_v210
        • embl=> 7_EXONERATE_pepPTR.embl [oak]
          • genomeView: EXO_X_pepPTR
        • embl=> 12_EXONERATE_pepPTR.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepPTR
  • Prunus persica at Phytozome - see readme [Ahmad et al., 2011 BMC Genomics 12:569; PubMed ]. See also PeGDB
    • Genome - strong masked
      • v139 - see file
        • unix=> ppe_geno_v139
        • embl=> 14_BLASTN_genoPPE.embl [oak]
          • genomeView: BLASTN_genoPPE
    • CDS
      • v139 - see file
        • unix=> ppe_cds_v139
        • embl=> 6_EXONERATE_cdsPPE.embl [oak]
          • genomeView: EXO_N_cdsPPE
    • Peptides
      • v139 - see file
        • unix=> ppe_pep_v139
        • embl=> 7_EXONERATE_pepPPE.embl [oak]
          • genomeView: EXO_X_pepPPE
        • embl=> 12_EXONERATE_pepPPE.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepPPE
  • Setaria italica at Phytozome - see readme [Bennetzen et al., 2012 Nature Biotechnology 30:555-564; PubMed ]. See also PtGDB - EnsemblPlants
    • Genome - strong masked
      • v164 - see file
        • unix=> sit_geno_v164
        • embl=> 14_BLASTN_genoSIT.embl [wheat, barley, maize, rice]
          • genomeView: BLASTN_genoSIT
    • Peptides
      • v164 - see file
        • unix=> sit_pep_v164
        • embl=> 12_EXONERATE_pepSIT.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepSIT
  • Sorghum bicolor at Phytozome - see readme [Paterson et al., 2009 Nature 457:551-556; PubMed ]. Dee also SbGDB - EnsemblPlants
    • CDS
      • v79 - see file
        • unix=> sbi_cds_v79
        • embl=> 6_EXONERATE_cdsSBI.embl [wheat, barley, maize, rice]
          • genomeView: EXO_N_cdsSBI
    • Peptides
      • v79 - see file
        • unix=> sbi_pep_v79
        • embl=> 7_EXONERATE_pepSBI.embl [wheat, barley, maize, rice]
          • genomeView: EXO_X_pepSBI
        • embl=> 12_EXONERATE_pepSBI.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepSBI
  • Triticum uratu [Ling et al., 2013 Nature 496:87-90; PubMed]. See GenBank
    • Genome - unmasked
      • v1302 - see file GenBank
        • unix=> tur_geno_v1302
        • embl=> 14_BLASTN_genoTUR.embl [wheat, barley, maize, rice]
          • genomeView: BLASTN_genoTUR
    • Peptides
      • v1302 -see file GenBank
        • unix=> tur_pep_v1302
        • embl=> 7_EXONERATE_pepTUR.embl [wheat, barley]
          • genomeView: EXO_X_pepTUR
        • embl=> 12_EXONERATE_pepTUR.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepTUR
  • Solanum lycopersicum at Phytozome - see readme [Tomato Genome Consortium 2012 Nature 485:635-41; PubMed]. See also EnsemblPlants
    • Peptides
      • v225 - see file
        • unix=> sly_pep_v225
        • embl=>12_EXONERATE_pepSLY.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepSLY
  • Theobroma cacao at Phytozome - see readme [Argout et al., 2011 Nat Genet. 43:101-108; PubMed].
    • Peptides
      • v233 - see file
        • unix=> tca_pep_v233
        • embl=> 12_EXONERATE_pepTCA.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepTCA
  • Vitis vinefera at Phytozome - see readme [Jaillon et al., 2007Nature 449:463-467; PubMed ]. See also PtGDB - EnsemblPlants
    • Genome - strong masked
      • v145 - see file
        • unix=> vvi_geno_v145
        • embl=> 14_BLASTN_genoVVI.embl [oak]
          • genomeView: BLASTN_genoVVI
    • CDS
      • v145 - see file
        • unix=> vvi_cds_v145
        • embl=> 6_EXONERATE_cdsVVI.embl [oak]
          • genomeView: EXO_N_cdsVVI
    • Peptides
      • v145 - see file
        • unix=> vvi_pep_v145
        • embl=> 7_EXONERATE_pepVVI.embl [oak]
          • genomeView: EXO_X_pepVVI
        • embl=> 12_EXONERATE_pepVVI.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepVVI
  • Zea mays at MaizeSequence.org [Schnable et al., 2009 Science 326:1112-1115; PubMed ]. See also ZmGDB - EnsemblPlants . However, TriAnnot use the Phytozome - see readme fasta files
    • Genome - strong masked
      • v181 - see file
        • unix=> zma_geno_v181
        • embl=> 14_BLASTN_genoZMA.embl [wheat, barley, maize, rice]
          • genomeView: BLASTN_genoZMA
    • CDS
      • v181 - see file
        • unix=> zma_cds_v181
        • embl=> 6_EXONERATE_cdsZMA.embl [wheat, barley, maize, rice]
          • genomeView: EXO_N_cdsZMA
    • Peptides
      • v181 - see file
        • unix=> zma_pep_v181
        • embl=> 7_EXONERATE_pepZMA.embl [wheat, barley, maize, rice]
          • genomeView: EXO_X_pepZMA
        • embl=> 12_EXONERATE_pepZMA.embl [wheat, barley, maize, rice, oak]
          • genomeView: EXO_P_pepZMA

Other Genome ressources from TGAC

These WGS genomes assemblies are publicly availble at URGI

  • Triticum durum, cv. Cappelli (listed as durum _v1)- see file
    • unix=> tgac_WGSA_tdu_v1
    • embl=> 14_BLASTN_tgacWGSAtdu.embl
      • genomeView: BLASTN_tgacWGSA
  • Triticum durum, cv. Strongfield - see file
    • unix=> tgac_WGSA_strong_v1
    • embl=> 14_BLASTN_tgacWGSAstrong.embl
      • genomeView: BLASTN_tgacWGSA
  • Triticum monococcum - see file
    • unix=> tgac_WGSA_tmo_v1
    • embl=> 14_BLASTN_tgacWGSAtmo.embl
      • genomeView: BLASTN_tgacWGSA
  • Triticum speltoides - see file
    • unix=> tgac_WGSA_tsp_v1
    • embl=> 14_BLASTN_tgacWGSAtsp.embl
      • genomeView: BLASTN_tgacWGSA
  • Aegilops sharonensis - see file
    • unix=> tgac_WGSA_ash_v1
    • embl=> 14_BLASTN_tgacWGSAash.embl
      • genomeView: BLASTN_tgacWGSA
  • Triticum urartu - see file
    • unix=> tgac_WGSA_tur_v1
    • embl=> 14_BLASTN_tgacWGSAtur.embl
      • genomeView: BLASTN_tgacWGSA
  • Aegilops tauschii - see file
    • unix=> tgac_WGSA_ata_v1
    • embl=> 14_BLASTN_tgacWGSAata.embl
      • genomeView: BLASTN_tgacWGSA

SIMsearch specific databanks

  • SIMnuc
    • v1310
    • for wheat, barley, rice and maize
      • bdi_cds_v192
      • osa_irgsp_cds_v1a
      • hvu_cdsHC_v2012
    • for oak
      • vvi_cds_v145
      • ptr_cds_v210
      • ppe_cds_v139
  • SIMprot
    • v1310
    • for wheat, barley, rice, maize and oak
      • magn_prot_v2013_09
      • bdi_pep_v192
      • hvu_pepHC_v2012
      • osa_irgsp_pep_v1a
      • osa_pep_v204
      • ppe_pep_v139
      • ptr_pep_v210
      • sbi_pep_v79
      • zma_pep_v181

Databanks used Tallymer

  • TriAnnot uses a k-mer composition to mask repeated regions using an index of 17-mer frequency (called MDR for Mathematically Defined Repeats) that was computed with Tallymer (Kurtz et al., 2008 BMC Genomics 9, 517)
    • Genome sequences used: wheat genomic sequences yielded ~1× genome coverage of Chinese Spring  (Kumar et al. 2011 Current Science 100:455-457) - see paper
      • CS1x_20mer_occ1 (CS1xBbsrcDec2009)

Other Annotations

  • MIPS  on wheat IWGSC  sequence survey
    • v1210
    • Annotation made by MIPS on December 2012
      • unix=> tae_annot_mips_v1210
      • embl=> 6_EXONERATE_AnnotMIPS.embl
        • genomeView: EXO_N_AnnotMIPS
  • GDEC  on wheat Chromosome 3B
    • v4.4
    • Gene manually validated by S. Theil at GDEC for the 3BSEQ project assembly 4_1, Annotation 4_2 (Automatic annotation made with TriAnnot 3.5 modified)
      • unix=> tae_annot_gdec_v4_4
      • embl=> 6_EXONERATE_AnnotSEB.embl
        • genomeView: EXO_N_AnnotSEB