Information

Pathway Tools Overview Pathway Tools Testimonials
Publications
Release Note History
Contributions
Pathway Tools Blog
Technical Datasheet
Fact Sheet
Pathway Tools Testimonials
Contact Us

Licensing

Academic Licenses Commercial Licenses

Technical Specs

Web Services
Pathway Tools APIs
Installation Guide
Ontologies
Operations
File Formats

Support

Submitting Bug Reports
Tutorials
FAQs
Webinars
Pathway Tools File Formats

Pathway Tools File Formats

This page documents file formats for input files accepted by Pathway Tools, and for output files generated by Pathway Tools.

Input File Formats for Building New Pathway/Genome Databases

Given an annotated genome as input, Pathway Tools constructs a new Pathway/Genome Database containing the input genome plus new computationally inferred information such as the metabolic network of the organism. Here we list the input formats accepted by Pathway Tools with various commentary about these formats.

Overall we recommend GFF3 format because a single GFF3 file can contain all sequence plus annotation for a genome with multiple replicons, and because we have observed problems with representation of alternative splicing in GenBank-format files.

  • Input and output format: Sample GFF3 Format file
    • One .gbff file can contain all sequence plus annotation for multiple replicons (.gbk files cannot contain multiple replicons)
    • Better representation of alternative splicing than RefSeq .gbk files
    • Producers of GFF3 files that are known to work with Pathway Tools:
      • RefSeq prokaryotic and eukaryotic files
      • NCBI Prokaryotic Annotation Pipeline

  • Input and output format: Sample GenBank Format file
    • One .gbff file can contain all sequence plus annotation for multiple replicons (.gbk files cannot contain multiple replicons)
    • Sometimes representation of alternative splicing is problematic
    • Producers of GenBank files that are known to work with Pathway Tools:
      • RefSeq prokaryotic and eukaryotic files
      • NCBI Prokaryotic Annotation Pipeline
      • Prokka
      • RAST

  • Input format: Sample PathoLogic Format file
    • Separate files required for each replicon
    • Separate files required for sequence
    • Best for simple genomes with a single replicon
    • Cannot represent alternative splicing
    • Requires a separate genetic-elements.dat file to list annotation and sequence files for each replicon: Sample genetic-elements.dat file

Other Input File Formats

Input File Formats