The entire OMA database is available for download in several formats. It is also possible to download each group separately. This option is available in the group view. Please read our terms and conditions before integrating OMA data into your own research or database.
The orthology relationships are available in two types: groups or pairs of orthologs. The information is given in terms of OMA identifiers (of the form HUMAN04376).
All protein and coding sequences can be downloaded in fasta files. We use the OMA identifiers to identify the sequences. Cross-references to UniProt, RefSeq and Ensembl are also available as tsv files (see below in the Mapping section). The proteins are all in one file, while the coding DNA is split into two files, one for the Eukaryotes and one for the Prokaryotes.
Mappings of the OMA identifier to various other databases are available. Mappings to UniProt, RefSeq and EntrezGene IDs are based on exact sequence matches, other cross-references come from source genome files directly.
Mappings of the OMA identifier of updated genomes from one release to another. We track only proteins with same amino acid sequences.