Quantcast

Documentation Center

  • Trial Software
  • Product Updates

Connecting to the KEGG API Web Service

This example shows how to access the KEGG [1,2] system using the REST-style KEGG API from within MATLAB. We show some examples using the KEGG API to access some information of human fatty acid degradation pathway and its components.

This example was developed on October 17, 2013 (KEGG Release 68.0). Note that data in public repositories are frequently curated and updated; therefore the results shown in this example may differ from the results you get when you use up-to-date data.

The KEGG database [1,2] is developed by Kanehisa Laboratories (http://www.kanehisa.jp/), and its use is subject to various subscription or license terms and charges depending on your use of the data. For details, see http://www.kegg.jp/kegg/download/.

And for more details about the KEGG API, see http://www.kegg.jp/kegg/rest/keggapi.html.

Use the info operation to display the current statistics of the Pathway database

Define the base URL and info operation.

base = 'http://rest.kegg.jp/';
operation = 'info/';

Define the Pathway database.

database = 'pathway';

Retrive the current stats of the Pathway database.

pathwayDbInfo = urlread(strcat(base,operation,database))
pathwayDbInfo =

pathway          KEGG Pathway Database
path             Release 68.0+/10-25, Oct 13
                 Kanehisa Laboratories
                 272,505 entries


Use the conv operation to convert KEGG identifiers to/from outside identifiers.

The following codes show how to use the conv operation to convert NCBI-GI identifiers to KEGG identifiers.

operation = 'conv/';
database = 'genes/';
dbentry1 = 'ncbi-gi:10047086';
dbentry2 = 'ncbi-geneid:14751';
kegg_id1 = regexpi(urlread(strcat(base,operation,database,dbentry1)),'(?<=(??@dbentry1)\s+)\w+\W+\w*','match')
kegg_id2 = regexpi(urlread(strcat(base,operation,database,dbentry2)),'(?<=(??@dbentry2)\s+)\w+\W+\w*','match')
kegg_id1 = 

    'hsa:54206'


kegg_id2 = 

    'mmu:14751'

Retrieve the list of organisms from the KEGG taxonomic classification database.

Use the list operation to retrieve the list of organisms.

operation = 'list/';
database = 'organism';
organisms = urlread(strcat(base,operation,database));
organisms = regexpi(organisms,'[^\n]+','match')'; % convert to cellstr

organisms is an array of cell strings, one for each organism in the KEGG taxonomic classification database. Find an entry with the string Homo sapiens and notice that the KEGG organism code for Homo sapiens is hsa.

hsa_idx = find(~cellfun(@isempty,regexpi(organisms,'Homo sapiens')));
organisms(hsa_idx)
ans = 

    'T01001	hsa	Homo sapiens (human)	Eukaryotes;Animals;Vertebrates;Mammals'

Get the list of pathways for Homo sapiens in KEGG PATHWAY database

operation = 'list/';
database = 'pathway/';
organismCode = 'hsa';
pathway_list = urlread(strcat(base,operation,database,organismCode));
pathway_list = regexpi(pathway_list,'[^\n]+','match')'; % convert to cellstr

Get the total number of pathways.

num_pathways = numel(pathway_list)
num_pathways =

   278

Retrieve the lists of genes, compounds, enzymes, and reactions involved in the Fatty acid degradation pathway

Locate the string Fatty acid degradation from the list of pathways and extract its ID.

fadp_idx = find(~cellfun(@isempty,regexpi(pathway_list,'Fatty acid degradation')));
pathway_list(fadp_idx)
fadp_id = regexpi(pathway_list(fadp_idx),'(?<=path:)\w+','match');
fadp_id{1}
ans = 

    'path:hsa00071	Fatty acid degradation - Homo sapiens (human)'


ans = 

    'hsa00071'

Get the complete record of fatty acid degradation pathway.

operation = 'get/';
fadp_record = urlread(char(strcat(base,operation,fadp_id{1})))
fadp_record =

ENTRY       hsa00071                    Pathway
NAME        Fatty acid degradation - Homo sapiens (human)
CLASS       Metabolism; Lipid metabolism
PATHWAY_MAP hsa00071  Fatty acid degradation
MODULE      hsa_M00086  beta-Oxidation, acyl-CoA synthesis [PATH:hsa00071]
            hsa_M00087  beta-Oxidation [PATH:hsa00071]
DISEASE     H00162  Sjogren-Larsson syndrome
            H00178  Glutaric acidemia
            H00407  Peroxisomal beta-oxidation enzyme deficiency
            H00525  Disorders of fatty-acid oxidation
            H01267  Familial hyperinsulinemic hypoglycemia (HHF)
DRUG        D00123  Cyanamide (JP16)
            D00131  Disulfiram (JP16/USP/INN)
            D00707  Fomepizole (JAN/USAN/INN)
            D05292  Oxfenicine (USAN/INN)
ORGANISM    Homo sapiens (human) [GN:hsa]
GENE        39  ACAT2; acetyl-CoA acetyltransferase 2 [KO:K00626] [EC:2.3.1.9]
            38  ACAT1; acetyl-CoA acetyltransferase 1 [KO:K00626] [EC:2.3.1.9]
            30  ACAA1; acetyl-CoA acyltransferase 1 [KO:K07513] [EC:2.3.1.16]
            10449  ACAA2; acetyl-CoA acyltransferase 2 [KO:K07508] [EC:2.3.1.16]
            3032  HADHB; hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase (trifunctional protein), beta subunit [KO:K07509] [EC:2.3.1.16]
            3033  HADH; hydroxyacyl-CoA dehydrogenase [KO:K00022] [EC:1.1.1.35]
            3030  HADHA; hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase (trifunctional protein), alpha subunit [KO:K07515] [EC:1.1.1.211 4.2.1.17]
            1962  EHHADH; enoyl-CoA, hydratase/3-hydroxyacyl CoA dehydrogenase [KO:K07514] [EC:5.3.3.8 1.1.1.35 4.2.1.17]
            1892  ECHS1; enoyl CoA hydratase, short chain, 1, mitochondrial [KO:K07511] [EC:4.2.1.17]
            8310  ACOX3; acyl-CoA oxidase 3, pristanoyl [KO:K00232] [EC:1.3.3.6]
            51  ACOX1; acyl-CoA oxidase 1, palmitoyl [KO:K00232] [EC:1.3.3.6]
            35  ACADS; acyl-CoA dehydrogenase, C-2 to C-3 short chain [KO:K00248] [EC:1.3.8.1]
            34  ACADM; acyl-CoA dehydrogenase, C-4 to C-12 straight chain [KO:K00249] [EC:1.3.8.7]
            33  ACADL; acyl-CoA dehydrogenase, long chain [KO:K00255] [EC:1.3.8.8]
            36  ACADSB; acyl-CoA dehydrogenase, short/branched chain [KO:K09478] [EC:1.3.99.12]
            37  ACADVL; acyl-CoA dehydrogenase, very long chain [KO:K09479] [EC:1.3.8.9]
            2639  GCDH; glutaryl-CoA dehydrogenase [KO:K00252] [EC:1.3.8.6]
            23305  ACSL6; acyl-CoA synthetase long-chain family member 6 [KO:K01897] [EC:6.2.1.3]
            2182  ACSL4; acyl-CoA synthetase long-chain family member 4 [KO:K01897] [EC:6.2.1.3]
            2180  ACSL1; acyl-CoA synthetase long-chain family member 1 [KO:K01897] [EC:6.2.1.3]
            51703  ACSL5; acyl-CoA synthetase long-chain family member 5 [KO:K01897] [EC:6.2.1.3]
            2181  ACSL3; acyl-CoA synthetase long-chain family member 3 [KO:K01897] [EC:6.2.1.3]
            23205  ACSBG1; acyl-CoA synthetase bubblegum family member 1 [KO:K15013] [EC:6.2.1.3]
            81616  ACSBG2; acyl-CoA synthetase bubblegum family member 2 [KO:K15013] [EC:6.2.1.3]
            1374  CPT1A; carnitine palmitoyltransferase 1A (liver) [KO:K08765] [EC:2.3.1.21]
            1375  CPT1B; carnitine palmitoyltransferase 1B (muscle) [KO:K08765] [EC:2.3.1.21]
            126129  CPT1C; carnitine palmitoyltransferase 1C [KO:K08765] [EC:2.3.1.21]
            1376  CPT2; carnitine palmitoyltransferase 2 [KO:K08766] [EC:2.3.1.21]
            1632  ECI1; enoyl-CoA delta isomerase 1 [KO:K13238] [EC:5.3.3.8]
            10455  ECI2; enoyl-CoA delta isomerase 2 [KO:K13239] [EC:5.3.3.8]
            1579  CYP4A11; cytochrome P450, family 4, subfamily A, polypeptide 11 [KO:K07425] [EC:1.14.15.3]
            284541  CYP4A22; cytochrome P450, family 4, subfamily A, polypeptide 22 [KO:K07425] [EC:1.14.15.3]
            124  ADH1A; alcohol dehydrogenase 1A (class I), alpha polypeptide [KO:K13951] [EC:1.1.1.1]
            125  ADH1B; alcohol dehydrogenase 1B (class I), beta polypeptide [KO:K13951] [EC:1.1.1.1]
            126  ADH1C; alcohol dehydrogenase 1C (class I), gamma polypeptide [KO:K13951] [EC:1.1.1.1]
            131  ADH7; alcohol dehydrogenase 7 (class IV), mu or sigma polypeptide [KO:K13951] [EC:1.1.1.1]
            127  ADH4; alcohol dehydrogenase 4 (class II), pi polypeptide [KO:K13980] [EC:1.1.1.1]
            128  ADH5; alcohol dehydrogenase 5 (class III), chi polypeptide [KO:K00121] [EC:1.1.1.1 1.1.1.284]
            130  ADH6; alcohol dehydrogenase 6 (class V) [KO:K13952] [EC:1.1.1.1]
            217  ALDH2; aldehyde dehydrogenase 2 family (mitochondrial) [KO:K00128] [EC:1.2.1.3]
            224  ALDH3A2; aldehyde dehydrogenase 3 family, member A2 [KO:K00128] [EC:1.2.1.3]
            219  ALDH1B1; aldehyde dehydrogenase 1 family, member B1 [KO:K00128] [EC:1.2.1.3]
            501  ALDH7A1; aldehyde dehydrogenase 7 family, member A1 [KO:K14085] [EC:1.2.1.3 1.2.1.8 1.2.1.31]
            223  ALDH9A1; aldehyde dehydrogenase 9 family, member A1 [KO:K00149] [EC:1.2.1.3 1.2.1.47]
COMPOUND    C00010  CoA
            C00024  Acetyl-CoA
            C00071  Aldehyde
            C00136  Butanoyl-CoA
            C00154  Palmitoyl-CoA
            C00162  Fatty acid
            C00173  Acyl-[acyl-carrier protein]
            C00226  Primary alcohol
            C00229  Acyl-carrier protein
            C00249  Hexadecanoic acid
            C00332  Acetoacetyl-CoA
            C00340  Reduced rubredoxin
            C00435  Oxidized rubredoxin
            C00489  Glutarate
            C00517  Hexadecanal
            C00527  Glutaryl-CoA
            C00638  Long-chain fatty acid
            C00823  1-Hexadecanol
            C00877  Crotonoyl-CoA
            C01144  (S)-3-Hydroxybutanoyl-CoA
            C01371  Alkane
            C01832  Lauroyl-CoA
            C01944  Octanoyl-CoA
            C02593  Tetradecanoyl-CoA
            C02990  L-Palmitoylcarnitine
            C03221  2-trans-Dodecenoyl-CoA
            C03547  omega-Hydroxy fatty acid
            C03561  (R)-3-Hydroxybutanoyl-CoA
            C05102  alpha-Hydroxy fatty acid
            C05258  (S)-3-Hydroxyhexadecanoyl-CoA
            C05259  3-Oxopalmitoyl-CoA
            C05260  (S)-3-Hydroxytetradecanoyl-CoA
            C05261  3-Oxotetradecanoyl-CoA
            C05262  (S)-3-Hydroxydodecanoyl-CoA
            C05263  3-Oxododecanoyl-CoA
            C05264  (S)-Hydroxydecanoyl-CoA
            C05265  3-Oxodecanoyl-CoA
            C05266  (S)-3-Hydroxyoctanoyl-CoA
            C05267  3-Oxooctanoyl-CoA
            C05268  (S)-Hydroxyhexanoyl-CoA
            C05269  3-Oxohexanoyl-CoA
            C05270  Hexanoyl-CoA
            C05271  trans-Hex-2-enoyl-CoA
            C05272  trans-Hexadec-2-enoyl-CoA
            C05273  trans-Tetradec-2-enoyl-CoA
            C05274  Decanoyl-CoA
            C05275  trans-Dec-2-enoyl-CoA
            C05276  trans-Oct-2-enoyl-CoA
            C05279  trans,cis-Lauro-2,6-dienoyl-CoA
            C05280  cis,cis-3,6-Dodecadienoyl-CoA
REFERENCE   PMID:869535
  AUTHORS   Parekh VR, Traxler RW, Sobek JM
  TITLE     N-Alkane oxidation enzymes of a pseudomonad.
  JOURNAL   Appl Environ Microbiol 33:881-4 (1977)
KO_PATHWAY  ko00071
///


Retrieve the KO_PATHWAY id.

ko_id = regexpi(fadp_record,'(?<=KO\w+PATHWAY\s+)\w*','match')
ko_id = 

    'ko00071'

Retrieve all the other alias pathway entries for ko_id .

operation = 'link/';
database = 'pathway/';
allPathwayIDs = urlread(strcat(base,operation,database,ko_id{1}))
allPathwayIDs =

path:ko00071	path:ec00071
path:ko00071	path:ko00071
path:ko00071	path:map00071
path:ko00071	path:rn00071


Retrieve the map_id for later uses.

map_id = regexpi(allPathwayIDs,'(?<=\w+\W+\w+\s+path:)(?=map)\w*', 'match')
map_id = 

    'map00071'

Retrieve the list of genes involved in the pathway.

operation = 'link/';
database = 'genes/';
fadp_genes = urlread(char(strcat(base,operation,database,fadp_id{1})));
fadp_genes = regexpi(fadp_genes, '[^\n]+','match'); % conver to cellstr
num_genes = numel(fadp_genes)
num_genes =

    44

Retrieve the list of compounds involved in the pathway using the ko_id.

operation = 'link/';
database = 'cpd/';
fadp_cpds = urlread(strcat(base,operation,database,ko_id{1}));
fadp_cpds = regexpi(fadp_cpds,'(?<=\w+\:\w*\s+)cpd:\w*','match');
num_cpds = numel(fadp_cpds)
num_cpds =

    50

Get the list of enzymes using the ko_id of the pathway.

operation = 'link/';
database = 'enzyme/';
fadp_enzymes = urlread(strcat(base,operation,database,ko_id{1}));
fadp_enzymes = regexpi(fadp_enzymes, '[^\n]+','match');
num_enzymes = numel(fadp_enzymes)
num_enzymes =

    31

Get the list of reactions using the map_id of the pathway.

operation = 'link/';
database = 'rn/';
fadp_reactions = urlread(strcat(base,operation,database,map_id{1}));
fadp_reactions = regexpi(fadp_reactions,'[^\n]+','match');
num_reactions = numel(fadp_reactions)
num_reactions =

    47

Color Pathways

In KEGG pathway maps, a gene or enzyme is represented by a rectangle, and a compound is shown as a circle. In this example, the fatty acid degradation pathway map returned by KEGG has already contained green-colored enzymes related to Homo sapiens.

base_pathway_map = 'http://www.kegg.jp/pathway/';
web(char(strcat(base_pathway_map,fadp_id{1})),'-browser')

You can color more components in the pathway. These additional components are highlighted in red by default. Suppose you want to color the first five compounds from the compound list.

additional_components = [fadp_cpds(1:5)'];
final_url = char(strcat(base_pathway_map,fadp_id{1}));
for i = 1:size(additional_components,1)
    final_url = strcat(final_url,'+',additional_components{i});
end
web(final_url,'-browser')

Add custom colors to selected components of the pathway

Use %23 in ASCII code instead of # for color specification.

base_custom_color_map = 'http://www.kegg.jp/kegg-bin/show_pathway?';
final_url_custom_color_map = char(strcat(base_custom_color_map,fadp_id{1},'/'));
fgcolor = {'red','blue','green','magenta','yellow'}';
bgcolor = {'blue','magenta','cyan','red','blue'}';
for i = 1:size(additional_components,1)
    final_url_custom_color_map = strcat(final_url_custom_color_map,...
                                        additional_components{i},'%09',...
                                        bgcolor{i},',',fgcolor{i},'/');
end
web(final_url_custom_color_map,'-browser')

Apply one custom color to a selected list

You can apply just one custom color to all selected components. Use the map_id instead of HSA id so that gene products related to HSA are not automatically highlighted by KEGG.

dcolor = 'cyan';
final_url_one_color = strcat(base_custom_color_map,map_id{1},'/default%3d',dcolor,'/');
for i = 1:size(additional_components,1)
    final_url_one_color = strcat(final_url_one_color,additional_components{i},'/');
end
web(final_url_one_color,'-browser')

Display a static map in a figure

You can display the static pathway map in a figure. You need to set the color map of the figure to cmap. Note: If you have Image Processing Toolbox™, just use imshow(x,cmap) to display the pathway map.

operation = 'get/';
static_url = char(strcat(base,operation,fadp_id{1},'/image'));
[x,cmap] = imread(static_url);
hfig = figure('Colormap', cmap);
hax = axes('Parent', hfig);
himg = image(x, 'Parent', hax);
set(hax, 'Visible', 'off');
scaleimagefigure(hfig, hax, himg);

References

[1] Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., and Tanabe, M. "KEGG for integration and interpretation of large-scale molecular datasets", Nucleic Acids Research, 40, D109-D114, 2012.

[2] Kanehisa, M. and Goto, S. "KEGG: Kyoto Encyclopedia of Genes and Genomes", Nucleic Acids Research, 28, 27-30, 2000.

Provide feedback for this example.

Was this topic helpful?