INRIA data sets for APrIL2

Some Data Sets for Model Revision in Systems Biology
Contraintes project-team
INRIA Rocquencourt, France

created : 10/06, updated with Prolog and SLP files : 12/06 and 01/07
contact : François Fages

This page describes some data sets used by INRIA for learning model revisions from temporal logic properties in biochemical networks, in the framework of the 6th PCRD STREP project APrIL2 on "Applications of Probabilistic Inductive Logic programming".

The syntax used for these data sets is the one of the BIOCHAM modeling environment.

The method we use is decribed in
Calzone L., Chabrier-Rivier N., Fages F. and Soliman S. Machine Learning Biochemical Networks from Temporal Logic Properties. Transactions on Computational Systems Biology. LNBI 4220, Springer-Verlag. October 2006. pdf.

A Prolog file (generated in BIOCHAM by export_prolog) containing both the model and its specification is provided for each example (except the last for our current limitation to 256 state variables in Prolog). These Prolog files include the file ctl.pl which can be used to check the temporal properties, and experiment with inductive logic programming techniques. A Stochastic Logic Program file (generated in BIOCHAM by export_slp) is also provided.

Small cell cycle kinetic model after [Tyson 91]

Model of biochemical reactions : tyson91.bc (tyson91.pl, tyson91.slp)

biocham: list_rules.
k1 for _=>Cyclin.
k2*[Cyclin] for Cyclin=>_.
k3*[Cyclin]*[Cdc2~{p1}] for Cyclin+Cdc2~{p1}=>Cdc2-Cyclin~{p1,p2}.
k4p*[Cdc2-Cyclin~{p1,p2}] for Cdc2-Cyclin~{p1,p2}=>Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2-Cyclin~{p1,p2}] for Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.
k5*[Cdc2-Cyclin~{p1}] for Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.
k6*[Cdc2-Cyclin~{p1}] for Cdc2-Cyclin~{p1}=>Cyclin~{p1}+Cdc2.
k7*[Cyclin~{p1}] for Cyclin~{p1}=>_.
k8*[Cdc2] for Cdc2=>Cdc2~{p1}.
k9*[Cdc2~{p1}] for Cdc2~{p1}=>Cdc2.

Elementary CTL properties : tyson91.CTL

biocham: add_genCTL.
Ei(EF(Cyclin)).
Ei(EF(!(Cyclin))).
Ai(AG(Cyclin->EF(!(Cyclin))&(!(Cyclin)->EF(Cyclin)))).
Ei(EF(Cdc2~{p1})).
Ei(EF(!(Cdc2~{p1}))).
Ai(AG(Cdc2~{p1}->EF(!(Cdc2~{p1}))&(!(Cdc2~{p1})->EF(Cdc2~{p1})))).
Ai(AG(!(Cdc2~{p1})->!(E(!(Cdc2) U Cdc2~{p1})))).
Ei(EF(Cdc2-Cyclin~{p1,p2})).
Ei(EF(!(Cdc2-Cyclin~{p1,p2}))).
Ai(AG(Cdc2-Cyclin~{p1,p2}->EF(!(Cdc2-Cyclin~{p1,p2}))&(!(Cdc2-Cyclin~{p1,p2})->EF(Cdc2-Cyclin~{p1,p2})))).
Ei(EF(Cdc2-Cyclin~{p1})).
Ei(EF(!(Cdc2-Cyclin~{p1}))).
Ai(AG(Cdc2-Cyclin~{p1}->EF(!(Cdc2-Cyclin~{p1}))&(!(Cdc2-Cyclin~{p1})->EF(Cdc2-Cyclin~{p1})))).
Ai(AG(!(Cdc2-Cyclin~{p1})->!(E(!(Cdc2-Cyclin~{p1,p2}) U Cdc2-Cyclin~{p1})))).
Ei(EF(Cdc2)).
Ei(EF(!(Cdc2))).
Ai(AG(Cdc2->EF(!(Cdc2))&(!(Cdc2)->EF(Cdc2)))).
Ei(EF(Cyclin~{p1})).
Ei(EF(!(Cyclin~{p1}))).
Ai(AG(Cyclin~{p1}->EF(!(Cyclin~{p1}))&(!(Cyclin~{p1})->EF(Cyclin~{p1})))).
Ai(AG(!(Cyclin~{p1})->!(E(!(Cdc2-Cyclin~{p1}) U Cyclin~{p1})))).

Model compression :

biocham: reduce_model.
1: deleting Cyclin=>_
2: deleting Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}
3: deleting Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}
4: deleting Cdc2~{p1}=>Cdc2
After reduction, 6 rules remain corresponding to the bias ? => ?
Deletion(s):
Cyclin=>_.
Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.
Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.
Cdc2~{p1}=>Cdc2.

biocham: list_rules.
k1 for _=>Cyclin.
k3*[Cyclin]*[Cdc2~{p1}] for Cyclin+Cdc2~{p1}=>Cdc2-Cyclin~{p1,p2}.
k4p*[Cdc2-Cyclin~{p1,p2}] for Cdc2-Cyclin~{p1,p2}=>Cdc2-Cyclin~{p1}.
k6*[Cdc2-Cyclin~{p1}] for Cdc2-Cyclin~{p1}=>Cyclin~{p1}+Cdc2.
k7*[Cyclin~{p1}] for Cyclin~{p1}=>_.
k8*[Cdc2] for Cdc2=>Cdc2~{p1}.

Learning rules :

One rule is deleted from the original model and recovered by model revision from the temporal properties

biocham: delete_rules(Cdc2=>Cdc2~{p1}).
Cdc2=>Cdc2~{p1}
biocham: check_all.
The specification is not satisfied.
This formula is the first not verified: Ei(EF(Cdc2~{p1}))
Loading and Querying time: 0.03 s
biocham: revise_model.
1: adding Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}
2: adding Cdc2=>_
2: backtracking on previous add -> deleting Cdc2=>_
2: adding Cdc2=[Cyclin]=>_
2: backtracking on previous add -> deleting Cdc2=[Cyclin]=>_
2: adding Cdc2=[Cdc2-Cdc2~{p1}]=>_
3: adding Cdc2=>Cdc2~{p1}
4: deleting Cdc2=[Cdc2-Cdc2~{p1}]=>_
5: deleting Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}
Success
Time: 45.00 s
21 properties treated
Modifications found:
Deletion(s):
Addition(s):
Cdc2=>Cdc2~{p1}.
After reduction, 1 rules remain corresponding to the bias {Cdc2=>Cdc2~{p1}}
Deletion(s):

Considering all rules of some elementary pattern provides two solutions in this problem

biocham: learn_one_addition(elementary_interaction_rules).
_=> $A where $A in all_simple and $A diff _
_=[$G]=> $A where $A in all_simple and $G in all and $G diff $A and $A diff _ and $G diff _
$A=>_ where $A in all and $A diff _
$A=[$D]=>_ where $A in all and $D in all and $D diff $A and $A diff _ and $D diff _
$A- $B=> $A+ $B where $A in all and $B in all and $B diff $A and $B diff _ and $A diff _
$A+ $B=> $A- $B where $A in all and $B in all and $B diff $A and $B diff _ and $A diff _
$A=> $B where $B in all and $A in all and $A more_phos_than $B and $A diff $B
$A=[$C]=> $B where $C in all and $B in all and $A in all and $A more_phos_than $B and $A diff $B and $C diff _
$A=> $B where $A in all and $B in all and $B more_phos_than $A and $B diff $A
$A=[$C]=> $B where $A in all and $C in all and $B in all and $B more_phos_than $A and $B diff $A and $C diff _
Time: 5.00 s
Rules tested: 112
Good rules to be added: 2
Cdc2=>Cdc2~{p1}
Cdc2=[Cyclin]=>Cdc2~{p1}

Learning parameter values :

The kinetic parameter values, like k8 for the recovered rule, can be learned from quantitative temporal properties

biocham: learn_parameters([k8],[(0,200)],20,oscil(Cdc2-Cyclin~{p1},3),150).
Estimated maximum search time: 1.00 s
First values found that make oscil(Cdc2-Cyclin~{p1},3) true:
parameter(k8,10).
Search time: 0.34 s

biocham: learn_parameters([k3, k4], [(0, 200), (0, 200)], 20, oscil(Cdc2-Cyclin~{p1},3),150).
Estimated maximum search time: 30.00 s
First values found that make oscil(Cdc2-Cyclin~{p1},3) true:
parameter(k3,10).
parameter(k4,70).
Search time: 1.75 s

biocham: learn_parameters([k3, k4], [(0, 200), (0, 200)], 20, period(Cdc2-Cyclin~{p1},35),150).
Estimated maximum search time: 30.00 s
First values found that make period(Cdc2-Cyclin~{p1},35) true:
parameter(k3,10).
parameter(k4,190).
Search time: 4.84 s

MAPK model

Biochemical reactions : mapk.bc

CTL specification : mapk.CTL (mapk.pl, mapk.slp)

Model compression :

biocham: reduce_model.
1: deleting RAF-RAFK=>RAF+RAFK
2: deleting RAFPH-RAF~{p1}=>RAFPH+RAF~{p1}
3: deleting MEK-RAF~{p1}=>MEK+RAF~{p1}
4: deleting MEKPH-MEK~{p1}=>MEKPH+MEK~{p1}
5: deleting MAPK-MEK~{p1,p2}=>MAPK+MEK~{p1,p2}
6: deleting MAPKPH-MAPK~{p1}=>MAPKPH+MAPK~{p1}
7: deleting MEK~{p1}-RAF~{p1}=>MEK~{p1}+RAF~{p1}
8: deleting MEKPH-MEK~{p1,p2}=>MEKPH+MEK~{p1,p2}
9: deleting MAPK~{p1}-MEK~{p1,p2}=>MAPK~{p1}+MEK~{p1,p2}
10: deleting MAPKPH-MAPK~{p1,p2}=>MAPKPH+MAPK~{p1,p2}
After reduction, 20 rules remain corresponding to the bias ? => ?
Deletion(s):
RAF-RAFK=>RAF+RAFK.
RAFPH-RAF~{p1}=>RAFPH+RAF~{p1}.
MEK-RAF~{p1}=>MEK+RAF~{p1}.
MEKPH-MEK~{p1}=>MEKPH+MEK~{p1}.
MAPK-MEK~{p1,p2}=>MAPK+MEK~{p1,p2}.
MAPKPH-MAPK~{p1}=>MAPKPH+MAPK~{p1}.
MEK~{p1}-RAF~{p1}=>MEK~{p1}+RAF~{p1}.
MEKPH-MEK~{p1,p2}=>MEKPH+MEK~{p1,p2}.
MAPK~{p1}-MEK~{p1,p2}=>MAPK~{p1}+MEK~{p1,p2}.
MAPKPH-MAPK~{p1,p2}=>MAPKPH+MAPK~{p1,p2}.

biocham: list_rules.
MAPKPH+MAPK~{p1}=>MAPKPH-MAPK~{p1}.
0.1*[RAF-RAFK] for RAF-RAFK=>RAFK+RAF~{p1}.
0.1*[RAF~{p1}-RAFPH] for RAF~{p1}-RAFPH=>RAF+RAFPH.
0.1*[MEK~{p1}-RAF~{p1}] for MEK~{p1}-RAF~{p1}=>MEK~{p1,p2}+RAF~{p1}.
0.1*[MEK-RAF~{p1}] for MEK-RAF~{p1}=>MEK~{p1}+RAF~{p1}.
0.1*[MEK~{p1}-MEKPH] for MEK~{p1}-MEKPH=>MEK+MEKPH.
0.1*[MEK~{p1,p2}-MEKPH] for MEK~{p1,p2}-MEKPH=>MEK~{p1}+MEKPH.
0.1*[MAPK-MEK~{p1,p2}] for MAPK-MEK~{p1,p2}=>MAPK~{p1}+MEK~{p1,p2}.
0.1*[MAPK~{p1}-MEK~{p1,p2}] for MAPK~{p1}-MEK~{p1,p2}=>MAPK~{p1,p2}+MEK~{p1,p2}.
0.1*[MAPK~{p1}-MAPKPH] for MAPK~{p1}-MAPKPH=>MAPK+MAPKPH.
0.1*[MAPK~{p1,p2}-MAPKPH] for MAPK~{p1,p2}-MAPKPH=>MAPK~{p1}+MAPKPH.
RAF+RAFK=>RAF-RAFK.
RAFPH+RAF~{p1}=>RAFPH-RAF~{p1}.
MEK~{p1}+RAF~{p1}=>MEK~{p1}-RAF~{p1}.
MEK+RAF~{p1}=>MEK-RAF~{p1}.
MEKPH+MEK~{p1,p2}=>MEKPH-MEK~{p1,p2}.
MEKPH+MEK~{p1}=>MEKPH-MEK~{p1}.
MAPK~{p1}+MEK~{p1,p2}=>MAPK~{p1}-MEK~{p1,p2}.
MAPK+MEK~{p1,p2}=>MAPK-MEK~{p1,p2}.
MAPKPH+MAPK~{p1,p2}=>MAPKPH-MAPK~{p1,p2}.

Cell cycle kinetic model after [Qu et al. 03]

Biochemical reactions : qu.bc

CTL specification : qu.CTL (qu.pl, qu.slp)

Mammalian cell cycle model after [Kohn 99]

This large example of 776 rules over 165 proteins and genes (500 variables) has been transcribed from Kohn's map [Kohn 99]. This model contains many errors as can be checked in CTL (e.g. cyclin B does not oscillate as there is no synthesis rule for this cyclin).

Biochemical reactions (776 rules) : cell_cycle.bc

CTL specification: cell_cycle.CTL