# Process Calculi and Biology of Molecular Networks

Action de Recherche Coopérative INRIA

http://contraintes.inria.fr/cpbio

Year 2004, 2003, 2002

# Position

In recent years, Biology clearly began a work of elucidation of high level biological processes in terms of their biochemical bases at the molecular scale. It is probably not necessary to describe here the concrete applications and the perspectives of this research (acceleration of the development cycle of drugs, new methods of diagnosis, genic therapies, etc). In the end of the Nineties, the front of research in Bioinformatics evolved, passing from the analysis of the genomic sequence to the analysis of various data produced in mass by technologies known as {\it post-genomic} (expression of ARN and proteins, SNP and haplotypes, protein-protein interactions, 3D structures, etc). This effort of disassembling per identification and measurement of certain characteristics of the elementary components (genes and proteins) starts to be used as a basis for the opposite systematic effort: the reconstitution of the biological mechanisms in which these components exhibit a function.

The complexity of the systems concerned makes everyone agree on the need for a large parallel work around the symbolic notation of biological processes and data. This is particularly true in network biology (metabolic networks, extra and intracellular networks, networks of genetic regulation) which interests us in this project. To give an idea, it is estimated that 2.500 of the 10.000 kinds of proteins present in a cell concern tasks of information transfer. The scientific community is still far from having all the keys for this operation, and the language in which one could draw up the plans of this cellular machinery remains to be defined.

Many works attempt to model and analyze biological processes. These studies are structured by various modeling formalisms. Among the diversity of approaches, our project focuses on process algebras as a central subject. In 1998, R. Hofestädt (Bonn) and S.  Thelen (Magdeburg) used modified Petri nets for the representation of metabolic networks. In 1999, A. Regev and E Shapiro (Weizmann Institute) outlined a surprising formalization of a cellular signaling pathway (the RTK/MAPK cascade) in the Pi-calculus of R. Milner, and showed how to describe the molecular "lego" which implements these tasks of communication in a relatively readable way for the biologist. This reveals that an adapted Pi-calculus could appear as an excellent tool for the description of mesoscopic dynamics in biology.

In the requirements for a biological modeling language, an important point and probably the least well perceived by a public of non computer scientists, is that the language should allow a compositional or modular approach : as descriptions accumulate, and they accumulate very quickly, the model must be able to integrate the new data. One of the consequences is that the model must remain open and must probably be able to go up and down to a rather fine level of description (for example molecular). Process calculi in the broad sense are particularly well adapted to this task. Moreover the work of G. Berry and G. Boudol on the Chemical Abstract Machine, that is used nowadays as an intermediate language for process calculi, invoked already explicitly the chemical metaphor.

More recently, the use of hybrid Petri nets (work of Matsuno et al..) and of hybrid systems (works of Alur et al., and Ghosh and Tomlin) appeared in biology. A. Bockmayr and A. Courtois rebuilt models using hybrid concurrent constraint languages, that make it possible to combine discrete interactions and global dynamics with continuous time managed by systems of differential equations. The impression that came out from these last models is that the constraint languages with continuous time provide an interesting framework and reliable algorithms to represent multiscale dynamical systems.

# Objective

The objective of the ARC CPBIO is to push forward a declarative and compositional approach to a language of life''.

By working with the biologists of the ARC  on well understood biological models, we seek :

• to identify in the family of competitive models coming from   Functional Programming (Pi-calculus, Join-calculus and their derivatives) and from   Logic Programming (Constraint Logic Programming, Concurrent Constraint languages and their extensions to discrete and continuous time, TCC,   HCC), the ingredients of a language for the   modular and multi-scale representation of biological processes;
• to provide, in close collaboration with biologists, a series of examples of biomolecular processes transcribed in formal languages,   and a set of biological questions of interest about these models;
•  to design and  apply to these examples formal computational reasoning tools   for the simulation, the analysis and the querying of the models.

# Results

The results achieved so far concern :
• on the one hand, our work on the mammalian cell cycle control after Kohn's diagram, from the design of a core modeling language (M. Chiaverini, V. Danos and C. Laneve),   to the use of symbolic model checking methods for querying the temporal properties of the model (N. Chabrier and F. Fages),
• on the other hand, the multiscale modeling of alternative splicing regulation (D. Eveillard,   D. Ropers, H. de Jong, C. Branlant and A. Bockmayr).

## Publications:

Modelling and querying interaction networks in the biochemical abstract machine BIOCHAM by François Fages, Sylvain Soliman and Nathalie Chabrier-Rivier. Journal of Biological Physics and Chemistry 4(2), pp.64-73. October 2004.Preprint available as pdf.

The Biochemical Abstract Machine BIOCHAM by Nathalie Chabrier, François Fages and Sylvain Soliman.
Computational Methods in Systems Biology, CMSB'04, Paris, April 2004. To appear in Lecture Notes in Bio-informatics, Springer-Verlag.

Modeling and querying biochemical interaction networks, by Nathalie Chabrier, Marc Chiaverini, Vincent Danos, François Fages. and Vincent Schächter. Theoretical Computer Science 325:1, pp.25-44. September 2004.

Multiscale modeling of alternative splicing regulation
by Eveillard, Damien and Ropers, Delphine and de Jong, Hidde and Branlant, Christiane and Bockmayr, Alexander. Computational Methods in Systems Biology, CMSB'03, Rovereto, Italy, February 2003. Springer LNCS 2602.75-87. © Springer-Verlag.

Symbolic model checking of biochemical networks, by Nathalie Chabrier and François Fages.
Computational Methods in Systems Biology, CMSB'03, Rovereto, Italy, February 2003. Springer LNCS 2602 149-162. © Springer-Verlag.

A Core Modeling Language for the Working Molecular Biologist by Marc Chiaverini, Vincent Danos. November 2002.

Using hybrid concurrent constraint programming to model dynamic biological systems
by Bockmayr, Alexander and Courtois, Arnaud.
18th International Conference on Logic Programming, ICLP'02, Copenhagen, July 2002. Springer, LNCS 2401, 85-99. © Springer-Verlag

## Software:

BIOCHAM a programming environment for modeling biochemical systems, making simulations and querying the model in temporal logic CTL.

CMBSlib a library of computational models of biological systems.

## Events:

Third International Workshop on Computational Methods in Systems Biology CMSB'05, co-located with ETAPS'05, Edinburgh, Scotland, April 2005.

Second International Workshop on Computational Methods in Systems Biology CMSB'04, Paris, France, March 2004.

First International Workshop on Computational Methods in Systems Biology CMSB'03, Rovereto, Italy, March 2003.

Formal Methods and Biological Reasoning workshop of the 3rd International Conference on Systems Biology ICSB'02, Stockohlm, December 2002.

## Teaching:

Cours de bio-informatique formelle du Master Parisien de Recherche en Informatique.