Eclipse Formal Modeling Project

Monday, April 6, 2015 - 05:19 by Boutheina Bannour

Basics

This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process) and is written to declare its intent and scope. We solicit additional participation and input from the community. Please login and add your feedback in the comments section.

Project

Eclipse Formal Modeling Project™

Parent Project

Eclipse Modeling Project

Proposal State

Created

Background

The Eclispe Formal Modeling Project (E-FMP for short) is based on research work that has been started at the CEA LIST in the nineties. That research focused on methods and tools to validate and test concurrent systems. The first resulting implementation was the AGATHA toolset [1]. The main functionality of AGATHA was to analyze automata-based models specified by means of a proprietary entry language. AGATHA offered several analysis modes including automated test generation and verification of predicates.
AGATHA toolset has been used in the scope of industrial languages such as ESTELLE, SDL or STATEMATE by translation targeting the entry language of AGATHA. The complexity of the resulting translation showed the expressiveness limitations of the entry language.
In the last decade, the entry language of AGATHA has considerably evolved in order to easily cover a wide range of communication and execution semantics. The evolutions also implied deep modifications in the symbolic execution engine, which led to a complete rewriting of the source code. This re-engineering work resulted in the creation of the DIVERSITY platform. Lately, an ongoing work has been initiated in order to integrate DIVERSITY in the SDL platform RTDS of the SME PragmaDev [2]. Currently, several projects at the CEA LIST are being undertaken on the subject of analyzing UML/SysML models using DIVERSITY platform. For this, model transformation gateways are being developed (e.g. UML Interaction models as sequence diagrams [13]), together with gateways to export analysis results in standardized/specific formats (e.g. TTCN-3 test cases [2]).
In this context, the E-FMP project aims at gathering together and maturing formal analysis tools to be used in the scope of the Eclipse modeling technologies. The first tool to be contributed to the E-FMP project is a symbolic execution platform which is the open source version of DIVERSITY. During the early phases, the E-FMP project will unify and enable the development of import/export gateways around this symbolic execution platform.

What is DIVERSITY ?

DIVERSITY is a multi-purpose and customizable platform for formal analysis based on symbolic execution. Symbolic execution was first defined for programs [3]. The underlying concept consists in executing programs, not for concrete numerical values but for symbolic parameters, and computing logical constraints on those parameters at each step of the execution. Symbolic execution techniques allow computing semantics of programs or models and representing them efficiently in an abstract manner. DIVERSITY has been designed for the purpose of managing the diversity of different semantics, but also the diversity of possible analyses based on symbolic execution (test generation, proof, deadlock search, etc.). The design of DIVERSITY has been guided by the following concerns:

[The ability to analyze a wide range of (modeling) languages] This has led to the definition of a pivot language called "xLIA” which stands for eXecutable Language for Interaction & Assemblage. xLIA is a rich language with a variety of primitives which allow encoding all classical semantics.
[The ability to express a wide range of coverage objectives] A large class of coverage objectives are already implemented, such as structural coverage, Condition/Decision coverage MC/DC, inclusion criterion (detects already encountered behaviors), etc. Moreover, DIVERSITY provides facilities to define new ones.
[The customizability of the exploration strategies] DIVERSITY implements specific strategies for exploring the models in addition to the usual ones: BFS (Breadth First Search), DFS (Depth First Search), random traversal, weighted traversal, etc. Besides, DIVERSITY provides heuristics to alleviate the combinatory explosion problem when targeting a coverage objective. DIVERSITY has been designed in a way which facilitates the definition of new and more advanced exploration strategies.

Scope

The general purpose of E-FMP project is to develop a series of formal analysis tools and enable a broader use of them thanks to their integration with the Eclipse modeling technologies. The first tool to be provided by the E-FMP project is a symbolic execution platform. In the following, we define precisely the scope this platform.

The first objective of the symbolic execution platform of the E-FMP project is to provide researchers with a common platform :

generic enough to take into account all the objects they want to analyze;
extensible in order to allow them to customize the basic symbolic treatments to implement their specific formal functionalities.

Offering such functionalities will enable the share and the coordination of different researches that will be implemented in a common symbolic platform. This will permit using conjointly different formal techniques in a common framework. Moreover, from the contributors’ perspective, they do not have to re-implementing basic symbolic execution functionalities. Rather, they are expected to concentrate on the formal usage that they want to make of them.

The second objective of the symbolic execution platform of the E-FMP project is to provide users with user-friendly formal functionalities:

This will be made possible thanks to a catalogue of import and export gateways that will be developed all along the project life. At the beginning of the project, the import gateway from UML/SysML will be released and enriched. The export gateway to TTCN-3 will be also released.

Description

The symbolic execution platform of the E-FMP project will be organized around a symbolic execution infrastructure as illustrated in Figure 1.

Figure 1. The symbolic execution platform of the E-FMP project

a. All the steps of the symbolic execution processing in this infrastructure can be customized by adding modules specific to a kind of formal analysis. Many such modules are already available in DIVERSITY: modules to highlight live-locks, deadlocks, bad synchronizations between sub-systems; modules to check predicates; and modules to generate test cases and compute test verdicts [4] [5] [6].

b. In order to make use of the (existing) formal analysis modules, input models are meant to be expressed with the xLIA pivot language. As glimpsed at the beginning of the section, xLIA has proven its flexibility to capture various semantics. Some of the targeted modelling formalisms are state-based notations, communicating process languages, scenario-oriented notations, etc. In order to enable transparent usages of the formal analysis modules for non-experts of formal methods, users may access to a catalogue of import gateways. Those gateways permit to edit models in user-friendly environments. Advanced users may develop their own import gateways. For instance, a gateway from a subset of UML/SysML using model transformation techniques has been developped by the CEA as a plug-in for Papyrus [7] which is an Eclipse integrated modeling environment. It allows analyzing sequence diagram models [8] [9].

c. Similarly to import gateways, export gateways can be developed. Those gateways are used to export the results of the analysis towards specific modeling environment. For instance, a gateway has been developed in order to export test cases computed by DIVERSITY to TTCN-3 [2]. Again, advanced users may develop their own export gateways.

The symbolic execution infrastructure (SE for short) is a central component of the symbolic execution platform of the E-FMP project . In paragraph A, we describe its structuration (which reflects the architecture of the DIVERSITY kernel) and we emphasize all its customization possibilities. In paragraph B, we present the formal analysis functionalities that have been implemented in this infrastructure and that will be made available at the initial release. Similarly in paragraph C, we introduce the import/export gateways that will be made available as well.

A. Customizable symbolic execution infrastructure

Symbolic execution consists in computing the semantics of a model/program (in the sequel we simplicity talk about a model) in an abstract manner by constraint propagation on input symbolic values. The result of the symbolic execution is a tree-like structure where a path abstracts a class of concrete execution of the analyzed object. Symbolic trees are potentially infinite structures due to the possibility of repeatedly executing an unbounded loop for example. Classically, SE allows setting stopping criteria related to the tree size. Once such a symbolic tree is computed, SE provides post processing facilities to conduct formal analysis on the symbolic tree. In addition, the SE infrastructure has been designed so as to allow customized formal treatment on the fly during the construction of the symbolic tree. We will discuss in the remaining of the section, the entry points offered by the SE infrastructure to access on the fly the generic symbolic execution processing in order to insert new formal treatments.

Figure 2. The running process of the symbolic execution platform

The symbolic processing consists of five Steps (i), …, (v) depicted in Figure 2. The scheduling of these steps is cyclic. Each cycle consists in updating a queue of so-called Execution Contexts, denoted ECs. Intuitively, an EC fully characterizes a class of executions (all those who follow a common path in the model control graph) and is defined by the following pieces of information:

[Data part] the set of symbolic values associated with variables of the model; and a path condition consisting in a conjunction of constraints on those symbolic values; The path condition fully characterizes the constraints to be satisfied in order to follow the path associated to the EC.
[Control part] the control points of the model control graph reached at the end of the executions characterized by the EC.

At the beginning of the first iteration of the cycle, the queue contains one EC which characterizes the initial symbolic values associated with the variables and the path condition is restricted to True because no constraint has not yet been encountered. Each iteration step consists in: selecting one or more EC(s) (removed from the queue); computing their children ECs by symbolically executing all possible atomic instructions subsequent to the control points reached in the parent ECs; deciding whether or not the parent ECs are added to the tree; in which case, their children ECs are added in the queue. The whole symbolic processing is based on the notation of the so-called filter. The purpose of a filter is to dynamically accept or reject ECs according to a specific user coverage purpose. It can be seen as a selection strategy to complement the traversal strategy in order to increase the chances of reaching the targeted coverage while avoiding combinatorial explosion. The processing is detailed as follows:

Step (i): Selection of EC candidates for Step (ii)

One or more EC are selected from the queue according to a customizable strategy. For the moment, the following exploration strategies for generating the symbolic tree are implemented: Random traversal, classical BFS (Breadth First Search) and DFS (Depth First Search) traversals. Some heuristics may however associate a weight with each of the EC, and thus induce an order thereof in the queue which becomes a priority queue.

Step (ii): Pre-filtering

Pre-filtering consists in applying one or more filters to reason on ECs before computing their children. If the EC successfully passes the scrutiny of each of the chained filters, it continues its way in the symbolic processing flow, through Step (ii-a). Otherwise Step (ii-b), the EC will be ignored or possibly tagged (by an informative note on the reason for its failure) and inserted into the symbolic tree under construction. In the favorable case where all user coverage objectives are met, the symbolic processing stops.

Step (iii): Symbolic execution

Each EC issued from Step (ii-a) is evaluated symbolically. During the evaluation its children EC1, ..., ECn are computed. For example, in the context of classical state-machines, it consists in symbolically executing outgoing transitions from active states indicated by the control part of the EC. For each transition, the newly computed child EC denotes:

New symbolic values associated with state-machine variables after executing the assignment of the transition and applying the communication rules;
New path condition computed by taking into account the guard of the transition;
New active states being targets of the transition effectively evaluated.

More information can be kept in the EC. For example, the evaluated transitions can be saved in each of the children ECs together with the (symbolic) input/output sequences which result from the symbolic evaluation of the communication actions.

Step (iv): Post-filtering

Step (iv) is similar to Step (ii), except that the filter involved in post-filtering reasons on the EC and its children to decide of the future of the symbolic processing. After passing the post-filters, there are two possibilities:

Step (iv-a) If successful, the symbolic processing continue with Phase (v) in which case the the EC is added to the tree.
Step (iv-b) If failed, the EC and its children EC1, ..., ECn are ignored or inserted in the symbolic tree (possibly tagged by an informative note on the reason for their failure);

As in Step (ii), in the favorable case where all user coverage objectives are met, the symbolic processing stops.

Step (v): All the children EC1, ..., ECn resulting from Step (iv-a) are enqueued and the symbolic processing iterates with Step (i).

B. Modules of formal functionalities

Formal functionalities are implemented in the SE infrastructure using the filter mechanism. The formal functionalities available at the beginning of the E-FMP project will be:

Verification of predicates: These functionalities allow passive verification of the usual boolean predicates on variables, extended to communication actions. The verification occurs on the fly on each execution context EC built by the symbolic processing. This functionality acts as a passive observer not influencing the symbolic processing.

Hit-or-Jump exploration strategy: Classical search algorithms like Breadth First Search (BFS) are implemented in the SE infrastructure. However in some cases, using BFS results in exploring a large number of paths in the symbolic tree which are irrelevant to the coverage criteria. The SE infrastructure implements an adaptation of the heuristic traversal called Hit-or-Jump [11]. This heuristics aims at computing a symbolic tree covering a declared sequence or set of transitions. The idea is to define a maximal depth N for which a symbolic tree is computed in BFS manner. Once it is computed, an analysis is realized to study whether or not a part of the tree satisfies the coverage: (Hit) If some non-empty prefixes of the sequence has been covered, the set of paths that have covered the greatest prefix are identified, and then one or several ones among them are chosen at random, else one or several paths are chosen at random; (Jump) Once a path is chosen the whole process starts again from the last symbolic state of the path (i.e. the target state of the last symbolic transition of the path) until the sequence is fully covered.

Eliciting unitary behaviors: In a component oriented approach, components are designed, developed and validated in order to be widely used. However one cannot always foresee which specific uses will be made of components depending on the system they will constitute. The SE infrastructure provides a filter implementing the approach in [5] whose goal is extract accurate component behaviors using information given by the system specification. By projecting symbolic execution of a system on its components, unitary symbolic component behaviors are derived from symbolic system behaviors. In practice, those component behaviors can be seen as typical behaviors of the component in the context of the system.

Off-line Test Case Generation: Model-based conformance testing of reactive systems consists in taking advantage of the model for mechanizing both test data generation and verdicts computation. In off-line approaches, test suites are pre-computed from the model and stored to be later performed on test-beds. The SE infrastructure implements two filters, one for test case generation and one for verdict computation, both of them based on the work presented in [6]. A generated test case is a timed sequence of numerical stimuli data and waiting delays. This sequence is intended to be submitted to the System Under Test (SUT), and then the timed sequence of output data resulting from the corresponding execution of the SUT is post-processed according to the model to deliver a test verdict.

C. Functionalities of import/export

The following import/export functionalities will be available in E-FMP project:

Import of UML scenario models: Imported scenario models are described by means of a subpart of UML sequence diagrams (enriched with timing properties expressed in VSL language from the UML profile MARTE [12]). Sequence diagrams are translated into Timed Input Output Symbolic Transition Systems (TIOSTS) [8] which are symbolic automata that can be encoded straightforwardly in xLIA. The symbolic semantics has been given in practice by a model transformation and has been implemented as a plug-in for Papyrus [7]. A typical usage scenario is to analyze system specification to extract test cases that can be later exploited by the offline testing filters presented in Section 3.2. This usage scenario has been experimented with several cases studies [13] [14].

Export test cases in TTCN-3 format: A path in the symbolic tree represents the intended sequence of input/output communication actions between the SUT and its environment. From such paths, numeric sequences of inputs and outputs are computed using constraint solving techniques on path conditions. The SE infrastructure allows this treatment and provides a filter permitting to export these sequences in the form of TTCN-3 test cases (refer to the work in [2] for a practical usage of this gateway).

Licenses

Eclipse Public License 1.0

Legal Issues

There is no legal issue with the E-FMP project.

The following 3rd Party libraries will be integrated with the initial contribution to the E-FMP project.

CVC4 BSD (http://cvc4.cs.nyu.edu/web/copyright/)

ANTLR BSD (http://www.antlr.org/license.html)

OMEGA https://github.com/davewathaverford/the-omega-project/blob/master/LICENSE

Why Here?

Eclipse Foundation is a perfect place for the E-FMP project for the following reasons:

The CEA LIST’s previous experience with Papyrus as an Eclipse project has accelerated its development and increased its acceptance in academia as well as in industry. This encouraged the open sourcing of the symbolic execution platform DIVERSITY within the Eclipse Foundation.

The E-FMP project being a top-level project will provide formal model-based analysis and testing functionalities that will benefit from all existing Eclipse modeling projects.

The submission of Titan [10] which is a general purpose test execution environment based on the TTCN-3 language and that of the symbolic execution platform of the E-FMP project to the Eclipse Foundation provides within the Eclipse ecosystem a complete tool-chain of MBT connected thanks to the standard TTCN-3.

Future Work

The CEA LIST is developing the Papyrus gateway in order to support formal analysis of UML/SysML models, among others for model-based testing (MBT).

The CEA LIST is in touch with Ericsson in order to develop the Titan gateway in order to offer a complete MBT tool-chain. This integration will be based on TTCN-3 as a standardized specification language for tests.

CEA LIST has created a Joint research group “MOSAIC” with the MAS lab of the Ecole Centrale de Paris that will be directly involved with committers. The objective is to reach a critical mass in order to increase the visibility of the common activities on symbolic analysis and to upgrade the scope of the E-FMP project.

The CEA LIST will promote the project by publishing research results in known related conferences such as ICTSS (formerly TestCom/FATES), ICST, MODELS, FM, and participate in related seminars like those of Dagstuhl as well as related meetings such as EclipseCon (where a first communication has been realized in 2015).

The CEA LIST will promote the teaching activities on the applications the symbolic execution platform provided by the E-FMP project. The CEA LIST is involved in MBT courses using symbolic techniques at Ecole Centrale de Paris and at UPMC engineering school.

Bibliography

[1]

J. Gallois et A. Lanusse, «Le test structurel pour la vérification de spécifications de systèmes industriels,» Génie Logiciel, 1997.

[2]

J. Deltour, A. Faivre, E. Gaudin et A. Lapitre, «Model-Based Testing: an Approach with SDL/RTDS and DIVERSITY,» SAM, Springer, 2014.

[3]

J.-C. King, «A new approach to program testing,» Reliable software, 1975.

[4]

C. Gaston, P. Le Gall, N. Rapin and A. Touil, "Symbolic Execution Techniques for Test Purpose Definition," TestCom, 2006.

[5]

A. Faivre, C. Gaston and P. Le Gall, "Symbolic Model Based Testing for Component Oriented Systems," FATES, 2007.

[6]

B. Bannour, J. P. Escobedo, C. Gaston et P. Le Gall, «Off-line Test Case Generation For Timed Symbolic Model-Based Conformance Testing,» ICTSS, Springer, 2012.

[7]

CEA LIST, «Papyrus».http://www.eclipse.org/papyrus/.

[8]

B. Bannour, C. Gaston et D. Servat, «Eliciting Unitary Constraints from Timed Sequence Diagram with Symbolic Techniques: Application to Testing,» APSEC, IEEE, 2011.

[9]

M. Arnaud, B. Bannour, A. Cuccuru, C. Gaston, S. Gérard et A. Lapitre, «Timed symbolic testing framework for executable models using high-level scenarios,» CSD&M, Springer, 2014.

[10]

Ericsson, Titan, http://projects.eclipse.org/projects/tools.titan/.

[11]

A. Cavalli, D. Lee, C. Rinderknecht et F. Zaïdi, «Hit-or-Jump: An Algorithm for Embedded Testing with Applications to in Services,» Formal Methods for Protocol Engineering and Distributed Systems, Springer, 1999.

[12]

OMG, UML Profile for MARTE: Modeling and Analysis of Real-time Embedded Systems, http://www.omg.org/spec/MARTE/, 2011.

[13]

B. Bannour, C. Gaston, A. Lapitre et J. P. Escobedo, «Incremental Symbolic Conformance Testing from UML MARTE Sequence Diagrams: Railway Use Case,» HASE, IEEE, 2012.

[14]

B. Bannour, J. P. Escobedo, C. Gaston, P. Le Gall et G. Pedroza, «Security Weaknesses Detection by Symbolic Analysis of Scenarios,» APSEC, IEEE, 2014.

Project Scheduling

Once the project creation process is finished, the existing source code of symbolic execution infrastructure and the import/export gateways developed at the CEA will be contributed to the project code base.

People

Project Leads

Committers

Interested Parties

David Mentré - Mitsubishi Electric R&D Centre Europe (MERCE)

Dr. David Mentré received en Engineering degree from Ecole Nationale Supérieure des Télécommunications de Bretagne (Télécom Bretagne) in 1996 and a Ph.D. from INRIA Rennes in 2001. After a post-doctorate in INRIA Rocquencourt, he joined Mitsubishi Electric R&D Centre Europe in 2001 as Research Engineer. Since 2001, he has worked on various subjects related to the telecommunication field and more generally computer science including network processor programming, deep packet inspection, IP protocol analysis, QoS, security, LTE 3GPP protocol stack, etc. Since

2009, he is focusing his research on formal methods as a way to ensure 100% correctness of properties in Mitsubishi Electric safety critical systems (railway, factory automation, ...) at a lower cost. He has experience in various formal paradigms like Deductive Verification and Correct by Construction program development, Abstract Interpretation or Model Checking. He is participating to European project openETCS and French project BWare. He is co-author of more than 7 papers and 8 patent families.

Rob Hierons - Professor of Computing, Department of Computer Science, Brunel University

Rob Hierons received a BA in Mathematics (Trinity College, Cambridge), and a Ph.D. in Computer Science (Brunel University). He then joined the Department of Mathematical and Computing Sciences at Goldsmiths College, University of London, before returning to Brunel University in 2000. He was promoted to full Professor in 2003. His main research largely concerns the automated generation of efficient, systematic test suites on the basis of program code, models or specifications. He also has a significant interest in program analysis and automated transformation techniques such as program slicing. He is joint Editor of the Journal of Software Testing, Verification, and Reliability (STVR). He has organised or been on the steering committee of several international conferences and workshops. He has published over 150 papers in international workshops, conferences and journals including in top journals such as SIAM Journal of Computing, IEEE Transactions on Computers, IEEE Transactions on Software Engineering, and ACM Transactions on Software Engineering and Methodology.

Juergen Dingel - Associate Professor in School of Computing and Adjunct Associate Professor in ECE at RMC School of Computing, Queen's University

Juergen Dingel is an Associate Professor in the School of Computing at Queen's University. He received an M.Sc. in Computer Science from Berlin University of Technology in 1992, an M.Sc. in Pure and Applied Logic in 1994 and a Ph.D. in Computer Science in 1999 from Carnegie Mellon University. He is on the editorial board of the Springer journal Software and Systems Modeling (SoSyM) and was PC Co-chair of the ACM/IEEE 17th International Conference on Model Driven Engineering Languages and Systems (MODELS'14) and of the IFIP International Conference on Formal Techniques for Distributed Systems (FMOODS-FORTE'11). At Queen's, he leads the Modeling and Analysis in Software Engineering Group (MASE). His research interests include software modeling, model-driven engineering, and formal methods. His research interests include software modeling, model-driven engineering, and formal methods.

Pascale Le Gall - full professor of Computer Science in CentraleSupélec

Pascale Le Gall leads the research team Logimas at the laboratory of Mathematics Applied to Systems (MAS) and she co-leads the Mosaic joint team with the CEA LIST and devoted to Modeling for Validation of Component-based Systems. Her scientific interests concern formal methods in Software Engineering, Verification and Validation (formal testing, model-checking), in various application domains: geometric modelers, services in telecommunications, embedded systems, systems biology, smart grids ... She is particularly interested in conformance testing of reactive component-based systems, timed or not, by combining symbolic execution technics and constraint solving. She supervises or has supervised more than 25 PhD students or post-docs, she is the author of more than 50 papers of international audience and participates to numerous collaborative national or international projects.

Yvan Labiche – Professor at Carleton University

Yvan Labiche was born in France. In 1995, he received an engineering degree in Computer System Engineering, from the graduate school of engineering: CUST (Centre Universitaire des Science et Techniques, Clermont-Ferrand), France. He completed a Master of fundamental computer science and production systems in 1995 (Université Blaise Pascal, Clermont Ferrand, France). While doing his Ph.D. in Software Engineering, completed in 2000 at LAAS/CNRS in Toulouse, France, Yvan worked with Aerospatiale Matra Airbus (now EADS Airbus) on the definition of testing strategies for safety-critical, on-board object-oriented software systems. In January 2001, Dr. Yvan Labiche joined the Department of Systems and Computer Engineering at Carleton University, as an Assistant Professor. He is now Associate Professor.

György Rhéty - Ericsson

György Rhéthy is one of the leaders of the Eclipse TITAN project [10]. As Titan has become an open source, Eclipse TTCN-3 tool lately, the E-FMP project and Titan perfectly complementing each other that gives a unique opportunity for both tools and also to Eclipse to create an industry-scale open source tool chain for model-based testing.

Source Code

Initial Contribution

The code which will be contributed initially includes: the core of the symbolic execution infrastructure; the formal functionalities related to testing and coverage; and the import/export gateways presented in details in the Description section.

Source Repository Type

Git