Publications
The book describes 12 case studies which use RAISE (Rigorous Approach to Industrial Software Engineering) to con- struct, analyse, develop and apply formal specifications. The case studies present a wide range of application areas: authentication of communications protocols, error detection in programs, geographic information systems, library management, multi-lingual documents, production planning, run-time verification, software design patterns, software reuse, taxation management, telephony, and travel planning. Most have little or nothing to do with safety-critical systems, traditionally considered as the area in which formal methods are most applicable, and many of them do not involve verification. The cases studies illustrate diverse uses of formal specifications: to capture requirements for new software, to carry out software development from requirements, to formalise algorithms in order to prove their correctness, to check if software behaves correctly at run-time, to build models of application domains, to relate descriptive and prescriptive models, to formalise notations for software development, and various other uses. They also try to put the task of creating formal specifications into perspective, asking questions about the purpose, scope and use of the formal models before the models are built.
The case study chapters have each been written by one of this book's editors. Each case study describes work done at the United Nations University's International Institute for Software Technology (UNU-IIST) in Macau by that editor in collaboration with fellows (lecturers or post-graduate students from developing countries). Each case study chapter concludes with information about the fellows involved, the relevant UNU-IIST technical report(s), and other related work.
Download: tj-pub-29.pdf (70.68 KB)The goal of implementation verification is to prove that a given program behaves correctly under all possible executions, for instance that it produces correct output for all legal inputs. There are several conditions on carrying out such verification in practice: (1) the implementation is available in a form suitable for analysis; (2) the implementation “size” is such that the proof remains feasible; (3) it is written in a language which supports specification and proof; and (4) there is enough human expertise to guide the proof. Such conditions are often not satisfied in practice. Result verification is a possible alternative. It relies not on the implementation of the program but on its execution record. This record may be a sequence of inputs and the corresponding outputs from the program when it computes a certain function, or a sequence of values observed about the changing state of the program as it interacts with its environment, or perhaps such values augmented with time-stamps. Whatever the form, the goal is to decide if this record describes a correct execution of the program. A positive answer does not mean that the program is fault-free, only that this execution did not exhibit any errors. On the other hand, finding an error is like constructing a counter-example for claims of correctness about the program. Due to its modest goals, result verification has several advantages over implementation verification: (1) It can be carried out off-line, after the execution terminates, or on-line, as the execution unfolds; (2) Its application scope includes both off-the-shelf components (binary files) and remote service providers (distributed objects); (3) Due to the simple structure involved (a sequence of states), it is easier to automate; and (4) Based solely on the execution record, its complexity is largely independent of the “size” of the implementation and the language used, which makes it particularly suitable for complex, heterogeneous systems. In this chapter we propose a technical framework for carrying out automated result verification in practice: 1. The execution record is a simple text file. Many programs produce such files during their normal operations, for administrative purposes. A general technique for recording exactly the information needed for verification, is to introduce a program monitor by wrapping. 2. The execution record is given as input to the verifier program, which does the actual verification. Given the execution record in a text file, we consider result verification as a text-processing task. Accordingly, the verifier is written in AWK, which is a special-purpose language for text processing [121]. Verification is done by the AWK interpreter, given the execution record and the verifier program as inputs. 3. The verifier program is not written by hand but specification-generated. The generator takes the property the verifier should check, called result specification, as input and produces the code of the verifier program as output. The specification-to-verifier generator is itself written in AWK. 4. A result specification is a first-order property built with two kinds of variables, state and record variables, using the functions and predicates over their respective types. A record variable refers to the contents of the record file. A state variable allows the calculation of the values that are not written in the record file but are derived from it. For each state variable we define its type, initial value and how this value changes for different operations invoked. We also allow specifications to be combined. The rest of the chapter is as follows. Section 12.2 shows how to model the behaviour of a program. Section 12.3 describes how to wrap a program to record the relevant observations during its execution. Result verification, using a hand-written verifier program, is the subject of Section 12.4. Section 12.5 defines a result specification. Section 12.6 describes various ways such specifications can be combined. Section 12.7 describes the generator which produces the code of the verifier program given the specification. Section 12.8 contains an example and Section 12.9 presents some conclusions.
Download: tj-pub-25.pdf (167.4 KB)Object-based distributed systems challenge the traditional ways of applying formal methods via specification and proof. One of the problems is the large number of the components involved, partly decided at compile time (static invocation) and partly at run-time (dynamic invocation). Another problem is having to rely on the vendor’s claims about correctness of individual components, without being able (lacking implementation details) to verify such claims ourselves. Yet another is expressing component specifications in interface definition languages, which describe how to communicate with a component (syntax), but not the expected results of such communication (semantics). Such problems make static verification difficult, at best. On the other hand, the structuring of the whole system in terms of the independent, distributed components, is particularly suitable for making such a system fault-tolerant [85]. The goal is to ensure that failures of individual components (violation of their specifications) will not cause the whole system to fail (violation of the system’s specification). The latter specification can be used at design-time to prove if the system is indeed fault-tolerant. The former specifications can be used at run-time to detect if components have failed. This chapter describes an approach to formally specify software components in order to make such error detection possible. Defining such specifications is not without problems. Specifications may contain infinite constructs like quantifiers (for all values of a type), liveness properties (for all states in a sequence) or modal properties of branching time (for all transitions from a state). Such constructs are generally non-executable and non-checkable. This means we cannot execute them directly on the machine and we cannot check effectively at run-time that they indeed hold. On the other hand, specifications based on propositional logic which can be checked at run-time, are insufficiently expressive in practice. Also, checking specifications which use equality of states is not possible when the state can be only accessed via defined operations; the best we can do is checking observational equivalence [97]. Such problems require a different kind of specification to carry out effective run-time checking from those to support static verification. In this chapter we propose an approach to define and check such specifications at run-time. Technically, the approach is as follows: 1. Specifications are formally-based. They are defined as logic-based regular expressions built from the propositions about the states or pairs of states of a component, via its observer operations. 2. Specifications are checkable at run-time, based on the recorded history of the component’s execution. This history is a sequence of values observed about individual states and the operations, with arguments, whose invocation caused state-changes. 3. Checking is carried out by a wrapper which is generated from the component and its specification. The wrapper takes over all communication between the component and its environment. It remains transparent to the clients except being able to detect (as it carries out run-time checking after every state-change) and announce (via the additional boolean observer error) the occurrence of an error. 4. The required effect of run-time checking is described formally as the fail-stop property. The wrapper generator is to transform a given component, which may fail in an arbitrary way, into a component which only fails by refusing to operate, which fact is also announced by the observer error. The rest of this chapter is as follows. Section 13.2 explains and illustrates the concept of a ‘component’. Section 13.3 discusses and compares ‘fault-free’ and ‘fail-stop’ behaviours of components. Section 13.4 shows how to ensure fail-stop behaviour by pattern matching. Section 13.5 presents an example, a line editor. Sections 13.6 and 13.7 describe how pattern matching can become part of an automatically generated wrapper. Section 13.6 specifies the wrapper generator and Section 13.7 describes a prototype implementation. Section 13.8 provides some conclusions.
Download: tj-pub-24.pdf (200.5 KB)Business process modelling can have many applications. It can be used to: understand the flow of work in an organisation; monitor and control progress of work; measure and optimise work performance; predict the effects of changes in management and operations; plan for implementation of changes; design interaction patterns between processes running in different organisations; and specify, develop and deploy software to further organisational/business goals, etc. Formal analysis is a particularly good reason to carry out such modelling, as it can disclose problems long before the process is actually deployed in an organisation. This early detection is especially important for processes that cross organisational boundaries where errors, as in all distributed systems, are inherently hard to detect and expensive to correct. One approach to model a business process is to describe its concrete execution in an enterprise, using enterprise resources to produce tangible results like products or services. Typically, such descriptive modelling would be carried out to explore the analogy between business and computing and would concentrate on the mechanics of a process. A prescriptive model, in contrast, aims to express the intended purpose of each process. The challenge for business process modelling is, we believe, finding suitable abstractions that can be applied at both levels. By formally relating such descriptive and prescriptive models, we could verify if a process is correct (satisfies its intended purpose) and further develop an engineering approach to design such processes in a rigorous way. This chapter is set to contribute to this general goal. We consider a particular, although broadly defined, business domain: customer-driven production. Production refers to the process of creating goods – tangible products like cars, phones or shovels. Products are produced by assembly from sub-products, carried out within independent business entities called production cells. Each cell contains the resources to store, manufacture and deliver products to its customers. On a certain level, a cell represents formally what is a manufacturing enterprise with its resources like a warehouse, shop-floor and product stocks. The behaviour of a cell is driven by the orders received from the customers and how such orders are implemented. The customers include other production cells. The implementation of a customer order is described by a production process. The aim of this chapter is to define formally what it means for such a process to be feasible (possible to carry out given the resources delegated for its execution) and correct (satisfying a customer order, if feasible). We adopt the following business model, explained for sequential, concurrent and distributed production: 1. Sequential production (one process, one cell). Each process responds to a customer order which specifies the product to deliver, the number of items and possibly the latest time of delivery. The process operates within a production cell which offers the resources for its execution, in terms of product stocks, storage space, and capacity to carry out manufacturing and transportation. It describes in detail what operations should be performed on the cell and in which sequence. The process is feasible if the cell has enough resources for its execution. A feasible process is correct if its execution satisfies the order: the stocks for the products reach the required volumes within the deadline. 2. Concurrent production (many processes, one cell). A cell may contain several processes, each created in response to a particular customer order and all executed concurrently. In order to resolve conflicts for the shared resources (such as product stocks, manufacturing workstations or a transportation system), processes are assigned priorities to represent their importance. Feasibility means the cell has enough resources for all such processes, executed concurrently. Correctness means all processes satisfy their corresponding customer orders, when executed concurrently under a priority-based scheduler. 3. Distributed production (many processes, many cells). Several processes running in different cells may satisfy an order in a cooperative way. Trying to utilise the resources that one cell lacks and another has available, they form customer-supplier relations dynamically, by receiving customer orders and implementing with their own processes, perhaps sending more orders at the same time. As a result, each cell may be running several processes, each contributing part of the original order. This mechanism leads to the distribution of production activities. Feasibility means each cell has enough resources for its processes executed concurrently. Correctness means a process satisfies its customer order provided all supplier processes satisfy their own. Moreover, when several such processes run in the same cell, they must satisfy their orders concurrently. The rest of this chapter is as follows. Section 7.2 is about product modelling. Section 7.3 presents a descriptive production model, including the concepts like production cells, operations and processes, and defines what it means for a process to be feasible. Section 7.4 presents a corresponding prescriptive model. Section 7.5 defines what it means for a feasible process to be correct, by relating such descriptive and prescriptive models. It starts with sequential production, then introduces concurrency, distribution, and real-time constraints. Section 7.6 describes related work and provides some conclusions.
Download: tj-pub-27.pdf (222.89 KB)Travel planning can be a challenge these days. There is an abundance of travel-related information in the magazines, newspapers, guides, brochures, on the internet. The information covers all imaginable aspects of travel including flights, hotels, cars, restaurants, maps, events and others. Travel is offered in the form of individual products or complete packages, its contents predefined or customised to individual needs, definite or based on waiting lists, sold for a regular price or as special deals (under different restrictions when, where and for whom the deal applies). With so many options available and so many constraints involved, there is a great deal of uncertainty how to carry out the process of travel planning. Internet contributes significantly to making this information available for individual travellers, including on-line availability-checking, reservations and sale. However, internet-based travel agents tend to focus more on the presentation of information and less on its actual contents. Moreover, they implement little in the way of a method for the process of systematic travel planning – formulating the requirements and then moving gradually toward the solution. Here are some observations when visiting such agents: (1) a visitor has to know in detail what service he looks for before he can learn this for himself; (2) when presented with an offer a visitor can only accept or reject it, but cannot explore the alternatives; (3) an error message communicates to the visitor the presence of an error but not the reasons for it or how to avoid it; (4) there is no recollection of previous interactions with a visitor – the system always returns to its pre-defined state; (5) when a little amount of automation is possible in support of visitors’ decisions it is not implemented. This chapter is about methods and tools in support of travel planning. We show how to approach travel planning as essentially an engineering problem. We define precisely what constitutes travel requirements, what is a travel plan, and in what sense the plan correctly implements the requirements. The approach is model-based, with models representing all stages of travel planning, from abstract requirements to concrete itineraries. Such models are designed, analysed and redesigned to represent decisions made by the travel engineer (the person who designs the travel plan). The models are minimal – expressed with the minimum number of modelling concepts – but also open-ended – new concepts can be added when needed. They are also formally-based – their syntax is well-defined and their semantics is expressed in precise mathematical terms, using numbers, sets, maps, functions, abstract and concrete types. The formal machinery used for semantic definitions is basic set theory and logic. Formality makes it possible to prove correctness of a travel plan with respect to travel requirements, and also to justify the soundness of the design rules that allow travel plans to be made more concrete. Such rules are applied during a development process, leading gradually from the abstract travel requirements to concrete solutions. We present such rules and discuss their soundness. In presenting such models we have four goals in mind. First, we want to formalise some of the travel-related concepts in a way which is independent of the current technology, providing the basis for an engineering approach. The latter means that the models must capture different levels of abstraction as well as relate those levels by the concept of correctness. Second, we want to apply such models to define a formally-based method for systematic travel planning, including definitions of the design rules and justification of why they are sound. Third, we want to explore the opportunities and limitations for automation in support of travel-related decision-making. The models will be used to formulate precise problems and seek possibly automated solutions. Fourth, we would like to seek an implementation of a prototype travel assistant, based on the models, in support of the method. The rest of this chapter is as follows. Section 8.2 introduces the basic concepts. Section 8.3 describes the models for travel planning, with definitions and examples. As the models capture different levels of abstraction, Section 8.4 defines what it means for a concrete travel plan to correctly implement an abstract plan. Section 8.5 presents a method for systematic travel planning, based on the models in Section 8.3 and the notion of correctness in Section 8.4. Section 8.6 describes the ideas for a travel assistant, a software system to support travel development. Finally, Section 8.7 contains some conclusions.
Download: tj-pub-26.pdf (145.75 KB)Nowadays, acquisition of trustable information is increasingly important in both professional and private contexts. However, establishing what information is trustable and what is not, is a very challenging task. For example, how can information quality be reliably assessed? How can sources’ credibility be fairly assessed? How can gatekeeping processes be found trustworthy when filtering out news and deciding ranking and priorities of traditional media? An Internet-based solution to a human-based ancient issue is being studied, and it is called Polidoxa, from Greek “poly” (πολύ), meaning “many” or “several” and “doxa” (δόξα), meaning “common belief” or “popular opinion.” This old problem will be solved by means of ancient philosophies and processes with truly modern tools and technologies. This is why this work required a collaborative and interdisciplinary joint effort from researchers with very different backgrounds and institutes with significantly different agendas. Polidoxa aims at offering: 1) a trust-based search engine algorithm, which exploits stigmergic behaviours of users’ network, 2) a trust-based social network, where the notion of trust derives from network activity and 3) a holonic system for bottom-up self-protection and social privacy. By presenting the Polidoxa solution, this work also describes the current state of traditional media as well as newer ones, providing an accurate analysis of major search engines such as Google and social network (e.g., Facebook). The advantages that Polidoxa offers, compared to these, are also clearly detailed and motivated. Finally, a Twitter application (Polidoxa@twitter) which enables experimentation of basic Polidoxa principles is presented.
Formal methods have grown out of their infancy: firm foundations are established, a variety of languages are available to describe specifications at different levels of abstraction, and methods and calculi exist to guide development of specifications into correct programs, supported by a wide range of tools. There is also a growing number of well-documented experience reports and case studies which present stories and analyses of both successful and less successful applications. A further evidence for maturing of formal methods is their increasing presence in industrial, large-scale software development. Faced with realistic expectations, they become integrated into the industrial software development process in such a way as to complement, not replace, traditional methods of software design and analysis. The purpose of this book is to present our experience in developing and applying formal specifications. The book describes 12 case studies which use RAISE – Rigorous Approach to Industrial Software Engineering – to construct, analyse, develop and apply formal specifications. The case studies present a wide range of application areas: authentication of communications protocols, error detection in programs, geographic information systems, library management, multi-lingual documents, production planning, run-time verification, software design patterns, software reuse, taxation management, telephony, and travel planning. Most have little or nothing to do with safety-critical systems, traditionally considered as the area in which formal methods are most applicable, and many of them do not involve verification. The case studies illustrate diverse uses of formal specifications: to capture requirements for new software, to carry out software development from requirements, to formalise algorithms in order to prove their correctness, to check if software behaves correctly at run-time, to build models of application domains, to relate descriptive and prescriptive models, to formalise notations for software development, and various other uses. They also try to put the task of creating formal specifications into perspective, asking questions about the purpose, scope and use of the formal models before the models are built. The case study chapters have each been written by one of this book’s editors. Each case study describes work done at the United Nations University’s International Institute for Software Technology (UNU/IIST) in Macau by that editor in collaboration with “fellows” – lecturers or post-graduate students – from developing countries. Each case study chapter concludes with information about the fellows involved, the relevant UNU-IIST technical report(s), and other related work. This introduction consists of six sections. Section 1.1 describes some factors promoting the application of formal specifications in industry. Section 1.2 presents some paradigms for software development which require the use of formal or semi-formal specifications. Section 1.3 gives a high-level overview of specification languages, development methods and tools. It concludes with a summary description of RAISE. Sections 1.4 and 1.5 contain respectively an overview of the case studies in the book and a discussion of several themes recurring throughout these case studies. Finally, Section 1.6 describes the lessons learned from the case studies. A more detailed technical introduction to the RAISE language and method is given in Chapter 2.
Download: tj-pub-28.pdf (99.71 KB)Recent developments in data de-identification technologies offer sophisticated solutions to protect medical data when, especially the data is to be provided for secondary purposes such as clinical or biomedical research. So as to determine to what degree an approach-- along with its tool-- is usable and effective, this paper takes into consideration a number of de-identification tools that aim at reducing the re-identification risk for the published medical data, yet preserving its statistical meanings. We therefore evaluate the residual risk of re-identification by conducting an experimental evaluation of most stable research-based tools, as applied to our Electronic Health Records (EHRs) database, to assess which tool exhibits better performance with different quasi-identifiers. Our evaluation criteria are quantitative as opposed to other descriptive and qualitative assessments. We notice that on comparing individual disclosure risk and information loss of each published data, mu-Argus performs better. Also, the generalization method is considerably better than the suppression method in terms of reducing risk and avoiding information loss.
This book presents thoroughly revised tutorial papers based on lectures given by leading researchers at the International Training School on Domain Modeling and the Duration Calculus, held in Shanghai, China, in September 2007 as an associated event of ICTAC 2007, the 4th International Colloquium on Theoretical Aspects of Computing.The four tutorial papers presented provide competent coverage of software security, domain modeling of software engineering, and duration calculus for real time systems - originating from lectures of leading experts in these fields from Europe and Asia. Topics addressed in detail are: development of real-time systems, domain engineering using abstract modeling, the area of duration calculus, and formal methods like language description using the operational semantics approach.
This Festschrift volume, published to honour both Dines Bjørner and Zhou Chaochen on the occasion of their 70th birthdays in October and November 2007, includes 25 refereed papers by leading researchers, current and former colleagues, who congregated at a celebratory symposium held in Macao, China, in the course of the International Colloquium on Theoretical Aspects of Computing, ICTAC 2007. The papers cover a broad spectrum of subjects, from foundational and theoretical topics to algorithms and systems issues and to applications, comprising formal methods, systems modelling, hybrid and real-time systems, specification and verification, as well as interval temporal logic.Dines Bjørner is known for his many contributions to the theory and practice of formal methods for software engineering with special focus on abstraction and modelling, specification of systems and languages, and domains, requirements, and software design. He was a professor at the Technical University of Denmark (DTU) in Lyngby, near Copenhagen; he was the founding director of the United Nations University International Institute for Software Technology (UNU-IIST) in Macao during the 1990s; and a co-founder of VDM-Europe, which became Formal Methods Europe, an organisation that promotes the use of formal methods.Zhou Chaochen is known for his seminal contributions to the theory and practice of timed and hybrid systems. Starting at Peking University and as a postgraduate at the Institute for Computing Technology of the Chinese Academy of Sciences, he continued his career with an extended visit to Oxford University Computing Laboratory, where he was the prime instigator of the Duration Calculus, an interval logic for real-time systems. He also worked as a visiting professor at the Technical University of Denmark, Lyngby, at the invitation of Prof. Dines Bjørner. He was a principal research fellow at UNU-IIST during the period 1992-97, before becoming its director, an appointment he held from 1997 to 2002.
This book constitutes the refereed proceedings of the 8th International Conference on Formal Engineering Methods, ICFEM 2006, held in Macao, China, in November 2006.The 38 revised full papers presented together with 3 keynote talks were carefully reviewed and selected from 108 submissions. The papers address all current issues in formal methods and their applications in software engineering. They are organized in topical sections on specification and verification, internetware and Web-based systems, concurrent, communicating, timing and probabilistic systems, object and component orientation, testing and model checking, tools, fault-tolerance and security, as well as specification and refinement.
Graph transformation techniques, and the Double-Pushout approach in particular, have been successfully applied in the modeling of concurrent systems. In this area, a research thread has addressed the definition of concurrent semantics for process calculi. In this paper, we show how graph transformation can cope with advanced features of service-oriented process calculi, such as several logical notions of scoping (like sessions and pipelines) together with the interplay between linking and containment. This is illustrated by encoding CaSPiS, a recently proposed process calculus with such sophisticated features. We show how to represent the congruence and reduction relations between CaSPiS processes by exploiting concurrent graph transformations over hierarchical graphs.
Download: FACS10.pdf (256.36 KB)We present a mathematical model of class graphs, object graphs and state graphs which naturally capture the essential oo features. A small-step operational semantics of oo programs is defined in the style of classical structural operational semantics, in which an execution step of a command is defined as a transition from one state graph to another obtained by simple operations on graphs. To validate this semantics, we give it an implementation in Java. This implementation can also be used for simulation and validation of oo programs, with the visualization of state graph transitions during the execution. A distinct feature of this semantics is location or address independent. Properties of objects and oo programs can be described as properties of graphs in terms of relations of navigation paths (or attribute strings).
Download: ICFEM09.pdf (412.35 KB)
his compendium consists of the papers presented at the TTSS’09 workshop for the convenience of the workshop participants.
Download: report421.pdf (3.21 MB)This volume contains the papers presented at the 4th International Conference on Distributed Computing and Internet Technology (ICDCIT2007) held during 17-20 December 2007 in Bangalore, India. The conference was organized by Kalinga Institute of Industrial Technology (KIIT), Bhubaneshwar, India, www.kiit.org, and co-organized by the Center for Electronic Governance at United Nations University International Institute for Software Technology (UNU-IIST-EGOV), Macao, www.egov.iist.unu.edu.
Download: tj-pub-46.pdf (41.28 KB)Free and open source software projects are often perceived to be of high quality. To a great extent the success of open source software seems to be due to an implicit but effective connection between communication and contribution in its development process. In this paper, we present a snapshot of the state the art on quality and success of Open Source Software (OSS) based on a review of the literature. For each of these concepts, we describe various measures considered in the literature and a number of methods by which they are obtained. Contributions to an Open Source Software (OSS) project are made through communication among developers and users. We elaborate on the concrete notions of communication and contribution in Open Source Software (OSS) and their links.





