A Review of Two Approaches to the Static Analysis of Multithreaded Java Programs

(1)

A Review of Two Approaches to the Static Analysis of Multithreaded Java Programs

Mikko Lahola

May 31, 2010

(2)

Abstract

Static analysis of multithreaded programs is considerably more difficult than analysis of sequential programs due to the amount of possible interleavings of threads.

This work reviews the available literature on the static analysis of multithreaded Java programs. The focus is on finding suitable generic analyzers, i.e. analyzers that do analysis over multiple multithreaded properties. The work discusses how threads, monitors and memory models are used to handle concurrency in Java, presents two methods that extend the abstract interpretation and the model checking static analysis approaches to concurrent programs for the language and reviews (based on the available literature) the methods’ usefulness in terms of results produced and scalability.

(3)

Chapter 1 Introduction

The role ofparallel computing as a usage pattern of computers is increasing. For many years parallel computing has been exclusive to rich universities and research groups with financial backing from governments or multibillion-dollar companies. The situation is changing due to advances in both hardware (multicore processors targeted at common desktops) and software creation methodology (parallel programming models). Parallel computing presents new opportunities in designing more efficient software but in general applications must be redesigned to take full advantage of multicore processors. The problem is that writing properly functioning parallel code is a considerably more complex task than writing serial code. [18]

One way of trying to ensure that a program works correctly is to try to analyze the possible behaviors of the program by inspecting its code. This is called static analysis of the program (see e.g. [20]). Static analysis of sequential programs, i.e. programs that do not contain concurrent behavior, already features open problems [14]. We are however focusing on problems arising from concurrency: a concurrent program has the ability to have multiple threads running in parallel - and the number of ways in which the threads can be interleaved can grow huge. The problems this presents to static analysis are two-fold [24]:

1. New properties related to concurrency can be analyzed, e.g. data races (where at least two threads access the same data at the same time, and one of the accesses is a write operation) and deadlocks (where all threads pend their execution on results from other threads).

2. Problems already at least partially solved for sequential programs - e.g. dataflow analysis (where the flow of data values along control flow paths is analyzed) - become harder due to the complex influence threads and their interleaved execution can have on shared data.

This work reviews the available literature to find methods suitable for the static analysis of concurrent Java [10] programs. However, methods present in literature often focus on analyzing an individual property (e.g. deadlocks). Instead of collating a large number of such methods we focus on the so called generic analyzers (see e.g. [9]), i.e.

(5)

frameworks that can handle multiple properties. In particular, we review the current feasibility of extending two methods that have had some success at analyzing sequential Java programs to also analyze concurrent programs. We consider abstract interpretation [7] (with sequential analysis available by e.g. Cibai [15]) and its extension methodology to concurrent programs [9]. We also consider model checking [5]

(with sequential analysis by e.g. Java PathFinder [4]) and its extension calledcontext switch bounded model checking [22, 26]. The basic problem in extending analyzes to handle concurrent programs is the undecidability result of [23].

The goals of the work are:

1. to discuss what has to be taken into account when applying static analysis on multithreaded Java programs,

2. to present, based on the literature, the ideas on how sequential abstract interpretation and model checking methods for Java can be extended to also handle multithreaded programs, and

3. to review the theoretical and empirical results given in literature for these methods.

The rest of the work is structured in the following way: Chapter 2 shows how parallel computing (or multithreading) is handled in Java and discusses the general requirements of a static analysis system targeted for the language. Chapter 3 presents the two particular approaches to the static analysis of multithreaded Java programs: abstract interpretation and context switch bounded model checking. The approaches are analyzed in Chapter 4, by considering their current usefulness in terms of results produced and scalability. Chapter 5 concludes.

(6)

Chapter 2 Background

In this chapter we show how parallel computing is handled in Java, looking at both the parallel computing related constructs in the language as well as the memory model that defines the concurrent execution semantics. Also, we look at the requirements of a static analysis system targeted for the language.

2.1 Multithreading in Java

Parallel computing is implemented in Java [10] with the usage of threads, i.e. Java programs can be multithreaded. The idea is that during the execution of a program there can be multiple threads present, each executing program code “independently”, i.e. otherwise independent from the execution of other threads but operating on values and objects residing in a shared main memory. Threads are given semantics that are independent of the actual implementation in hardware.

Threads are created dynamically during the execution of the program. The only way to create threads is to extend the class Thread, create instances of the extension and start them by executing thestart()method of them. The language provides multiple mechanisms for communication between threads. We consider here the basic method of synchronization. Synchronization can be applied to individual statements, groups of statements, or methods, in which case the synchronization is applied to the group of statements that form the body of the method. Basic synchronization is the most used method for inter-thread communications and we feel that the other methods defined in the java.util.concurrent package - such as barriers - are implementable and therefore analyzable based on basic synchronization constructs. For simplicity, we do not consider volatile reads and writes in this work.

Synchronization is implemented with monitors: “Each object in Java is associated with a monitor, which a thread can lock or unlock. Only one thread at a time may hold a lock on a monitor. Any other threads attempting to lock that monitor are blocked until they can obtain a lock on that monitor.” [10]

At Java bytecode level, the creation of threads is translated into bytecode instructions in the same manner as other object initializations and method calls (the new,

(7)

invokevirtual, invokespecial, etc. instructions) and locking and unlocking are implemented with special monitorenter and monitorexit instructions. Sequential execution related instructions are not affected with the exception that the values read by various reading instructions depend on a memory model.

2.2 The Java Memory Model

Programming languages use memory models to define how memory actions in a program will appear to execute to the programmer and specifically which values each read of a memory location can return. The design of the Java memory model has had to take into account two partially contradictory goals: (a) the model should disallow (mostly concurrent) behaviors that are hard for the programmers to understand (such as a read operation reading a value of a variable that seemingly would be written to the variable only after the read operation), but at the same time (b) allow compilers to perform common optimizations (many of which rely on the possibility of reordering statements in the code). [17]

One way of restricting behavior is guaranteeingsequential consistency [25]: all results of execution must be the same as if operations of all threads were executed in some sequential order and operations of an individual thread appear in the sequence speci- fied by program code. This is however very strict requirement and does not generally allow reordering of statements. The current Java memory model [17] adopts a model with weaker restrictions on execution. We consider here the simplified version of the model called the happens-before model (also found in [17]).

Definition 2.1 (Happens-before model). Program order means the total order in which statements are written in program code. Execution on an individual thread is sequentially consistent if and only if it follows the program order.

For two memory actions (reads, writes) x and y, x−→^sw y means that x synchronizes- with y. The Java specification [10] states that an unlock operation on a monitor synchronizes-with all subsequent (according to program order) locks on the monitor, an action that starts a thread synchronizes-with the first action of the started thread and the final action of a thread synchronizes-with any action on another thread that detects the termination of the thread. (Synchronizes-with contains also elements related to e.g. volatile writes but they are not considered here for simplicity.) The set of all synchronizes-with relations form the partial order called the synchronizes- with order.

For two memory actionsxandy,x−^hb→ymeans thatxhappens-beforey. Thehappens- before order [13] is a partial order containing the transitive closure of program order and synchronizes-with order. In other words, the order containsx−^hb→yif and only ify can be reached fromxby following a path containing program order and synchronizes- with edges.

Thehappens-before model (withoutvolatilereads and writes) then says that for each individual thread the execution of the thread is the same as that would be by program

(8)

order, given the values seen by the read operations of the thread during execution.

The values that can be seen by read operations are dictated by the happens-before consistency rules. A read r of a variable v is allowed to see the value from the write w to v if:

1. r −^hb→w is not in the happens-before order, and

2. there is no write w^′ to v such thatw−^hb→w^′ −^hb→r in the happens-before order.

In other words, a read of a variable can see all the values written to it that are not invalidated by elements of the happens-before order. In particular, a read cannot see a particular value if that value is written to the variable after the read or if the value has been overwritten by another write before the read. Therefore, the model restricts the possible values. The actual Java memory model [17] restricts the values even more by considering also the subtle effects of causality of memory operations. Such considerations are too complex to handle in the scope of this work.

2.3 Static Analysis

As stated in the introduction,static analysis [20] is a program analysis method where the possible behaviors of the program are analyzed by inspecting program code (as opposed to running the program). Static analysis is in general undecidable [14], i.e.

there cannot be an analysis method that would determine the possible behaviors of any given (Turing-complete) program with absolute precision. Therefore static analysis methods must apply some form of approximation. There are basically two choices:

analysis can overapproximate or underapproximate the possible behaviors. The choice to be made depends on whether analysis needs to be sound, i.e. whether all real possible behaviors are to be found by the analysis. This is required when analysis is used to verify safety properties, i.e. that the program cannot exhibit some unwanted behavior. Overapproximative methods are sound whereas underapproximative methods are not.

The static analysis of multithreaded programs has an additional undecidability result [23]. The result states that it is in general impossible to do analysis that is both context-sensitive and synchronization-sensitive for even the simplest of analysis problems. An analysis is context-sensitive if it takes into account the call stack of the program (and does not therefore allow, at the end of a procedure, execution to resume from a different call point than what was used to call the procedure). An analysis is synchronization-sensitive if it is precise towards the synchronization constructs used in the program, e.g. in Java not allowing executions which are not possible according to the memory model. Some general ideas on how to analyze multithreaded programs are surveyed in [24].

There are a few things to consider when applying static analysis to Java programs.

Firstly, a decision must be made whether analysis is done on Java source code or Java bytecode. Devising analysis on bytecode is simpler due to smaller variety of code structures to handle but the translation from source code to bytecode can destroy

(9)

some information that would be useful in analysis. The issue is examined in [16], and from their results we determine that the shortcomings of bytecode analysis (relative to source code analysis) can be mostly fixed with a careful implementation. Secondly, the main Java memory model is relatively complex and hard to analyze: it has already been fixed once due to problems in semantics [21]. As the happens-before model provides a safe approximation and is conceptually much simpler we can opt to having analysis systems use it.

We are considering static analysis methods that have proven implementations for the analysis of sequential programs. Because of sequential consistency within individual threads we can use sequential analysis systems as a base and extend them to concurrent programs by taking thread interleavings and the happens-before model into account.

(10)

Chapter 3 Static Analysis of Multithreaded Java Programs

In this chapter we show how static analysis of sequential Java programs can be extended to concurrent programs. We focus on abstract interpretation and model checking approaches.

3.1 Abstract Interpretation Approach

Abstract Interpretation

Abstract interpretation [7] is a study ofabstractions. The generic idea behind abstractions is tosoundly approximate complex structures with simpler ones [20]. For static program analysis this can correspond to domain abstraction, where the domain of a variable can be changed to a simpler one, preserving the all the data values of the variable but possibly adding some values that the variable did not have in the original domain. This can only add to the semantics of the program under analysis, making the analysis over abstracted domains sound w.r.t. the original domains.

The technique can be applied to the static analysis of complete programs by constructing data domains with complete lattices, specifying operations of the program with monotone (sound) functions over these lattices and calculating the least fixpoint of the functions to get the semantics of the program. A complete lattice is a partially ordered set with least-upper-bound (marked ⊔) and greatest-lower-bound (marked ⊓) defined over all of its subsets, and includes the least element and the greatest element. Then, with complete lattices Land M anabstraction function α :L→M and a concretization function γ :M →L are defined so that ∀L^′ ⊂L:L^′ ⊂γ(α(L^′)), i.e.

soundness is guaranteed when switching to a more abstract domain and back [20].

In building a static analysis system for multithreaded Java programs, both branching and interleaving of threads can cause exponential state space explosion and infinite execution paths can make the analysis take infinite time. To help scalability we can use the already mentioned abstract domains as well as a technique calledwidening [6].

Widening ensures convergence to a least fixpoint in finite number of iterations.

(11)

Solution

We researched the available literature to find generic analyzers that use abstract interpretation techniques to statically analyze multithreaded Java programs. A recent Ph.D. thesis [9] is spot-on in this sense as it deals extensively with the problems related to the handling of the happens-before memory model as well as giving an abstract interpretation framework and an implementation for the problem. An alternative solution [3] describes a technique where only the synchronization of threads is approximated (with abstractions), as an attack to the undecidability problem of [23].

We choose to discuss here the solution presented by [9], since it is conceptually simpler to present. We streamline the presentation even further and do not consider a few smaller restrictions given in the thesis, such as that a reading thread can only see the values produced by its launching thread that have been produced after the launch of the reading thread. To apply abstract interpretation on the programs, their semantics are represented in a fixpoint-applicable form, i.e. small-step semantics.

The following notation is used: the set Tld denotes thread identifiers, the set St denotes states, the set St⁺ denotes traces (of states), Ψ : Tld → St⁺ denotes a multithreaded state partitioned into traces of individual threads, andσ_a −→^∗ σ_b denotes an intra-thread step (i.e. an execution step affecting the local state of one thread).

We also interpret these structures as complete lattices and have abstractions for them.

These are represented with adding a line over the structure e.g. Tlddenotes abstracted version of Tld. Most of the abstractions are trivial, but we note that a trace is abstracted by applying an abstraction on each individual state of the trace separately.

The first part is to consider a single execution step of the program. Given the active thread, traces of all threads so far and the current state,stepcomputes all the possible successor states. These are gathered by either executing an intra-thread step or by reading a value according to visible (given later):

step : (Tld×Ψ×St)→ P(St), step(t, f, σa) = {σb}, for all σb for which (1) there is an intra-thread step to perform: σ_a −→^∗ σ_b, or

(2) the operation to execute at σa is a read operation on location l, in which case σ^′ is the same as σa except that anyv ∈visible(l, f) is written inl forσ^′ and σ^{′ ∗}−→σb, (where −→^∗ is the read operation).

The problem with step is that it can produce several states at each iteration. This can lead to exponential behavior, and so there is a corresponding abstracted version which produces just a single state. This is achieved by merging all the results of a read operation into one value. We assume the availability of abstracted intra-thread step operations (static analysis systems for sequential programs, e.g. Cibai [15], can be used to obtain these):

step : (Tld×Ψ×St)→St, step(t, f , σa) = σb, for the σb for which (1) if the operation to execute at σa is not a read operation: σa

−∗

→σb

(2) otherwise: σ^{′ ∗}−→σb, where σ^′ is the same as σa but with the same kind of write in memory as in the concrete case, with the written value being the least-upper-bound of the set of values given by visible(l, f).

(12)

Given the active thread and traces of all threads so far, S^o computes all possible traces where the active thread is run as far as possible but traces of other threads are fixed to the given ones. The traces are gathered by starting from an initial state (i.e. initial trace) of computation (σ0), and repeatedly applying F^o (reaching a least fixpoint of it), which at each iteration adds to the set of traces the traces which contain one state more, as given by step:

S^o : (Ψ×Tld)→ P(St⁺)

Sô(f, t) =lf p h{σ0},Fô({σ0}),Fô(Fô({σ0})), ...i, where Fô : P(St⁺)→ P(St⁺)

F^o(T) =T∪ {σ0 →...→σi :σ0 →...→σ_i−1 ∈T∧σi ∈step(t, f, σ_i−1)}

Here - again - the set of traces can grow exponentially. The abstracted version merges the existing trace with the trace with new state given bystep. Widening can be used to ensure that the least fixpoint is reached in a finite number of iterations:

S^o : (Ψ×Tld)→St⁺

Sô(f , t) =lf p h{σ0},Fô({σ0}),Fô(Fô({σ0})), ...i, where Fô : St⁺ →St⁺

F^o(τ) =τ ⊔ {σ0 →...→σi :σ0 →...→σi−1 =τ ∧σi =step(t, f , σi−1)}

Finally, S^k computes the full multithreaded semantics, i.e. all possible traces. The traces are gathered by starting from a map with an initial trace (just the initial state) for each thread, and repeatedly applying F^k. At the first iteration all threads have just the initial trace so S^o will compute all traces that are possible when the threads are run in isolation (not seeing the actions) of each other. At the second iteration a thread can see all the (already computed) actions from other threads, but each individual trace can only contain actions from one other thread. This continues so that at the i-th iteration a thread can see the actions from i-1 other threads on an individual trace.

S^k : Ψ → P(Ψ)

S^k(f0) =lf p h{f0},F^k({f0}),F^k(F^k({f0})), ...i, where F^k : P(Ψ)→ P(Ψ)

F^k(Φ) = Φ∪ {fi :∃fi−1 ∈Φ :∀t∈dom(fi−1) :fi(t)∈S^o(fi−1, t)}

Once again, abstraction is used to compute just one map of traces of threads, and widening can be used to converge in finite iterations:

S^k : Ψ →Ψ

S^k(f0) =lf p h{f0},F^k({f0}),F^k(F^k({f0})), ...i, where F^k : Ψ →Ψ

F^k(f) =f ⊔ {fi :∀t∈dom(f) :fi(t) = S^o(f , t)}

The last part to consider is visible. Given a memory location and traces of all threads so far, it gives values that are visible to a read operation on that location, i.e. values that can be read. This is clearly related to the happens-before memory model. It would be sound to just conclude that all values written to a given location are visible to a read, but for precision we try to constrain the values in the manner required by happens-before consistency. In particular, we require that for a value v to be visible to a read, it must have been written by some thread in the already

(13)

gathered trace (therefore disallowing a read on a thread to find values that would be written by the thread itself later on, and satisfying the first rule of happens-before consistency) and that there is no write operation on the trace between the write of v and the current state, i.e. v has not been overwritten by another write operation (satisfying the second rule). The model can be further refined by considering locking and unlocking of monitors (e.g. a value is not overwritten by a write if the write cannot acquire the required lock on the memory location) but we leave these refinements out for simplicity. We consider only the concrete version of visible, as the abstracted version just uses abstract domains in place of concrete ones. The set Loc denotes memory locations and the set Valdenotes values here:

visible : (Loc,Ψ)→ P(Val)

visible(l, f) ={v :t ∈dom(f)∧f(t) = (σ0 →...→σi) ∧ (∃j ∈[0..i] :σ_j is a write operation that writesv on location l ∧

(6 ∃σk ∈(σj →...→σi) :σk is a write operation that writesv∗ (6=v) on location l))}

3.2 Context Switch Bounding Approach

Model Checking with Context Switch Bounding

Model checking [5] is an automatic technique for verifying finite state concurrent systems. In program analysis it can be used as a static analysis technique for verifying properties on multithreaded programs. However, contrary to e.g. systems describing hardware, the analysis of software programs has to take into account that the programs can have infinite state space. Model checking is applied to program analysis by answering the model checking problem: given a structure (representing the program under analysis) and a property (representing the property that the program is analyzed against), decide if the property holds in the structure. The structure is often given as aKripke structure M = (S, R, L), whereS is the set of states of the program, Ris the transition relation and Lis a set labeling the states with (atomic proposition) properties that hold in them. The property is expressed as a logic formula, such as a linear temporal logic (LTL) formula. [5]

Model checking can be used to build a generic analyzer since the logic formulae are suited to describing various properties arising from multithreaded behavior, e.g. deadlocks [5]. However, we cannot simply encode (multithreaded) programs with infinite state space to Kripke structures as the set S would become infinite.

One way to handle this is to employbounded model checking [2], where only execution paths of length up to a fixed integer bound k are considered, and therefore the state space only needs to be unrolled to the bound. Context switch bounded model checking [22] (or just context bounded model checking) works similarly: only paths that contain at most k context switches, i.e. changes of the active thread, are considered.

Bounded model checking implies context bounded model checking as there cannot be more context switches than execution steps, but we focus on solutions that do context bounded model checking without bounded model checking. Generally bounding paths makes the analysis method unsound, i.e. it does not find all possible executions.

(14)

Solution

Context switch bounded model checking is commonly credited to the work of Qadeer and Rehof [22]. We present here a streamlined version of the theoretical algorithm given in their work. The problem of analyzing a multithreaded program against a property is given as a k-bounded reachability problem for a concurrent pushdown system.

Definition 3.1 (Pushdown systems). A pushdown system (PDS) is a 5-tupleP = (G,Γ,∆, gin, win), where G is a global state, Γ is the stack alphabet, Γ^∗ is a stack,

∆ ⊂ (G×Γ)×(G×Γ^∗) is the transition relation, gin ∈ G is the initial state and win ∈Γ^∗ is the initial stack. We define a transition system −→∆ over configurations c ∈ G×Γ^∗ (state with a stack) as follows: (∀w^′ ∈ Γ^∗ : hg, γw^′i −→∆ hg^′, ww^′i) ⇔ (hg, γi,hg^′, wi) ∈ ∆. −→^∗_∆ denotes the reflexive, transitive closure of −→∆ and a configuration cis reachable if cin −→^∗_∆c, where cin is the initial configuration.

A concurrent pushdown system (CPDS) is an n-tuple P = (G,Γ,∆0, ...,∆N, g_in, w_in), with one global state and transition relations ∆0, ...,∆N overGand Γ for each thread of the system. Configurations are of the corresponding form c=hg, w0, ..., wNi, with g ∈ G and w_i ∈ Γ^∗ and the transition system for them is hg, w0, ..., w_i, ..., w_Ni −→i

hg^′, w0, ..., w_i^′, ..., wNi ⇔ hg, wii −→∆_i hg^′, w_i^′i with the corresponding reflexive and transitive closure −→^∗_i.

For a positive natural number k, the k-bounded transition relation −−→^k is defined as follows: c −−→¹ c^′ ⇔ ∃ i : c −→^∗_i c^′, and c −−→^k⁺¹ c^′ ⇔ ∃ c^′′, i : c −−→^k c^′′ and c^′′ −→^∗_i c^′. Therefore a k-bounded transition can contain transitions with at mostk choices of i (thread), i.e. context switches. The k-bounded reachability problem for a concurrent pushdown system P is: given configurationsc0 and c1, is it the case thatc0

−−→k c1? To compute the reachable states, i.e. to solve the reachability problem, it is noted that the set of reachable configurations of a pushdown system is regular (see discussion in [22]). A subset S of pushdown configurations is called regular if and only if there exists a regular pushdown store automaton A so that S = L(A), where a regular pushdown store automaton is afinite automaton given as a 5-tupleA= (Q,Γ, δ, I, F), with states Q, alphabet Γ, transition relationδ⊂Q×Γ×Q, initial states I and final states F and so that G ⊂ Q and I ⊂ G and a language of configurations is defined as: L(A) ={hg, wi | A acceptsw when started in the state g}.

Then, having P = (Q,Γ,∆, gin, win) as a pushdown system, S as a set of configurations,P ost^∗_∆(S) ={c| ∃c^′ ∈S :c^′ −→^∗∆c}as the set of configurations reachable from S,A as a regular pushdown store automaton and S =L(A), a theorem (see reference in Theorem 1 of [22]) states that there exists a regular pushdown store automaton A^′ that can be constructed from P and A in polynomial time w.r.t. the sizes of P and A, and for which P ost^∗_∆(L(A)) =L(A^′), i.e. given a regular set of configurations the regular set of reachable configurations can be computed in polynomial time.

An algorithm utilizing these ideas is given in [22]. We show it as Algorithm 1. The algorithm starts by constructing a trivial regular pushdown store automaton from

(15)

the given initial configuration on line 2. Then the algorithm proceeds by using two auxiliary structures. “Worklist” is a structure containing pairs of a concurrent system configuration and a value denoting the amount of context switches used to reach the configuration. The order of elements in Worklist is not important. On calling Worklist.take() an an arbitrary element is returned (and removed) from the structure and on calling Worklist.put with an element the element is added to the structure.

Concurrent system configurations are represented by a global state and a regular pushdown store automaton for each thread (the accepted language of the automaton gives the corresponding stacks). “Reach” is used to collect reachable concurrent system configurations.

The algorithm works by taking a configuration from the Worklist (line 7) and in case reaching it has not exceeded the allowed context switch bound P ost^∗_∆ is run for each thread (line 10), the global states that were reached are collected (line 11), and for each of these a new configuration to Worklist is added. In the new configuration the global state and the store automaton (the local state) of the corresponding thread are updated (lines 12-13). The reached configurations are also collected (line 14).

The global state is visible to all threads and required for synchronization between the threads (it is a way of handling shared memory). The synchronization causes the context switch counter to be increased by one for the configuration. At the end of the algorithm Reach contains all reachable configurations to the context switch bound k.

Algorithm 1 Context switch bounded analysis.

Input: (G,Γ,∆0, ...,∆N, g_in, w_in) (CPDS), k (bound) Output: Reach (reachable configurations)

1: // initialization

2: Ain ←(Q,Γ, δ,{gin}, F) so that L(Ain) ={hgin, wini}

3: Worklist ←(hg, Ain, ..., Aini,0) // N copies ofAin

4: Reach ← hg, Ain, ..., Aini

5: // analysis

6: while Worklist not empty do

7: (hg, A0, ..., ANi, i)← Worklist.take()

8: if i < k then

9: for all j = 0.. N do

10: A^′_j ←Post^∗_∆_j(Aj)

11: for all g^′ ∈ {g^′ | ∃ w:hg^′, wi ∈L(A^′_j)} do

12: x← hg^′, A0, ..., A^′_j, ..., A_Ni

13: Worklist.put((x, i+ 1))

14: Reach ← Reach ∪ {x}

15: end for

16: end for

17: end if

18: end while

19: return Reach

The actual situation is not as straightforward as our representation might imply. The set of reachable configurations as computed by P ost^∗ on line 10 can be infinite be-

(16)

cause stack length is not bounded. This is handled in [22] (by reformulating what the regular pushdown store automata represent) but is left out for simplicity here.

Lastly, we can utilize the same approach that was used in the abstract interpretation approach to handle the happens-before memory model. In particular, a procedure such as visible (Section 3.1) can be used to determine visible values on read operations (during P ost^∗∆) and guarantee happens-before consistency. The only additional requirement of visible is access to the traces of all threads, which - in our view - can be incorporated into regular pushdown store automata.

We now discuss some of the improvements that have been built on the groundwork of Qadeer and Rehof.

An iterative implementation is presented in [19] targeted to the systematic testing of programs. Context switches are separated into two classes: preemptive (happening on an arbitrary location) and non-preemptive (happening when the running thread blocks on a read operation) switches. The paper makes the case that by bounding analysis only on preemptive switches interesting program behaviors can be inspected with very small bound, even zero. However, in the context of Java, the specification of the language [10] allows context switches to occur by e.g. time-slicing on processors, leading into a lot of preemptive switches.

In [12] it is noted that the design of Qadeer and Rehof only works on the assumption that the set of global states is finite and that therefore the design only works on finite- state data abstraction. An extension usingweighted pushdown systems is developed to rectify this. The implementation also uses summarizations of individual threads, i.e.

symbolic representations of sets of configurations and reachable sets of configurations in the same manner as summaries of procedures relating inputs to outputs are used to scale sequential program analysis.

Another symbolic representation for the reachable sets and an implementation of the design of Qadeer and Rehof with a modified technique called lazy splitting is given in [26]. Of particular interest is that the implementation has been done on the jMoped tool, and as such is targeted for the Java language. However, the paper does not go into much detail about the issues with the analysis of Java and instead focuses on proving the usefulness of lazy splitting.

In [1] it is noted that in the design of Qadeer and Rehof the amount of threads in the system is fixed. To support the dynamic creation of threads the paper suggests to change the global context switch bound to separate context switch bounds for each thread.

Lastly, [11] presents a technique to reduce concurrent programs under a context switch bound to sequential ones. This could allow extension of proven sequential analysis methods to concurrent ones but very few experimental results are presented.

(17)

Chapter 4 Analysis

This chapter presents an analysis of the two approaches given in the last chapter. We look at performance of the approaches as reported by the literature, make comparisons between them and consider alternative approaches.

Abstract Interpretation Approach

The Ph.D. thesis [9] that gave the abstract interpretation based method that we presented also contains an implementation of the said approach. The implementation is a tool called Checkmate, a framework to which various memory models, analysis properties and domain abstractions can be plugged into. Comprehensive testing of the base method would require testing several possible configurations of these elements and is not fully attempted in the thesis. The results given in the thesis consider academic case problems containing less than 200 bytecode statements and benchmark problems of some thousands of statements. For the small programs the system was too fast to quantify and found most of the counterexamples to the desired properties, or in some cases verified that no counterexamples exist. For the larger programs the system was also able to find counterexamples, but some domain abstractions showed problems in scalability. The thesis leaves the application of the system to industrial problems as a future challenge. We view that it is rather easy to make an abstract interpretation implementation scalable (e.g. polynomial w.r.t. program size) and precise in the sense that it is sound, i.e. that it reports all problems (but is not necessarily able to find the precise counterexample). The problem is in making the implementation precise in the sense that it does not report too many false counterexamples (due to imprecision of analysis). The results of the thesis do not fully answer what the impact of ensuring this kind of precision would have on the performance of the implementation. We conclude that the approach provides a flexible framework for the analysis of multiple properties, but that the performance on larger (Java) programs is largely unknown.

(18)

Context Switch Bounding Approach

The context switch bounded model checking algorithm that was shown in Chapter 3 has a time complexity result that for a bound k and a concurrent pushdown system P = (G,Γ,∆0, ...,∆N, gin, win) the algorithm decides the k-bounded reachability problem in time O(k³(N|G|)^k|P|⁵) [22]. The factor (N|G|)^k majors the others for large enough k. This factor can be understood by a quick inspection of the algorithm: each time a configuration is taken from the Worklist new configurations can be placed back at most |G| times for each thread, giving N|G|. Then, for such configurations the context switch counter is increased by one, and this can be done at most k times, giving (N|G|)^k. This highlights two problems with the method: the set of global states G must be finite and k cannot be very big (we assume that it is natural to demand that the amount of threads N is finite). As a counterpoint to the theoretical result, benchmarks run on a few practical programs [19] show that in practice, small bounds are often enough to expose bugs, i.e. that it is enough for a few context switches to happen at the right places. On a similar note, the same study finds that on some smaller cases a small bound can also cover most states of the given program. Additionally, iterative context switch bounded search was found to be superior to e.g. depth-first-search and (execution path) depth-bounding search.

Apart from these benchmarks not that many experimental results are found for other implementations of the approach. We conclude that the approach could prove to be a practical way of doing static analysis of multithreaded (Java) programs when soundness is not required. Still, more evidence of performance is needed.

Comparison of Approaches

The immediate difference between the two presented approaches would seem to be that the abstract interpretation approach is sound whereas context switch bounded model checking is not. However, the difference is not that clear cut: [9] notes that the abstract interpretation approach in the form shown in this work is context switch bounded analysis when the amount of steps in multithreaded semantics calculation is fixed instead of running to a fixpoint. The difference to the model checking approach is then that the model checking approach guarantees the analysis to be precise (at least when not utilizing symbolic representations of data) and sound up to the bound [22] whereas the abstract interpretation approach utilizes approximative abstractions already in the singlethread semantics calculation (and even in the intra-thread step calculation).

Another comparison can be made based on how the approaches handle the undecidability of an analysis that would be both context-sensitive and synchronization- sensitive [23] (see also Section 2.3). The abstract interpretation approach approxi- mates both of these (precise context-sensitivity is lost in abstractions in singlethread semantics and precise synchronization-sensitivity in abstractions in multithreaded semantics calculation). The context switch bounded model checking approach does not directly lose information related to calling contexts of procedures (although undecidability results related to sequential analysis [14] may apply), and therefore we could view it as a fully context-sensitive method. Synchronization-sensitivity is clearly lost due to missing all synchronizations after a given bound.

(19)

Other Approaches

The methods presented in this work are by no means the only possibilities in handling multithreaded Java programs. As stated in the introduction, lots of methods have been presented but usually they are targeted to a single analysis property. We give two examples of alternative approaches that could be used to handle multiple properties. A survey in the field of analysis of multithreaded programs [24] discusses the possibility of “analyzing” unwanted multithreaded properties by disallowing them to be written to programs. This would be done by enhancing the type system of the programming language. Utilizing this with Java would have to wait until a new version of the language would be designed. A more rigorous approach would be to use mathematical theorem proving methods [8] to analyze programs. However, this approach would be heavy-handed, and with the lighter methods such as model checking already having scalability problems the scalability of theorem provers would need a lot of improvement.

(20)

Chapter 5 Conclusions

Summary

Static analysis of concurrent Java programs was discussed. The most relevant parts of the Java programming language - when considering concurrent behavior - are threads, the handling of shared memory by synchronization with monitors and the (slightly overapproximated) happens-before memory model that defines the values readable from shared memory locations. Two generic (w.r.t. the property to analyze the programs against) approaches to multithreaded static analysis were reviewed. In the abstract interpretation approach, a sound analysis is achieved by overapproximating reachable states by firstly considering individual threads with very bounded visibility of the actions of other threads and then iterating analysis while relaxing the bound, leading into a fixpoint of no bound. The approach gives a flexible framework but large-scale performance, in particular the precision of the analysis, is unknown. In the context switch bounded model checking approach the analysis is cut on execution paths exceeding a given amount of changes of the active thread. The analysis is then unsound. The running time is generally exponential to the amount of allowed context switches but in practice a small bound can be enough to find relevant violations of properties. Both approaches are suited to handle the happens-before memory model.

Future Work

We have presented ideas on the static analysis of multithreaded Java programs, but not given an implementation. This is somewhat reflecting of the situation in the literature: several methods have been presented with minimal experimental results accompanying them. The technology is in its infancy, and the next step would be to have implementations with a lot of empiria behind them. An implementation utilizing several of the ideas given here could be of particular interest. This multi- pronged approach would enhance the type system of Java to disallow some unwanted behaviors and base analysis on context switch bounding, but also use abstractions and widening to help scalability and make the analysis sound if so required. Lastly, it is not clear to us whether the analysis of concurrent programs in general should be done with generic approaches or with approaches tuned to a particular property.

(21)

Bibliography

[1] M. F. Atig, A. Bouajjani, and S. Qadeer. ”Context-Bounded Analysis for Con- current Programs with Dynamic Creation of Threads”. In Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), York, UK, 2009, pages 107–123.

[2] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu. ”Symbolic Model Checking without BDDs”. InProceedings of the 5th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), Amsterdam, The Netherlands, March 22-28, 1999, pages 193–207.

[3] A. Bouajjani, J. Esparza, and T. Touili. ”A Generic Approach to the Static Analysis of Concurrent Programs with Procedures”. In Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL) New Orleans, Louisiana, USA, 2003, pages 62–73.

[4] G. Brat, K. Havelund, S. Park, and W. Visser. ”Java PathFinder - Second Gen- eration of a Java Model Checker”. InProceedings of the Workshop on Advances in Verification, Chicago, Illinois, July 2000.

[5] E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. The MIT Press, 1999.

[6] A. Cortesi. ”Widening Operators for Abstract Interpretation”. In Proceedings of the 6th IEEE International Conference on Software Engineering and Formal Methods (SEFM), Cape Town, South Africa, 10-14 November, 2008, pages 31–40.

[7] P. Cousot and R. Cousot. ”Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints”.

In Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), Los Angeles, California, USA, 1977, pages 238–252.

[8] D. Detlefs, G. Nelson, and J.B. Saxe. ”Simplify: A Theorem Prover for Program Checking”. Journal of the ACM, 52(3):365–473, 2005.

[9] P. Ferrara. ”Static Analysis via Abstract Interpretation of Multithreaded Pro- grams”. PhD thesis, ´Ecole Polytechnique of Paris (France) and University ”Ca’

Foscari” of Venice (Italy), May 2009.

(22)

[10] J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Specification, Third Edition. Addison-Wesley Longman, 2005.

[11] A. Lal and T. Reps. ”Reducing Concurrent Analysis Under a Context Bound to Sequential Analysis”. Formal Methods in System Design, 35(1):73–97, 2009.

[12] A. Lal, T. Touili, N. Kidd, and T. Reps. ”Interprocedural Analysis of Concurrent Programs Under a Context Bound”. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Con- struction and Analysis of Systems (TACAS / ETAPS), Budapest, Hungary, 2008, pages 282–298.

[13] L. Lamport. ”Time, clocks, and the Ordering of Events in a Distributed System”.

Communications of the ACM, 21(7):558–565, 1978.

[14] W. Landi. ”Undecidability of Static Analysis”. ACM Letters on Programming Languages and Systems (LOPLAS), 1(4):323–337, 1992.

[15] F. Logozzo. ”Cibai: An Abstract Interpretation-Based Static Analyzer for Mod- ular Analysis and Verification of Java Classes”. InProceedings of the 8th Interna- tional Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI), Nice, France, January, 2007, pages 283–298.

[16] F. Logozzo and M. F¨ahndrich. ”On the Relative Completeness of Bytecode Analysis versus Source Code Analysis”. In Proceedings of the 17th International Conference on Compiler Construction (CC), Held as Part of the Joint European Conferences on Theory and Practice of Software (ETAPS), Budapest, Hungary, 2008, pages 197–212.

[17] J. Manson, W. Pugh, and S.V. Adve. ”The Java Memory Model”. InProceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), Long Beach, California, USA, January, 2005, pages 378–

391.

[18] A. Marowka. ”Parallel Computing on Any Desktop”. Communications of the ACM, 50(9):74–78, 2007.

[19] M. Musuvathi and S. Qadeer. ”Iterative Context Bounding for Systematic Testing of Multithreaded Programs”. ACM SIGPLAN Notices, 42(6):446–455, 2007.

[20] F. Nielson, H. R. Nielson, and C. Hankin. Principles of Program Analysis.

Springer-Verlag New York, Inc., 1999.

[21] W. Pugh. ”The Java Memory Model is Fatally Flawed”. Concurrency - Practice and Experience, 12(6):445–455, 2000.

[22] S. Qadeer and J. Rehof. ”Context-Bounded Model Checking of Concurrent Soft- ware”. In Proceedings of the 11th International Conference on Tools and Algo- rithms for the Construction and Analysis of Systems (TACAS), held as part of the Joint European Conferences on Theory and Practice of Software (ETAPS) Edinburgh, UK, April, 2005, pages 93–107.

(23)

[23] G. Ramalingam. ”Context-Sensitive Synchronization-Sensitive Analysis is Un- decidable”. ACM Transactions on Programming Languages and Systems (TOPLAS), 22(2):416–430, 2000.

[24] M. C. Rinard. ”Analysis of Multithreaded Programs”. In Proceedings of the 8th International Static Analysis Symposium (SAS), Paris, France, July, 2001, pages 1–19.

[25] R.C. Steinke and G.J. Nutt. ”A Unified Theory of Shared Memory Consistency”.

Journal of the ACM, 51(5):800–849, 2004.

[26] D. Suwimonteerabuth, J. Esparza, and S. Schwoon. ”Symbolic Context-Bounded Analysis of Multithreaded Java Programs”. In Proceedings of the 15th Interna- tional Workshop on Model Checking Software (SPIN ’08), Los Angeles, Califor- nia, USA, 2008, pages 270–287.

A Review of Two Approaches to the Static Analysis of Multithreaded Java Programs