C OMPLEMENTARY CROSSOVER - Genetic Algorithms in Software Architecture Synthesis

4.2.1 Method

As asexual reproduction did not produce the desired outcome, and the random crossover is not very close to a real-life design situation, the idea of enhancing the crossover operator so that it would act in a more purposeful way seemed appealing. Thus, the complementary crossover and the complementary gene-selective crossover were implemented. Biologically, complementary crossover corresponds to the theory of genetic

compatibility [Zeh and Zeh, 1996], i.e., that some individuals are genetically more compatible with each other than others. Genetically compatible individuals are more likely to produce viable offspring, and partners are sought so that the individuals have complementing properties. Thus, their offspring would inherit as many desirable properties as possible. In software design, the complementary crossover corresponds to a situation where two architects exchange ideas on their designs, and attempt to combine the best parts of two competing solutions.

In simple complementary crossover the objective is to purposefully combine two parents which satisfy different quality requirements.

Modifiability and efficiency were chosen as the pair of competing quality attributes. In the beginning of each generation, the population was ranked based on both modifiability and efficiency. If an individual’s modifiability rank was higher than its efficiency rank, it was considered that the individual satisfied modifiability related requirements better (and vice versa in terms of efficiency). The modifiable individuals were labeled as mothers and the efficient individuals were labeled as fathers.

After the population was divided into mothers and fathers, the mutation operation and parent-selection for crossover were performed normally.

However, when crossover was actually applied, the individuals in the parent pool were inspected more thoroughly. The fathers were placed in a subpool of their own, and the mothers similarly to their own subpool. The crossover was then administered by choosing a pair of parents so that one came from the father pool and one from the mother pool. If one pool had more individuals than the other, these remaining individuals were not used for crossover. The actual crossover point, however, was still chosen randomly. The ideology was that these two individuals with different strengths would complement each other and produce offspring which fulfills both quality attributes.

As the crossover point in this simple complementary crossover was chosen randomly, there was the risk that the best parts, i.e., the parts of individuals that have the biggest impact in satisfying the quality requirements, might be lost. Thus, a more intricate version, the gene-selective complementary crossover, was implemented.

When using the gene-selective complementary crossover, the ranking and pooling of parents was done similarly to simple complementary crossover.

However, when the actual crossover was performed, the crossover point was no longer selected randomly. In order to find the best crossover point, the most modifiable section of the mother and the most efficient section of the father were sought. These sections were identified by using the maximum contiguous sub-sequence sum [Weiss, 1998]. Each supergene was calculated a value by how much it influenced the modifiability or

83 efficiency value. In other words, the operation defined by that particular supergene was inspected in terms of its impact on the different sub-fitnesses. The best sequence of supergenes was selected as the modifiable or efficient section.

The crossover point was then selected randomly from the indexes that were in between the found sections. Only one individual was produced, and that individual was thus combined so that it contained the best sections of both its mother and its father. The gene-selective complementary crossover is depicted in Figure 29. Here index l is the starting index of the most modifiable section of the mother, while k is the end index. For the father, m is the start index of the most efficient section, and p is the end index. The crossover point cp should now land between k and m in order to retain both sections for the child.

Figure 29. Gene-selective complementary crossover

4.2.2 Case studies

The fitness curves for both cases are portrayed in logarithmic scale. The fitness function used in this study (as well as for asexual reproduction) was able to give exponential reward for the use of message dispatcher, if it was used very heavily. The message dispatcher was rewarded in this way as it is not fully beneficial until it is used throughout the whole architecture, and not only between a few operations. Once the usage is intensive, it makes the architecture much more modifiable on a general level than individual design patterns. As the complementary crossover made such usage possible, the fitness values also developed exponentially, and logarithmic curves were required to evaluate the development of fitness values. Fitness values were calculated as averages of 20 runs.

l k

m p

k cp

l m p

mother

father

child

Figure 30 illustrates the fitness curves for ehome. The curve for the standard method remains quite unchanged at just above 1000. Some development did happen, but not so much that it would show on logarithmic scale. The curves for complementary crossover actually descend for the first 100 and so generations, after which they start ascending, and reach quite high values.

Figure 31 shows the respective fitness curves for robo. Here the standard curve behaves similarly as in the case of ehome, while the difference between standard and complementary crossover curves is drastic. Because the difference between the two cases was so big, tests with 750 generations were made. These tests (discussed further in publication [VII]) showed, that with this longer evolution, ehome was also able to reach values as high as robo already did at 250, and the shape of the curve resembled much more those of robo as well. The fitness curves for longer evolution with ehome are given in Figure 32. The fitness curves for robo with longer evolution (750 generations) did not differ significantly from those achieved with shorter evolution (250 generations).

Figure 30. Fitness curves for ehome, complementary crossover

Figure 31. Fitness curves for robo, complementary crossover

Figure 32. Fitness curves for ehome, complementary crossover, 750 generations

The architecture proposals achieved with complementary crossover illustrated quite well why the fitness curves behaved the way they did.

The main difference between solutions achieved with complementary crossovers and the standard crossover was the presence and level of usage of the message dispatcher. When the standard random crossover was used, no solution contained the message dispatcher for either case. However, when complementary crossover was used, 18 of the 20 solutions for simple complementary crossover in ehome already contained the dispatcher, and in solutions with gene-selective crossover for ehome, and both complementary crossovers for robo, the message dispatcher was present in all the solutions.

The example solutions for ehome and robo, given in Figures 33 and 34, respectively, portray typical solutions with gene-selective complementary crossover. The example solution for ehome has been achieved after 750 generations, while the example for robo is the result of a 250 generations long evolution. It seems ehome is slightly more difficult to deal with from the algorithm point of view, and it takes a longer time to effectively use the message dispatcher. This explains the descend of fitness curves in the beginning: when the message dispatcher is present in the architecture but very poorly used, the penalty for it is much larger than the reward, and thus the fitness curve descends.

Both solutions have very centralized use of the message dispatcher, and also several instances of the different design patterns. The longer evolution for ehome seems to have affected also the number of patterns, as all low-level design patterns have been brought to the system, while for robo there are no Adapters, and the amounts of Template Methods and Strategies are significantly lower than for ehome.

Figure 33. Example solution for ehome, complementary crossover

Figure 34. Example solution for robo, complementary crossover

To summarize, the simple complementary crossover combines parents which fulfill different quality requirements. The gene-selective complementary crossover similarly seeks complementary parents, but also inspects the parents and only produces one offspring with the best parts of the two parents. The complementary crossover is able to produce solutions with delayed reward, and the difference to simple random crossover is significant (to the better). This variation is discussed in more detail in publication [VII].

In document Genetic Algorithms in Software Architecture Synthesis (sivua 81-87)