Method of this study - A System to Support the Analysis of Antivirus Products' Virus Detection

In this work we will present a system that is capable of automating several tasks that can be used for computer antivirus product virus detection analysis.

However, in order to provide a theoretical background we will at first present definitions and restrictions concerning computer viruses and antivirus product evaluation.

In constructive research when we are building an artefact the important question is: Can the artefact be built (see March and Smith 1995, p.258 and Järvinen 1999, p.59)? Therefore in this dissertation we will present the development process of the system as well as other computer-supported virus detection analysis processes used in the Virus Research Unit’s antivirus scanner analyses (see Helenius 1994a, 1995b, 1996b, 1997 and 1999a). Thus we will prove by demonstration that computer-supported processes can be built. The subsequent question is: How good is the system? Therefore assessments of the computer-supported processes are conducted. Although there also exist other characteristics, we will concentrate on efficiency, because we see this as the most critical characteristic.

2.1 Theoretical framework for antivirus product virus detection analysis

A theoretical framework is needed to understand computer antivirus product virus detection analysis. The framework is mainly constructed from experiential knowledge gathered during my research work at the Virus Research Unit. Some theories have been developed from previous papers and studies. For example, I discussed some problems and solutions associated with antivirus product evaluation in a conference paper at the EICAR conference 1996 (Helenius 1996). From the theoretical framework for computer antivirus detection analysis we can move to the construction of computer-supported processes.

2.2 Construction of computer-supported processes

At first we should note that some development phases of the processes were presented in conference papers (Helenius 1995a, 1998a and 1998b). However, my choice was to rewrite the development phases of the processes in a more detailed and consistent development description.

The computer-supported processes were constructed by starting from a simple implementation and gradually extending to more complex implementations.

When one stage was discovered to be viable, more features and new processes were constructed. From automatic virus replication processes a system named as Automatic and Controlled Virus Code Execution System was constructed.

The system was first presented at the EICAR conference in 1995 (Helenius 1995a). At first the system was used for automatic virus replication, but it also enabled the construction of other processes.

From the beginning the following general principles were established for the system because of safety and flexibility requirements:

1) The system must be isolated in such a way that a possibly escaped virus cannot cause harm to external computer systems.

2) The system must be designed as much as possible in such a way that a malicious code executed in a controlled environment cannot harm the system.

3) The system should be designed to be flexible in order to allow flexible future development.

4) The system should be designed to work as continuously as possible

In order to meet the first condition the system was isolated from external connections and also integrity of executable files of the system was checked. In order to meet the second condition I carefully prepared for possible vulnerabilities of the system. In order to fulfil the third requirement the software components of the system were designed as flexible as possible. In order to meet the fourth condition such problems were solved that would have precluded the system from working continuously.

The Automatic and Controlled Virus Code Execution System development started from an initial idea for automatic file virus replication. The implementation idea is presented in Figure 1. At the same time this was the target state. The target state was achieved, but other tasks were also perceived possible with the help of the system and this generated new ideas and therefore new target states. Furthermore, while the number of viruses infecting new types of object became large the need for automatic processes for these viruses also arose. When macro viruses appeared automatic macro virus replication and automatic processes for evaluation of macro virus detection were constructed.

The system’s capabilities to replicate macro viruses in a controlled environment was presented at the EICAR conference in 1998 (Helenius 1998a). Similarly when self-e-mailing viruses constituted a threat the need for

The system development resembled Floyd’s project model of STEPS (Software Technology for Evolutionary Participative System Development, Floyd et al.

1989, p.57). Usage of the system sometimes generated unpredictable situations and thus it gave feedback that was used for improving the system. While the system was used for antivirus product analysis, the improvements were implemented step by step until the system was stable. Sometimes there were programming errors, which caused problems and which had to be fixed.

Network server

Monitoring PC

Victim PC

M:\SOURCE\...\VIRUS.EXE M:\TARGET\...\GOAT1.EXE

Reset

Figure 1: Implementation idea of the automatic and controlled virus code system

A typical programming error was a timing error. For example, if a virus active in a computer impaired the computer’s performance, this had to be observed during the system development. The infected computer needed to have enough time to execute all required operations. A tragic example of a programming error is a situation where a computer was logged into a network with write rights to the wrong network directories after the virus code had been executed.

Sometimes there were unforeseen situations or required part of the implementation was missing. For example, certain malicious software could change CMOS memory's (see Appendix 1) content. This had to be fixed by automatically restoring the original content of the CMOS memory. Another example is a malicious program that destroyed the contents of the hard disk in such a way that it had to be low level formatted. This was fixed by automatically low level formatting the hard disk, if normal recovery did not recover the original system.

2.3 Assessment of the system

Since we do not have accurate technical details of other competing systems or processes we decided to briefly compare the efficiency of the processes

developed with manual processes required for the same tasks. Therefore I conducted controlled laboratory experiments measuring the efficiency of manual processes.

We define the difference between manual and automated processes such that manual processes do not have customised hardware implementations to automate all required operations whereas automated processes do not need human assistance once initiated. The argument for assessing the manual processes by myself is the expertise achieved from the research area and the capability to construct test conditions as correctly as possible.

The manual processes were executed in such a way that no customised hardware parts were utilised. However, such semi-automatic tasks were included in the manual replication which did not require hardware customisation and which were likely to be used in manual replication environments. This includes using batch files for executing goat files (see Appendix 1), automatic recovery of the fixed disk, checksum calculation, obtaining the sample file from the network server and saving changed objects to the network server. The intention was to estimate the maximum human processing efforts and therefore the intention was not to measure the human weariness that a monotonous work can cause. Therefore the processes were short enough to exclude weariness.

The same computers that were used with manual processes were used with automatic processes. The argument for using the same computers was to eliminate the effects that different computers would have had on replication time. The sample files were obtained from a virus collection containing possible viruses. In other words the sample files of the virus collection were not proved to contain viruses. To summarise, in the assessment section of this thesis, the controlled laboratory experiment setting was imitated as closely as possible.

We will present the results of experimental manual virus replication processes, the results from automatic virus replication processes and then compare the results. The replication speed of manual processes was recorded by using a program that recorded the process starting time and the sample file name for each replication process. The replication speed of automatic processes could be gathered from log files created during usage of the system.

3. Terminology associated with computer viruses and

In document A System to Support the Analysis of Antivirus Products' Virus Detection Capabilities (sivua 18-22)