1. Introduction

Internet traffic has been growing steadily for the past several years. For example, in the US alone, online banking traffic increased 47% from 2002 to 2005 (http://www.clickz.com/3481976, accessed 11 May 2010) and online shopping in the 2007 holiday season increased by 19% over 2006 (http://www.comscore.com/Press_Events/Press_Releases/2008/01/Holiday_E-Commerce_Increase, accessed 11 May 2010). This growth indicates a proportional increase in the demand experienced by web-based applications. Web-based applications used by customers usually provide an integrated compilation of information that is collected from multiple sources. The performance of these systems varies significantly based on their architecture, application logic, and capacity. Efficient and effective management of the system performance is necessary to meet and maintain satisfactory customer service, and to minimize the operating and capital cost. For a new system, it is important to analyse the performance prior to deployment (Praphamontripong et al, 2007) to make sure that the system performs to the expected level. For an existing system, it is important to analyse how the system performs under changing conditions. However, one cannot manage and/or control any system without having the capability to measure its performance and try various what-if scenarios (Puigjaner, 2003). Therefore, it is essential to have a tool that can be used to estimate system performance measures under varying load conditions and system configurations.

Several authors (Rolia and Sevcik, 1995; Ramesh and Perros, 2000a, 2000b; Reeser and Hariharan, 2002; Urgaonkar et al, 2005) have presented queuing network models for simple web-based applications. A queuing model of a very simple web server presented in Reeser and Hariharan (2002) incorporates dynamic server-side computing in a distributed environment. The model serves as a building block of a decision support tool for evaluating behaviour of new systems prior to their deployment and a tool to estimate the behaviour of existing systems under new workload scenarios. Ramesh and Perros (2000a, 2000b) present a queueing network model of web-based applications. However, the models can send and receive messages only from servers in an adjacent tier. This limits routing possibilities. Gautam and Seshadri (2002) model the web-server for an e-business application as a multi-stage, multi-class queuing network. They show that certain traffic patterns on the web cannot be easily managed by resorting to the use of parallel servers. However, these models do not adequately cover many of the complexities of web-based applications. These complexities include: resource locking, sequence and state dependent routing, failure and time-out of entities, parallel service operations, and resource availability management.

Discrete event simulation (DES) is an appropriate tool for investigating the performance and behaviour of web-based applications in distributed computing systems/networks under various input conditions (Pappu, 1997). Direct experimentation and study of these systems is typically cost prohibitive and often impractical. The complexity of web-based application systems precludes the possibility of adequate analytical models for most real-world scenarios. On the other hand, a simulation model can represent different parts of such systems realistically and can be used to understand their behaviour under a variety of conditions. Taylor and Robinson (2006) in a survey on future of discrete-event simulation conclude the need for domain-specific simulation tools especially in the area of services. This study addresses that gap.

Past work on simulating computer systems has typically focused on simulating at the computer hardware level (Keezer, 1997) or focused on modelling the system level software components (Reeser and Hariharan, 2002). Gralewitz et al (2004) model networking technologies using the COMNET III simulation tool. Altiok et al (2001) have developed a DES framework using ARENA/SIMAN that consists of a number of modules representing client and server nodes, network nodes and other critical components. The framework could be used for performance evaluation by predicting response times and identifying bottlenecks in the system.

This research addresses the performance issues at a higher level, that is web-application level—interactions among the servers and the performance as seen by customers. We develop a generic modelling tool for representing complex web-based applications deployed in a distributed computing environment. Specifically, the research focuses on modelling the application server and its interaction with other components of a web-based application system. Furthermore, we are primarily interested in the flow of customer requests through the different elements that make up the physical components of the web application. The configuration of web-based applications, and the services offered through the web change frequently. Therefore, the tool should be flexible to accommodate the various changes in the system modelled. In addition, we seek to develop a tool which is generic and can easily represent various types of web-based applications. We present an example to demonstrate the implementation of the tool and analysis to demonstrate the applicability of the tool as a decision support tool for performance evaluation and planning.

2. Simulating web-based application systems

Figure 1 depicts the components of a typical web-based application system. End users send requests using their browser and these requests are transmitted to the appropriate application through the internet. The application itself resides in a computer, often called the application server. There are usually several application servers, each with multiple processing channels. The application software reacts to user requests, routes the user requests to one of the application servers, and in turn sends information requests to other computers in the organization, to collect necessary information to meet the user request. We refer to these other computers as External Servers. The application collects and processes the responses from the external servers and responds back to the user.

Figure 1
figure 1

Web application systems.

The nature of the end user request dictates the initial sequence of the transactions from an application server to the external servers. These transactions can be processed sequentially, where the application server waits for a response from an external server before sending another transaction to an external server; or in parallel, if there are no precedence constraints, sending several transactions to external servers simultaneously and then aggregating the responses. This sequence of transactions may also be altered based on the response from an external server, for example if a transaction fails to get the requested information from the external server. Computing resources of the application servers are used for any data processing within the server as well as when waiting for responses from an external server. The response time to the end user (the key performance criterion) is the sum of the processing times inside the application server and the times it waits for responses from the external servers.

Many popular commercially available tools for simulation were designed with the intent of modelling the physical world. The capabilities of such tools are stretched while studying information flow in web-based applications, because such applications have special characteristics as described below.

  1. 1

    Web-based applications have non-traditional demand patterns. Slow response makes users go away, resulting in lost demand as well as wasted resources. A quick failure is not always lost demand as users often retry such as by using the ‘refresh’ button.

  2. 2

    The routing sequence of transactions is affected by outcomes in various processes. The simulation model must include the option for sequence and state dependent routing and must have the capability to add new routes when new types of transactions are added.

  3. 3

    Application server resources are tied up waiting for a transaction to return from the external server. We refer to this as resource locking. This is further affected by limits for maximum waiting times programmed in the application server.

  4. 4

    Resources needed to process transactions (eg., memory, CPU and system services) are not instantly available for use. The operating system in the application server needs time to acquire these resources and make them available for use.

  5. 5

    For a given user request, the application server can send multiple service calls in parallel to external servers, if there are no precedence constraints among the services. When the last of such transactions returns, the application server aggregates the responses and continues to the next service step.

An appropriate simulation tool should support hierarchical model construction and allow additional levels of details as the model is developed. It should include a wide variety of modelling constructs that are built in and allow one to develop complex constructs from the built-in ones, and a programming language to reflect routing logic of web-based systems. It should support reading model parameters from external files, so that a generic model can simulate a wide variety of system configurations. Finally, it is helpful to have the capability to write simulation results to external files at different stages during a simulation. Simprocess from CACI was chosen (http://www.simprocess.com) for this research because it meets all these requirements.

Server capacity is described in terms of the CPU, memory, disk storage and the I/O channels. Operating system software and web systems middleware run on the server. This software can be configured to determine the server's capacity to handle various activities, sometimes described as the number of simultaneous threads/processes that the server can support. For our study, we define the capacity of an application server as the number of processes the application server can run simultaneously. Determining the capacity of a server in number of such units requires a carefully controlled experiment.

3. Modelling tool and model constructs

We develop a DES tool to model the flow of information between various components of a web-based application system. The important model constructs described in this section are not available in typical simulation software, but are effective in modelling complex web-based application systems. Figure 2 shows a high level view of the simulation model. The Arrival element generates the entities that represent user requests. Entities arrive at an application server (AppSrvr), capture a unit of the AppSrvr resource (AppRes) when it becomes available, then is sent to the different external servers (ExtSrvr 1, ExtSrvr 2, …) during various steps of the service, based on a sequence determined by entity type. During each trip to an external server, another unit of AppRes is tied up, waiting for a response from the external server. This AppRes is released when a response is received. When all services are completed the entity ends up as either a Success or Failure, and finally the one unit of AppRes that was captured first is released.

Figure 2
figure 2

High level overview of a generic simulation model.

3.1. Arrival element: generation of entities

There is one generate activity for each type of entity. For example, if an application server receives three different types of transactions, the Arrival element associated with that server has three generate activities. Different attributes, for example, entity type and arrival time, are assigned to the entities in the Assign Attributes element.

3.2. ExtSrvr element: external server and the TimeOut construct

When a transaction is sent to an external server, there is a pre-determined maximum time that the application server will wait for a response. This is to prevent locking up all resources of the application server due to a problem and/or extreme congestion at an external server. If there is no response within this time, the transaction is considered timed-out and the application server considers the transaction failed and continues processing the entity based on an alternate routing due to this failure. However, the external server still continues to process the transaction discarded by the application server. The details of the TimeOut construct are described in next paragraph and Figure 3.

Figure 3
figure 3

TimeOut construct (inside ExtSrvr 1 element).

When an entity arrives from the AppSrvr, the TimeOutValue of the transaction is calculated as SimTime (current simulation time) plus the time limit for the external server. When the transaction gets a unit of ExtRes, delays of the transaction are calculated in the CalcDelays element. The ProcTimeDelay is sampled from the service time distribution of the external server for the transaction. Then the ActualDelay and RemainingDelay are calculated as:

Note that, when ProcTimeDelay is less than (TimeOutValue-SimTime), the transaction is successful and the RemainingDelay is zero. On the other hand, when ProcTimeDelay is greater than (TimeOutValue-SimTime); the transaction is timed-out and failed, and the RemainingDelay takes a positive value. After the ActualDelay of the transaction, a copy is made to simulate the RemainingDelay of the transaction in the external server. The parent transaction releases the ExtRes unit and returns to the application server. The copy is given a higher priority than the parent transaction and all the other entities, so that it captures the unit of resource released by the parent transaction immediately and holds it until the service is completed in the external server.

3.3. AppSrvr element: details inside the application server

An entity typically goes through several services. Each service is comprised of a number of processing steps. Routing information for various types of entities are kept in an Excel file referred to during a simulation. An entity enters the AppSrvr element through the UpdateSvc element, where the first service (Start) is assigned to it. The entity is then sent to the Router element, which routes the entity to the Start element. The entity acquires a unit of AppRes inside the Start element and the processing of the entity starts. Transactions are sent to different external servers, based on the processing sequence, through the To ExtSrvr element. Upon completion of the required action in the external server, the transaction returns to the application server. The AppRes unit is released and the entity enters the Router element, where the next step of the service is determined. If a service step consists of parallel service operations, the entity is sent to the Parallel element (described in the next section). When all steps of a service are completed, the entity is sent to the UpdateSvc element where the service is updated. The next service is determined based on the result of previous service and service sequence. When all services are completed the entity leaves the application server successfully through the END element. If an entity fails in any step of its processing, it leaves the application server through the EXIT element.

3.4. Parallel element: parallel construct for parallel service operations

The Parallel construct is an important construct, designed to model parallel service operations. For a given customer request, if there are multiple services that do not have any precedence constraints, they can be performed in parallel. The parallel service operation reduces the total response time of entities in system. Figure 4 presents the detail inside the Parallel element that simulates parallel service operations. When an entity requires parallel service operations, it is sent to the Parallel element, where it captures a unit of AppRes. The entity is then copied to reflect the number of parallel service operations to be performed, and each copy is assigned attributes based on the requirements for the respective service. The copies are sent to the Router element while the parent entity waits in front of the gate Hold Parents, until all copy entities come back after completing their services. When a copy entity completes its service, it is sent from Router to the Parallel element and the copy information, for example, success or failure, is collected, and the copy is destroyed. When all copy entities complete their services successfully, the parent entity is released from the gate Hold Parents and leaves the Parallel element as a successful entity. If any of the required copy entities fail, the parent entity is released from the gate Hold Parents and it leaves the Parallel element as a failed entity. If a parent entity leaves the Parallel element as failed, the copy entities that still remain in the system are marked as late copies. All late copies are destroyed when they come back to the Parallel element after completing their service.

Figure 4
figure 4

Parallel construct (inside Parallel element).

3.5. Resource capturing and availability management

In a traditional service activity, when a resource is available, it can be captured by an entity instantly. But, in the case of computing resources, the system takes some time to acquire the requested resource before the entity can capture it. We refer to this as the Resource Acquisition Time (RAT). RAT is a system characteristic and depends on the current usage of the resource. If current usage of the resource is high, the RAT is significant and may increase exponentially as the usage of the resource goes close to its capacity. We use a separate model element to reflect such characteristics. The time to acquire resource is calculated as a function of the current resource usage. This function may be different for different systems and a person knowledgeable about the system should have an idea of such a function approximately. In this study, we use a linear function; we define a RATParameter and calculate RAT as RATParameter*U t , where U t is the percentage of the resource in use. The entity waits in a Wait (RAT) element before capturing the resource, and then captures the resource.

Our discussion with experts in such application systems helped us understand that when an entity releases a resource unit, it can be captured by another entity instantly if requested within a short period of time. This time is usually small and if not requested by any entity within this time the resource gets back to its normal idle condition. To address this, we develop the idea of active and inactive idle resources. When a resource is not busy, it can be in one of the two conditions—active or inactive. When a resource is released by an entity, it goes to the active resource pool and stays as active for a specific amount of time (T active). The resource becomes inactive if not requested by any entity within this time. Determining the value of T active deserves detailed study as it depends on the specific system characteristics. An entity can capture a resource instantly if an active resource is available or wait for a time equal to RAT before capturing an inactive resource. To reflect the active resource management, we defined a release time variable, RelTime[n]. The index n goes from 1 to N active, where N active is the number of available active resources. The variable RelTime[n] represents the time when the nth active resource will be inactive if not requested by any entity by this time.

3.6. Router and UpdateSvc: routing

Developing a good routing logic is critical since entity routing is dynamic, entity and state dependent. The system updates all the necessary information about different processing steps in a service within the Router element. Typical information includes the name of the external server that will perform the service, processing time distribution and success rate. The Router element sends entities to appropriate external servers and receives them after processing. The processing step the entity will undergo next depends on the result from the previous step. This process is repeated until all steps of the current service are competed. When the service is completed, a Service Result is assigned to the entity depending on the success or failure of the entity. The service the entity will undergo next depends on this result. The entity is then sent to the UpdateSvc element. The system then updates the information for the next service. While updating the processing steps in the Router or updating the services in the UpdateSvc element, all information is read from two input Excel files. Details of the Excel files are explained in the next section.

3.7. Defining and using the input and output files

The configurations and routings of entities in web-based application systems may change for many reasons, including changes in the technology, configurations of hardware and/or software, service level agreements with external service providers, and services provided to customers. Making changes in the simulation model of a complex system usually requires significant time and effort, and a person who is knowledgeable about the system being modelled and has expertise in the simulation software. Therefore, we wanted to develop a flexible tool such that the developed model can be adjusted for many changes in the system easily and quickly. To make the model flexible, input Excel files are used to feed the model with all input parameters required during a simulation. All input parameters, for example the arrival rate distribution, sequence of services and processing steps, processing time distributions, and external severs that provides the services, are organized in the input Excel files. The necessary information is read at different stages of a simulation as required. An output Excel file is used to store necessary information during a simulation for easy offline analysis after the completion of simulation. This also keeps the modelling tool very modular and new applications can be modelled very quickly.

The input and output files must be opened at the start of the simulation. In SimProcess, the input and output files are defined as stream variables. For example, to define an input files, SvcData.xls and, and an output file, OutPut.xls, the following statements are used.

The necessary information is read from the input files. For example, to read the inter-arrival time distribution of Entity 1 from SvcData.xls, the following statement is used.

This statement reads the information from the Excel file SvcData.xls, Sheet Entity 1, row number 11 and column number given by the variable Model.Load, and assigns the value to the string variable, Model.DistEntity 1. This variable is then used in generating Entity 1 in the Arrival element. The necessary information can be written to out Excel file. Output variables can be written to the open Excel file in a similar fashion.

The input and output Excel files make the development and maintenance of the model, and output analysis easy and hence make the model flexible. To modify the model developed to reflect any changes in the system that is modelled, we simply need to change the information in the input Excel files. Therefore, with a well-defined document, the model that is developed can be used and maintained easily by a person with little knowledge of simulation and/or simulation software.

4. An example

In this section we present a simple web-based application using our modelling tool. We consider a web-based application system, for example for a credit card account service, with two layers of servers, application servers and external servers (DSS, CM, GWay, MTC and RP). The system receives one type of customer requests (ASUM)—customers requesting their accounts summary. The external servers also receive service requests from sources other than the application server.

4.1. Problem description for the example

Table 1 presents the required services, processing steps for the services, and their sequence for ASUM. Once the process starts within the application server, it goes to authorization at DSS. After authorization of a customer, the application server continues to the CardSvc service, where the information on each of the customer's cards is collected from the card database. In this example, we assume that the number of cards a customer may have is uniformly distributed from 1 to 6. After the card information is collected, the application continues to ASumSvc where the account summary for each card is collected. Finally, there is Customization service, where personalized information, for example special promotion, is provided. When all the information is collected, the application server responds back to the customer. Entities may fail at any step and in that case, the entity exits the system rather than continue to the next step. At ASumSvc, the application server gets summary statements for each card from GWay. This can be done either sequentially or in parallel. To implement the parallel service, we point the entities to a nested service called GetASum. This service (described in Table 2) calls another service GWaySvc that routes the entities to the external servers MTC and GWay to obtain the necessary summaries. Details of GWaySvc are described in Table 3.

Table 1 Services and the processing steps of the services for ASUM
Table 2 Detail of the nested service GetASum
Table 3 Detail of the GWaySvc

We assume that the capacity of an application server is 20. We use RATParameter as 50.0 and T active as zero for our initial analysis and then vary them to see their effects on the system performance. The parameters for the external servers are as shown in Table 4. The capacities and loads are chosen such that the external servers are approximately 60% loaded by the external entities. This reflects the fact that the load from an application system is usually a small portion of the total load on the external servers, as they receive entities from several application systems. Our research was motivated by an application in the technology division of a Fortune 100 financial services company. The processing times for different processes we used are in a similar scale of the real system. The processing time distributions were mostly exponential for the real-world application.

Table 4 Configurations of external servers and external loads

4.2. Designing the input and output files for the example

Two input Excel files are defined; SvcData.xls and ProcData.xls. The file SvcData.xls includes the information on the inter-arrival time distributions of the entities, service requirements, and the servers. All information for the ASUM entity is stored in the Sheet named ASUM as shown in Figure 5(a). A separate sheet would be defined for each entity type, for example Entity 1, Entity 2, etc if we would have more entity types. Row 15 of the Sheet ASUM contains the inter-arrival time distribution (defined as a string variable) for ASUM; the distributions at different load conditions are stored in different columns. Rows 2 through 7 contain information on the services in the order they will be performed. Column 1 contains the service name (defined as string variable). Columns 7 and 8 contain the row number (defined as integer variable) that contains the information on the next service that will be performed, if the entity is successful and fails respectively at current service. The Sheet ServerInfo contains the server information as shown in Figure 5(b). The server information is read at the start of a simulation run.

Figure 5
figure 5

Input Excel file SvcData.xls. (a) Service requirements for ASUM; (b) Server Information.

The file ProcData.xls includes the detailed information about the processing steps of different services. A separate Sheet is defined for each service. Figure 6(a) shows the processing steps in ASumSvc, in the order they need to be performed. Columns 1, 2, 3 and 4 contains respectively the processing step (defined as string variable), processing time distributions (defined as string variable), success rates (defined as real variable) and the name of the external servers (defined as string variable) that provide the service at current step. Columns 7 and 8 contain respectively the row numbers (defined as integer variable) that contain the information on the next processing step, if the entity is successful or fails respectively at current step.

Figure 6
figure 6

Input Excel file ProcData.xls. (a) Processing steps for AsumSvc; (b) Processing steps for GetASum; (c) Processing steps for GwaySvc.

There is a nested service in step 2 of ASumSvc named GetASum; the processing steps are shown in Figure 5(b). In step 2, there are parallel service operations, indicated by a dummy processing step named Parallel; column 5 contains the number (defined as a string variable) of service operations to be performed in parallel. The entity is sent to the Parallel element, where one copy for each service operation is made and the information on the services for the copies are read starting from row 8 of the same sheet. In our example, the same service, GWaySvc, is performed for all copies. In other problems, there may be different services that need to be performed in parallel. However, these services can also be performed sequentially for all cards one after another; in that case, the GWaySvc is repeated for each card without making any copies. Figure 6(c) shows the processing steps for GWaySvc. We use an attribute for the entities as the depth parameter for services and processes, so that when a nested service or parallel service operation is completed, the system can return to the same processing step where the service was originated.

An output file named OutPut.xls is defined to collect different data during a simulation run. At the end of a simulation run or at any step of entities processing during the simulation run, the necessary data are written to the file.

4.3. Maintaining the model

The model tool is designed in a way such that the model developed can be modified to implement many changes in the system, by simply changing the corresponding information in the input files. For example, if the inter-arrival time distribution for entity ASUM is changed, we simply change the distribution in row 21 of the corresponding Sheet in SvcData.xls. If the sequence of services changes for ASUM, we change their sequence accordingly in SvcData.xls. If a new service is to be added for ASUM, we insert the new service information in an appropriate row according to the service sequence. Similarly, if the sequence of steps changes or a new step is to be added for a service, we change the corresponding information in ProcData.xls. To implement multiple entity types, a separate sheet for each entity type is to be defined in SvcData.xls with all information for that entity type. Information collected in the output Excel file can be analysed independently from simulation. Therefore, the model that is developed is flexible and easy to maintain.

5. Results and analysis

Some important features of web application systems are parallel service operations, RAT, active time of resource, server selection rule, and the fact that there is a nonlinear relationship between the load and response time. In this section, we present some analyses to demonstrate how the model developed can be used to explore the characteristics of the system and hence can be used as a tool for performance management. For all of the experiments, the estimates of the response variable are from one long replication. To ensure reliable results, we have run the simulations long enough so that once the simulation reaches steady state, we get approximately hundred thousand observations for estimating the average of any output variable.

5.1. Effect of parallel, sequential and mixed service operations

Parallel service operations play an important role in the performance of web-based applications. The objective of using parallel service operations is to reduce the response time of entities. We present an experiment with our example system to compare the performance of the system with parallel and sequential service operations for cases with 8 and 10 application servers. The result of our experimentation is presented in Figure 7. Figure 7 also presents the response time for the same cases for mixed service operations (Mixed) which is described later in the section. Figure 7(a) and (b) present the average response times and Figure 7(c) and (d) present the standard deviation of the response time.

Figure 7
figure 7

Comparison of Parallel (P), sequential (S) and Mixed service operations. (a) Mean response time for system with eight application servers; (b) Mean response time for system with 10 application servers; (c) Standard deviation of response time for system with eight application servers; (d) Standard deviation of response time for system with 10 application servers.

Figure 7 shows that, at high levels of load, parallel service operations actually result in longer average response times when compared to sequential service operations. Also, it is observed that with parallel service operations, the system becomes unstable at a lower load level than with sequential service operations. During the parallel service operations the application consumes additional resources that increase the load on the application server. Also, the application server sends the service requests to the external servers in bulk and sometimes to the same external server, if all services are the same. This increases the variance of the turnaround times of entities from the external servers, which further increases the load on the application server because of resource locking. Therefore, it is advantageous to use parallel service operations at low to medium load and sequential service operations at high load. We use this as a strategy in the mixed service operations (Mixed) model. For our example system, we found that the response time increases at a higher rate with parallel service operations than with sequential service operations when the application server utilization exceeds approximately 70%. Therefore, for mixed service operations we use 70% server utilization as the switching point from parallel to sequential service operations. Figure 7 indicates that the use of mixed service operations provides the good parts of both parallel and sequential service operations. Standard deviation of the response time also follows similar trend as average. Similar results were observed with 8 and 10 application servers. Hence, for our analyses in the next sections, we use mixed service operations with 10 application servers only.

5.2. Effect of server selection rules

Web-based applications usually have multiple application servers and use a server selection rule to route arriving entities to a particular application server. In our example problem, we used the Least Busy Server (LBS) rule, that is, an arriving entity is assigned to a server which is least busy at the moment of arrival. To study the effect of different server selection rules on system performance we tried two more rules: Random, where the entities are assigned randomly to a server, and Cyclic, where entities are assigned servers in a cyclic order. The average and standard deviation of the response times are presented in Figure 8(a) and (b).

Figure 8
figure 8

Comparison of panel selection rules (a) Mean response time; (b) Standard deviation of response time.

Figure 8 indicates that the system performance was the best in terms of both the average and standard deviation for the LBS rule. The Cyclic server selection rule comes in at a close second, but the system performance with Random server selection rule was significantly worse. For the analysis in the next section, we use the LBS server selection rule.

5.3. Effect of resource active time

In this section, we present analysis to show the effect of the resource active time (T active) on average response time. We analyse our system with three different values for T active and the results are presented in Figure 9(a). To see how the variance is affected we present the standard deviation of the response time in Figure 9(b). Figure 9 indicates that with an increase in T active, the response time decreases. Standard deviation of the response time also follows similar trend as average. This is because, with an increase in T active, the probability that an entity gets an active resource increases; hence the entity does not have to wait for the RAT. However, the marginal benefit of increasing T active diminishes with increasing T active. Maintaining a larger value for T active involves additional cost of server maintenance. Therefore, there is a trade-off between the cost of maintaining large T active and the benefits gained.

Figure 9
figure 9

Effect of T active on system response time (a) Mean response time; (b) Standard deviation of response time.

6. Summary and conclusions

We describe in this paper the role DES plays in modelling and analysing performance issues in a web-based application that is dominated by high volume transactions, unpredictable demand, and the varying definitions of success or failure as the transactions flow through different resources. We present a DES modelling tool that facilitates modelling a complex web-based application system for performance evaluation. The tool is flexible and modular—it can be used to model a wide variety of systems at virtually any level of detail and the resulting model can be easily adjusted to accommodate many changes from time to time. The use of Excel files for data input makes the model easy to maintain by a person with limited simulation expertise. We demonstrated the implementation of our modelling tool with an example and presented some preliminary analyses and insights. These demonstrate the applicability of our modelling tool as a decision support tool for performance management of web-based application systems.

It is important that correct data be collected for service time of entities at different steps. It is also essential to define the application resource unit properly and measure the capacity of a server accurately before the model can be used effectively to analyse such systems. A carefully designed experiment with a simulated demand pattern in an existing system, or a completely simulated environment, may help defining an appropriate unit and measuring the capacities of the servers. The performance of the Internet substantively affects the overall performance of any system. Future directions for this research include understanding the effect of network performance on the response time of web-based applications.