Performance Fundamentals - 27th Edition (Spring 2010)

Performance Fundamentals - 27th Edition (Spring 2010)

Computer platforms must be configured properly to support system performance requirements. There are many factors that contribute to user performance and productivity. Enterprise GIS solutions include distributed processing environments where user performance can be the product of contributions from several hardware platform environments. Many of these platform resources are shared with other users. Understanding distributed processing technology provides a fundamental framework for deploying a successful enterprise GIS.

The importance of working together to understand the technology is illustrated in Figure 5-1. There is a famous poem by John Godfry Saxe that tells a story about six blind men that went to see an elephant, though all of them were blind. As they observe the elephant they form their own understanding of what the elephant looks like.



The first blind man approached the side of the elephant which feels like a wall, the second feels the tusk which is very like a spear, the third holds the trunk which feels like a snake, the fourth felt about the knee which was like a tree, the fifth touched the ear which felt like a fan, and the sixth grabbed the tail which felt like a rope.

After examining the elephant, the six wise men met to discuss their findings. Each expressed his opinion about what an elephant looks like based on his experience. They were all partly right, and at the same time they were all wrong. They each experienced only a part of what the elephant really looked like. The whole elephant is a combination of the parts each blind man visualized from their own discovery.

We understand technology from an abstract view much like the blind men saw the elephant. Technology is changing very rapidly, and we all see and try to understand what these changes mean. At the same time, we too have a limited view based our own experience. What we see is only a part of the whole picture. For this reason, it is important to listen to the experience of others and combine their experience with our own as we move technology forward. I have found that we all have information to contribute and the questions we have can help others understand the technology in a better and more complete way. Modeling our experience, and expanding these models to incorporate the experience of others, can facilitate our learning process.

Understanding the Technology
ESRI has implemented distributed GIS solutions since the late 1980s. For many years, distributed processing environments were not well understood, and customers relied on the experience of technical experts to identify hardware requirements to support their implementation needs. Each technical expert had a different perspective on what hardware infrastructure might be required for a successful implementation, and recommendations were not consistent. Many hardware decisions were made based on the size of the project budget rather than a clear understanding of user requirements and the appropriate hardware technology.

We started developing simple system performance models in the early 1990s to document our understanding about distributed processing systems. These system performance models have been used by ESRI system design consultants to support distributed computing hardware solutions since 1992. These same performance models have also been used to identify potential performance problems with existing computing environments.

Our first performance models were developed to address platform sizing for GIS desktop applications with file and GIS database data sources. UNIX and Windows application computer servers were used to provide remote terminal access to GIS applications hosted in centralized data centers. A simple concurrent user model was used to support capacity planning.

Web mapping services were introduced in the late 1990s, and transaction-based sizing models were developed for capacity planning and proper hardware selection. Transaction rates were identified in terms of map displays per hour. The transaction-based capacity planning models proved to be much more accurate and predictable than the previous concurrent user models, although in many cases customers were more comfortable identifying sizing requirements in terms of peak concurrent user load than using peak map requests per hour.

The release of ArcGIS Server 9.2 in 2006 introduced some new challenges for the traditional sizing models, and an effort to review lessons learned and take a close look at the road ahead was in order. The result is a new approach to capacity planning that incorporates the best of the traditional client/server and Web services sizing models and provides an adaptive sizing methodology to support future enterprise GIS operations. The new capacity planning methodology is much easier to use and provides metrics to manage performance compliance during development, initial system implementation, and delivery.

This new capacity planning model was developed and shared with the objective of helping software developers, business partners, technical marketing specialists, and ESRI distributors better understand the performance and scalability of ESRI technology. The primary goal was to provide customers with the best possible GIS solutions to support their enterprise GIS operations.

This section presents a basic overview of the system performance fundamentals. The terms and relationships introduced in this section describe fundamental relationships we can all use to better understand performance. The sections on Software and Platform performance provide additional insight on the processing demands GIS software can place on our system environment, and identify the processing capabilities of current vendor hardware technology. An understanding of these performance fundamentals can provide a framework for building and maintaining more effective real-world GIS operations.

What Is Capacity Planning?
Figure 5-2 identifies some key factors that contribute to overall system performance. Proper hardware and architecture selection is part of the overall system performance equation. Software technology selection and application design is another key contributing factor. Network connectivity can also be a limiting constraint. The system design solution must provide sufficient platform and network capacity to satisfy peak user performance needs.



Proper system design requires the right software and hardware technology selection to meet user workflow performance needs. Enhancements in any of the system performance factors can improve user productivity and impact total system capacity. Performance cannot be guaranteed by proper hardware selection alone. The performance fundamentals described in this section can help identify appropriate hardware selection based on customer business requirements.

GIS technology can be both compute intensive and bandwidth intensive, which means the system design will fail if it does not have sufficient platform or network capacity to handle the required processing loads. System architecture design is all about designing a balanced system and managing implementation risk.

Most project managers clearly understand the importance and value of a project schedule in managing deployment risk associated with cost and schedule. The same basic principals can be applied to managing system performance risk. Figure 5-3 identifies how system architecture design can be applied to managing system performance risk.



System architecture design provides a framework for identifying a balanced system design and establishing reasonable software processing performance budgets. Performance expectations are established based on selected software processing complexity and vendor published hardware processing capacity. System design performance expectations can be represented by established software processing performance targets. These performance targets can be translated into specific software performance milestones which can be validated during system deployment. Software processing complexity and/or hardware processing capacity can be reviewed and adjusted as necessary at each deployment milestone to ensure system is delivered within the established performance budget.

Our understanding of GIS processing complexity and how this workload is supported by vendor platform technology is based on more than 20 years of experience. A balanced software and hardware investment, with capacity based on projected peak user workflow loads, can reduce cost and ensure system deployment success.

What Is System Performance?
Computer platforms are supported by several component technologies. Each component technology is important - the weakest component will limit overall platform performance. Some software applications require lots of data transfer and interactive graphics display, while others require heavy computing. Hardware vendors build computers with a balance of component resources that optimize performance for a broad range of customers. Compute intensive software, like GIS, find the hardware platform processor core the computer technology which limits server performance.

In much the same way, distributed computing solutions (enterprise computing environments) are supported by several hardware platforms that contribute to overall system performance. Each hardware component contributes - the weakest component will limit overall system performance. Hardware platforms supporting a computing environment must be carefully selected to satisfy peak processing needs.

The primary objective of the system architecture design process is to provide the highest level of user performance for the available system hardware investment. Each hardware component must be selected with sufficient performance to support processing needs. Current technology can limit system design alternatives. Understanding the distributed processing loads at each hardware component level provides a foundation for establishing an optimum system solution.

Figure 5-4 provides a overview of the key performance terms and measures in a distributed processing environment. Each system component participates sequentially in the overall program execution. Processing is supported in platform memory—sufficient memory must be available to support the software execution.



The total response time of a particular application display will be a collection of the processing and wait times from a variety of shared system components. A computer vendor optimizes the component configuration within the hardware to support the fastest computer response to an application query. The customer IT/systems department has the responsibility of optimizing the organization's hardware and network component investments to provide the highest system-level response at the user desktop. System performance can directly contribute to user productivity.

GIS users have experienced significant performance and productivity improvements over the past eight years. Time for computers to process a dynamic map display is over 10 times faster with 2009 hardware than what was possible with 2001 technology - this change has a significant impact on user productivity and the opportunities for use of GIS technology.

Software technology selection has also made a difference in display performance. The earlier scripted program technologies of the 1990s (ArcIMS) performs faster than the standard component object based software technology (ArcGIS Server) used today. Performance is a function of the amount of work (processing) required by the software and the performance of the hardware technology (how fast the processing workload can be performed). Significant performance gains improved user experience with the ArcGIS Server 9.3.1 optimized map service, using a new graphics rendering engine to improve quality and performance of standard Web mapping services - generating dynamic maps faster than the ArcIMS Image service. ArcGIS Server client use of pre-processed map cache offloads dynamic (real time) server processing loads and improves user productivity.

Selecting the right software technology pattern and the right hardware architecture is more important today than it ever was before. This chapter is about understanding the fundamental concepts about performance and scalability. The following chapter will look more closely at display traffic and network communication bandwidth contributions to overall system performance.

System Performance Fundamentals
The study of work performance is not new, and there are a considerable amount of theories and ideas published on this topic. Understanding the fundamental terms and relationships that define work performance and applying these fundamentals to computer processing helps us better understand the technology and make more appropriate design choices. Figure 5-5 identifies some basic terminology used in defining a user workflow. During a user needs assessment the user workflow is often represented by a use case. The user is the person that interfaces with a software application through a computer display.



The most important system performance terms define the average work transaction (display), work throughput, system capacity, and system utilization identified in Figure 5-6. The processor core is the hardware that executes the computer program instructions, so knowing the number of processor core available in a selected hardware platform configuration is important. Service time is a measure of the average work transaction processing time, and queue time is a measure of the time waiting to be processed (waiting in line for service). Response time is the overall measure of system performance, and includes all of the component service times plus any additional wait or travel times required to refresh the client display (complete a work transaction).



The relationships between these performance terms are quite simple and many times misunderstood. The relationship between throughput, capacity, and utilization are true based on how these terms are defined. Throughput is the number of work transactions being processed per unit time, capacity is the maximum throughput that can be supported by a specific hardware configuration, and utilization is the ratio of the current throughput to the system capacity (expressed as percentage of capacity). If you know the current throughput (users working on the system) and you measure the system utilization (average computer CPU utilization), then you can know the capacity of the server.

Work transaction service time is a key term used to measure software performance. The software program provides a set of instructions that must be executed by the computer to complete a work transaction. The processor core executes the instructions defined in the computer program to complete the work transaction. Transactions with more instructions represent more work for the computer, while transactions with fewer instructions represent less work for the computer.

The complexity of the computer program workflow can be defined by the amount of work (or processing time) required to complete an average work transaction. Service time is measured relative to a platform performance baseline. Faster platform processor core execute program instructions in less time than slower processor core. Service time can be computed using a simple formula based on number of processor core and platform capacity.

Performance Processing Delays (calculating display response time)
Most of the performance factors used for system design capacity planning involve simple terms and relationships. A work transaction (display) is an average unit of work, throughput is a measure of the average work transactions completed over a period of time (displays per minute or transactions per hour), capacity is the maximum rate at which a platform can do work, and utilization is the percentage of capacity represented by a give throughput rate. You can calculate display service time if you know the platform throughput and corresponding utilization, calculated at any throughput level.

Calculating user display response time for shared system loads is a little bit more difficult. Only one user transaction can be serviced at a time on each processor core. If lots of user transaction requests arrive at the same time, some of the transactions must wait in line while the others are processed first. Waiting in line for processing contributes to system processing delays. User display response time must account for all the system delays, since the display is not complete until the final processing is done.

Fortunately, computing transaction service response time is a common problem for many business applications. The theory of queues or waiting in line has its origin in the work of A. K. Erlang, starting in 1909. There are a variety of different queuing models available for estimating queue time, and I went back to one of the textbooks used during my graduate school days to incorporate these models for use in system design capacity planning. The simplest models were for large populations of random arrival transactions, which should certainly be the case in a high capacity computer computation (we are dealing with thousands of random computer program instructions being executed within a relatively small period of time - i.e. minutes).

Figure 5-6 includes an overview of the model used in the Capacity Planning Tool for estimating component queue times. The second half of the model (single core section) was quite strait forward, and there is general agreement that this simple model would identify wait times in the case of a single service provider (single core platform or single network connection). The multi-core case was a little more complicated, and unfortunately was the more common capacity planning case when you have a multi-core server platform configuration.

In the multi-core service provider case, it was important to include the probability of a service provider (processor core) being available to service the request (not busy) and then multiply this value by the single core factor. The more processor core in the server, the more likely one of these core will not be busy when the next service transaction arrives - thus this is a fraction that gets smaller for platforms with more server core. There are some other constraints to consider. The total equation must be zero when there is no load on the system (queue time = 0 when utilization = 0) and reduce to the simple single core formula when the number of processor core = 1. These considerations were all made in developing the queue time formula used in the Capacity Planning Tool.

The actual multiple service provider queuing formulas included in the reference textbook included a number of factorial and summation calculations which were far too complex for use in a simple capacity planning tool. The simplified formula provided above has been compared against several benchmark test results, and the computed response time was reasonably close to the measure test results (showed conservative response times - slightly higher than measured values).

It is important to recognize that the accuracy of the queue time calculation impacts only the expected user response time, and does not reduce the accuracy of the platform capacity calculations provided by the earlier simple relationships. For many years, ESRI capacity planning models did not include estimates for user response time. Workflow response time is important, since it directly impacts user productivity and workflow validity. If display response times are too slow, the peak throughput estimates would not be achieved and the capacity estimates would be not be conservative. Including user response time in the capacity planning models provides more accurate and conservative platform specifications, and gives customers with a better understanding of user performance and productivity.

So in summary, queue time is any time the software program instructions must wait in line to be processed. Queue time is based on a statistical analysis of the transaction (processing request) arrival time distribution. Simply stated, this is the probability of having to wait in line when arriving for a service. For very large populations with random arrival times, the probability distribution for having to wait for service is predictable.

Response time is the sum of the total service times (processing times) and queue times (wait times) required to refresh the requested user display. Response time is importance, since it directly contributes to user productivity {productivity = 60 sec / (response time + think time)}. As Queue time increases response time will increase and productivity will decrease.

User Productivity
Figure 5-7 shares the relationship of several key terms that define user productivity. Productivity is a measure of user activity in doing work. For GIS capacity planning purposes, the maximum user productivity is 10 displays per minute. Web clients often work at a slower pace, and for GIS capacity planning purposes we use a maximum of 6 displays per minute for a Web client. These are general rules of thumb, based on a belief that at some point computers will be fast enough that GIS user productivity will no longer be limited by computer processing speed.



Productivity
User productivity is measured in terms of displays (work transactions) per minute. Display cycle time is the average time between each user display request. A user productivity of 10 displays per minute would generate an average of 1 display every 6 seconds.

Think Time
Think time is the average user input time. Computed think time is cycle time minus response time (time after the display appears on the screen to when the user requests a new display). Minimum user think time is the minimum think time required for a valid user workflow (user needs time to think before requesting another display). Margin is the slack time between computed and minimum think time.

Batch Process
Batch process can be defined as any workflow with zero (0) think time (no user input between display transactions). Batch productivity is calculated based on computed response time (60 seconds / response time = batch DPM). Batch process also does not have queue time (no random arrival time between display requests). Displays are requested sequentially following each refresh. Batch processes deployed on a single platform with local data source tend to consume a processor core.

User Workflow Performance Factors
Figure 5-8 shows how the fundamental performance terms and factors interact together within a user workflow transaction. For a given display executed on a given platform, display service time is a constant value. In a shared server environment, queue time increases with increasing user loads (increasing server utilization). As queue time increases, display response time increases. For a fixed user productivity (10 displays per minute), computed user think time will decrease with increasing display response time. At some point, computed user think time will be less than minimum think time (invalid user workflow).



Figure 5-9 shows an invalid workflow (computed think time less than user think time). Under this condition, the user productivity must reduce, increasing display cycle time to equal minimum think time plus response time. The Capacity Planning Tool Design tab ADJUST function automatically reduces user productivity until computed user think time = minimum think time, which results in a valid adjusted workflow with reduced user productivity.



Platform Capacity
Figure 5-10 shows the relationship between platform utilization, throughput, and capacity. The chart shows performance of a four (4) core test platform running a series of batch tests. Platform utilization is shown by the vertical bars and the left vertical axis, and throughput is shown by the RED line and the right vertical axis.



There is a direct relationship between Utilization and Throughput since utilization is defined relative to peak throughput or capacity. Platform capacity is a constant value, and can be calculated from any throughput level (you do not need to conduct a peak load test to identify 100 percent platform capacity). Platform capacity can be calculated based on measured platform capacity for any throughput value (Capacity = Throughput / Utilization).

Computing Platform Service Times
Once you identify platform capacity, you can identify service time. The work transaction service time (transaction processing time per core) is the total number of platform processor core divided by the platform capacity (capacity measured in work transactions per second).

The spreadsheet in the upper left of the chart in Figure 5-10 shows service time calculations for 1, 2, 3, 4, and 5 batch process loads respectively. Capacity and service time calculated for each throughput were all very close (this assumes you can accurately measure utilization at the lighter test loads). Knowing how to calculate service time at any defined platform throughput is important for establishing proper workflow performance targets and verifying capacity planning targets are satisfied when monitoring real system workflow loads.

Display Response Time
Figure 5-11 shows the relationship between platform utilization and display response time. During initial processing loads, response time and service time are the same. As processing loads increase, inbound work transactions start to arrive at the same time. Each processor core can only process one program instruction at a time, so if two different work transaction requests arrive for processing at the same time, one must wait. This wait time (Queue time) increases as utilization increases, and as throughput approaches platform capacity queue time will increase to unacceptable values (throughput will never reach full capacity). Display response time will continue to increase as platform throughput and utilization increase toward full capacity levels.



The relationship between service time and response time is demonstrated on the graphic. Response time is service time plus queue time. Initially response time is equal to service time, and during higher system loads response time will increase as queue time increases - display service time will stay the same for all loads. Transaction queue time is a function of display service time, utilization, and number of platform core.



Network Communication Performance Factors
Performance models used to support network communications follow the same type of terms and relationships identified above for server platforms, with the only difference being the names of these same terms. Figure 5-12 shows the terms and relationships used for network capacity planning.



The most important terms include definition of the average work transaction (display), work throughput (traffic), system capacity (bandwidth), and system utilization (network utilization). The network connection (switch port, router port, network interface card, hardware bus adapter, etc) is the hardware that processes the network traffic, so knowing the number of NIC cards available in a selected hardware platform configuration or the number of WAN or Internet connections and aggregate bandwidth is important. Service time (network transport time) is a measure of the average work transaction processing time, and queue time is a measure of the time waiting to be processed (waiting in line for service).

Work Transaction (display) Response time
Response time is the total time required to complete the work transaction. The software program instructions (work procedure) is executed sequential, thus each instruction in the program must be executed before the next step in the program can be completed (results of the first step in the procedure often must be known before completing the second step, etc). Response time includes all of the processing times and queue times experienced in completing an average work transaction. Figure 5-13 provides a system performance summary that shows service times, network transport time, and associated component service times displayed as a stacked bar chart for each workflow. Display response time is represented at the top of each workflow stack, representing the sum of all service and queue times for each workflow display.



This Workflow Performance Summary chart is included in the Capacity Planning Tool (CPT) showing performance of each of the workflows identified in the CPT requirements module. The chart above shows results of a series of batch load tests performed on a 4 core server platform. Service times for each system component (10 server tier and network with their associated queue time, plus network latency and client display processing time).

Response time is close to the total processing time for the first batch process, and the queue time grows on the server and network components as utilization increases. Queue time is more than processing time with the system is running more than two batch processes per core (total of 8 batch processes on the 4 core server).

Capacity Planning
The models supporting ESRI capacity planning today are based on the performance fundamentals introduced in this section. Platform capacity is determined by the software processing time (platform service time) and the number of platform core, and is expressed in terms of peak displays per minute. Platform capacity (DPM) can be translated to supported concurrent users by dividing by the user productivity (DPM/client).

The performance fundamentals discussed in this chapter do not change with changing technology, and an understanding of these fundamentals will provide a solid foundation for understanding system performance and scalability. Software and hardware technology will continue to change, and the terms and relationships identified in this section can help us normalize these changes and help us understand what is required to support our system performance needs.

The next section will discuss Network Communications, providing some insight on how to build GIS solutions that support remote user performance and Enterprise GIS scalability.

Capacity Planning Demo

Page Footer Specific license terms for this content System Design Strategies 26th edition - An Esri ® Technical Reference   Document • 2009 (final PDF release)