Performance Fundamentals 28th Edition (Fall 2010)

Fall 2010 Performance Fundamentals 28th Edition

Computer platforms must be configured properly to support system performance  requirements. There are many factors that contribute to user performance and productivity. Enterprise GIS solutions include distributed processing environments where user performance can be the  product of contributions from several hardware platform environments. Many of these platform resources are shared with other users. Understanding distributed processing technology provides a fundamental framework for deploying a successful enterprise GIS.

The importance of working together to understand the technology is  illustrated in Figure 5-1. There is a famous poem by John Godfry  Saxe that tells a story about six blind men that went to see an  elephant, though all of them were blind. As they observe the elephant they form their own understanding of what the elephant looks like.



The first blind man approached the side of the elephant which feels like a  wall, the second feels the tusk which is very like a spear, the third  holds the trunk which feels like a snake, the fourth felt about the knee  which was like a tree, the fifth touched the ear which felt like a fan,  and the sixth grabbed the tail which felt like a rope.

After examining the elephant, the six wise men met to discuss their findings. Each expressed his opinion about what an elephant looks like based on his experience. They were all partly right, and at the same time they were all wrong. They each experienced only a part of what the elephant really looked like. The whole elephant is a combination of the parts each blind man visualized from their own discovery.

We understand technology from an abstract view much like the blind men saw  the elephant. Technology is changing very rapidly, and we all see and try to understand what these changes mean. At the same time, we too have a limited view based our own experience. What we see is only a part of the whole picture. For this reason, it is important to listen to the experience of others and combine their experience with our own as  we move technology forward. I have found that we all have information to contribute and the questions we have can help others understand the  technology in a better and more complete way. Modeling our experience, and expanding these models to incorporate the experience of others, can  facilitate our learning process.

Understanding the Technology
ESRI has implemented distributed GIS solutions since the late 1980s. For many years, distributed processing environments were not well  understood, and customers relied on the experience of technical experts  to identify hardware requirements to support their implementation needs. Each technical expert had a different perspective on what hardware infrastructure might be required for a successful implementation, and  recommendations were not consistent. Many hardware decisions were made based on the size of the project budget rather than a clear  understanding of user requirements and the appropriate hardware  technology.

We started developing simple system performance models in the early 1990s to document our understanding  about distributed processing systems. These system performance models have been used by ESRI system design consultants to support distributed  computing hardware solutions since 1992. These same performance models have also been used to identify potential performance problems with  existing computing environments.

Our first performance models were developed to address platform sizing for GIS desktop  applications with file and GIS database data sources. UNIX and Windows application compute servers were used to provide remote terminal access  to GIS applications hosted in centralized data centers. A simple concurrent user model was used to support capacity planning.

Web mapping services were introduced in the late 1990s, and  transaction-based sizing models were developed for capacity planning and  proper hardware selection. Transaction rates were identified in terms of map displays per hour. The transaction-based capacity planning models proved to be much more accurate and predictable than the previous  concurrent user models, although in many cases customers were more  comfortable identifying sizing requirements in terms of peak concurrent  user load than using peak map requests per hour.

The release of ArcGIS Server 9.2 in 2006 introduced some new challenges for  the traditional sizing models, and an effort to review lessons learned  and take a close look at the road ahead was in order. The result is a new approach to capacity planning that incorporates the best of the  traditional client/server and Web services sizing models and provides an  adaptive sizing methodology to support future enterprise GIS  operations. The new capacity planning methodology is much easier to use and provides metrics to manage performance compliance during  development, initial system implementation, and delivery.

This new capacity planning model was developed and shared with the objective  of helping software developers, business partners, technical marketing  specialists, and ESRI distributors better understand the performance and  scalability of ESRI technology. The primary goal was to provide customers with the best possible GIS solutions to support their  enterprise GIS operations.

This section presents a basic overview of the system performance fundamentals. The terms and relationships introduced in this section describe fundamental  relationships we can all use to better understand performance. The sections on Software and Platform performance provide additional insight  on the processing demands GIS software can place on our system  environment, and identify the processing capabilities of current vendor  hardware technology. An understanding of these performance fundamentals can provide a framework for building and maintaining more effective  real-world GIS operations.

What Is Capacity Planning?
Figure 5-2 identifies some key factors that contribute to overall system  performance. Proper hardware and architecture selection is part of the overall system performance equation. Software technology selection and application design is another key contributing factor. Network connectivity can also be a limiting constraint. The system design solution must provide sufficient platform and network capacity to  satisfy peak user performance needs.



Proper system design requires the right software and hardware technology  selection to meet user workflow performance needs. Enhancements in any of the system performance factors can improve user productivity and  impact total system capacity. Performance cannot be guaranteed by proper hardware selection alone. The performance fundamentals described in this section can help identify appropriate hardware selection based  on customer business requirements.

GIS technology can be both compute intensive and bandwidth intensive, which means the  system design will fail if it does not have sufficient platform or  network capacity to handle the required processing loads. System architecture design is all about designing a balanced system and  managing implementation risk.

Most project managers clearly understand the importance and value of a project schedule in  managing deployment risk associated with cost and schedule. The same basic principals can be applied to managing system performance risk. Figure 5-3 identifies how system architecture design can be applied to managing system performance risk.



System architecture design provides a framework for identifying a balanced  system design and establishing reasonable software processing  performance budgets. Performance expectations are established based on selected software processing complexity and vendor published hardware  processing capacity. System design performance expectations can be represented by established software processing performance targets. These performance targets can be translated into specific software performance milestones which can be validated during system deployment. Software processing complexity and/or hardware processing capacity can be reviewed and adjusted as necessary at each deployment milestone to  ensure system is delivered within the established performance budget.

Our understanding of GIS processing complexity and how this workload is  supported by vendor platform technology is based on more than 20 years  of experience. A balanced software and hardware investment, with capacity based on projected peak user workflow loads, can reduce cost  and ensure system deployment success.

What Is System Performance?
Computer platforms are supported by several component technologies. Each component technology is important - the weakest component will limit  overall platform performance. Some software applications require lots of data transfer and interactive graphics display, while others require  heavy computing. Hardware vendors build computers with a balance of component resources that optimize performance for a broad range of  customers. Compute intensive software, like GIS, find the hardware platform processor core the computer technology which limits server  performance.

In much the same way, distributed computing solutions (enterprise computing environments) are supported by  several hardware platforms that contribute to overall system  performance. Each hardware component contributes - the weakest component will limit overall system performance. Hardware platforms supporting a computing environment must be carefully selected to satisfy  peak processing needs.

The primary objective of the system architecture design process is to provide the highest level of  user performance for the available system hardware investment. Each hardware component must be selected with sufficient performance to  support processing needs. Current technology can limit system design alternatives. Understanding the distributed processing loads at each hardware component level provides a foundation for establishing an  optimum system solution.

Figure 5-4 provides a overview of the key performance terms and measures in a distributed processing  environment. Each system component participates sequentially in the overall program execution. Processing is supported in platform memory—sufficient memory must be available to support the software  execution.



The total response time of a particular application display will be a  collection of the processing and wait times from a variety of shared  system components. A computer vendor optimizes the component configuration within the hardware to support the fastest computer  response to an application query. The customer IT/systems department has the responsibility of optimizing the organization's hardware and  network component investments to provide the highest system-level  response at the user desktop. System performance can directly contribute to user productivity.

GIS users have experienced significant performance and productivity improvements over  the past eight years. Time for computers to process a dynamic map display is over 30 times faster with 2010 hardware than what was  possible with 1999 technology - this change has a significant impact on  user productivity and the opportunities for use of GIS technology.

Software technology selection has also made a difference in display performance. The earlier scripted program technologies of the 1990s (ArcIMS) performs faster than the standard component object based software  technology (ArcGIS Server) used today. Performance is a function of the amount of work (processing) required by the software and the  performance of the hardware technology (how fast the processing workload  can be performed). Significant performance gains improved user experience with the ArcGIS Server 9.3.1 optimized map service, using a  new graphics rendering engine to improve quality and performance of  standard Web mapping services - generating dynamic maps faster than the  ArcIMS Image service. ArcGIS Server client use of pre-processed map cache offloads dynamic (real time) server processing loads and improves  user productivity.

Selecting the right software technology pattern and the right hardware architecture is more important  today than it ever was before. This chapter is about understanding the fundamental concepts about performance and scalability. The following chapter will look more closely at display traffic and network  communication bandwidth contributions to overall system performance.

System Performance Fundamentals
The study of work performance is not new, and there are a considerable  amount of theories and ideas published on this topic. Understanding the fundamental terms and relationships that define work performance and  applying these fundamentals to computer processing helps us better  understand the technology and make more appropriate design choices. Figure 5-5 identifies some basic terminology used in defining a user workflow. During a user needs assessment the user workflow is often represented by a use case. The user is the person that interfaces with a software application through a computer display.



The most important system performance terms define the average work  transaction (display), work throughput, system capacity, and system  utilization identified in Figure 5-6. The processor core is the hardware that executes the computer program instructions, so knowing the  number of processor core available in a selected hardware platform  configuration is important. Service time is a measure of the average work transaction processing time, and queue time is a measure of the  time waiting to be processed (waiting in line for service). Response time is the overall measure of system performance, and includes all of  the component service times plus any additional wait or travel times  required to refresh the client display (complete a work transaction).



The relationships between these performance terms are quite simple and many  times misunderstood. The relationship between throughput, capacity, and utilization are true based on how these terms are defined. Throughput is the number of work transactions being processed per unit time, capacity is the maximum throughput that can be supported by a  specific hardware configuration, and utilization is the ratio of the  current throughput to the system capacity (expressed as percentage of  capacity). If you know the current throughput (users working on the system) and you measure the system utilization (average computer CPU  utilization), then you can know the capacity of the server.

Work transaction service time is a key term used to measure software  performance. The software program provides a set of instructions that must be executed by the computer to complete a work transaction. The processor core executes the instructions defined in the computer program  to complete the work transaction. Transactions with more instructions represent more work for the computer, while transactions with fewer  instructions represent less work for the computer.

The complexity of the computer program workflow can be defined by the amount  of work (or processing time) required to complete an average work  transaction. Service time is measured relative to a platform performance baseline. Faster platform processor core execute program instructions in less time than slower processor core. Service time can be computed using a simple formula based on number of processor core and  platform capacity.

Performance Processing Delays (calculating display response time)
Most of the performance factors used for system design capacity planning  involve simple terms and relationships. A work transaction (display) is an average unit of work, throughput is a measure of the average work  transactions completed over a period of time (displays per minute or  transactions per hour), capacity is the maximum rate at which a platform  can do work, and utilization is the percentage of capacity represented  by a give throughput rate. You can calculate display service time if you know the platform throughput and corresponding utilization,  calculated at any throughput level.

Calculating user display response time for shared system loads is a little bit more  difficult. Only one user transaction can be serviced at a time on each processor core. If lots of user transaction requests arrive at the same time, some of the transactions must wait in line while the others are  processed first. Waiting in line for processing contributes to system processing delays. User display response time must account for all the system delays, since the display is not complete until the final  processing is done.

Fortunately, computing transaction service response time is a common problem for many business  applications. The theory of queues or waiting in line has its origin in the work of A. K. Erlang, starting in 1909. There are a variety of different queuing models available for estimating queue time, and I went  back to one of the textbooks used during my graduate school days to  incorporate these models for use in system design capacity planning. The simplest models were for large populations of random arrival  transactions, which should certainly be the case in a high capacity  computer computation (we are dealing with thousands of random computer  program instructions being executed within a relatively small period of  time - i.e. minutes).

Figure 5-6 includes an overview of the model used in the Capacity Planning Tool for estimating component  queue times. The second half of the model (single core section) was quite strait forward, and there is general agreement that this simple  model would identify wait times in the case of a single service provider  (single core platform or single network connection). The multi-core case was a little more complicated, and unfortunately was the more  common capacity planning case when you have a multi-core server platform  configuration.

In the multi-core service provider case, it was important to include the probability of a service provider  (processor core) being available to service the request (not busy) and  then multiply this value by the single core factor. The more processor core in the server, the more likely one of these core will not be busy  when the next service transaction arrives - thus this is a fraction that  gets smaller for platforms with more server core. There are some other constraints to consider. The total equation must be zero when there is no load on the system (queue time = 0 when utilization = 0) and reduce  to the simple single core formula when the number of processor core = 1. These considerations were all made in developing the queue time formula used in the Capacity Planning Tool.

The actual multiple service provider queuing formulas included in the reference  textbook included a number of factorial and summation calculations which  were far too complex for use in a simple capacity planning tool. The simplified formula provided above has been compared against several  benchmark test results, and the computed response time was reasonably  close to the measure test results (showed conservative response times -  slightly higher than measured values).

It is important to recognize that the accuracy of the queue time calculation impacts  only the expected user response time, and does not reduce the accuracy  of the platform capacity calculations provided by the earlier simple  relationships. For many years, ESRI capacity planning models did not include estimates for user response time. Workflow response time is important, since it directly impacts user productivity and workflow  validity. If display response times are too slow, the peak throughput estimates would not be achieved and the capacity estimates would be not  be conservative. Including user response time in the capacity planning models provides more accurate and conservative platform specifications,  and gives customers with a better understanding of user performance and  productivity.

So in summary, queue time is any time the software program instructions must wait in line to be processed. Queue time is based on a statistical analysis of the transaction (processing request) arrival time distribution. Simply stated, this is the probability of having to wait in line when arriving for a service. For very large populations with random arrival times, the probability distribution for having to wait for service is predictable.

Response time is the sum of the total service times (processing times) and queue  times (wait times) required to refresh the requested user display. Response time is importance, since it directly contributes to user productivity {productivity = 60 sec / (response time + think time)}. As Queue time increases response time will increase and productivity will  decrease.

User Productivity
Figure 5-7 shares the relationship of several key terms that define user  productivity. Productivity is a measure of user activity in doing work. For GIS capacity planning purposes, the maximum user productivity is 10 displays per minute. Web clients often work at a slower pace, and for GIS capacity planning purposes we use a maximum of 6 displays per  minute for a Web client. These are general rules of thumb, based on a belief that at some point computers will be fast enough that GIS user  productivity will no longer be limited by computer processing speed.



Productivity
User productivity is measured in terms of displays (work transactions) per  minute. Display cycle time is the average time between each user display request. A user productivity of 10 displays per minute would generate an average of 1 display every 6 seconds.

Think Time
Think time is the average user input time. Computed think time is cycle time minus response time (time after the display appears on the screen to  when the user requests a new display). Minimum user think time is the minimum think time required for a valid user workflow (user needs time  to think before requesting another display). Margin is the slack time between computed and minimum think time.

Batch Process
Batch process can be defined as any workflow with zero (0) think time (no  user input between display transactions). Batch productivity is calculated based on computed response time (60 seconds / response time =  batch DPM). Batch process also does not have queue time (no random arrival time between display requests). Displays are requested sequentially following each refresh. Batch processes deployed on a single platform with local data source tend to consume a processor core.

User Workflow Performance Factors
Figure 5-8 shows how the fundamental performance terms and factors interact  together within a user workflow transaction. For a given display executed on a given platform, display service time is a constant value. In a shared server environment, queue time increases with increasing user loads (increasing server utilization). As queue time increases, display response time increases. For a fixed user productivity (10 displays per minute), computed user think time will decrease with  increasing display response time. At some point, computed user think time will be less than minimum think time (invalid user workflow).



Figure 5-9 shows an invalid workflow (computed think time less than user think  time). Under this condition, the user productivity must reduce, increasing display cycle time to equal minimum think time plus response  time. The Capacity Planning Tool Design tab ADJUST function automatically reduces user productivity until computed user think time =  minimum think time, which results in a valid adjusted workflow with  reduced user productivity.



Platform Capacity
Figure 5-10 shows the relationship between platform utilization, throughput,  and capacity. The chart shows performance of a four (4) core test platform running a series of batch tests. Platform utilization is shown by the vertical bars and the left vertical axis, and throughput is  shown by the RED line and the right vertical axis.



There is a direct relationship between Utilization and Throughput since  utilization is defined relative to peak throughput or capacity. Platform capacity is a constant value, and can be calculated from any throughput level (you do not need to conduct a peak load test to  identify 100 percent platform capacity). Platform capacity can be calculated based on measured platform capacity for any throughput value  (Capacity = Throughput / Utilization).

Computing Platform Service Times
Once you identify platform capacity, you can identify service time. The work transaction service time (transaction processing time per core) is  the total number of platform processor core divided by the platform  capacity (capacity measured in work transactions per second).

The spreadsheet in the upper left of the chart in Figure 5-10 shows service  time calculations for a series of 10 increasing Web load profiles. Capacity and service time calculated for each throughput were all very close (this assumes you can accurately measure utilization at the  lighter test loads). Knowing how to calculate service time at any defined platform throughput is important for establishing proper  workflow performance targets and verifying capacity planning targets are  satisfied when monitoring real system workflow loads.

Display Response Time
Figure 5-11 shows the relationship between platform utilization and display  response time. During initial processing loads, response time and service time are the same. As processing loads increase, inbound work transactions start to arrive at the same time. Each processor core can only process one program instruction at a time, so if two different work  transaction requests arrive for processing at the same time, one must  wait. This wait time (Queue time) increases as utilization increases, and as throughput approaches platform capacity queue time will increase  to unacceptable values (throughput will never reach full capacity). Display response time will continue to increase as platform throughput and utilization increase toward full capacity levels.



The relationship between service time and response time is demonstrated on  the graphic. Response time is service time plus queue time. Initially response time is equal to service time, and during higher system loads  response time will increase as queue time increases - display service  time will stay the same for all loads. Transaction queue time is a function of display service time, utilization, and  number of platform core. 

Network Communication Performance Factors
Performance models used to support network communications follow the same type of  terms and relationships identified above for server platforms, with the  only difference being the names of these same terms. Figure 5-12 shows the terms and relationships used for network capacity planning.



The most important terms include definition of the average work transaction  (display), work throughput (traffic), system capacity (bandwidth), and  system utilization (network utilization). The network connection (switch port, router port, network interface card, hardware bus adapter,  etc) is the hardware that processes the network traffic, so knowing the  number of NIC cards available in a selected hardware platform  configuration or the number of WAN or Internet connections and aggregate  bandwidth is important. Service time (network transport time) is a measure of the average work transaction processing time, and queue time  is a measure of the time waiting to be processed (waiting in line for  service).

Work Transaction (display) Response time
Response time is the total time required to complete the work transaction. The software program instructions (work procedure) is executed sequential,  thus each instruction in the program must be executed before the next  step in the program can be completed (results of the first step in the  procedure often must be known before completing the second step, etc). Response time includes all of the processing times and queue times experienced in completing an average work transaction. Figure 5-13 provides a system performance summary that shows service times, network  transport time, and associated component service times displayed as a  stacked bar chart for a series of batch process workflows. Display response time is represented at the top of each workflow stack,  representing the sum of all service and queue times for each workflow  display.



This Workflow Performance Summary chart is included in the Capacity Planning  Tool (CPT) showing performance of each of the workflows identified in  the CPT requirements module. The chart above shows results of a series of batch processes performed on a 2 core server platform. Service times for each system component (10 server tier and network with their  associated queue time, plus network latency and client display  processing time).

Response time is close to the total processing time for the first batch process, and the queue time grows on  the server and network components as utilization increases. Queue time increases to more than processing time when the system is running more  than two batch processes per core (total of 4 batch processes on the 2  core server). For batch processes, server reaches full capacity with N+1 batch processes. Response time increases linearly after that point as concurrent batch processes compete for limited core processing  resources.

Capacity Planning
The models supporting ESRI capacity planning today are based on the  performance fundamentals introduced in this section. Platform capacity is determined by the software processing time (platform service time)  and the number of platform core, and is expressed in terms of peak  displays per minute. Platform capacity (DPM) can be translated to supported concurrent users by dividing by the user productivity  (DPM/client).

The performance fundamentals discussed in this chapter do not change with changing technology, and an  understanding of these fundamentals will provide a solid foundation for  understanding system performance and scalability. Software and hardware technology will continue to change, and the terms and relationships  identified in this section can help us normalize these changes and help  us understand what is required to support our system performance needs.

The next section will discuss Network Communications, providing some  insight on how to build GIS solutions that support remote user  performance and Enterprise GIS scalability.

Previous Editions
[Spring 2010 Performance Fundamentals 27th Edition]

Page Footer Specific license terms for this content System Design Strategies 26th edition - An Esri ® Technical Reference  Document • 2009 (final PDF release)