SPEC News: SPEC95 Q&A

SPEC95 Questions And Answers

By Kaivalya Dixit, IBM, Austin, Texas and
Jeff Reilly, Intel Corporation, Santa Clara, Calif.

Published September, 1995; see disclaimer.

Q1: What is SPEC95?
Q2: What is a benchmark?
Q3: What does the "C" in CINT95 and CFP95 stand for?
Q4: What components do CINT95 and CFP95 measure?
Q5: What components do CINT95 and CFP95 not measure?
Q6: What is included in the SPEC95 package?
Q7: What does the user of the SPEC95 have to provide?
Q8: What are the basic steps in running the benchmarks?
Q9: What source code is provided? What exactly makes up these suites?
Q10: What metrics can be measured?
Q11: What is the difference between a "base" metric and a "non-base" metric?
Q12: What is the difference between a "rate" and a "non-rate" metric?
Q13: Why and/or when should I use SPEC95?
Q14: Which metric should you use?
Q15: SPEC92 is already an available product. Why create SPEC95?
Q16: What happens to SPEC92 after SPEC95 is released?
Q17: Is there a way to translate SPEC92 results to SPEC95 results?
Q18: What criteria was used to select the benchmarks?
Q19: Weren't some of the SPEC95 benchmarks in SPEC92?
Q20: Why were some of the benchmarks not carried over?
Q21: What is the reference machine?
Q22: How long does it take to run the suite?
Q23: What if the tools cannot be run or built on a system?
Q24: Is there any place that results will be available?
Q25: How do I contact SPEC?
Q26: How do I order the suite?

SPEC95 Questions And Answers

Q1: What is SPEC95?

A1: SPEC95 is a software benchmark product produced by SPEC. It was designed to provide comparable measures of performance for comparing compute-intensive workloads on different computer systems. SPEC95 contains two suites of benchmarks:

CINT95: for measuring and comparing compute- intensive integer performance.
CFP95: for measuring and comparing compute- intensive floating point performance.

Q2: What is a benchmark?

A2: The definition from Webster's II Dictionary states: "A standard of measurement or evaluation." SPEC is a non-profit corporation formed to establish and maintain computer benchmarks for measuring component and system level computer performance.

Q3: What does the "C" in CINT95 and CFP95 stand for?

A3: In its product line, SPEC uses "C" to denote a "component" benchmark and "S" to denote a "system" benchmark. CINT95 and CFP95 are component benchmarks.

Q4: What components do CINT95 and CFP95 measure?

A4: Being compute-intensive benchmarks, these benchmarks emphasize the performance of the computer's processor, the memory architecture and the compiler. It is important to remember the contribution of the latter two components; performance is more than just the processor.

Q5: What components do CINT95 and CFP95 not measure?

A5: The CINT95 and CFP95 benchmarks do not stress other computer components such as I/O (disk drives), networking or graphics. Note that it may be possible to configure a system in such a way that one or more of these components impact the performance of CINT95 and CFP95. However, that is not the intent of the suites.

Q6: What is included in the SPEC95 package?

A6: SPEC provides the following in its SPEC95 package:

SPEC95 tools for compiling, running and validating the benchmarks, compiled for a variety of operating systems.
Source code for the SPEC95 tools, to allow the tools to be built for systems not covered by the precompiled tools.
Source code for the benchmarks.
Tools for generating performance reports.
Run and reporting rules defining how the benchmarks should be used to produce standard results.
SPEC95 documentation.

The initial offering of SPEC95 will have tools for most UNIX operating systems. Additional products for other operating systems (Windows NT, VMS, etc.) will be released later as products if SPEC detects enough demand. All of this will be shipped on a single CD.

Q7: What does the user of the SPEC95 have to provide?

A7: The user must have a computer system running a UNIX environment with a compiler installed and a way of reading the SPEC95 media (CD player). Approximately 150MB will be needed on a hard drive to install SPEC95. It is also assumed that the system has at least 64MB of RAM to ensure that the benchmarks remain compute intensive and do not page. (SPEC is assuming this will be the standard amount of desktop memory during the life of this suite.)

Q8: What are the basic steps in running the benchmarks?

A8: Installation and use are covered in detail in the SPEC95 User Documentation. The basic steps are:

Install SPEC95 from media.
Run the installation scripts specifying your operating system.
Compile the tools, if executables are not provided in SPEC95.
Determine what metric you wish to run.
Create a configuration file for that metric. In this file, you specify compiler flags and other system dependent information.
Run the SPEC tools to build (compile), run and validate the benchmarks.
If the above three steps are successful, generate a report based on the run times and metric equations.

Q9: What source code is provided? What exactly makes up these suites?

A9: CINT95 and CFP95 are based on compute-intensive applications provided as source code. CINT95 contains eight applications, written in C, used as benchmarks:

Name    Remarks
099.go    Artificial Intelligence, plays the game of Go.
124.m88ksim Moto 88K Chip simulator, runs test program.
126.gcc   New Version of GCC, builds SPARC code.
129.compress  Compresses and uncompresses file in memory.
130.li    Lisp interpreter.
132.ijpeg Graphic compression and decompression.
134.perl  Manipulate strings (anagrams) and prime numbers in Perl.
147.vortex  A database program.

CFP95 contains 10 applications, written in FORTRAN, used as benchmarks:

Name    Remarks
101.tomcatv Tomcatv is a mesh generation program.
102.swim  Shallow Water Model with 513 x 513 grid.
103.su2cor  Quantum physics. Monte Carlo simulation.
104.hydro2d Astrophysics. Hydrodynamical Navier Stokes equations
107.mgrid Multi-grid solver in 3D potential field.
110.applu Parabolic/elliptic partial differential equations.
125.turb3d  Simulate isotropic, homogeneous turbulence in a cube.
141.apsi  Solve temp., wind, velocity and dist. of pollutants.
145.fpppp Quantum chemistry.
146.wave5 Plasma physics. Electromagnetic particle simulation.

Q10: What metrics can be measured?

A10: The CINT95 and CFP95 suites can be used to measure and calculate the following metrics: CINT95:

SPECint95: The geometric mean of eight normalized ratios (one for each integer benchmark) when compiled with aggressive optimization for each benchmark.
SPECint_base95: The geometric mean of eight normalized ratios (one for each integer benchmark) when compiled with conservative optimization for each benchmark.
SPECint_rate95: The geometric mean of eight normalized throughput ratios (one for each integer benchmark) when compiled with aggressive optimization for each benchmark.
SPECint_rate_base95: The geometric mean of eight normalized throughput ratios (one for each integer benchmark) when compiled with conservative optimization for each benchmark.

CFP95:

SPECfp95: The geometric mean of 10 normalized ratios (one for each floating point benchmark) when compiled with aggressive optimization for each benchmark.
SPECfp_base95: The geometric mean of 10 normalized ratios (one for each floating point benchmark) when compiled with conservative optimization for each benchmark.
SPECfp_rate95: The geometric mean of 10 normalized throughput ratios (one for each floating point benchmark) when compiled with aggressive optimization for each benchmark.
SPECfp_rate_base95: The geometric mean of 10 normalized throughput ratios (one for each floating point benchmark) when compiled with conservative optimization for each benchmark.

The ratio for each of the benchmarks are calculated a SPEC-determined reference time and the run time of the benchmark.

Q11: What is the difference between a "base" metric and a "non-base" metric?

A11: In order to provide comparisons across different computer hardware, SPEC had to provide the benchmarks as source code. Thus, in order to run the benchmarks, they must be compiled. There was agreement that the benchmarks should be compiled the way users compile programs. But how do users compile programs? On one side, people may experiment with many different compilers and compiler flags to achieve the best performance. On the other side, people may just compile with the basic options suggested by the compiler vendor. SPEC recognizes that they can not exactly match how everyone uses compilers, but two reference points are possible. The base metrics (i.e., "SPECint_base95") are required for all reported results and have set guidelines for compilation (i.e., the same flags must be used in the same order for all benchmarks). The non-base metrics (i.e., "SPECint95") are optional and have less strict requirement (i.e., different compiler options may be used on each benchmark.

A full description of the distinctions can be found in the SPEC95 Run and Reporting rules available with SPEC95.

Q12: What is the difference between a "rate" and a "non-rate" metric?

A12: There are several different ways to measure computer performance. One way is to measure how fast the computer completes a single task; this is a speed measure. Another way is to measure how many tasks a computer can accomplish in a certain amount of time; this is called a throughput, capacity or rate measure.

The SPEC speed metrics (i.e., SPECint95) are used for comparing the ability of a computer to complete single tasks. The SPEC rate measures (i.e., SPECint_rate95) the throughput or rate of a machine carrying out a number of tasks.

Q13: Why and/or when should I use SPEC95?

A13: Typically, the best measure of a system is your own application with your own workload. Unfortunately, it is often very difficult and expensive to get a wide base of reliable, repeatable and comparable measurements on different systems for your own application with your own workload. This may be due to time, money or other constraints. That's why benchmarks exist -- to act as a reference point for comparison. It's the same reason that EPA gas mileage exists although probably no driver in America gets exactly the EPA gas mileage. If you understand what benchmarks measure, they're useful. It's important to note that CINT95 and CFP95 are CPU-focused and not system focused benchmarks. These CPU benchmarks focus on only one portion of those factors that contribute to applications performance. An application whose bottleneck is say, graphics or network, will not be represented by these benchmarks. Understanding your own needs helps determine the relevance of the benchmarks.

Q14: Which metric should you use?

A14: As mentioned, this depends on your needs. SPEC provides the benchmarks and results as tools for you to use. You need to determine how you use a computer or what your performance requirements are, and then choose the appropriate SPEC benchmark or metrics.

For example, a single user running a compute-intensive integer program may only be interested in SPECint95 or SPECint_base95. On the other hand, a person who maintains a machine used by multiple scientists running floating point simulations may be more concerned with SPECfp_rate95 or SPEC95fp_rate_base95.

Q15: SPEC92 is already an available product. Why create SPEC95, and will it show anything different from SPEC92?

A15: Technology is always improving. As the technology improves, the benchmarks need to improve as well. SPEC needed to address the following issues:

Runtime -- Several of the SPEC92 benchmarks were running in less than a minute on the leading edge processors/systems. Given the SPEC measurement tools, small changes or fluctuations in the measurements were having significant impacts on the percentage improvements being seen. SPEC chose to make the benchmarks longer to take into account future performance and prevent this from being an issue for the life of the suite.
Application size -- Many comments received by SPEC indicated that applications had grown in complexity and size and that SPEC92 was becoming less representative of what was being run on current systems. One of the criteria used in selecting benchmarks was seeking some programs with larger resource requirements to provide a mix with some of the smaller programs.
Application type -- SPEC felt that there were additional application areas that should be included to increase the variety and representation within the suites. Areas such as imaging and database were added.
Portability -- SPEC found that compute-intensive performance was important beyond the UNIX workstation arena where SPEC was founded. Thus, it was important that the benchmarks and the tools running the benchmarks be as independent of the operating system as possible. While the first release of SPEC95 will be geared toward UNIX, SPEC has consciously chosen programs and tools that are dependent only upon POSIX or ANSI standard development environments. SPEC will produce additional releases for other operating systems (e.g. WIN/NT) based on demand.
Moving target -- The initial hope for benchmarks is that improvements in the benchmark performance will be generally applicable to other situations. However, as the competition develops, it is feared that improvements in the test performance becomes specific to that test only. By updating the benchmarks frequently, it is hoped that general improvements will be encouraged and test specific optimizations become less effective.
Education -- As the computer industry grows, benchmark results are being quoted more often. With the release of a new suite, this is SPEC's new opportunity to discuss and clarify how and why the suite was developed.

Q16: What happens to SPEC92 after SPEC95 is released?

A16: SPEC will begin obsoleting SPEC92. The results published by SPEC will be marked as obsolete and by June 1996, SPEC will stop publishing SPEC92 results and selling the SPEC92 suites.

Q17: Is there a way to translate SPEC92 results to SPEC95 results or vice versa?

A17: There is no formula for converting SPEC92 results to SPEC95 results. They are different products. There may be high correlation between SPEC92 and SPEC95 (i.e., machines with higher SPEC92 results may have higher SPEC95 results) but there is no universal formula for all systems. SPEC is strongly encouraging SPEC licensees to publish SPEC95 numbers on older platforms to provide a historical perspective.

Q18: What criteria was used to select the benchmarks?

A18: In the process of selecting application to use as benchmarks, SPEC considered the following criteria:

Portability to all SPEC hardware architectures (32- and 64-bit including Alpha, Intel Architecture, PA-RISC, Rxx00, Sparc, etc.)
Portability to various operating systems, particularly UNIX, NT and VMS.
Benchmarks should not include measurable I/O.
Benchmarks should not include networking or graphics.
Benchmarks should run in 64MB RAM without swapping. (SPEC is assuming this will be a minimal memory requirement for the life of SPEC95; and the emphasis is on compute-intensive performance and not disk activity).
Benchmarks should run at least five minutes on a DEC 200MHz Alpha system. Benchmarks should not spend more than five percent of time in non-SPEC provided code.

Q19: Weren't some of the SPEC95 benchmarks in SPEC92 and how are they different?

A19: Some of the benchmarks from SPEC92 are included in SPEC95. However, all benchmarks were given different workloads or modified to improve their coding style or resource utilizations. These include: CINT95: 126.gcc, 129.compress, 130.li CFP95: 101.tomcatv, 102.swim, 103.su2cor, 104.hydro2d, 145.fpppp, 146.wave5.

The benchmarks are given different identifying numbers to distinguish them from versions in previous suites and indicate they are not comparable.

Q20: Why were some of the benchmarks not carried over?

A20: Some benchmarks were not carried over because it was not possible to create a longer running workload or to create a more robust workload, or the benchmark was too susceptible to benchmark specific compiler optimization.

Q21: What is the reference machine?

A21: SPEC used the SPARCstation 10/40 (40MHz SuperSPARC with no L2 cache) as a reference machine to normalize the performance metrics used in the SPEC95 suites. Each benchmark is run and measured on this machine to establish a reference time for that benchmark. These times are then used in the SPEC calculations. It took approximately 48 hours to run a SPEC conforming execution of CINT95 and CFP95 on this machine.

Q22: How long does it take to run the suite?

A22: This depends on the suite and the machine that is running the benchmarks. As mentioned above, on the reference machine it takes two days for a SPEC-conforming run (at least three iterations of each benchmark to ensure reproducibility).

Q23: What if the tools cannot be run or built on a system? Can they be run manually?

A23: To generate SPEC-compliant results, the tools used must be approved by SPEC. If several attempts at using the SPEC tools are not successful for the operating system for which you purchased SPEC95, you should contact SPEC for technical support. SPEC will work with you to correct the problem and/or investigate SPEC compliant alternatives.

Q24: What if I don't want to run the benchmarks? Is there any place that results will be available?

A24: There are several current alternatives:

Every quarter, SPEC publishes the SPEC Newsletter which contains results submitted by SPEC members and licensees. Subscription information is available from SPEC.
SPEC provides information to the Performance Database Server found at: http://performance.netlib.org/performance/html/spec.html This typically lags three months behind the SPEC Newsletter.

SPEC is working on establishing its own Internet presence but final details are not yet available.

Q25: How do I contact SPEC?

A25: For SPEC's mail address, phone numbers, and specific online contact points; see the SPEC Contact Info Page.

Q26: How do I order the suite?

A26: The suite will be available from SPEC within 45 days of the announcement on Aug. 21. For information contact, SPEC at the address above. Media and pricing are as follows:

The primary distribution medium of SPEC95 (CINT95 and CFP95) will be on a CD-ROM.

All SPEC95 orders placed on or before December 31, 1995 will include a complimentary one year subscription to SPEC Newsletter.

Product         New             New             Current         Current
                Customer        University      SPEC92          University
                                                Licensee        Licensee

SPEC95 CD-ROM   $600            $300            $300            $150

Kaivalya Dixit is the president of SPEC and works for IBM in Austin, Tx. Jeff Reilly is the Release Manager for SPEC95 and is a Project Lead at Intel Corporation in Santa Clara, Calif.

Standard Performance Evaluation Corporation

SPEC95 Questions And Answers

SPEC95 Questions And Answers

Q1: What is SPEC95?

Q2: What is a benchmark?

Q3: What does the "C" in CINT95 and CFP95 stand for?

Q4: What components do CINT95 and CFP95 measure?

Q5: What components do CINT95 and CFP95 not measure?

Q6: What is included in the SPEC95 package?

Q7: What does the user of the SPEC95 have to provide?

Q8: What are the basic steps in running the benchmarks?

Q9: What source code is provided? What exactly makes up these suites?

Q10: What metrics can be measured?

Q11: What is the difference between a "base" metric and a "non-base" metric?

Q12: What is the difference between a "rate" and a "non-rate" metric?

Q13: Why and/or when should I use SPEC95?

Q14: Which metric should you use?

Q15: SPEC92 is already an available product. Why create SPEC95, and will it show anything different from SPEC92?

Q16: What happens to SPEC92 after SPEC95 is released?

Q17: Is there a way to translate SPEC92 results to SPEC95 results or vice versa?

Q18: What criteria was used to select the benchmarks?

Q19: Weren't some of the SPEC95 benchmarks in SPEC92 and how are they different?

Q20: Why were some of the benchmarks not carried over?

Q21: What is the reference machine?

Q22: How long does it take to run the suite?

Q23: What if the tools cannot be run or built on a system? Can they be run manually?

Q24: What if I don't want to run the benchmarks? Is there any place that results will be available?

Q25: How do I contact SPEC?

Q26: How do I order the suite?