SPEC CPU95 Digital Unix Switch Disclosure Compaq Computer Corporation Revised 29 September 1998 This switch disclosure contains all switches used in Digital Unix SPEC submissions in September 1998. Switches are given in alphabetical order rather than by product or benchmark (it is hoped that this order will be useful to the reader). The collating sequence ignores upper/lower case, hyphens, and the presence of "no" for negation. That is, if you are looking for "-nomumble", try looking under "-mumble". Note: some switches in this disclosure statement are not used directly, but are generated by other switches (e.g. "-fast"). -ag=a (KAP Fortran) Perform aggressive optimization, pad common blocks and subroutine-local memory to avoid cache line collisions. -ag=b (KAP Fortran) Redefine the array indices of arrays when doing so would cause cache utilization benefits. -ag=f (KAP Fortran) Pad arrays as in -ag=b while relaxing the restriction that arrays cannot be passed as actual parameters to other procedures. -ag=g (KAP Fortran) Extends padding for -ag=abf to pad one dimensional arrays as well as the leading dimension of multi-dimensional arrays. -ansi_alias (DEC C) Directs the compiler to assume the ANSI C aliasing rules. -ansi_args (DEC C) Tells the compiler that the source code follows all ANSI rules about arguments; that is, whether the type of an argument matches the type of the parameter in the called function, or whether a function prototype is present so the compiler can automatically perform the expected type conversion. -arch ev56 (DEC C, Digital Fortran) Generate code that may include byte/word manipulation instructions. -arch ev6 (DEC C, Digital Fortran) Generate code that may include instructions that are supported on the Alpha 21264 chip. -noarclimit (KAP Fortran) Specifies unlimited data dependence analysis. -assume trusted_short_alignment (DEC C) Specifies that any short accessed through a pointer is naturally aligned. -assume whole_program (DEC C) Specifies that there are no occurrences of the address-of operator (&) being applied outside the current compilation unit to extern variables that are declared inside the current compilation unit. This flag is often suitable for use with the -ifo flag, which presents a group of source files to the compiler as a single compilation unit. cc.alt (compiler) This commmand invokes the alternative DEC C compiler, which is intended to deliver faster runtime performance. -conc (KAP Fortran) Restructure the source code for parallel processing. -D_INTRINSICS (DEC C) Declares certain functions to be intrinsic. When a function is intrinsic, the compiler is free to generate faster code that provides the same function behavior (but may not actually call the function). -D_INLINE_INTRINSICS (DEC C) Directs the compiler to inline some of the intrinsic functions, avoiding the overhead of a function call. -D_FASTMATH (DEC C) Redefines the names of certain common math routines so that faster but slightly less accurate functions are used. -fast (DEC C) Provides a single method for turning on a collection of optimizations for increased performance, namely: -ansi_alias -ansi_args -assume trusted_short_alignment -D_INTRINSICS -D_INLINE_INTRINSICS -D_FASTMATH -float -fp_reorder -ifo -O3 -readonly_strings -float (DEC C) Tells the compiler that it is not necessary to promote expressions of type float to type double. -fp_reorder (DEC C) Allows floating-point operations to be reordered during optimiza- tion. -fkapargs='...' (KAP Fortran) Pass the switches between apostrophes to the KAP optimizer. -fuse (KAP Fortran) Perform loop fusion. -fuselevel=1 or 2 (KAP Fortran) Perform more aggressive fusion (0=default). -gen_feedback (DEC C) Generate accurate profile information for compiler feedback. -granularity byte (Digital Fortran) Ensures that data of byte size can be accessed from different threads sharing data in memory. -heaplimit=500 (KAP Fortran) KAP may require large amounts of memory in order to process your source code. The heaplimit switch specifies the maximum size in megabytes that the KAP heap can grow. -ifo (DEC C) Performs inter-file optimizations. -inline speed (DEC C, Digital Fortran) Provides inline expansion of function calls even when doing so may significantly increase the size of the program. -ipa (KAP Fortran) Do interprocedural analysis. -ipa_optimize=2 (KAP Fortran) Enables a group of interprocedural analysis options useful for optimizing large codes, including -ipa, -ipa_loop_level=3, -ipa_depth=10, -heaplimit=500 and -noarclimit. -ipa_depth=10 (KAP Fortran) The ipa_depth switch sets the maximum level of subprogram nesting which kapf will attempt to analyze. -ipa_loop_level=3 (KAP Fortran) The ipa_looplevel switch enables the user to limit inlining to just functions which are referenced in nested loops where the effects of reduced function call overhead or enhanced optimizations will be multiplied. kf77 (compiler) This command invokes the KAP Fortran 77 high-level optimizer and then invokes the Fortran 77 compiler. kf90 (compiler) This command invokes the KAP Fortran 90 high-level optimizer and then invokes the Fortran 90 compiler. ld (linker) This command invokes the linker. -lexc (ld) Include the exception handling library. -lsys5 (ld) Links with the system 5 version of system services. -mc=1500 (KAP Fortran) The minconcurrent switch sets the level of work in a loop above which KAP executes the loop in parallel. -merge (prof -pixie) Sums the .Counts files and writes the result into a new file. -non_shared (ld) Directs the linker to produce a static executable. The output object created by the linker will not use any shared objects during execution. -O0 through -O5 (Digital Fortran) Fortran's general optimization level. O0 disable all optimizations O1 local optimizations and common subexpressions O2 global optimizations such as code motion, strength reduction, lifetime analysis, and code scheduling O3 additional global optimizations that may cost more space, such as loop unrolling and code replication O4 inline expansion O5 software pipelining and loop transformation -O0 through -O4 (DEC C) DEC C's general optimization level. O0 disable all optimizations O1 local optimizations and common subexpressions global optimizations such as code motion, strength reduction, lifetime analysis, and code scheduling O2 additional global optimizations that may cost more space, such as loop unrolling and code replication O3 inline expansion O4 software pipelining -o=0 (KAP Fortran) KAP's general optimization level. Setting it to 0 disables optimizations such as DO loop interchanging and lifetime analysis. om (postlink optimizer) -om (Digital Fortran, DEC C) This command (or switch if added to a Fortran or C compile command) invokes the om post-link time optimizer which does optimizations such as "nop" removal, .lita removal, and global pointer repositioning. -om_ireorg_feedback (om) Uses the pixie-produced information in file.Counts and file.Addrs to reorganize the instructions to reduce cache thrashing. -om_split_procedures (om) Allows om to break procedures into more than one piece. -pids (DEC C, pixie) Enables the addition of the process-id to the filename of the basic block counts file (.Counts). This facilitates collecting information from multiple invocations of the pixie output file. -pipeline (DEC C, Digital Fortran) Enables software pipelining, that is, "wrap around" of loop iterations to reduce latency. pixie (profiling) Add profiling code to a program prof -pixie (profiling) Analyze profile data -prof_dir /tmp/prof (DEC C) Specifies a location to which the profiling data files (.Counts and .Addrs) are written. -prof_gen (DEC C) Generates an executable image that has profiling code added to it. -prof_use_feedback (DEC C) Uses profiling feedback to improve runtime performance. -prof_use_om_feedback (DEC C) Uses profiling feedback to rearrange the resulting image to reduce cache conflicts of the program text. This flag uses the -om postlink optimizer. protect_headers_setup.sh (DEC C installation option) Ensures that the compiler's assumptions about pointer sizes and data alignments are not in conflict with the default values that were in effect when the system libraries were created. -r (ld) Retain relocation entries in the output file. -r=1 (KAP Fortran) Roundoff option to specify change in serial roundoff error that is tolerable. The range of values allowed is 0 through 3, from least to most differences allowed. -readonly_strings (DEC C) Makes string literals read-only for improved performance. setenv DECFORT_CC cc.alt (Fortran Driver) The Fortran driver does not invoke the linker or om directly; instead it calls the C driver which does these tasks. Setting the environment variable DECFORT_CC tells Fortran where to find the C driver, in this case /usr/lib/cmplrs/cc.alt. -so=0 (KAP Fortran) Scalar optimizations. Setting this switch to 0 disables optimizations such as code floating out of loops, loop unrolling, and loop peeling. -speculate all (Digital Fortran, DEC C) Enables speculative code scheduling for all routines in the program. Speculation occurs when a conditionally executed instruction is moved to a position before a test instruction so that the moved instruction is then executed unconditionally. -std1 (DEC C) Strictly enforces the ANSI C standard and all its prohibitions. -taso (ld) Directs the linker to load the executable file in the lower 31-bit addressable virtual address range. tcp_ttl (Digital Unix) Sets the time to live for TCP/IP sockets; default is 60 seconds. -transform_loops (Digital Fortran) Enables a group of loop transformation optimizations that apply to array references within loops, including loop blocking, distribution, fusion, and interchange. -unsigned (DEC C) Causes all char declarations to be unsigned char declarations. -ur= (KAP Fortran) [alternate spelling: -unroll] The maximum number of iterations to unroll inner loops. -ur2= (KAP Fortran) [alternate spelling: -unroll2] The maximum work allowed in an unrolled loop. Work is estimated by counting operands and operators in a loop. -ur3= (KAP Fortran) [alternate spelling: -unroll3] The lower limit for unrolling. If there are less than n units of work in the loop, the loop will not be unrolled. -xtaso_short (DEC C) Directs the compiler to allocate 32-bit pointers by default. You can still use 64-bit pointers, but only by the use of pragmas.