Switch Disclosure for Digital Alpha-based Systems for SPEC (07/22/96) The following are the Digital Unix AXP C compiler, DEC Fortran, and KAP Preprocessor switches/qualifiers used for SPEC95: -aligned_data Insure external symbols and commons are at least quadword aligned. -Dname=def -Dname Define the name to the C macro preprocessor, as if by #define. If no definition is given, the name is defined as "1". Used to set common portability switches or invoke available intrinsic or built-in functions. -std Have the compiler produce warnings for language constructs that are not standard in the language. -std0 enforces the K & R standard with some ANSI extensions, -std1 enforces the ANSI C standard, and -std enforces the ANSI C standard with popular extensions. The -std flag causes the macro __STDC__=0 to be passed to the preprocessor, -std1 causes the macro __STDC__=1 to be passed, and -std0 causes __STDC__ to be unde- fined. The default is -std0. -migrate This switch changes the cc compiler location to that of the DEC C compiler which supports the language processing rules and language extensions common to C on other Digital operating systems. -O2 (-O for cc) Invoke the global optimizer. -O3 Do all above optimizations, including global register allocation and procedure inlining. A ucode object file is created for each C source file and left in a .u file. -O4 (-O for f77) Enables inline expansion of small procedures and all -O3 optimizations. -O5 Enables software pipelining, which in certain cases improves run-time performance. -align commons Aligns all COMMON block entities on natural boundaries, up to 4-bytes. The default is -align nocommons. -align dcommons Aligns all COMMON block entities on natural boundaries, up to 8-bytes. The default is -align nocommons. For optimal perfor- mance on AXP systems, specify -align records (the default) and -align dcommons. -align records Aligns all RECORD fields on the natural boundaries. This is the default unless you specify -vms (which sets the -align norecords option). -arch ev56 Optimizes for the ev56 architecture -assume noaccuracy_sensitive Reorders floating-point operations, based on algebraic identi- ties (inverses, associativity, and distribution), to improve performance. The default is -assume accuracy_sensitive. -automatic Places local variables on the run-time stack. The default is -noautomatic. -fkapargs='${KAP_FLAGS}' Passes KAP switches to KAP preprocessor for kf77 usage. -inline speed Inlines all of the routines in the -inline manual category, plus any additional calls that the compiler determines will improve run-time performance, even where it may signficantly increase the size of the program. This is the default for optimization levels -O, -O4, and -O5. -non_shared Does not produce a dynamic executable. The linker will search regular archive library (.a) files to resolve undefined refer- ences; .so files are not searched. Object files (.o suffix) from archives are included in the executable produced. This option is only available on DEC OSF/1 systems. The default is -call_shared. -math_library fast Use the available fast math library routines. -ifo Provides improved optimization and code generation across file boundaries that would not be available if the files were compiled seperately. For purposes of code generation, the compiler treats the files as one application. -speculate all Enables speculative code scheduling for all routines in the program. Speculation occurs when a conditionally executed instruction is moved to a position before a test instruction so that the moved instruction is then executed unconditionally. -unsigned Causes all char declarations to be unsigned char declarations. -assume whole_program (formerly: -Wf,-noptrs_to_globals -assume noptrs_to_globals) Declares that extern variables declared in the current compilation unit do not have the address-of operator (&) applied outside of the current compilation unit. This effectively means that access through a pointer cannot reference most extern variables, so the compiler can perform better optimizations. This switch is commonly used when the whole program is given to the compiler in a single compilation using -ifo. -32data Binds the long datatype to 32 bits. -xtaso_short Directs the compiler to allocate 32-bit pointers by default. You can still use 64-bit pointers, but only by the use of pragmas. System libraries are protected (that is, remain 64-bit) if the script /usr/lib/cmplrs/cc.alt/protect_headers.sh is run. -W... Passes command line options to other phases of the compilation process. The following are the KAP switches/qualifiers used: -ag=a Perform aggressive optimization, pad common blocks and subroutine-local memory to avoid cache line collisions. -ag=b redefine the array indices of arrays when doing so would cause cache utilization benefits. -ag=c permits KAP to inline routines that contain static (SAVE or DATA) variables by promoting the static variables to members of a COMMON that is introduced into the program. -conc directs KAP to restructure the source code for parallel processing. -hli= The -hoist_loop_invariants option controls code hoisting of loop-invariant expressions from loops. Note that this switch is independent of the switches that control the floating of invariant-IFs out of loops, -each_invariant_if_growth and -max_invariant_if_growth. The possible settings for this option are the following: 0 Turns off the hoisting of invariant code from loops. 1 Hoists (floats) all loop invariant expressions that are not under the control of an IF-structure within the given loop nest. This is the default setting. 2 Same as 1, except that a zero trip IF statement is not created to guard the loop to protect array references that are potentially out of bounds when floated outside the loop. This can give a slight performance increase at the expense of a possible runtime error. 3 Float all loop-invariant expressions from the loop structure. -arl= Address resolution option to inform KAP-C what level of data aliasing may be present in the program. 4 - No aliases for objects. -fuse Perfom loop fusion -heaplimit KAP may require large amounts of memory in order to process your source code. The heaplimit switch specifies the maximum size in megabytes that the KAP heap can grow. -ind= Inline depth qualifier that sets the maximum level of subprogram nesting which KAP will attempt to inline. -inl Allow inlining of subroutines. -inll= Restrict inlining to those routines referenced at the deepest levels of nested loops. -inlc[= Inline subroutines or functions and include text of routine in transformed code file. -ipa_loop_level= The ipa_looplevel switch enables the user to limit inlining to just functions which are referenced in nested loops where the effects of reduced function call overhead or enhanced optimizations will be multiplied. The switch can be abbreviated -ipall. -ipa_depth= The ipa_depth switch sets the maximum level of subprogram nesting which kapf will attempt to inline. -ipa_optimize=2 Enables a group of interprocedural analysis options, including -ipa_loop_level=3, -ipa_depth=10, -heaplimit=500 and -noarclimit. -lc= Directs KAP to replace sections of code with calls to standard numerical library routines which have the same functionality. -mc= The minconcurrent switch sets the level of work in a loop above which KAP executes the loop in parallel. -noarclimit Specifies unlimited data dependence analysis. -o= The optimize option sets the loop optimization and code analysis levels, ranging from 0 (minimum optimization) to 5 (maximum optimization). i.e. -o=5 Each optimization level is cumulative and optimization level of 5 includes all optimizations from level 0 through 4 plus the following additional optimization: 5 - Array expansion and loop fusion are enabled. -r= Set roundoff option to specify change in serial roundoff error that is tolerable. -ur= Sets the number of times to unroll scalar inner loops. -ur2= Sets weighing factor on level of work per unrolled iteration. -tune=ev5 Optimize for EV5 architecture. -nat Use natural alignment, such as REAL*8 entities will always start on double-word boundaries. Natural alignment specifies that variables and arrays in COMMON blocks will start on boundaries which correspond to their size. Fortran: kf77 - this command invokes KAPF transparently and then passes the kap'd output to f77 -fkapargs=' -ag=a' -Aggressive=a means that kapf will pad COMMON blocks in an attempt to avoid cache line collisions. This assumes the following: All COMMON blocks will be visible to kapf in the course of processing the entire file. If the same COMMON block has two different layouts, these two layouts are fully independent and do not pass values between each other. -fast Sets the following command options that can improve run-time performance: -assume noaccuracy_sensitive, -O4, -align dcommons, and -math_library fast: -O4 - Enables inline expansion of small procedures and all -O3 optimizations. -align dcommons - Aligns all COMMON block entities on natural boundaries, up to 8-bytes. The default is -align nocommons. For optimal perfor- mance on AXP systems, specify -align records (the default) and -align dcommons. -math_library fast - Specifies that the compiler is to select the version of the math library routine which provides the highest execution per- formance for mathematical intrinsic functions EXP and SQRT. Linker: The following are the linker switches/qualifiers used: -lm link with math library - libm.a -non_shared Produce a static executable. The output object created will not use any shared objects during execution. -trapuv Initializes dynamic memory to trap. -lsys5 links with the system 5 version of system services -om Performs code optimization after linking, including "nop" removal, .lita removal, simple peephole optimizations on basic blocks, reorders procedures based on static call graph and common reallocation. The -om is only supported for -non_shared programs. Options can be passed directly to om using -WL: -WL,-om_compress_lita - Compresses non-unique .lita entries rather than removing them. -WL,-om_dead_code - Removes dead code (unreachable instructions) generated after applying optimizations. -WL,-om_no_align_labels - Turns off alignment of labels. Normally om will quadword align the targets of all branches to improve loop performance. -WL,-om_Gcommon, - Sets size threshold of "common" symbols. Every "common" symbol whose size is less than or equal to will be allocated close to each other. -WL,-gen_feedback - Use to generate alternate om feedback file. -WL,-om_ireorg_feedback - Will use feedback information to rearrange code based on basic block frequencies. -tL -h/usr/lib/cmplrs/cc.alt -B uses the version of om contained in the cc.alt directory Operating system: vm-page-free-min 5760 Sets a goal of 45MB free pages.