| Sun SPEC CPU2000 Flag DescriptionsSun ONE Studio 8Last updated: 09-Mar-2003
 Note: This flags file is alphabetized by command or switch name, 
without regard to upper/lower case, without regard to the presence
or absence of a leading "-", and without regard to the software
component that uses the command or switch.   The component is mentioned in 
(parentheses) immediately after the name of the command or switch.  
 It is hoped that this order of presentation will make it easier to look
up commands or switches even if the reader does not already know what
software component they belong to.
 
 -Abcopy (optimizer) 
Increase the probability that the compiler will perform 
memcpy/memset transformations.
 -Addint:sf=<n> (optimizer) 
When considering whether to interchange loops, 
set memory store operation weight to n.
A higher value of n indicates a greater performance
cost for stores.
 -Ainline[:cp=<n>][:cs=<n>][:inc=<n>][:irs=<n>]
[:mi][:recursion=1]
 (optimizer)
 
 
  | Control the optimizer's loop inliner: |  |  | cp=<n> | The minimum call site frequency counter
       in order to consider a routine for inlining. |  |  | cs=<n> | Set inline callee size limit to n.  The unit 
       roughly corresponds to the number of instructions. |  |  | inc=<n> | The inliner is allowed to increase the
       size of the program by up to n%. |  |  | irs=<n> | Allow routines to increase by up to n.  The
       unit roughly corresponds to the number of instructions. |  |  | mi | Perform maximum inlining (without considering code 
       size increase). |  |  | recursion=1 | Allow routines that are called recursively to still be
       eligible for inlining. |  -Apf:llist=<n>:noinnerllist (optimizer) 
Do speculative prefetching for link-list data structures:
 llist=<n> perform prefetching n 
    iterations ahead
 noinnerllist do not attempt for innermost loops.
 -Apf:pdl=1 (optimizer) 
Do prefetching for one-level indirect memory references.
 -array_pad_rows,<n> (Fortran)
Enable padding of arrays by n.
 -Ashort_ldst (optimizer) 
Convert multiple short memory operations into single long
memory operations.
 -Atile:skewp[:b<n>] (optimizer) 
Perform loop tiling which is enabled by loop skewing.  Loop skewing is a 
transformation that transforms a non-fully interchangeable loop nest
to a fully interchangeable loop nest.  The optional b<n>
sets the tiling block size to n.
 -Aujam:inner=g (optimizer) 
Increase the probability that small-trip-count inner loops will
be fully unrolled.
 autoup=<n> (Unix)
When the file system flush daemon fsflush runs, it
will write to disk all modified file buffers that are more than
n seconds old.
 cc (C compiler)
Invoke the Sun ONE Studio 8 Compiler C
 CC (C++ compiler)
Invoke the Sun ONE Studio 8 Compiler C++
 -crit (optimizer) 
Enable optimization of critical control paths
 -dalign (C, C++, Fortran)
Assume data is naturally aligned.
 -Dalloca=__builtin_alloca (Portability: SPEC Tools)
Portability switch, used for 176.gcc:  allow use of compiler's internal 
builtin alloca.
 -depend (Fortran)
Synonym for -xdepend.
 -DHOST_WORDS_BIG_ENDIAN (Portability: SPEC Tools)
Portability switch, used for 176.gcc: controls how bytes are numbered within 
a word.
 disablecomponent (System Management Services)
This command is used prior to booting the system for a 1-cpu test.
The tester uses disablecomponent to add all other CPUs 
to the "blacklist",
which is a list of components that cannot be used at boot time.
 -D__MATHERR_ERRNO_DONTCARE (C)
Allows the compiler to assume that your code does not rely on setting
of the errno variable.
 -DSPEC_CPU2000_SOLARIS (Portability: SPEC Tools)
Portability switch, used for 253.perlbmk: selects header files and
code paths compatible with Solaris.
 -DSUN (Portability: SPEC Tools)
Portability switch, used for 186.crafty: selects header files and code paths
compatible with solaris.
 -DSYS_HAS_CALLOC_PROTO (Portability: SPEC Tools)
Portability switch, used for 254.gap: allows use of the designated prototype.
 -DSYS_HAS_IOCTL_PROTO (Portability: SPEC Tools)
Portability switch, used for 254.gap: allows use of the designated prototype.
 -DSYS_HAS_SIGNAL_PROTO (Portability: SPEC Tools)
Portability switch, used for 254.gap: allows use of the designated prototype.
 -DSYS_HAS_TIME_PROTO (Portability: SPEC Tools)
Portability switch, used for 254.gap: allows use of the designated prototype.
 -DSYS_IS_USG (Portability: SPEC Tools)
Portability switch, used for 254.gap: selects code compatible with 
USG-based systems.
 -e (Portability, Fortran)
Portability switch, used for 178.galgel: allows source lines to be 
up to 132 characters long.
 f90 (Fortran compiler)
Invoke the Sun ONE Studio 8 Compiler Fortran 90
 -fast (C)
A convenience option, this switch selects the following switches that
are defined elsewhere in this page:
      -D__MATHERR_ERRNO_DONTCARE 
     -dalign
     -fns 
     -fsimple=2 
     -fsingle 
     -ftrap=%none 
     -xalias_level=basic 
     -xbuiltin=%all 
     -xdepend 
     -xlibmil 
     -xO5 
     -xprefetch=auto,explicit 
     -xtarget=native   -fast (C++)
A convenience option, this switch selects the following switches that
are defined elsewhere in this page:
      -dalign
     -fns
     -fsimple=2 
     -ftrap=%none 
     -xbuiltin=%all 
     -xlibmil 
     -xlibmopt 
     -xO5 
     -xtarget=native   -fast (Fortran)
A convenience option, this switch selects the following switches that
are defined elsewhere in this page:
      -dalign 
     -depend
     -fns
     -fsimple=2 
     -ftrap=common 
     -xlibmil 
     -xlibmopt 
     -xO5 
     -xpad=local 
     -xprefetch=auto,explicit 
     -xtarget=native
     -xvector=yes       -fixed (Portability, Fortran)
Portability switch, used for 178.galgel: assume fixed-format source input.
 -fns (C, C++, Fortran)
Selects faster (but nonstandard) handling of floating point 
arithmetic exceptions and gradual underflow.
 -fsimple=<n> (C, C++, Fortran)
Controls simplifying assumptions for floating point arithmetic:
 
   -fsimple=0 permits no simplifying assumptions. 
      Preserves strict IEEE 754 conformance.
   
   -fsimple=1 allows the optimizer to assume:
   
      The IEEE 754 default rounding/trapping modes do not change
         after process initialization.
      Computations producing no visible result other than potential
         floating-point exceptions may be deleted.
      Computations with Infinity or NaNs as operands need not
         propagate NaNs to their results. For example, x*0 may be replaced
         by 0.
      Computations do not depend on sign of zero.
   -fsimple=2 permits more aggressive floating point 
      optimizations that may cause
      programs to produce different numeric results due to changes in
      rounding. Even with -fsimple=2, the optimizer 
      still is not permitted to introduce a floating point exception 
      in a program that otherwise produces none.
    -fsingle (C)
Evaluate float expressions as single precision.
 -ftrap=common (C, C++, Fortran)
Sets the IEEE 754 trapping mode to common exceptions (invalid, division
by zero, and overflow).
 -ftrap=%none (C, C++, Fortran)
Turns off all IEEE 754 trapping modes.
 LD_LIBRARY_PATH=<directories> (linker)
LD_LIBRARY_PATH controls the search order for both the compile-time
and run-time linkers.  Usually, it can be defaulted; but testers may
sometimes choose to explicitly set it (as documented in the notes in the 
submission), in order to ensure that the correct versions of libraries
are picked up.
 LD_PRELOAD=mpss.so.1 (Unix)
Allow use of the mpss.so.1 shared object, which provides a means
by which preferred stack and/or heap page sizes can be selected.
 -library=iostream (Portability, C++)
Portability switch, used for 252.eon: allow use of the classic iostream 
library.
 -ll2amm (linker)
Include a library containing chip specific memory routines.
 -lm (linker)
Include the math library.
 -lmopt (linker)
Include the optimized math library.  This option usually generates
faster code, but may produce slightly different results.  Usually
these results will differ only in the last bit.
 MPSSHEAP=<n> (Unix)
Specify the preferred page size for heap.  The specified page size is
applied to all created processes.
 MPSSSTACK=<n> (Unix)
Specify the preferred page size for stack.  The specified page size is
applied to all created processes.
 -noex (C++)
Do not allow C++ exceptions.  A throw specification on a function is 
accepted but ignored; the compiler does not generate exception code.
 -O (Fortran)
A synomym for -xO3.
 -Qdepgraph-early_cross_call=1 (code generator)
There are several scheduling passes in the compiler.  This option
allows early passes to move instructions across call instructions.
 -Qeps:enabled=1 (code generator)
Use enhanced pipeline scheduling(EPS) and selective scheduling
algorithms for instruction scheduling.
 -Qeps:rp_filtering_margin=100 (code generator)
Turn off register pressure heuristics in EPS.
 -Qeps:ws=<n> (code generator)
Set the EPS window size, that is, the number of instructions it will
consider across all paths when trying to find independent instructions
to schedule a parallel group.  Larger values may result in better 
run time, at the cost of increased compile time.
 -Qgsched-T<n> (code generator)
Sets the aggressiveness of the trace formation, where n 
is 4, 5, or 6.  The higher the value of n, the lower 
the branch probability needed to include a basic block in a trace.
 -Qicache-chbab=1 (code generator)
Turn on optimization to reduce branch after branch penalty: nops
will be inserted to prevent one branch from occupying the delay slot of
another branch.
 -Qipa:valueprediction (code generator)
Use profile feedback data to predict values and attempt to
generate faster code along these control paths, even at the
expense of possibly slower code along paths leading to different
values. Correct code is generated for all paths.
 -Qiselect-funcalign=<n> (code generator)
Do function entry alignment at n-byte boundaries.
 -Qiselect-sw_pf_tbl_th=<n> (code generator)
Peels the most frequent test branches/cases off a switch until
the branch probability reaches less than 1/n. This is effective
only when profile feedback is used
 -Qlp=<n>[-av=<n>][-t=<n>][-fa=<n>][-fl=<n>] (code generator)
   
  | Control irregular loop prefetching: |  |  | lp=<n> | Turns the module on (1) or off (0) (default is on for F90; 
       off for C/C++) |  |  | -av=<n> | Sets the prefetch look ahead distance, in bytes.  Default is 256. |  |  | -t=<n> | Sets the number of attempts at prefetching.  If not
       specified, t=2 if -xprefetch_level=3 has been 
       set; otherwise, defaults to t=1. |  |  | -fa=<n> | 1=Force user settings to override internally computed values. |  |  | -fl=<n> | 1=Force the optimization to be turned on for all languages. |  -Qms_pipe+intdivusefp (code generator)
In pipelined loops, use floating point divide instructions
for signed integer division.
 -Qms_pipe+prefolim=<n> (code generator)
Set number of outstanding prefetches in pipelined loops to <n>
 -Qms_pipe+unoovf (code generator)
Assert (to the pipeliner) that unsigned int computations will not overflow.
 -Qms_pipe-prefst (code generator)
Turn off prefetching for stores in the pipeliner.
 -Qoption cg -switch[,-switch...]  (C++, Fortran)
Send the listed switch(es) to the code generator.  See the definitions
of the individual switches elsewhere in this page (alphabetically 
ordered).
 -Qoption f90comp -switch[,-switch...]  (Fortran)
Send the listed switch(es) to the Fortran 90 front end.  See the definitions
of the individual switches elsewhere in this page (alphabetically 
ordered).
 -Qoption iropt -switch[,-switch...]  (C++, Fortran)
Send the listed switch(es) to the global optimizer.  See the definitions
of the individual switches elsewhere in this page (alphabetically 
ordered).
 -Qpeep-Sh0 (code generator)
Reduce the probability that the compiler will hoist sethi insructions 
out of loops.
 RM_SOURCES = lapak.f90 (SPEC tools)
This option allows building the benchmark 178.galgel without its
copy of the lapak sources; instead, the lapak entry points in
the sunperf library are used.
 rm -rf ./feedback.profile ./SunWS_cache (Unix)
Remove any profile feedback information from previous runs.
 -stackvar (Fortran)
Allocate routine local variables on the stack.
 submit=echo 'pbind -b...' > dobmk; sh dobmk (SPEC tools, Unix)
This SPEC config file feature is used to cause individual jobs to be
bound to specific processors:
 
   submit= causes the SPEC tools to use this line 
       when submitting jobs; 
   echo ...> dobmk causes the generated commands 
       to be written to a file, namely dobmk
   pbind -b causes this copy's processes to be bound to 
       the CPU specified by the expression that follows it.  See the 
       config file used in the submission for the exact syntax, which
       tends to be cumbersome because of the need to carefully quote
       parts of the expression.  When all expressions are evaluated,
       each CPU ends up with exactly one copy of each benchmark.
   sh dobmk actually runs the benchmark
    tune_t_fsflushr=<n> (Unix)
Controls the number of seconds between runs of the file system
flush daemon, fsflush.
 ulimit -s unlimited (Unix)
Allow stack size to grow without limit.
 -W2,-switch[,-switch...] (C)
Send the listed switch(es) to the global optimizer.  See the definitions
of the individual switches elsewhere in this page (alphabetically 
ordered).
 -Wc,-switch[,-switch...] (C)
Send the listed switch(es) to the code generator.  See the definitions
of the individual switches elsewhere in this page (alphabetically 
ordered).
 -xO<n> (C, C++, Fortran)
Specify optimization level n:
 
   -xO1 does only basic local optimizations (peephole.)
   -xO2 Do basic local and 
       global optimizations, such as induction variable 
       elimination, common subexpression elimination, constant 
       propogation, register allocation, and basic block merging.
   
   -xO3 Add global 
        optimizations at the function level, loop unrolling, 
	and software pipelining.
   
   -xO4 Adds automatic 
        inlining of functions in the same file.
   
   -xO5 Uses optmization 
        algorithms that may take significantly more compilation 
	time or that do not have as high a probability of improving 
	execution time, such as speculative code motion.
    -xalias_level=[basic|std|strong] (C)
Allows the compiler to perform type-based alias analysis at the
specified alias level:
 
   basic assume that memory references 
        that involve different C basic types do not alias each 
	other.
   std assume aliasing rules described in 
       the ISO 1999 C standard.
   strong in addition to the restrictions
        at the std level, assume that pointers of 
	type char * are used only to access an object of 
	type char; and assume that there are no interior pointers.
    -xalias_level=compatible (C++)
Allows the compiler to assume that layout-incompatible types
are not aliased.
 -xarch=v8plusb (C, C++, Fortran)
Allow the compiler to use instructions from architecture level v8plusb
(UltraSPARC III, 32-bit mode).
 -xbuiltin=%all (C, C++)
Substitute intrinsic functions or inline system functions where 
profitable for performance.
 -xchip=ultra3 (C, C++, Fortran)
Specify that the target processor will be an UltraSPARC-III.
 -xdepend (C, Fortran)
Analyze loops for inter-iteration data dependencies, and do loop
restructuring.
 -xinline= (C, C++, Fortran)
Turn off inlining.
 -xipo[=2] (C, C++, Fortran)
Perform optimizations across all object files in the link step:
 0=off
 1=on
 2=performs whole-program detection and analysis
 -xlibmil (C, C++, Fortran)
Use inline expansion for math library, libm.
 -xlibmopt (C++, Fortran)
Select the optimized math library.
 -xlic_lib=sunperf (C, C++, Fortran)
Link with Sun supplied licensed sunperf library.
 -xlinkopt (C, C++, Fortran)
Perform link-time optimizations, such as branch optimization
and cache coloring.
 -xpad=common[:<n>] (Fortran)
Pad common block variables, for better use of cache. n
specifies the amount of padding to apply. If no parameter is
specified then the compiler selects one automatically.
 -xpad=local (Fortran)
Pad local variables, for better use of cache.
 -xpagesize=<n> (C, Fortran)
Set the preferred page size for running the program.
 -xprefetch=auto,explicit (C, C++, Fortran)
Allow generation of prefetch instructions.  -xprefetch=yes is a 
synonym for -xprefetch=auto,explicit.
 -xprefetch=latx:<n> (C, C++, Fortran)
Adjust the compiler's assumptions about prefetch latency by
the specified factor.  Typically values in the range of 
0.5 to 2.0 will be useful.  A lower number might indicate
that data will usually be cache resident; a higher number
might indicate a relatively larger gap between the processor
speed and the memory speed (compared to the assumptions built
into the compiler).
 -xprefetch=no%auto (C, C++, Fortran)
Turn off prefetch instruction generation.
 -xprefetch_level=<n> (C, C++, Fortran)
Control the level of searching that the compiler does for prefetch
opportunities by setting n to 1, 2, or 3, where higher
numbers mean to do more searching.  The default is 2.
 -xprofile=collect:./feedback (C, C++, Fortran)
Collect profile data for feedback-directed optimization, and store it in
a subdirectory of the current directory, named ./feedback.
 -xprofile=use:./feedback (C, C++, Fortran)
Use data collected for profile feedback.  Look for it in 
a subdirectory of the current directory, named ./feedback.
 -xrestrict (C)
Treat pointer-valued function parameters as restricted pointers.
 -xsafe=mem (C, C++, Fortran)
Enables the use of non-faulting loads when used in conjunction
with -xarch=v8plus. Assumes that no memory 
based traps will occur.
 -xtarget=native (C, C++, Fortran)
Selects options appropriate for the system where the compile is
taking place, including architecture, chip, and cache sizes.  (These
can also be controlled separately, via -xarch, -xchip, and -xcache, 
respectively.)
 -xvector (C, Fortran)
Allow the compiler to transform math library calls within loops 
into calls to the vector math library.
 |