FLAG DESCRIPTIONS SUN C AND FORTRAN WS6 -dalign Assume double-type data is double aligned -[no]depend [Disable] Enable all dependence based transformations -dn Specify static binding -fns Select non-standard floating point mode -fsimple[=n] Allows the optimizer to make simplifying assumptions concerning floating-point arithmetic. If n is present, it must be 0, 1, or 2. The defaults are: o With no -fsimple[=n], the compiler uses -fsimple=0. o With only -fsimple, no =n, the compiler uses -fsim- ple=1. -fsimple=0 Permits no simplifying assumptions. Preserves strict IEEE 754 conformance. -fsimple=1 Allows conservative simplifications. The resulting code does not strictly conform to IEEE 754, but numeric results of most programs are unchanged. With -fsimple=1, the optimizer can assume the follow- ing: o The IEEE 754 default rounding/trapping modes do not change after process initialization. o Computations producing no visible result other than potential floating- point exceptions may be deleted. o Computations with Infinity or NaNs as operands need not propagate NaNs to their results. For example, x*0 may be replaced by 0. o Computations do not depend on sign of zero. With -fsimple=1, the optimizer is not allowed to optim- ize completely without regard to roundoff or excep- tions. In particular, a floating-point computation can- not be replaced by one that produces different results with rounding modes held constant at run time. -fast implies -fsimple=1. -fsimple=2 Permits aggressive floating point optimizations that may cause many programs to produce different numeric results due to changes in rounding. For example, per- mits the optimizer to replace all computations of x/y in a given loop with x*z, where x/y is guaranteed to be evaluated at least once in the loop, z=1/y, and the values of y and z are known to have constant values during execution of the loop. Even with -fsimple=2, the optimizer still is not per- mitted to introduce a floating point exception in a program that otherwise produces none. -libmil Use inline expansion templates for libm -s Strip symbol table from the executable file -xlibmopt This chooses the math library that is optimized for speed at the expense of some accuracy. -xO4 Generate optimized code. See -O4 below. -xO5 Generate optimized code. See -O5 below. -pad local variables or common blocks, or both, for efficient use of the cache -[x]pad=local[:] Pad local variables only, for better use of cache. n specifies the amount of padding to apply. If no parameter is specified then the compiler selects one automatically. -[x]pad=common[:] Pad common block variables, for better use of cache. n specifies the ammount of padding to apply. If no parameter is specified then the compiler selects one automatically. -unroll= Suggestion to optimizer to unroll loops n times -xarch= Limit the set of instructions the compiler may use to (generic,v7,v8a,v8,v8plus,v8plusa,v9,v9a,v9b) -xcache= Define the cache properties for use by the optimizer -xchip= Define the instruction scheduling properties for use by the optimizer -xcrossfile enable cross-file inlining. -xprofile=use Use data collected for profile feedback -xprofile=collect Collect profile data for feedback directed optimizations. -xparallel Use parallel processing to improve performance -xreduction Parallelize loops containing reductions -xsafe=mem Enables the use of non-faulting loads when used in conjunction with -xarch=v8plus is set, assumes that no memory based traps will occur -fast Fast execution. Select the combination of compilation options that optimizes for speed of execution without excessive compilation time. This is a convenience option, and it chooses: o The -native best machine characteristics option (-xarch=native, -xchip=native, -xcache=native) o Optimization level: -O5 o A set of inline expansion templates (-libmil) o The -fsimple=2 option o The -dalign option (SPARC only) o The -xlibmopt option (SPARC only) o The -xdepend option (FORTRAN only) o Options to turn off all trapping (-fns -ftrap=%none) -xO5 Besides what -xO4 does, enables speculative code motion. -xO4: Besides what -O3 does, this option does automatic inlining of functions in the same file. The code usually runs faster, but for some code, -O4 makes it run more slowly. -g suppresses automatic inlining. In general, -O4 results in larger code. -xO3 Performs like -xO2 but, also optimizes refer- ences or definitions for external variables. Loop unrolling and software pipelining are also performed. The -xO3 level does not trace the effects of pointer assignments. When com- piling either device drivers, or programs that modify external variables from within signal handlers, you may need to use the volatile type qualifier to protect the object from optimization. In general, the -xO3 level results in increased code size. -xO2 Does basic local and global optimization. This is induction variable elimination, local and global common subexpression elimination, algebraic simplification, copy propagation, constant propagation, loop-invariant optimi- zation, register allocation, basic block merging, tail recursion elimination, dead code elimination, tail call elimination and complex expression expansion. The -xO2 level does not assign global, exter- nal, or indirect references or definitions to registers. It treats these references and definitions as if they were declared "vola- tile." In general, the -xO2 level results in minimum code size. -xO1 Does basic local optimization (peephole). -xvector enable vectorization of loops with calls to math routines -xprefetch enable generation of prefetch instructions -xstackvar allocate routine local variables on stack (Fortran) -xrestrict[=f1,...,f2,%all, %none] Treat pointer-valued function parameters as restricted pointers. This command-line option can be used on its own, but is best used with optimization. The default is %none. Specifying -xrestrict is equivalent to specifying -xrestrict=%all. -xregs=syst Allows use of the system reserved registers %g6 and %g7, and %g5 if not already allowed by -xarch= value. -Xc Compile assuming strict ANSI C conformance -Xa Compile assuming ANSI C conformance, allow K & R extensions (default mode) -Xt Compile assuming K & R conformance, allow ANSI C -Qoption Pass flags along to compiler phase: cg Code generator f77pass1 Fortran first pass iropt Internal representation optimizer -W, Pass flags along to compiler phase: 2 Second pass c code generator -Qoption cg -Qms_pipe+nfll= specifies n as the latency of non-floating point load instructions. -Qoption cg -Qms_pipe-off Turn off the software pipeliner. -Qoption iropt -O4+ansi_alias Assume (more restrictive) ANSI C semantics for pointer aliasing -Qoption iropt -O4+scalarrep disable scalar replacement optimization -Qoption iropt -O4+algassoc enable floating point reassociation -Qoption iropt -O4+unroll enable aggressive loop unrolling. -Qoption iropt -Si Sets n as the limit of general integer virtual registers for register allocation optimization. Default is 30. -Qoption iropt -O4+bcopy enable vectorization of copy and memset loops -Qoption iropt -O4+data_access enable optimizations based on data access patterns -Qoption iropt -reroll=1 enable automatic loop rerolling of completely unrolled loop nests -Qoption iropt -O4+invccexp See -W2,-O4+invccexp below -Qoption iropt -O4+pde See -W2,-O4+pde below -Qoption iropt -whole See -W2,-whole below -W2,-whole do whole program optimizations -W2,-fsimple=2 perform aggresive floating point simplification and optimizations. -W2,-Mp Procedures with entry counts equal or greater than n become candidates for inlining. -W2,-Mt The maximum size of a routine body elegible for inlining is limited to n triples. -W2,-Mr maximum code increase due to inlining is limited to n triples -W2,-Ma enable inlining of routines with frame size upto n -W2,-Mm maximum module increase limit for inlining -W2,-O4+pde enable aggressive dead code elimination -W2,-O4+cond_elim enable aggresive optimizations of conditional branches -W2,-O4+bopt enable aggresive optimizations of all branches -W2,-O4+bmerge enable branch merge optimizations -W2,-O4+invccexp enable hoisting of invariant branches -W2,-ANSI_S use ANSI semantics for routines with hidden control flow (e.g. setjmp) -W2,-ldstr enable hoisting of load and store instructions -W2,-crit enable optimization of critical control paths -W2,-O4+ipa perform interprocedural optimizations -W2,-O4+heap keep track of malloc like memory allocation calls -Wc,-Qicache-L1-bsize=4-bbits=7 Do L1 instruction cache alignment. The -L1 selects loop boundaries. The -bsize=4 selects the alignment boundary == 16 bytes. The -bbits=7 selects the bad alignments, not the last three of the 4 instr's per 16 bytes. This is really one option and not really partitionable. The default is to not do any I-cache alignment. -Wc,-Qiselect-funcalign= do function entry alignment at n-byte boundaries. -Wc,-Qms_pipe+unoovf do software pipelining for loops with unsigned counters -Wc,-Qiselect-sw_pf_tbl_th= Peels the most frequent test branches/cases off a switch until the branch probability reaches less than 1/n. This is effective only when profile feedback is used. -Wc,-Qdepgraph-early_cross_call=1 Enable early cross-call instruction scheduling. -lfast Link in the fast system libraries. Kernel Parameters ----------------- consistent_coloring Consistent Coloring controls the page coloring policy. It can be set to one of the following: 0: (default) dynamic (uses various vaddr bits) 1: static (virtual=paddr) 2: bin hopping