=============================================== HP-UX Flag Descriptions for CPU2000 - July 2002 =============================================== ----------------------------------------------------------------- Common Flags for HP-UX F90 Compiler, C Compiler and aCC Compiler Compiler specific flags are mentioned below or in other notes ----------------------------------------------------------------- +Ofaster Selects the +Ofast option at optimization level +O4. Must be used with +Oprofile=use (+P) or else the optimization level will drop to +O3. +Ofast Select a combination of compilation options for optimum execution speed at build times. Currently: +O2, +Olibcalls, +Onolimit, +Onofltacc, +FPD, +DSnative (on IPF), and +Oshortdata. +Olevel Invoke optimizations selected by level. These can be preceded by either +O or -O. Defined values for level are: 0 Perform no optimizations. 1 Perform optimizations within basic blocks only. This is the default. 2 Perform level 1 and global optimizations. Same as -O and +O. 3 Perform level 2 as well as interprocedural global optimizations. 4 Perform level 3 as well as doing link time optimizations. (C and aCC only) NOTE: +Oprocelim is the general default at all levels, unless the users says +ild +ildrelink or -b. NOTE: +O4 is only supported with +Oprofile=use (+P). Otherwise, attempts to activate +O4 will cause the compiler to automatically drop to +O3. +Oentrysched Perform instruction scheduling on a subprogram's entry and exit code sequences. This option can be used at optimization level 1 and higher. The default is +Onoentrysched. +Olibcalls Use low-call-overhead versions of select library routines. This option can be used at any level. At optimization level 0 or 1, the default is +Onolibcalls; at optimization level 2 or higher, the default is +Olibcalls. +O[no]recovery Generate [do not generate] recovery code for control speculation. The default is +Onorecovery. +O[no]fltacc Disable [enable] floating-point optimizations that can result in numerical differences. +Onofltacc also generates Fused Multiply-Add (FMA) instructions, as does compiling your program at optimization level 2 or higher. FMA instructions can improve performance of floating-point applications and are available only on PA-RISC 2.0 systems or later. If you do not specify either +Ofltacc or +Onofltacc at optimization level 2 or higher, the optimizer will generate FMA instructions but will not perform any expression-reordering optimizations. If +Oall is used it implies +Oaggressive which in turn implies +Onofltacc. This can be overridden if +Ofltacc follows +Oall on the command line. +Onoinitcheck Disable initialization of any local, scalar, automatic variable that is found to be uninitialized. This option can be used at optimization level 2 and higher. The default is to enable initialization if the variable is uninitialized with respect to every path leading to its use. +FPflag Specify how the environment for floating-point e.g. operations should be initialized at program +FPD start-up. By default, all behaviors are disabled. The following flags are supported (upper case flag enables; lower case flag disables): D (d) Enable sudden underflow (flush to zero) of denormalized values. +Onolimit Do not suppress optimizations that significantly increase compile-time or consume enormous amounts of memory. +Oloop_unroll=n Unroll [do not unroll] program loops by a factor of n. For example, specifying +Oloop_unroll=4 requests the optimizer to replicate the loop body four times. This option can be used at optimization level 2 or higher. The default is +Oloop_unroll=4. This option is only valid on PA-RISC systems. +O[no]loop_block Enable [disable] loopblocking for data cache optimizations. Available at optimization level 3 +O[no]inline Request [disable] inlining and cloning. This option can be used at optimization level 3 and higher. The default is +Oinline. +O[no]inline=function1[,function2...] Enable [disable] optimizer inlining for the named functions. This optimization can occur at optimization levels 3 and 4. The default is +Oinline. +Oinlinebudget=n aCC(1)/cc(1) +Oinline_budget=n f90(1) Perform more aggressive inlining, where n specifies the degree of aggressiveness, as follows: 100 Default level of inlining. > 100 More aggressive inlining at the expense of compilation time and code size. The maximum for n is 1000000. 2 - 99 Less aggressive inlining. The optimizer gives more weight to compilation time and code size when determining whether to inline. 1 Inline only if it reduces code size. This option can be used at optimization level 3 or higher. +O[no]ptrs_to_globals[=name1,name2,...,nameN] Tell the optimizer whether global variables are modified [are not modified] through pointers. This optimization can occur at levels 2, 3, 4. The default is +Optrs_to_globals +O[no]type_safety=[off|limited|ansi|strong] Enable [disable] aliasing across types. off The default. Specifies that aliasing can occur freely across types. This is a synonym to +Onoptrs_ansi and +Onoptrs_strongly_typed options in cc. limited Code follows ANSI aliasing rules, and that unnamed objects should be treated as if they had an unknown type. ansi Code follows ANSI aliasing rules, and unnamed objects should be treated the same as named objects. This option is synonym to +Optrs_ansi option in cc. strong Code follows ANSI aliasing rules, except that accesses through lvalues of a character type are not permitted to touch objects of other types. This assumes that field addresses are not taken. This option is synonym to +Optrs_strongly_typed option in cc. +O[no]dataprefetch Insert [do not insert] instructions within innermost loops to explicitly prefetch data from memory into the data cache. Data prefetch instructions can improve cache performance. This option can be used at optimization level 2 or higher. On HP-UX version 11i and later, +Odataprefetch is the same as +Odataprefetch=indirect and +Onodataprefetch is the same as +Odataprefetch=none. At +O2 and higher, the default is +Odataprefetch. +O[no]procelim Enable [disable] the elimination of functions that are not referenced by the application. Only functions with the hidden export class may be eliminated. The default is +Oprocelim. +Oshortdata[=size] All objects of size size bytes or smaller will be placed in the short data area, and references to such data will assume it resides in the short data area. Valid values of n are 0, or a decimal number between 8 and 4,194,304 (4MB). If no size is specified, all data is placed in the short data area. If size is 0, no data will be placed in the short data area, and all data references will use long offsets. The default is +Oshortdata=8. +DSmodel Perform instruction scheduling appropriate for a e.g. specific implementation of the architecture. +DSnative ON IPF the defined values for model are: blended Tune for best performance on a combination of processors (i.e., Itanium or Itanium 2 processor). itanium Tune for best performance on an Itanium processor. itanium2 Tune for best performance on an Itanium 2 processor. native Tune for best performance on the processor on which the compiler is running. The default model for HP-UX 11i v1.5 is blended. -exec Indicates that any object files created will be used to create an executable file. Constants with a protected or hidden export class are placed in the read-only data section. This option also implies -Bprotected and -dynamic. -dynamic Produces dynamically bound executables. See -minshared for partially statically bound executables. The default behavior is dynamic. -minshared Indicates that the result of the current compilation is going into an executable file that will make minimal use of shared libraries. This option is only supported on HP-UX version 11i and later. -Bprotected[=symbol[,symbol...]] The named symbols, or all symbols if no symbols are specified, are assigned the protected export class. That means these symbols will not be preempted by symbols from other load modules, so the compiler may bypass the linkage table for both code and data references and bind them to locally defined code and data symbols. -Bprotected_data Marks only data symbols as having the protected export class. -Wl,-asearch e.g. (ld option -a search) Specifies library search order. -Wl,-aarchive_shared Specify whether shared or archive libraries are searched with the -l option. The value of search should be one of archive, shared, archive_shared, shared_archive, or default. This option can appear more than once, interspersed among -l options, to control the searching for each library. The default is to use the shared version of a library if one is available, or the archive version if not. If either archive or shared is active, only the specified library type is accepted. If archive_shared is active, the archive form is preferred, but the shared form is allowed. If shared_archive is active, the shared form is preferred but the archive form is allowed. [Profile Feedback Related Options] +Oprofile=collect +I Instrument the application for profile-based optimization. See ld(1), +P, and +pgm for more details. The +I option is incompatible with the -G, +P, and -S options. +I is equivalent to +Oprofile=collect. See ld(1), +P, and +pgm for more details. The +I option is incompatible with the -G, +P, and -S options. It is incompatible with the -g option only during compile time. +Oprofile=use +P Optimize the application based on profile data found in the database file flow.data, produced by compilation with +I. +P is equivalent to +Oprofile=use or +Oprofile=use:filename. See ld(1), +I, and +df, for more details. The +P option is incompatible with the +I and -S options. It is incompatible with the -g option only during compile time. ----------------------------------------------- Specific Flags for HP-UX F90 Compiler ----------------------------------------------- +cat Concatenates all source files of the same source form together, then compiles the concatenated source all at once. This enables inlining at +O3 within the concatenated file. ----------------------------------------------- Specific Flags for HP-UX C and aCC Compiler ----------------------------------------------- -Ae Turns on ANSI C c89 mode. This option allows compilation of c89 compatible C source programs just like C compiler. -AOe In addition to specifying the extended ANSI C language dialect as per -Ae (the default), allows the optimizer to aggressively exploit the assumption that the source code conforms to the ANSI programming language C standard ISO 9899:1990 plus the extensions. At present, the effect is to make +Otype_safety=ansi the default (it can of course be overridden). As new independently-controllable optimizations are developed that depend on the extended ANSI C standard, the flags that enable those optimizations may also become the default under -AOe. +inline_level [i]num This option controls how C/C++ inlining hints influence aCC or cc. Specify num as 0, 1, 2, or 3. num Meaning 0 No inlining is done (same effect as the +d option). 1 Only small functions are inlined. 2 Only large functions are not inlined. 3 Inlining hints are respected in all cases, except when the called function is recursive or when it has a variable number of arguments. The default level depends on +Olevel as shown in the following table: level num 0 1 1 1 2 2 3 2 4 2 If i is also specified, then implicit inlining is invoked for "small" functions without the inline keyword. NOTE: This option controls functions declared with the inline keyword or within the class declaration and is effective at all optimization levels. The options +Oinline and +Oinlinebudget control the high level optimizer that recognizes other opportunities in the same source file (+O3) or amongst all source files (+O4). ----------------------------------------------- Other descriptions ----------------------------------------------- -llapack Link in highly tuned math library functions found in the LAPACK library. B6061AA (HP MLIB) is an optional HP product which contains the LAPACK library. effmem.o Replacement for malloc/free that assumes ANSI compliance and improves spatial locality and minimizes memory usage by not maintaining a free list. fastmem.o Replacement for malloc/free that assumes ANSI compliance. ----------------------------------------------- Descriptions of Portability Flags ----------------------------------------------- +source={fixed|free|default} Accept source files in fixed format (+source=fixed) or free format (+source=free). The default, +source=default, is free for .f90 files and fixed for .f and .F source files. ----------------------------------------------- Descriptions of Kernel Tunables ----------------------------------------------- (Unless otherwise noted, units are in bytes) dbc_max_pct Maximum dynamic buffer cache size as a percent of system memory dbc_min_pct Minimum dynamic buffer cache size as a percent of system memory maxdsiz Maximum data size maxdsiz_64bit Maximum data size for 64 bit applications maxssiz Maximum stack size maxssiz_64bit Maximum stack size for 64 bit applications maxtsiz Maximum thread data size maxtsiz_64bit Maximum thread data size for 64 bit applications vps_ceiling Maximum System-Selected Page Size (in Kbytes) vps_pagesize Default user page size (in Kbytes) swapmem_on Swap to memory flag.