IBM AIX Flag Disclosure SPEC CPU2000 Last Revised 04 June , 2003 Source Level Portability Options ================================ -DHOST_WORDS_BIG_ENDIAN (176.gcc) Host system is big-endian. -DAIX (186.crafty) Sets some basic parameters like endian-ess, OS type, and ANSI language extensions to be compatible with an AIX system. -DNDEBUG (252.eon) SPEC default for C++ compiler but also needed explicitly by some linkers. Defining this disables any assert macros used for debugging. -DNEED_EXPLICIT_SPECIALIZATION (252.eon) Supply function definitions with explicit types in two cases where templatized versions fail to compile. -DSPEC_CPU2000_AIX (253.perlbmk) Compile the SPEC CPU2000 modified perl for an AIX system. -DSYS_IS_BSD (254.gap) Compile gap for a BSDish system. -DSYS_STRING_H (254.gap) Do not explicitly include string.h -DSYS_HAS_TIME_PROTO (254.gap) Do not supply prototypes for the time(), times() and getrusage() functions. -DSYS_HAS_MALLOC_PROTO (254.gap) Do not supply prototypes for malloc() and free(). -DSYS_HAS_CALLOC_PROTO (254.gap) Do not supply a prototype for calloc(). -DHAVE_SIGNED_CHAR (300.twolf) System allows signed char type. Compiler Invocation =================== xlc Invokes the compiler for C source files with a default language level of ansi and specifies that it allow type-based aliasing. cc Invokes the compiler for C source files with a default language of extended and specifies that it provide compatibility with older IBM compilers and allow placement of string literals or constant values in read/write storage. cc does not conform to the ISO/ANSI C standard. Compiler Options ================ -ma Use built-in alloca() function. -O Performs optimizations that the compiler developers considered the best combination for compilation speed and runtime performance. -O3 Perform some memory and compile time intensive optimizations in addition to those executed with -O. The -O3 specific optimizations have the potential to slightly alter the semantics of a user's program. Optimizations may include, but are not limited to: Aggressive code motion, and scheduling on computations that have the potential to raise an exception; Relaxed conformance to IEEE rules in cases where the difference in the results is not important to an application; Rewriting of floating point expressions. -O4 Equivalent to -O3 -qipa -qhot with automatic generation of architecture ( -qarch= )and tuning ( -qtune= )options ideal for that platform. The qipa level defaults to level=1. -O5 Equivalent to -O3 -qipa=level=2 -qhot with automatic generation of architecture ( -qarch= ) and tuning ( -qtune= ) options ideal for that platform. -D_ILS_MACROS: Defined in /usr/include/ctypes.h to use the macro version of the classification functions (e.g. isupper()). -Q, -qinline The -Q option without any list inlines all appropriate procedures, subject to limits on the number of inlined calls and the amount of code size increase as a result. -qinline is an alias for -Q. -Q=xxx Inline all functions that contain less than xxx lines of abstract code units. -q64 Selects 64-bit compiler mode. -qalign=natural The compiler maps structure members to their natural boundaries. -qansialias Use type-based aliasing during optimization -qarch=ppc Produces object code containing instructions that will run on any of the 32-bit PowerPC hardware platforms. -qarch=pwr3 Produces object code containing instructions that will run on power3 processors. -qarch=rs64b Produces object code containing instructions that will run on RS64-II processors. -qdatalocal Changes the default to assume that all variables ar local. -qessl Specifies that, if either -lessl or -lesslsmp are also specified, then Engineering and Scientific Subroutine Library (ESSL) routines should be used in place of some Fortran 90 intrinsic procedures when there is a safe opportunity to do so. -qfixed Indicates that the input source program is in fixed form. Allows fixed format Fortran 77 programs to be compiled using the xlf90 compiler invocation. -qhot perform high-order transformations on loops during optimization. -qipa=level=1 Turns on interprocedural analysis with inlining, limited alias analysis, and limited call-site tailoring. This is the default level of -qipa. -qipa=level=2 Turns on interprocedural analysis with inlining, cloning, full alias analysis, constant propagation, call-site tailoring, and dead code removal. -qinline Alias for -Q. See -Q. -qipa=partition=large Specifies the size of the regions within the program to analyze. Larger partitions contain more procedures, which result in better interprocedural analysis but require more storage to optimize. -qlanglvl=ansi Compilation conforms to the ANSI standard. -qpdf1/pdf2 Profile directed feedback optimization -qsuffix=f=f90 Sets the suffix for source files to be .f90. The .f90 suffix is required by xlf90 to compile Fortran 90 programs. -qtune=604 Instruction selection, scheduling, and other implementation dependent performance enhancements for the PowerPC 604/604e processor. -qtune=pwr3 Instruction selection, scheduling, and other implementation dependent performance enhancements for the Power3 processor. -qtune=rs64b Instruction selection, scheduling, and other implementation dependent performance enhancements for the RS64-II processor. -qunroll=n Unrolls inner loops in th program by a factor of n. Linker Options ============== -Ldir Link looks in the directory that is specified by the option "dir". -lessl Link the Engineering and Scientifc Subroutine Library (ESSL). -lmass Link the mathematical acceleration subsystem libraries (MASS), which contain libraries of tuned mathematical intrinsic functions. See http://techsupport.services.ibm.com/server/mass?fetch=home.html -lhmu Link fast malloc libraries. These libraries are part of the memdbg package that is included with IBM C compilers. -lpdf Routines used in the first pass of the profile directed feedback process. Routines from this library are not used in building the final executable. In newer compilers, -qpdf1 does this automatically, so using this in conjunction with -qpdf1 is redundant. -blpdata Sets the bit in the file's XCOFF header indicating that this executable will request the use of large pages when they are available on the system and when the user has an appropriate privilege -bnso Brings referenced library procedures into the object file -bI:/lib/syscalls.exp Create statically linked object files (syscalls.exp supplies the names of the routines that can be imported). FDPR: ----- The fdpr (feedback directed program restructuring) program optimizes the executable image of a program by collecting information on the behavior of the program while the program is used for some typical workload, and then creating a new version. It is available on AIX Version 4 and 5 systems as part of the Performance Toolbox for AIX. Options: -o OutFile Specifies the name of the output file from the optimizer. -p ProgramName The name of the executable program to optimize. -v Selects verbose output during processing/compilation -x Command Specifies the command used for invoking the instrumented program. All the arguments after the -x flag are used for the invocation. -R2 Employ a program-reordering technique in which the original structure of the program, including traceback entries, is preserved. -R3 Employ global reordering techniques that do not preserve debug information. System Configuration Settings: ------------------------------ chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE $USER Allows $USER (non-root ID) to access the large pages that are available. bosboot -a Creates a boot image used on the next system reboot shutdown -r Halt the operating system and reboot Large Page Settings using: vmtune or vmtune64 command options: -g Sets the page size for the large page Example: -g 16777216 for 16M page -L Sets the number of large pages Example: -L 200 allows 200 large pages -y1 Enables the memory affinity Virtual Memory Manager(VMM) vmo command options : -r apply changes at the system boot -o lgpg_regions=# Specifies the number of large pages to reserve Example: #=32 allows 32 large pages to be reserved -o lgpg_size=# Specifies the size in bytes of the hardware-supported large pages Example: #=16777216 is a 16M page size -o memory_affinity=1 Enable the VMM to restrict the memory frames attached to the executing MCM Environment Variable MEMORY_AFFINITY=MCM Turn on Memory Affinity which has been enabled with the vmo command