Description of compiler flags for Intel C Compiler 5.0 ------------------------------------------------------ /O1 optimize for speed, but disable some optimizations which increase code size for a small speed benefit. Includes inline expansion except for intrinsic functions, global optimizations, string pooling optimizations. /O2 Optimizes for speed. The -O2 option includes O1 optimizations and in addition enables inlining of intrinsics and more speed optimizations. /O3: Builds on -01 and -02 optimizations by enabling high-level optimization. This level does not guarantee higher performance unless loop and memory access transformation take place. In conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the compiler to perform more aggressive data dependency analysis than for -O2. This may result in longer compilation times. /Oa[-] assume [do not assume] no aliasing in program /Qax generate code specialized for processor extensions specified by while also generating generic IA-32 code. includes one or more of the following characters: i Pentium Pro and Pentium II processor instructions M MMX(TM) instructions K streaming SIMD extensions (implies i and M above) W Pentium 4 processor with Streaming SIMD Extensions 2 (implies i, M and K) /Qx generate specialized code to run exclusively on processors supporting the extensions indicated by as described above. /Ob{1|2|3} Controls the compiler's inline expansion. 0: disable inlining. 1: disables inlining unless /Qip or /Ob2 are specified. 2: enables inlining of any function. However, the compiler decides which functions are inlined. This option enables interprocedural optimizations and has the same effect as specifying the /Qip option. /Qip enable single-file IP optimizations (within files, same as /Ob2) /Qipo multi-file ip optimizations that includes: - inline function expansion - interprocedural constant propogation - dead code elimination - propagation of function characteristics - passing arguments in registers - loop-invariant code motion /Qprof_gen instrument program for profiling for the first phase of two-phase profile guided otimization /Qprof_use Instructs the compiler to produce a profile-optimized executable and merges available dynamic information (.dyn) files into a pgopti.dpi file. If you perform multiple executions of the instrumented program, -Qprof_use merges the dynamic information files again and overwrites the previous pgopti.dpi file. Without any other options, the current directory is searched for .dyn files /Qrcd The Intel compiler uses the -Qrcd option to improve the performance of code that requires floating-point-to-integer conversions. The system default floating point rounding mode is round-to-nearest. This means that values are rounded during floating point calculations. However, the C language requires floating point values to be truncated when a conversion to an integer is involved. To do this, the compiler must change the rounding mode to truncation before each floating point-to-integer conversion and change it back afterwards. The -Qrcd option disables the change to truncation of the rounding mode for all floating point calculations, including floating point-to-integer conversions. Turning on this option can improve performance, but floating point conversions to integer will not conform to C semantics. /GX Enables the full C++ Exception Handling unwind semantics. /GR Enables C++ Runtime Type Information (RTTI). shlW32M.lib: MicroQuill SmartHeap Library 5.0 available from http://www.microquill.com/ Description of compiler flags for Intel FORTRAN Compiler 5.0 ------------------------------------------------------------ /O1 optimize for speed, but disable some optimizations which increase code size for a small speed benefit. Includes inline expansion except for intrinsic functions, global optimizations, string pooling optimizations. /O2 Optimizes for speed. The -O2 option includes O1 optimizations and in addition enables inlining of intrinsics and more speed optimizations. /O3: Builds on -01 and -02 optimizations by enabling high-level optimization. This level does not guarantee higher performance unless loop and memory access transformation take place. In conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the compiler to perform more aggressive data dependency analysis than for -O2. This may result in longer compilation times. /Qax generate code specialized for processor extensions specified by while also generating generic IA-32 code. includes one or more of the following characters: i Pentium Pro and Pentium II processor instructions M MMX(TM) instructions K streaming SIMD extensions (implies i and M above) W Pentium 4 processor with Streaming SIMD Extensions 2 (implies i, M and K above) /Qx generate specialized code to run exclusively on processors supporting the extensions indicated by as described above. /Qip enable single-file IP optimizations (within files, same as /Ob2) /Qipo multi-file ip optimizations that includes: - inline function expansion - interprocedural constant propogation - dead code elimination - propagation of function characteristics - passing arguments in registers - loop-invariant code motion /Qprof_gen instrument program for profiling for the first phase of two-phase profile guided otimization /Qprof_use Instructs the compiler to produce a profile-optimized executable and merges available dynamic information (.dyn) files into a pgopti.dpi file. If you perform multiple executions of the instrumented program, -Qprof_use merges the dynamic information files again and overwrites the previous pgopti.dpi file. Without any other options, the current directory is searched for .dyn files /Qrcd Enables fast float-to-int conversion. Other Notes: ------------ "/" and "-" are both allowable starting tokens for flags passed to the compiler i.e. -QxK and /QxK are identical switches. Portability options for CPU2000: ------------------------------- 176.gcc: -Dalloca=_alloca : so as to use the built-in optimized alloca /Fn : 176.gcc uses alloca and this options tells the linker to pre-allocate n bytes of stack. The default amount of stack allocated is not enough and 176.gcc crashes with a run-time error 178.galgel: -FI : Fixed-format F90 source code. -F32000000 : Same as with 176.gcc, pre-allocates a 32MB stack 186.crafty: -DNT_i386 : Specifies that it is a Windows NT Intel processor-based system which makes the compiler use "long long" as the 64-bit variable that 186.crafty needs. 253.perlbmk: -DSPEC_CPU2000_NTOS : This enables the code changes for porting to Windows get included -DPERLDLL : On Windows, we need a perl.exe instead of a perl.exe and perl.dll. This pre-define ensures that the changes necessary to get a single, UNIX-style executible without getting the indirect calls that can cause a 10% performance degradation. This allows the Windows-based executible to be as close as possible to the Unix-based one. /MT : Use the static multi-threaded library else it will not compile. 254.gap: -DSYS_HAS_CALLOC_PROTO : -DSYS_HAS_MALLOC_PROTO : These two pre-defines tell of the existence of malloc and calloc prototypes.