Hi,
I am using collect tool from Sun Studio 12, i used it succeccfully on my system using 64-bit Sun JDK 1.6
but when i try to use it to collect Hardware counter data using 32-bit jvm it failes and shows error while i trying to analyze collect data using "Analyzer" tool
The error message it shows is *"test.1.er***Collector Error: Initializing Hardware counter profiling failed"*
OS version
$ uname -a
Linux m02 2.6.18-92.el5.src-PAPI #1 SMP Tue Jan 27 10:57:40 CET 2009 x86_64 x86_64 x86_64 GNU/Linux
Java version (32-bit version that is not working)
]$ java -version
java version "1.6.0_10"
Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
Java HotSpot(TM) Server VM (build 11.0-b15, mixed mode)
the other Java version that is working fine (64-bit version)
]$ java -version
java version "1.6.0_13"
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed mode)
collect command output
]$ collect
NOTE: Linux-64-bit, 32 CPUs, CentOS_5.2 system "mi02" is supported by the Performance tools.
NOTE: The J2SE[tm] version 1.6.0_10 found at java (picked by PATH) is supported by the Performance tools.
usage: collect <args> target <target-args>
Sun Analyzer 7.7 Linux_i386 2009/06/03 [64-bit]
-p <interval> specify clock-profiling
clock profiling interval range on this system is from
10.000 to 1000.000 millisec.; resolution is 10.000 millisec.
-h <ctr_def>...[,<ctr_n_def>]
specify HW counter profiling for up to 4 HW counters
see below for more details
-s <threshold> specify synchronization wait tracing
-r <option> specify thread analyzer experiment; see man page
-H {on|off} specify heap tracing
-M {version|off} specify an MPI experiment
Supported MPI versions: CT8.2, CT8.1, CT8, CT7, CT7.1, OPENMPI, MPICH2, MVAPICH2
-m {on|off} specify MPI tracing
-j {on|off|path} specify Java profiling
-J <java-args> specify arguments to Java for Java profiling
-t <duration> specify time over which to record data
-x specify leaving the target waiting for a debugger attach
-n dry run -- don't run target or collect performance data
-y <signal>[,r] specify delayed initialization and pause/resume signal
When set, the target starts in paused mode;
if the optional r is provided, it starts in resumed mode
-F {on|off|all|=<regex>} specify following descendant processes
-A {on|off|copy} specify archiving of load-objects; default is on
-S <interval> specify periodic sampling interval (secs.)
-L <size> specify experiment size limit (MB.)
-l <signal> specify signal for samples
-o <expt> specify experiment name
-d <directory> specify experiment directory
-g <groupname> specify experiment group
-O <file> redirect all of collect's output to file
-v print expanded log of processing
-C <label> specify comment label (up to 10 may appear)
-R show the README file and exit
-V print version number and exit
Default experiment:
expt_name = test.3.er
clock profiling enabled, 10.000 millisec.
descendant processes will not be followed
periodic sampling, 1 secs.
experiment size limit 2000 MB.
experiment archiving: on
data descriptor: "p:10000;S:1;L:2000;A:1;"
host: `m02', cpuver = 2501, ncpus = 32, clock frequency 1200 MHz.
memory: 16518000 pages @ 4096 bytes = 64523 MB.
Specifying HW counters on `AMD Family 10h':
<ctr_def> == [+]<ctr>[~<attr>=<val>]...[~<attrN>=<valN>][/<reg#>][,<interval>]
<+>
for memory-related counters, attempt to backtrack to find
the triggering instruction and the virtual and physical
addresses of the memory reference
<ctr>
counter name, must be selected from the available counters
listed below. On most systems, if a counter is not listed
below, it may still be specified by its numeric value
<attr>=<val>
optional attribute where <val> can be in decimal or hex
format, and <attr> can be one of:
'umask'
'os'
'edge'
'pc'
'inv'
'cmask'
<reg#>
forces use of a specific hardware register. If not specified,
collect will attempt to place the counter into the first
available register and as a result, may be unable to place
subsequent counters due to register conflicts.
<interval> == {on|hi|lo|<value>}
`on' selects the default rate, listed below
`hi' specifies an interval ~10 times shorter than `on'
`lo' specifies an interval ~10 times longer than `on'
Aliased HW counters available for profiling:
cycles[/{0|1|2|3}],99999989 (`CPU Cycles', alias for BU_cpu_clk_unhalted; CPU-cycles)
insts[/{0|1|2|3}],9999991 (`Instructions Executed', alias for FR_retired_x86_instr_w_excp_intr; events)
ic[/{0|1|2|3}],100003 (`I$ Refs', alias for IC_fetch; events)
icm[/{0|1|2|3}],100003 (`I$ Misses', alias for IC_miss; events)
itlbh[/{0|1|2|3}],100003 (`ITLB Hits', alias for IC_itlb_L1_miss_L2_hit; events)
itlbm[/{0|1|2|3}],100003 (`ITLB Misses', alias for IC_itlb_L1_miss_L2_miss; events)
eci[/{0|1|2|3}],1000003 (`E$ Instr. Refs', alias for BU_internal_L2_req~umask=0x1; events)
ecim[/{0|1|2|3}],10007 (`E$ Instr. Misses', alias for BU_fill_req_missed_L2~umask=0x1; events)
dc[/{0|1|2|3}],1000003 (`D$ Refs', alias for DC_access; load events)
dcm[/{0|1|2|3}],100003 (`D$ Misses', alias for DC_miss; load events)
dtlbh[/{0|1|2|3}],100003 (`DTLB Hits', alias for DC_dtlb_L1_miss_L2_hit; load-store events)
dtlbm[/{0|1|2|3}],100003 (`DTLB Misses', alias for DC_dtlb_L1_miss_L2_miss; load-store events)
ecd[/{0|1|2|3}],1000003 (`E$ Data Refs', alias for BU_internal_L2_req~umask=0x2; load-store events)
ecdm[/{0|1|2|3}],10007 (`E$ Data Misses', alias for BU_fill_req_missed_L2~umask=0x2; load-store events)
fpadd[/{0|1|2|3}],1000003 (`FP Adds', alias for FP_dispatched_fpu_ops~umask=0x1; events)
fpmul[/{0|1|2|3}],1000003 (`FP Muls', alias for FP_dispatched_fpu_ops~umask=0x2; events)
fpustall[/{0|1|2|3}],1000003 (`FPU Stall Cycles', alias for FR_dispatch_stall_fpu_full; CPU-cycles)
memstall[/{0|1|2|3}],1000003 (`Memory Unit Stall Cycles', alias for FR_dispatch_stall_ls_full; CPU-cycles)
PAPI_l1_dcm[/{0|1|2|3}],100003 (`Level 1 data cache misses'; load-store events)
PAPI_l1_icm[/{0|1|2|3}],100003 (`Level 1 instruction cache misses'; events)
PAPI_l2_dcm[/{0|1|2|3}],100003 (`Level 2 data cache misses'; load-store events)
........
.......
Raw HW counters available for profiling:
*...... { I removed raw counter list to shorten this post}*
See section 3.15 of the "BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors,"
AMD publication #31116
See the collect.1 man page for more information
analyzer tool version
$ analyzer -v
analyzer: Sun Analyzer 7.7 Linux_i386 2009/06/03
Any idea what is wrong with my configuration? Can we use collect for 32-bit and java apps?
Thanks a lot
Edited by: allo6 on Aug 20, 2009 3:25 AM