Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

VERY odd issue with malloc, siginfo: SIGSEGV SEGV_ACCERR

807567May 13 2009 — edited Jun 9 2009
I am running a C++ program that runs fine on 2 other Sun boxes (running Solaris 10), but (naturally) bombs out on the Sun box (running Solaris 10) that will be our production server. The binaries were copied from one of the other boxes so, it was not recompiled.

I have not been able to reproduce this error on any other box, but the box it fails on is the prod box so, I have to get this fixed.

I am not using LD_PRELOAD and the mdb output shows that it is getting malloc from libc.

Now, here is the REALLY odd part:
I have 2 variables, defined as long, that when set to certain integers (seems to be mainly odd numbers), the malloc fails, if I set them to other values (like even numbers or very low odd numbers), the malloc works fine.

Here is a portion of the code:
typedef struct
{
 char           proc_Name[19];
 char           proc_Status[3];
 char           proc_exec_hst_ind[3];
 long           sleep_seconds;
 long           commit_rcd_cnt;
 long           max_rcd_cnt;
 *long           min_threshold_cnt;*      <--- these 2 variables control 
 *long           max_threshold_cnt;*     <--- whether malloc fails or not
 *parmRcd *parms;*                    <--- malloc is performed on this field
} processRcd;
 
 processRcd     ProcRcd;
...  
  ProcRcd.max_threshold_cnt = 999998;   
//  ProcRcd.min_threshold_cnt = 999999;    // VJD FAILS
//  ProcRcd.min_threshold_cnt = 999998;    // VJD PASSES
//  ProcRcd.min_threshold_cnt = 999997;    // VJD FAILS
//  ProcRcd.min_threshold_cnt = 999996;    // VJD PASSES
//  ProcRcd.min_threshold_cnt = 50001;    // VJD FAILS
//  ProcRcd.min_threshold_cnt = 50000;    // VJD PASSES
//  ProcRcd.min_threshold_cnt = 501;    // VJD FAILS
//  ProcRcd.min_threshold_cnt = 500;    // VJD PASSES
//  ProcRcd.min_threshold_cnt = 47;    // VJD FAILS
//  ProcRcd.min_threshold_cnt = 42;    // VJD PASSES
  ProcRcd.min_threshold_cnt = 41;    // VJD FAILS
//  ProcRcd.min_threshold_cnt =  9;    // VJD PASSES

  userlog("VJD before calloc, parms addr = %x", ProcRcd.parms);
  ProcRcd.parms = (parmRcd *)calloc(numberOfParms, sizeof(parmRcd)); 

  userlog("VJD after calloc, parms addr = %x ", ProcRcd.parms);
  ...
I have also tried substituting malloc for calloc (same results) and moving the calloc above the block of assignments (same results).

Here is the mdb output:
::status
debugging core file of CIS_dsClient (32-bit) from nc1omzzpa05
file: /home/ncps/oracle/CIS_dsClient
initial argv: CIS_dsClient CIS_DATASEND
threading model: multi-threaded
status: process terminated by SIGSEGV (Segmentation Fault)
::stack
libc.so.1`t_splay+0x170(ffbff5b4, f4, 861, 0, fddbe3c0, ff3c52c0)
libc.so.1`realfree+0x8c(ffbff4b8, f5, e7974, 0, 0, ffbff4b0)
libc.so.1`_malloc_unlocked+0x260(ffbff3c0, 1f4, ffbff3b8, ffbff3c0, fddc1910, 0)
libc.so.1`malloc+0x4c(f8, 1, e8070, fe0c9f34, fddbe3c0, fddc85b8)
libc.so.1`calloc+0x58(3e, 3e, f8, 0, 328, 0)
getParms+0xfc(ffbff3c4, ffbff3c8, ffbff390, 0, 328, 0)
__1cIprocParmHrefresh6M_v_+0x68(ffbff38c, 228, fe371228, 0, 328, 0)
main+0x884(2, ffbff95c, ffbff968, 30800, fdff07c0, 0)
_start+0x108(0, 0, 0, 0, 0, 0)


The end of the truss output:
open64("/home/ncps/logs/ulogs/ULOG.051309", O_WRONLY|O_APPEND|O_CREAT, 0666) = 11
umask(022) = 0
write(11, " 1 5 5 8 0 4 . n c 1 o m".., 103) = 103
close(11) = 0
Incurred fault #6, FLTBOUNDS %pc = 0xFDCD7188
siginfo: SIGSEGV SEGV_MAPERR addr=0x0000000B <--- usually I get a SEGV_ACCERR
Received signal #11, SIGSEGV [default]
siginfo: SIGSEGV SEGV_MAPERR addr=0x0000000B


Log file entries:
155804.nc1omzzpa05!CIS_dsClient.13585.1.0: VJD at start, parms addr = ffbff390
155804.nc1omzzpa05!CIS_dsClient.13585.1.0: VJD before assgnmts parms addr = ffbff390
155804.nc1omzzpa05!CIS_dsClient.13585.1.0: VJD before calloc, parms addr = ffbff390, numP=4, sizeof=62
155854.nc1omzzpa05!BBL.18642.1.0: LIBTUX_CAT:216: WARN: Process 13585 died; removing from BB

Any assistance would be much appreciated!

Thank you

Edited by: Valerie101 on May 13, 2009 9:37 AM

Edited by: Valerie101 on May 13, 2009 9:38 AM
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jul 7 2009
Added on May 13 2009
3 comments
1,381 views