Stack full or something else?

Gedare Bloom gedare at gwmail.gwu.edu
Mon Nov 15 12:34:31 CST 2010


João,

You can configure extra space for all of your tasks like this:
  #define CONFIGURE_EXTRA_TASK_STACKS (RTEMS_MINIMUM_STACK_SIZE * ??)
Where you put ?? to make it a multiple of min stack size, or you can just
put an arbitrary number of bytes.

Otherwise you can increase the stack_size argument to rtems_task_create for
a particular task.

You might also try configuring with CONFIGURE_STACK_CHECKER_ENABLED
defined.  It does some extra checks for stack bounds and might give you a
tighter window for debugging where the problem is occurring.

-Gedare


On Mon, Nov 15, 2010 at 1:13 PM, João Rasta <freakforever at gmail.com> wrote:

> Hi Joel,
>
> Good point. I have a task that is heavy on memory operations (multi-size
> array operations and so on). It also has a lot of doubles so it can also be
> corrupting its stack. Too bad it is 'silent' when corrupting the memory one
> way or another..
>
> To eliminate the stack overflow issue, i guess it is not enough to increase
> the initial task stack size. Is there a way to control the subsequent called
> task stack sizes?
>
>
> Best,
> JM
>
>
>
> On Mon, Nov 15, 2010 at 5:43 PM, Joel Sherrill <joel.sherrill at oarcorp.com>wrote:
>
>> On 11/15/2010 11:24 AM, João Rasta wrote:
>>
>>> After some hard debuggin' with gdb i found out that this error is
>>> occurring at _Semaphore_Translate_core_mutex_return_code() but i still don't
>>> know why it happens. Here's the disassembly of the code where the error is
>>> generated
>>>
>>> 0x40040d88 <rtems_semaphore_obtain+192>: ld      [ %l3 ], %g1
>>> 0x40040d8c <rtems_semaphore_obtain+196>: call    0x400410c8
>>> <_Semaphore_Translate_core_mutex_return_code>
>>> 0x40040d90 <rtems_semaphore_obtain+200>: ld      [ %g1 + 0x34 ], %o0
>>> 0x40040e20 <rtems_semaphore_obtain+344>: b       0x40040d8c
>>> <rtems_semaphore_obtain+196>
>>> 0x40040e24 <rtems_semaphore_obtain+348>: ld      [ %l3 ], %g1
>>> 0x40040ed8 <rtems_semaphore_obtain+528>: b       0x40040d8c
>>> <rtems_semaphore_obtain+196>
>>> 0x40040edc <rtems_semaphore_obtain+532>: ld      [ %l3 ], %g1
>>>
>>> And Stack copy:
>>>
>>> Thread [3] (Suspended: Signal 'SIGSEGV' received. Description:
>>> Segmentation fault.)
>>>    3 rtems_semaphore_obtain()
>>> c:\opt\rtems-4.10-mingw\src\rtems-4.10\cpukit\rtems\src\semobtain.c:90
>>> 0x40040edc
>>>    2 <symbol is not available> 0x00000008
>>>    1 <symbol is not available> 0x0000000c
>>>
>>>  If this is the backtrace, somehow the stack pointer has gotten
>> corrupted.
>>
>>  The last value i could get from %g1 is 0.
>>>
>> If this task is not blowing its stack, then we are left with a couple
>> of guesses:
>>
>> + another task is blowing its stack and corrupting memory
>> that impacts this task.
>> + stray write is corrupting something.
>>
>> When did you see the %g1 have 0?  What was the last instruction?
>> Where was it loading from?
>>
>>  Here's how i'm configuring semaphores:
>>>
>>> #define CONFIGURE_MAXIMUM_POSIX_SEMAPHORES            15
>>>
>>>  This only affects POSIX semaphores sem_XXX.
>>
>>
>>> Any hints on what it can be failing? Should i set
>>> CONFIGURE_MAXIMUM_SEMAPHORES as well?
>>>
>>>  I doubt it since you appear to be completing a semaphore_obtain.
>> That means (probably) that a semaphore_create worked.
>>
>>  Do a backtrace in gdb.  I suspect you will find you are  coming from a
>> subsystem
>> like termios.
>>
>> -joel
>>
>>>
>>> Best,
>>> JM
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Nov 12, 2010 at 7:34 PM, Joel Sherrill <
>>> joel.sherrill at oarcorp.com <mailto:joel.sherrill at oarcorp.com>> wrote:
>>>
>>>    On 11/12/2010 01:05 PM, João Rasta wrote:
>>>
>>>        Hi Daron,
>>>
>>>        It is running on a LEON-3. The application exits raising the
>>>        following exception:
>>>
>>>        IU in error mode (tt = 0x07)
>>>
>>>        which is a memory access to an unaligned address. The last
>>>        instruction is this:
>>>
>>>        4003bd74  d0006034   ld  [%g1 + 0x34], %o0
>>>
>>>    What is the value of g1?  Is it unaligned?
>>>
>>>        I don't understand why i have this error. The code where this
>>>        error is being reported is compiled independently and then put
>>>        in a library, but it uses the same cross-compiler as the main
>>>        source code. I don't think i'm doing something wrong while
>>>        compiling the library files, i use the same compilation flags..
>>>
>>>        What can i be missing to have unaligned memory accesses?
>>>
>>>
>>>        Best,
>>>        JM
>>>
>>>
>>>
>>>        On Fri, Nov 12, 2010 at 6:43 PM, Daron Chabot
>>>        <daron.chabot at gmail.com <mailto:daron.chabot at gmail.com>
>>>        <mailto:daron.chabot at gmail.com
>>>        <mailto:daron.chabot at gmail.com>>> wrote:
>>>
>>>
>>>           On Fri, Nov 12, 2010 at 11:50 AM, João Rasta
>>>        <freakforever at gmail.com <mailto:freakforever at gmail.com>
>>>        <mailto:freakforever at gmail.com
>>>
>>>        <mailto:freakforever at gmail.com>>> wrote:
>>>
>>>               Hi,
>>>
>>>               I have an RTEMS POSIX API application which comes to a
>>>        point
>>>               that if a small function (with some doubles passed as
>>>               arguments) is called, the application exits with an
>>>        error. At
>>>               first i thought of increasing the stack space. I did
>>>        this with
>>>               CONFIGURE_POSIX_INIT_THREAD_STACK_SIZE but the problem
>>>        remains
>>>               even if i remove all the contents of the function.
>>>
>>>               1) Am i setting up the stack size correctly? I think i
>>>        am, but
>>>               i ask just in case..
>>>
>>>               2) Is there any other explanation to why a function call
>>>               crashes the application besides having a full stack?
>>>        Again, i
>>>               erased the function contents..
>>>
>>>
>>>           What architecture is this running on ?
>>>
>>>           What is the application exit error (message and/or return
>>>        code)?
>>>
>>>           It looks like all POSIX threads are created as floating-point
>>>           tasks (FP state saved across context switches), so there
>>>           "shouldn't" be a problem on that aspect...
>>>
>>>
>>>
>>>               Best,
>>>               JM
>>>
>>>
>>>
>>>               _______________________________________________
>>>               rtems-users mailing list
>>>        rtems-users at rtems.org <mailto:rtems-users at rtems.org>
>>>        <mailto:rtems-users at rtems.org <mailto:rtems-users at rtems.org>>
>>>
>>>
>>>        http://www.rtems.org/mailman/listinfo/rtems-users
>>>
>>>
>>>
>>>
>>>
>>>    --     Joel Sherrill, Ph.D.             Director of Research&
>>>  Development
>>>    joel.sherrill at OARcorp.com        On-Line Applications Research
>>>    Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>>>      Support Available             (256) 722-9985
>>>
>>>
>>>
>>>
>>
>> --
>> Joel Sherrill, Ph.D.             Director of Research&  Development
>> joel.sherrill at OARcorp.com        On-Line Applications Research
>> Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>>   Support Available             (256) 722-9985
>>
>>
>>
>
> _______________________________________________
> rtems-users mailing list
> rtems-users at rtems.org
> http://www.rtems.org/mailman/listinfo/rtems-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.rtems.org/pipermail/rtems-users/attachments/20101115/376798ee/attachment.html>


More information about the rtems-users mailing list