[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
SEVERE Bug in mc68360 _ISR_Handler???
- Date: Tue, 17 Jul 2001 13:47:47 +0200
- From: Thomas.Doerfler at imd-systems.de (Thomas Doerfler)
- Subject: SEVERE Bug in mc68360 _ISR_Handler???
i address this list to get some help concerning the behaviour of the
_ISR_Handler used for the MC68360 in rtems-4.5.0. I think there is a
very small chance, that lower-level interrupts get lost (or delayed
forever), when a higher level interrupt comes up at a critical point.
This mail is going to be a bit long, but the issue is rather
I have designed a system based on the MC68360 (and the gen68360 BSP),
which is heavily working with Ethernet and TCP/IP. Ethernet works
with built-in SCC1, all CPM interrupt sources are handled on IRQ
Level 4. I use the PIT as system clock timer, working on IRQ Level 6
(so it is higher than the CPM IRQ level).
All in all the system works fine, but in very rare occasions the
system communication interfaces got stuck. Last week I succeeded to
find out why. I built a test environment and sent UDP packets to the
system with almost all the ethernet bandwidth, adding a flood ping to
the network load. In that environment it took between 1 and 4 hours
until the system got stuck, and then I found that the "In-Service-
Bit" of SCC1 in the CPM Interrupt Controller was set although the
core did not execute the corresponding interrupt function.
This bit gets set whenever the CPM Interrupt Controller sends the
SCC1 vector number to the CPU and must be cleared in software. As
long as this bit is set, no other CPM interrupts will be issued.
NOTE: Even the SCC1 interrupt request will no longer be asserted
until this bit gets cleared.
The code of the SCC1 interrupt handler
"m360Enet_interrupt_handler (rtems_vector_number v)"
is correct, whenever this handler gets called, the ISR bit is
definitively cleared. So my assumption is, that:
1) a SCC1 interrupt gets asserted,
2) then the CPU performs the corresponding vector fetch
3) but in rare conditions the corresponding handler will not get
By the way: I lowered the PIT IRQ request level to 3, then the system
STRUCTURE OF _ISR_HANDLER
For the MC68360 target, the following Preprocessor options are
The function "_ISR_Handler" in exec/score/cpu/m68k/cpu_asm.S performs
the following basic steps:
A) Increment _Thread_Dispatch_disable_level
B) disable all interrupts
C) If _ISR_Nest_level==0: switch from task stack to interrupt stack
D) Increment _ISR_Nest_level
E) reenable higher interrupts
F) call user interrupt handler
G) disable all interrupts
H) Decrement _ISR_Nest_level
I) If _ISR_Nest_level==0: switch back from int stack to task stack
J) reenable higher interrupts
K) Decrement _Thread_Dispatch_disable_level
L) If _Thread_Dispatch_disable_level==0 and Context switch needed:
switch to new context (using _Thread_Dispatch)
M) return to interrupted code
ASSUMED BUG SEQUENCE
I assume, that the following events may loose the Level4 SCC1
1) A SCC1 IRQ4 occures, the CPU performs a vector fetch, the CPM
Interrupt controller supplies the corresponding vector and sets the
2) The CPU enters _ISR_Handler for Level 4/SCC1 Interrupt.
3) Before any real code gets executed, the PIT times out, issueing a
Level 6 Interrupt, so the CPU stores its basic context on the current
(task) stack and reenters _ISR_Handler for Level 6/PIT. Please note,
that _ISR_Nest_level and _Thread_Dispatch_disable_level have not yet
been intcremented for the SCC1 Interrupt.
4) The PIT Interrupt Handler executes and requests a context switch
(wakes up some task or so).
5) the general _ISR_Handler for Level 6/PIT then finds out, that it
was the only instance of _ISR_Handler running (because
_Thread_Dispatch_disable_level was 0) and therefore it performs a
context switch according to step L). This will make the corresponding
"woken" task to be executed, not the SCC1 interrupt handler.
So what do we have now:
- the SCC1 driver's interrupt handler has not yet been executed
- the physical SCC1 interrupt request signal is not applied to the
CPU, because it is locked out due to the still-set "SCC1 In-Service"
- Any further CPM interrupts are blocked
- the CPU executes the woken task, not knowing that it should resume
executing the SCC1 interrupt function
The SCC1 interrupt function might resume, when RTEMS switches back to
the suspended task, but this does not seem to happen
At the head of the _ISR_Handler code, a comment states:
* With this approach, lower priority interrupts may
* execute twice if a higher priority interrupt is
* acknowledged before _Thread_Dispatch_disable is
* incremented and the higher priority interrupt
* performs a context switch after executing. The lower
* priority interrupt will execute (1) at the end of the
* higher priority interrupt in the new context if
* permitted by the new interrupt level mask, and (2) when
* the original context regains the cpu.
The statement itself was very suprising for me. And from my point of
view, case (1) is not true for hardware, that negates the interrupt
request as soon as the CPU has performed the vector fetch (which is
absolutely legal according to the M68K architecture).
It may take a LONG time until case (2) occures. In my situation I
assume that this doesn't occure at all :-((
1) I don't understand, why the suspended context does not get
2) I don't have a better solution for _ISR_Handler. Any ideas?
3) I can't belive, that I would be the first one to find that problem?
4) I don't know, whether I am on the right track at all...
So here we are. I hope I could make my ideas clear in this mail. Any
IMD Ingenieurbuero fuer Microcomputertechnik
Thomas Doerfler Herbststrasse 8
D-82178 Puchheim Germany
email: Thomas.Doerfler at imd-systems.de
PGP public key available at: http://www.imd-systems.de/pgp_key.htm