Interrupt storm

Interrupt storm

In operating systems, an interrupt storm is an event during which a processor receives an inordinate number of interrupts that consume the majority of the processor's time. Interrupt storms are typically caused by hardware devices that do not support interrupt rate limiting.

Background

Because interrupt processing is typically a non-preemptible task in time-sharing operating systems, an interrupt storm will cause low perceived system responsiveness, or even appear to be a complete system freeze. This state is commonly known as "live lock". In such a state, the system is spending so much time processing interrupts that it is not completing any other work. Therefore, it does not appear to be processing anything at all, because of a lack of output to the user, the network, or otherwise. An interrupt storm is sometimes mistaken for thrashing, since they both have similar symptoms, but different causes.

An interrupt storm can have many different causes, including misconfigured or faulty hardware devices, faulty device drivers, or flaws in the operating system. Most modern hardware implement methods for reducing or eliminating the possibility of an interrupt storm. For example, many Ethernet controllers implement interrupt "rate limiting", which causes the controller to wait a programmable minimum amount of time between each interrupt it generates.

The most common interrupt storm is a faulty driver under an APIC (Advanced Programmable Interrupt Controller) where a device "behind" another signals an interrupt to the APIC. The OS then asks each driver on that interrupt if it was from its hardware. Faulty drivers may always claim "yes", but then proceed no further as the hardware attached actually did not interrupt. The device which originally interrupted did not get its interrupt serviced, so interrupts again and the cycle begins anew. The system locks dead under an interrupt storm. This was (and remains) a problem on the SoundBlaster Live! series of sound cards on some motherboards; only a kernel debugger can break the storm by unloading the faulty driver.

Many OSes implement a polling mode that disables interrupts for devices which generate too many interrupts. In this mode, the OS periodically queries the hardware for pending tasks. As the number of interrupts increase and the efficiency of an interrupt mode diminishes, an OS may change the interrupting device from an interrupt mode to a polling mode. Likewise, as the polling mode becomes less efficient than the interrupt mode, the OS will switch the device back to the interrupt mode. The implementation of interrupt rate limiting in hardware almost negates the need for such polling modes.

History

Perhaps the first interrupt storm occurred during the Apollo 11's lunar descent in 1969.

Considerations

Interrupt rate limiting must be carefully configured for optimum results. For example, an Ethernet controller with interrupt rate limiting will buffer the packets it receives from the network in between each interrupt. If the rate is set too high, the controller's buffer will overflow, and packets will be dropped. The rate must take into account how fast the buffer may fill between interrupts, and the interrupt latency between the interrupt and the transfer of the buffer to the system.

Interrupt mitigating

There are hardware-based and software-based approaches to the problem. FreeBSD detects interrupt storms and masks problematic interrupt for some time. Other possible scheme is the one used by NAPI
* System (driver) starts in interrupt enabled state
* Interrupt handler disables the interrupt and lets a thread/task handle the event(s) (example of event is an incoming Ethernet packet)
* Task polls the device, processes some number of events and enables the interrupt

Another interesting approach using hardware support — device generates interrupt when event queue state changes from "empty" to "not empty"

Device
* If there is no free DMA descriptors at the RX FIFO tail drop the event
* Add event to the tail and mark the FIFO entry as occupied
* If entry (tail−1) is free (cleared), generate interrupt (level interrupt)
* Increment tail pointer

CPU (interrupt handler)
* Acknowledge the interrupt (if hardware requires acknowledge)
* Handle all (part of) valid DMA descriptors at head
* return from interrupt

See also

* Broadcast radiation
* Inter-processor interrupt
* Interrupt handler
* Non-maskable interrupt
* Programmable Interrupt Controller


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Interrupt — This article is about computer interrupts. For the study of the effect of disruptions on job performance, see Interruption science. In computing, an interrupt is an asynchronous signal indicating the need for attention or a synchronous event in… …   Wikipedia

  • Interrupt latency — In Realtime Operating Systems, Interrupt latency is the time between the generation of an interrupt by a device and the servicing of the device which generated the interrupt. For many operating systems, devices are serviced as soon as the device… …   Wikipedia

  • break — breakable, adj. breakableness, n. breakably, adv. breakless, adj. /brayk/, v., broke or (Archaic) brake; broken or (Archaic) broke; breaking; n. v.t …   Universalium

  • climate — /kluy mit/, n. 1. the composite or generally prevailing weather conditions of a region, as temperature, air pressure, humidity, precipitation, sunshine, cloudiness, and winds, throughout the year, averaged over a series of years. 2. a region or… …   Universalium

  • break — [[t]breɪk[/t]] v. broke, bro•ken, break•ing, n. 1) to smash, split, or divide into parts violently 2) to disable or destroy by or as if by shattering or crushing: I broke my watch[/ex] 3) to violate or disregard (a law, promise, etc.) 4) to… …   From formal English to slang

  • List of Emily Dickinson poems — This is a list of Emily Dickinson poems. There are 1,775 known poems that have been written by Dickinson. The poems are alphabetized by their first line. Punctuation, capitalization and even in some cases wording of the first lines may vary… …   Wikipedia

  • Tropical cyclone — Hurricane redirects here. For other uses, see Hurricane (disambiguation). Hurricane Isabel (2003) as seen from orbit during Expedition 7 of the International Space Station. The eye, eyewall and surrounding rainbands that are characteristics of… …   Wikipedia

  • break — break1 W1S1 [breık] v past tense broke [brəuk US brouk] past participle broken [ˈbrəukən US ˈbrou ] ▬▬▬▬▬▬▬ 1¦(separate into pieces)¦ 2¦(bones)¦ 3¦(machines)¦ 4¦(rules/laws)¦ 5¦(promise/agreement)¦ 6¦(stop/rest)¦ 7¦(end something)¦ …   Dictionary of contemporary English

  • List of computing and IT abbreviations — This is a list of computing and IT acronyms and abbreviations. Contents: 0–9 A B C D E F G H I J K L M N O P Q R S T U V W X Y …   Wikipedia

  • Wikipedia:Featured article candidates — Here, we determine which articles are to be featured articles (FAs). FAs exemplify Wikipedia s very best work and satisfy the FA criteria. All editors are welcome to review nominations; please see the review FAQ. Before nominating an article,… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”