基础
<h2>函数</h2>
<pre><code class="language-c">// 禁止/允许执行中断下半部
__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); // 增加软中断计数。在 do_softirq 中判断如果 in_interrupt() 则退出,避免软中断嵌套
__local_bh_enable(SOFTIRQ_OFFSET); // 减少软中断计数
// 禁止/允许调度其它任务?(自旋锁里有防抢占)
preempt_disable() // 抢占计数增加。在 x86 上 __preempt_count 是每 CPU 变量?
preempt_enable() // 抢占计数减少,并马上检查是否需要调度高优先级任务
// 网络摘抄
local_irq_disable() // 禁止本地中断传递。在 x86 上本质就是 cli
local_irq_enable() // 激活本地中断传递。在 x86 上本质就是 sti
local_irq_save(flags) // 保存本地中断传递的当前状态,然后禁止本地中断传递。在 x86 上本质就是 pushf + pop 先保存标志寄存器到一个变量 flags 中,然后再调用 cli
local_irq_restore(flags) // 恢复本地中断传递状态。在 x86 上本质就是 push + popf 重置标志寄存器
</code></pre>
<h2>问题</h2>
<p>1、硬中断中不能睡眠,因为无法调度回来;软中断中也不能睡眠,因为在硬中断返回时也会执行软中断(此时硬中断计数已经平衡),但那里还是属于硬中断上下文。如果软中断是在内核线程中执行,是否可以睡眠?</p>
<h2>基础</h2>
<p>以下是从网络摘抄:</p>
<pre><code class="language-c">#define hardirq_count() (preempt_count() &amp; HARDIRQ_MASK)
#define softirq_count() (preempt_count() &amp; SOFTIRQ_MASK)
#define irq_count() (preempt_count() &amp; (HARDIRQ_MASK | SOFTIRQ_MASK | NMI_MASK))
#define in_irq() (hardirq_count()) // 判断当前是否在硬件中断上下文
#define in_softirq() (softirq_count()) // 判断当前是否在软件中断上下文
#define in_interrupt() (irq_count()) // 判断当前是否在中断状态(硬中断或软中断、上下半部)
/*
* PREEMPT_MASK: 0x000000ff
* SOFTIRQ_MASK: 0x0000ff00
* HARDIRQ_MASK: 0x03ff0000
* NMI_MASK: 0x04000000
*/
// 这里其实把 preempt_count 划分成了四部分:抢占计数器、软中断计数、硬件中断计数、NMI计数。
// 抢占计数器:0-7位
// 软中断计数器:8-15位
// 硬中断计数器:16-25位
// NMI标识:26位
</code></pre>
<h2>硬中断</h2>
<p>中断处理不会嵌套,防止栈溢出。</p>
<p>> 参考文档:<a href="https://zhuanlan.zhihu.com/p/113002536">https://zhuanlan.zhihu.com/p/113002536</a></p>
<h2>软中断</h2>
<p>执行时机:
1、硬中断完成时,会检查执行:在 irq_exit() 中
2、ksoftirqd 进程执行</p>
<h2>异常</h2>
<p>据说:异常都是不可屏蔽的。</p>
<p>> 参考文档: <a href="https://zhuanlan.zhihu.com/p/336775510">https://zhuanlan.zhihu.com/p/336775510</a></p>
<h2>特性</h2>
<p>1、在 linux 内核中,如果在执行 local_irq_disable() 后有硬件中断触发会怎么样,中断事件会丢失吗?
不会,内核禁用中断后发生的中断会被挂起,待内核启用中断后,会继续处理这些挂起的中断</p>
<h2>疑问</h2>
<p>软中断不能嵌套,但相同类型的软中断可以在不同CPU上并行执行。
不能嵌套的原因就是:每次执行下半部时,都先检查 <code>in_interrupt()</code> 条件</p>
<pre><code class="language-c">// file: kernel/softirq.c
asmlinkage __visible void do_softirq(void)
{
__u32 pending;
unsigned long flags;
if (in_interrupt()) // 如果有中断计数(当然包括软中断计数)就退出,避免下半部嵌套
return;
local_irq_save(flags);
pending = local_softirq_pending();
if (pending &amp;&amp; !ksoftirqd_running(pending))
do_softirq_own_stack();
local_irq_restore(flags);
}</code></pre>
<h2>中断分析</h2>
<p>以 4.19 为例:</p>
<pre><code class="language-c">// file: arch/x86/kernel/irq.c
/*
* do_IRQ handles all normal device IRQ's (the special
* SMP cross-CPU interrupts have their own specific
* handlers).
*/
__visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
{
struct pt_regs *old_regs = set_irq_regs(regs);
struct irq_desc * desc;
/* high bit used in ret_from_ code */
unsigned vector = ~regs-&gt;orig_ax;
entering_irq(); // 主要就是 preempt_count_add(HARDIRQ_OFFSET)
/* entering_irq() tells RCU that we're not quiescent. Check it. */
RCU_LOCKDEP_WARN(!rcu_is_watching(), &quot;IRQ failed to wake up RCU&quot;);
desc = __this_cpu_read(vector_irq[vector]);
if (!handle_irq(desc, regs)) {
ack_APIC_irq();
if (desc != VECTOR_RETRIGGERED &amp;&amp; desc != VECTOR_SHUTDOWN) {
pr_emerg_ratelimited(&quot;%s: %d.%d No irq handler for vector\n&quot;,
__func__, smp_processor_id(),
vector);
} else {
__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
}
}
exiting_irq(); // 主要执行 preempt_count_sub(HARDIRQ_OFFSET),并检查执行软中断(在硬中断的上下文中)
set_irq_regs(old_regs);
return 1;
}
// file: arch/x86/include/asm/apic.h
static inline void entering_irq(void)
{
irq_enter();
kvm_set_cpu_l1tf_flush_l1d();
}
// file: kernel/softirq.c
/*
* Enter an interrupt context.
*/
void irq_enter(void)
{
rcu_irq_enter();
if (is_idle_task(current) &amp;&amp; !in_interrupt()) { // 什么场景?
/*
* Prevent raise_softirq from needlessly waking up ksoftirqd
* here, as softirq will be serviced on return from interrupt.
*/
local_bh_disable();
tick_irq_enter();
_local_bh_enable();
}
__irq_enter();
}
// file: include/linux/hardirq.h
/*
* It is safe to do non-atomic ops on -&gt;hardirq_context,
* because NMI handlers may not preempt and the ops are
* always balanced, so the interrupted value of -&gt;hardirq_context
* will always be restored.
*/
#define __irq_enter() \
do { \
account_irq_enter_time(current); \
preempt_count_add(HARDIRQ_OFFSET); \
trace_hardirq_enter(); \
} while (0)</code></pre>
<p>比较重要的是 <code>exiting_irq</code>:</p>
<pre><code class="language-c">// file: include/linux/hardirq.h
static inline void exiting_irq(void)
{
irq_exit();
}
// file: kernel/softirq.c
/*
* Exit an interrupt context. Process softirqs if needed and possible:
*/
void irq_exit(void)
{
#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
local_irq_disable();
#else
lockdep_assert_irqs_disabled();
#endif
account_irq_exit_time(current);
preempt_count_sub(HARDIRQ_OFFSET); // 先恢复中断计数
if (!in_interrupt() &amp;&amp; local_softirq_pending()) // 再检查执行软中断。注意是这是在硬中断上下文中。另外,如果已经有软中断计数,这里也不会进入
invoke_softirq();
tick_irq_exit();
rcu_irq_exit();
trace_hardirq_exit(); /* must be last! */
}
static inline void invoke_softirq(void)
{
if (ksoftirqd_running(local_softirq_pending()))
return;
if (!force_irqthreads) {
#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
/*
* We can safely execute softirq on the current stack if
* it is the irq stack, because it should be near empty
* at this stage.
*/
__do_softirq(); // 软中断
#else
/*
* Otherwise, irq_exit() is called on the task stack that can
* be potentially deep already. So call softirq in its own stack
* to prevent from any overrun.
*/
do_softirq_own_stack();
#endif
} else {
wakeup_softirqd();
}
}</code></pre>
<p>继续看软中断(下半部),这是下部的核心,这里就没有再检查 <code>in_interrupt</code> 条件了:</p>
<pre><code class="language-c">// file: kernel/softirq.c
asmlinkage __visible void __softirq_entry __do_softirq(void)
{
unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
unsigned long old_flags = current-&gt;flags;
int max_restart = MAX_SOFTIRQ_RESTART;
struct softirq_action *h;
bool in_hardirq;
__u32 pending;
int softirq_bit;
/*
* Mask out PF_MEMALLOC s current task context is borrowed for the
* softirq. A softirq handled such as network RX might set PF_MEMALLOC
* again if the socket is related to swap
*/
current-&gt;flags &amp;= ~PF_MEMALLOC;
pending = local_softirq_pending();
account_irq_enter_time(current);
__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); // 增加软中断计数,避免软中断嵌套。在退出时,还会检查是否嵌套,并给告警。
in_hardirq = lockdep_softirq_start();
restart:
/* Reset the pending bitmask before enabling irqs */
set_softirq_pending(0); // 清空了,那只能把所有任务都执行完了
local_irq_enable(); // 开中断
h = softirq_vec;
while ((softirq_bit = ffs(pending))) {
unsigned int vec_nr;
int prev_count;
h += softirq_bit - 1;
vec_nr = h - softirq_vec;
prev_count = preempt_count();
kstat_incr_softirqs_this_cpu(vec_nr);
trace_softirq_entry(vec_nr);
h-&gt;action(h);
trace_softirq_exit(vec_nr);
if (unlikely(prev_count != preempt_count())) { // 执行软中断业务后,要求中断计数不变,抢占计数不变(要求平衡)
pr_err(&quot;huh, entered softirq %u %s %p with preempt_count %08x, exited with %08x?\n&quot;,
vec_nr, softirq_to_name[vec_nr], h-&gt;action,
prev_count, preempt_count());
preempt_count_set(prev_count); // 恢复之前的计数
}
h++;
pending &gt;&gt;= softirq_bit;
}
rcu_bh_qs();
local_irq_disable(); // 关中断
pending = local_softirq_pending();
if (pending) {
if (time_before(jiffies, end) &amp;&amp; !need_resched() &amp;&amp;
--max_restart) // 第二轮开始检查执行时间
goto restart;
wakeup_softirqd(); // 如果软中断没有处理完,唤醒 softirqd。似乎并不是立即执行,只是让其 READY 而已
}
lockdep_softirq_end(in_hardirq);
account_irq_exit_time(current);
__local_bh_enable(SOFTIRQ_OFFSET); // 减少软中断计数
WARN_ON_ONCE(in_interrupt()); // 首先,现在没有硬中断和 NMI 计数;其次,软中断不能嵌套,其计数已经减下来,那么这里肯定不会为真
current_restore_flags(old_flags, PF_MEMALLOC);
}</code></pre>