CPU#0 stuck for 61s work_struct
时间:2010-08-10
来源:互联网
我在调android sd卡做rootfs时碰到一个问题,内核2.6.34 ARM,卡驱动从之前正常的2.6.26移植过来的
在card queue thread中 时间超过61s,提示下面错误,网上说提示该错误时是某地方有死循环,确实在card_queue_thread中61s未sleep
[ 67.650000] BUG: soft lockup - CPU#0 stuck for 61s! [kcardd:117]
[ 67.650000] Modules linked in:
[ 67.650000]
[ 67.650000] Pid: 117, comm: kcardd
[ 67.650000] CPU: 0 Not tainted (2.6.34 #36)
[ 67.650000] PC is at card_queue_thread+0xd4/0x150
[ 67.650000] LR is at 0xa881
[ 67.650000] pc : [<c0254650>] lr : [<0000a881>] psr: 20000013
[ 67.650000] sp : cfe31fb8 ip : cfd6c6c4 fp : cfe31ff4
[ 67.650000] r10: cfd75a40 r9 : 00000000 r8 : 00000001
[ 67.650000] r7 : cfe31fbc r6 : c0517b9c r5 : 00000000 r4 : cfd5bf44
[ 67.650000] r3 : 00000000 r2 : 00000000 r1 : 00000003 r0 : cfd1cf00
[ 67.650000] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 67.650000] Control: 10c53c7d Table: 8fe6c059 DAC: 00000017
。。。
[ 67.650000] [<c0027608>] (show_regs+0x0/0x50) from [<c0070058>] (softlockup_tick+0xf8/0x150)
[ 67.650000] r4:cfe31f70 r3:c045e618
[ 67.650000] [<c006ff60>] (softlockup_tick+0x0/0x150) from [<c004c36c>] (run_local_timers+0x1c/0x20)
[ 67.650000] [<c004c350>] (run_local_timers+0x0/0x20) from [<c004c3a0>] (update_process_times+0x30/0x54)
[ 67.650000] [<c004c370>] (update_process_times+0x0/0x54) from [<c0063a58>] (T.214+0x40/0xc4)
[ 67.650000] r5:00000000 r4:c0480090
[ 67.650000] [<c0063a18>] (T.214+0x0/0xc4) from [<c0063af4>] (tick_handle_periodic+0x18/0x10
[ 67.650000] r9:00000000 r8:00000001 r7:0000000a r6:00000000 r4:c043d5c0
[ 67.650000] r3:c0063adc
[ 67.650000] [<c0063adc>] (tick_handle_periodic+0x0/0x10
from [<c0030050>] (my_timer_interrupt+0x54/0x5c)
[ 67.650000] [<c002fffc>] (my_timer_interrupt+0x0/0x5c) from [<c0070cb4>] (handle_IRQ_event+0x58/0x120)
[ 67.650000] [<c0070c5c>] (handle_IRQ_event+0x0/0x120) from [<c0072a14>] (handle_level_irq+0x7c/0x10c)
[ 67.650000] r7:cfe31fbc r6:00000000 r5:0000000a r4:c0440b84
[ 67.650000] [<c0072998>] (handle_level_irq+0x0/0x10c) from [<c0025048>] (asm_do_IRQ+0x48/0x94)
[ 67.650000] r5:0000000a r4:c044f5c8
[ 67.650000] [<c0025000>] (asm_do_IRQ+0x0/0x94) from [<c0025b70>] (__irq_svc+0x30/0xc0)
[ 67.650000] Exception stack(0xcfe31f70 to 0xcfe31fb
[ 67.650000] 1f60: cfd1cf00 00000003 00000000 00000000
[ 67.650000] 1f80: cfd5bf44 00000000 c0517b9c cfe31fbc 00000001 00000000 cfd75a40 cfe31ff4
[ 67.650000] 1fa0: cfd6c6c4 cfe31fb8 0000a881 c0254650 20000013 ffffffff
[ 67.650000] r6:00000001 r5:f1109a40 r4:ffffffff r3:20000013
[ 67.650000] [<c025457c>] (card_queue_thread+0x0/0x150) from [<c00442b8>] (do_exit+0x0/0x69c)
开机时偶尔会提示上面的错误,一旦出错 queue_flags就一直为QUEUE_FLAG_PLUGGED,
跟到blk-core.c时执行了blk_unplug_timeout但是并未执行work_struct的函数blk_unplug_work
void blk_unplug_timeout(unsigned long data)
{
struct request_queue *q = (struct request_queue *)data;
trace_block_unplug_timer(q);
- kblockd_schedule_work(q, &q->unplug_work);
+ blk_unplug_work(&q->unplug_work);
}
如果把kblockd_schedule_work(q, &q->unplug_work);改成直接执行unplug_work的函数blk_unplug_work,就正常了
请教大家,我这个可能的问题在哪里?
还有一个不明白的,原本q->unplug_work.data应该是0,执行第2次blk_unplug_timeout时就变成了0xcfc51f80,我认为data只存放下面这4个值?可能我理解有错
#define WORK_STRUCT_PENDING 0 /* T if work item pending execution */
#define WORK_STRUCT_STATIC 1 /* static initializer (debugobjects) */
#define WORK_STRUCT_FLAG_MASK (3UL)
#define WORK_STRUCT_WQ_DATA_MASK (~WORK_STRUCT_FLAG_MASK)
如果理解没错,那么是被什么冲掉了?
谢谢大家指教
在card queue thread中 时间超过61s,提示下面错误,网上说提示该错误时是某地方有死循环,确实在card_queue_thread中61s未sleep
[ 67.650000] BUG: soft lockup - CPU#0 stuck for 61s! [kcardd:117]
[ 67.650000] Modules linked in:
[ 67.650000]
[ 67.650000] Pid: 117, comm: kcardd
[ 67.650000] CPU: 0 Not tainted (2.6.34 #36)
[ 67.650000] PC is at card_queue_thread+0xd4/0x150
[ 67.650000] LR is at 0xa881
[ 67.650000] pc : [<c0254650>] lr : [<0000a881>] psr: 20000013
[ 67.650000] sp : cfe31fb8 ip : cfd6c6c4 fp : cfe31ff4
[ 67.650000] r10: cfd75a40 r9 : 00000000 r8 : 00000001
[ 67.650000] r7 : cfe31fbc r6 : c0517b9c r5 : 00000000 r4 : cfd5bf44
[ 67.650000] r3 : 00000000 r2 : 00000000 r1 : 00000003 r0 : cfd1cf00
[ 67.650000] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 67.650000] Control: 10c53c7d Table: 8fe6c059 DAC: 00000017
。。。
[ 67.650000] [<c0027608>] (show_regs+0x0/0x50) from [<c0070058>] (softlockup_tick+0xf8/0x150)
[ 67.650000] r4:cfe31f70 r3:c045e618
[ 67.650000] [<c006ff60>] (softlockup_tick+0x0/0x150) from [<c004c36c>] (run_local_timers+0x1c/0x20)
[ 67.650000] [<c004c350>] (run_local_timers+0x0/0x20) from [<c004c3a0>] (update_process_times+0x30/0x54)
[ 67.650000] [<c004c370>] (update_process_times+0x0/0x54) from [<c0063a58>] (T.214+0x40/0xc4)
[ 67.650000] r5:00000000 r4:c0480090
[ 67.650000] [<c0063a18>] (T.214+0x0/0xc4) from [<c0063af4>] (tick_handle_periodic+0x18/0x10

[ 67.650000] r9:00000000 r8:00000001 r7:0000000a r6:00000000 r4:c043d5c0
[ 67.650000] r3:c0063adc
[ 67.650000] [<c0063adc>] (tick_handle_periodic+0x0/0x10

[ 67.650000] [<c002fffc>] (my_timer_interrupt+0x0/0x5c) from [<c0070cb4>] (handle_IRQ_event+0x58/0x120)
[ 67.650000] [<c0070c5c>] (handle_IRQ_event+0x0/0x120) from [<c0072a14>] (handle_level_irq+0x7c/0x10c)
[ 67.650000] r7:cfe31fbc r6:00000000 r5:0000000a r4:c0440b84
[ 67.650000] [<c0072998>] (handle_level_irq+0x0/0x10c) from [<c0025048>] (asm_do_IRQ+0x48/0x94)
[ 67.650000] r5:0000000a r4:c044f5c8
[ 67.650000] [<c0025000>] (asm_do_IRQ+0x0/0x94) from [<c0025b70>] (__irq_svc+0x30/0xc0)
[ 67.650000] Exception stack(0xcfe31f70 to 0xcfe31fb

[ 67.650000] 1f60: cfd1cf00 00000003 00000000 00000000
[ 67.650000] 1f80: cfd5bf44 00000000 c0517b9c cfe31fbc 00000001 00000000 cfd75a40 cfe31ff4
[ 67.650000] 1fa0: cfd6c6c4 cfe31fb8 0000a881 c0254650 20000013 ffffffff
[ 67.650000] r6:00000001 r5:f1109a40 r4:ffffffff r3:20000013
[ 67.650000] [<c025457c>] (card_queue_thread+0x0/0x150) from [<c00442b8>] (do_exit+0x0/0x69c)
开机时偶尔会提示上面的错误,一旦出错 queue_flags就一直为QUEUE_FLAG_PLUGGED,
跟到blk-core.c时执行了blk_unplug_timeout但是并未执行work_struct的函数blk_unplug_work
void blk_unplug_timeout(unsigned long data)
{
struct request_queue *q = (struct request_queue *)data;
trace_block_unplug_timer(q);
- kblockd_schedule_work(q, &q->unplug_work);
+ blk_unplug_work(&q->unplug_work);
}
如果把kblockd_schedule_work(q, &q->unplug_work);改成直接执行unplug_work的函数blk_unplug_work,就正常了
请教大家,我这个可能的问题在哪里?
还有一个不明白的,原本q->unplug_work.data应该是0,执行第2次blk_unplug_timeout时就变成了0xcfc51f80,我认为data只存放下面这4个值?可能我理解有错
#define WORK_STRUCT_PENDING 0 /* T if work item pending execution */
#define WORK_STRUCT_STATIC 1 /* static initializer (debugobjects) */
#define WORK_STRUCT_FLAG_MASK (3UL)
#define WORK_STRUCT_WQ_DATA_MASK (~WORK_STRUCT_FLAG_MASK)
如果理解没错,那么是被什么冲掉了?
谢谢大家指教
作者: fei1700 发布时间: 2010-08-10
我先纠正一下我自己理解的错误
work_struct data是struct cpu_workqueue_struct *和WORK_STRUCT_FLAG_MASK的组合
/*
* Set the workqueue on which a work item is to be run
* - Must *only* be called if the pending flag is set
*/
static inline void set_wq_data(struct work_struct *work,
struct cpu_workqueue_struct *cwq)
{
unsigned long new;
BUG_ON(!work_pending(work));
new = (unsigned long) cwq | (1UL << WORK_STRUCT_PENDING);
new |= WORK_STRUCT_FLAG_MASK & *work_data_bits(work);
atomic_long_set(&work->data, new);
}
work_struct data是struct cpu_workqueue_struct *和WORK_STRUCT_FLAG_MASK的组合
/*
* Set the workqueue on which a work item is to be run
* - Must *only* be called if the pending flag is set
*/
static inline void set_wq_data(struct work_struct *work,
struct cpu_workqueue_struct *cwq)
{
unsigned long new;
BUG_ON(!work_pending(work));
new = (unsigned long) cwq | (1UL << WORK_STRUCT_PENDING);
new |= WORK_STRUCT_FLAG_MASK & *work_data_bits(work);
atomic_long_set(&work->data, new);
}
作者: fei1700 发布时间: 2010-08-10
相关阅读 更多
热门阅读
-
office 2019专业增强版最新2021版激活秘钥/序列号/激活码推荐 附激活工具
阅读:74
-
如何安装mysql8.0
阅读:31
-
Word快速设置标题样式步骤详解
阅读:28
-
20+道必知必会的Vue面试题(附答案解析)
阅读:37
-
HTML如何制作表单
阅读:22
-
百词斩可以改天数吗?当然可以,4个步骤轻松修改天数!
阅读:31
-
ET文件格式和XLS格式文件之间如何转化?
阅读:24
-
react和vue的区别及优缺点是什么
阅读:121
-
支付宝人脸识别如何关闭?
阅读:21
-
腾讯微云怎么修改照片或视频备份路径?
阅读:28