从视线估计到分心检测:算法优化实战指南

引言:视线估计只是第一步

视线估计输出的是3D角度向量,但分心检测需要回答:

  • 驾驶员在看哪里?(视线区域分类)
  • 是否偏离前方道路?(分心判断)
  • 持续了多久?(时间窗口)
  • 置信度如何?(可靠性评估)

本文详解从视线估计到分心检测的完整流程。


一、视线区域分类

1.1 视线区域定义

Euro NCAP 2026要求的视线区域

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
                ┌─────────────────┐
│ Center Mirror │ 中央后视镜
└────────┬────────┘

┌──────────┐ │ ┌──────────┐
Left │ │ │ Right
│ Mirror │ │ │ Mirror │
└──────────┘ │ └──────────┘

┌────────────────────────┼────────────────────────┐
│ │ │
Left Screen │ Forward │
│ │ (前方道路) │
│ │ │
├────────────────────────┼────────────────────────┤
│ │ │
│ Center Console │ Dashboard │
│ (中控屏) │ (仪表盘) │
│ │ │
└────────────────────────┴────────────────────────┘

7个核心区域

区域 说明 角度范围(yaw/pitch)
Forward 前方道路 yaw: ±15°, pitch: ±10°
Left Mirror 左侧后视镜 yaw: -45° ~ -30°, pitch: -5° ~ +5°
Right Mirror 右侧后视镜 yaw: +30° ~ +45°, pitch: -5° ~ +5°
Center Mirror 中央后视镜 yaw: ±5°, pitch: +20° ~ +35°
Dashboard 仪表盘 yaw: ±20°, pitch: -30° ~ -15°
Center Console 中控屏 yaw: ±30°, pitch: -35° ~ -20°
Other 其他区域 上述之外

1.2 视线区域分类方法

方法一:基于角度阈值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
def classify_gaze_zone(gaze_vector):
"""
根据yaw/pitch角度分类视线区域
gaze_vector: [pitch, yaw, roll] (度)
返回: zone_name
"""
pitch, yaw, _ = gaze_vector

# 前方道路
if abs(yaw) < 15 and abs(pitch) < 10:
return "Forward"

# 左侧后视镜
if -45 < yaw < -30 and -5 < pitch < 5:
return "Left Mirror"

# 右侧后视镜
if 30 < yaw < 45 and -5 < pitch < 5:
return "Right Mirror"

# 中央后视镜
if abs(yaw) < 5 and 20 < pitch < 35:
return "Center Mirror"

# 仪表盘
if abs(yaw) < 20 and -30 < pitch < -15:
return "Dashboard"

# 中控屏
if abs(yaw) < 30 and -35 < pitch < -20:
return "Center Console"

return "Other"

方法二:基于神经网络分类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import torch
import torch.nn as nn

class GazeZoneClassifier(nn.Module):
def __init__(self, gaze_dim=3, num_zones=7):
super().__init__()
self.classifier = nn.Sequential(
nn.Linear(gaze_dim, 64),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(64, 32),
nn.ReLU(),
nn.Linear(32, num_zones)
)

def forward(self, gaze_vector):
"""
gaze_vector: [B, 3] - pitch, yaw, roll
返回: [B, 7] - 7类区域的概率
"""
return self.classifier(gaze_vector)

方法三:端到端多任务学习

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class MultiTaskGazeNet(nn.Module):
"""
同时预测视线向量和区域类别
"""
def __init__(self):
super().__init__()
self.backbone = GazeCapsNet() # 预训练视线估计模型

# 共享特征
self.shared_features = self.backbone.features

# 视线回归头
self.gaze_head = nn.Linear(512, 3)

# 区域分类头
self.zone_head = nn.Linear(512, 7)

def forward(self, x):
features = self.shared_features(x)

gaze = self.gaze_head(features)
zone = self.zone_head(features)

return gaze, zone

1.3 方法对比

方法 优点 缺点 适用场景
角度阈值 简单、可解释 边界处易误判 快速原型
神经网络 学习边界 需标注数据 生产环境
多任务 共享特征、精度高 训练复杂 高精度需求

二、时序滤波

2.1 为什么需要时序滤波

问题

  • 单帧视线估计有噪声(±2-3°)
  • 瞬间眨眼、转头会产生误判
  • Euro NCAP要求2秒以上才报警

解决方案:时序滤波平滑噪声

2.2 滑动窗口平均

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
from collections import deque
import numpy as np

class SlidingWindowFilter:
def __init__(self, window_size=10):
self.window_size = window_size
self.gaze_history = deque(maxlen=window_size)

def update(self, gaze_vector):
"""
添加新的视线向量并返回平滑结果
"""
self.gaze_history.append(gaze_vector)

# 加权平均(最近帧权重更高)
weights = np.exp(np.linspace(0, 1, len(self.gaze_history)))
weights = weights / weights.sum()

smoothed = np.average(self.gaze_history, axis=0, weights=weights)

return smoothed

2.3 卡尔曼滤波

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
from filterpy.kalman import KalmanFilter

class GazeKalmanFilter:
def __init__(self):
# 状态: [pitch, yaw, roll, pitch_vel, yaw_vel, roll_vel]
self.kf = KalmanFilter(dim_x=6, dim_z=3)

# 状态转移矩阵(匀速模型)
self.kf.F = np.array([
[1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1]
])

# 观测矩阵
self.kf.H = np.array([
[1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0]
])

# 观测噪声
self.kf.R = np.eye(3) * 3.0 # 3°观测噪声

# 过程噪声
self.kf.Q = np.eye(6) * 0.1

def update(self, gaze_vector):
"""
更新卡尔曼滤波器
"""
self.kf.predict()
self.kf.update(gaze_vector)

return self.kf.x[:3] # 返回平滑后的角度

2.4 一致性检验

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class ConsistencyChecker:
"""
检测视线跳变(可能是检测错误)
"""
def __init__(self, max_jump=15, min_consistent=3):
self.max_jump = max_jump # 最大跳变角度
self.min_consistent = min_consistent # 最小一致性帧数
self.history = []

def check(self, gaze_vector):
"""
检查视线向量是否合理
返回: (is_valid, confidence)
"""
if len(self.history) == 0:
self.history.append(gaze_vector)
return True, 1.0

# 计算跳变
prev = self.history[-1]
jump = np.linalg.norm(np.array(gaze_vector) - np.array(prev))

if jump > self.max_jump:
# 跳变过大,可能是检测错误
return False, 0.5

# 更新历史
self.history.append(gaze_vector)
if len(self.history) > self.min_consistent:
self.history.pop(0)

# 计算一致性
variance = np.var(self.history, axis=0).mean()
confidence = 1.0 / (1.0 + variance)

return True, confidence

三、分心判断逻辑

3.1 Euro NCAP分心定义

分心类型

类型 定义 报警条件
视觉分心 视线离开前方道路 累计>2秒
手动分心 手离开方向盘 -
认知分心 视线在前方但注意力分散 需多模态判断

3.2 分心判断实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import time
from enum import Enum

class GazeState(Enum):
FORWARD = "forward" # 注视前方
DEVIATED = "deviated" # 视线偏离
DISTRACTED = "distracted" # 分心(偏离>2秒)

class DistractionDetector:
def __init__(self,
forward_threshold=15, # 前方区域阈值(度)
time_threshold=2.0, # 分心时间阈值(秒)
grace_period=0.5): # 宽限期(秒)

self.forward_threshold = forward_threshold
self.time_threshold = time_threshold
self.grace_period = grace_period

self.state = GazeState.FORWARD
self.deviation_start = None
self.total_deviation_time = 0

def update(self, gaze_vector, confidence=1.0):
"""
更新分心状态
gaze_vector: [pitch, yaw, roll] (度)
confidence: 视线估计置信度
返回: (state, deviation_time)
"""
yaw, pitch = gaze_vector[1], gaze_vector[0]
current_time = time.time()

# 判断是否在前方区域
is_forward = (abs(yaw) < self.forward_threshold and
abs(pitch) < self.forward_threshold * 0.67)

if is_forward:
# 视线回到前方
if self.state != GazeState.FORWARD:
self.state = GazeState.FORWARD
self.deviation_start = None
self.total_deviation_time = 0
else:
# 视线偏离
if self.state == GazeState.FORWARD:
# 开始偏离
self.state = GazeState.DEVIATED
self.deviation_start = current_time
else:
# 继续偏离
deviation_time = current_time - self.deviation_start
self.total_deviation_time = deviation_time * confidence

# 超过阈值,判定分心
if self.total_deviation_time > self.time_threshold:
self.state = GazeState.DISTRACTED

return self.state, self.total_deviation_time

3.3 报警逻辑

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class WarningSystem:
def __init__(self):
self.detector = DistractionDetector()
self.warning_level = 0 # 0: 无, 1: 轻度, 2: 重度
self.last_warning_time = 0

def update(self, gaze_vector, confidence):
state, deviation_time = self.detector.update(gaze_vector, confidence)

if state == GazeState.DISTRACTED:
if deviation_time > 4.0:
# 重度分心(>4秒)
self.warning_level = 2
self.trigger_warning("严重分心!请立即注视前方!")
elif deviation_time > 2.0:
# 轻度分心(2-4秒)
self.warning_level = 1
self.trigger_warning("请注意前方道路")
else:
self.warning_level = 0

return self.warning_level

def trigger_warning(self, message):
"""
触发警告(声音/震动/视觉提示)
"""
current_time = time.time()

# 避免频繁报警(最小间隔1秒)
if current_time - self.last_warning_time > 1.0:
print(f"[WARNING] {message}")
self.last_warning_time = current_time

# 这里可以集成实际的报警接口
# - 声音报警
# - 座椅震动
# - HUD显示

四、置信度评估

4.1 影响置信度的因素

因素 影响 处理方式
眼部遮挡 瞳孔检测失败 降低置信度
极端头位 超出训练分布 降低置信度
光照异常 过曝/欠曝 降低置信度
墨镜 IR透光率低 置信度设为0

4.2 置信度计算

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def compute_confidence(eye_features, head_pose, lighting):
"""
计算视线估计置信度
"""
confidence = 1.0

# 1. 眼部遮挡检测
occlusion_ratio = detect_eye_occlusion(eye_features)
confidence *= (1.0 - occlusion_ratio)

# 2. 头位检查
yaw, pitch, roll = head_pose
if abs(yaw) > 45 or abs(pitch) > 30:
confidence *= 0.5 # 极端头位降权

# 3. 光照检查
if lighting < 10 or lighting > 10000:
confidence *= 0.7 # 异常光照降权

# 4. 墨镜检测
if detect_sunglasses(eye_features):
confidence = 0.0 # 墨镜场景不可信

return confidence

def detect_eye_occlusion(eye_features):
"""
检测眼部遮挡程度
返回: 遮挡比例 [0, 1]
"""
# 使用眼部关键点检测遮挡
# 简化实现:检测眼睛开合度
eye_aspect_ratio = compute_ear(eye_features)

if eye_aspect_ratio < 0.2:
return 1.0 # 完全闭合
elif eye_aspect_ratio < 0.3:
return 0.5 # 部分遮挡
else:
return 0.0 # 无遮挡

def detect_sunglasses(eye_features):
"""
检测是否佩戴墨镜
"""
# IR图像中瞳孔亮度异常低
iris_brightness = compute_iris_brightness(eye_features)

if iris_brightness < 20: # 阈值需根据设备调整
return True
return False

4.3 低置信度处理策略

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class LowConfidenceHandler:
def __init__(self):
self.low_confidence_count = 0
self.max_low_confidence_frames = 30 # 1秒(30fps)

def handle(self, confidence):
"""
处理低置信度情况
"""
if confidence < 0.3:
self.low_confidence_count += 1

if self.low_confidence_count > self.max_low_confidence_frames:
# 持续低置信度,提示驾驶员
return "DEGRADED_MODE", "请摘下墨镜或调整坐姿"
else:
self.low_confidence_count = 0

return "NORMAL", None

五、Euro NCAP合规设计

5.1 测试场景覆盖

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class EuroNCAPTestSuite:
"""
Euro NCAP DMS测试用例
"""
TEST_CASES = [
# (名称, 头位, 视线方向, 期望结果)
("Forward_Still", "forward", "forward", "no_warning"),
("Forward_Moving", "forward", "forward", "no_warning"),
("LeftMirror_Check", "forward", "left_mirror", "no_warning"),
("RightMirror_Check", "forward", "right_mirror", "no_warning"),
("CenterMirror_Check", "forward", "center_mirror", "no_warning"),
("Dashboard_Glance", "forward", "dashboard", "no_warning"),
("SideWindow_Long", "forward", "side_window", "warning_2s"),
("Passenger_Long", "forward", "passenger", "warning_2s"),
("Phone_Use", "forward", "lap", "warning_immediate"),
("Head_Turn_Left", "left", "left", "no_warning"),
("Head_Turn_Right", "right", "right", "no_warning"),
]

def run_test(self, test_name, gaze_sequence):
"""
运行测试用例
gaze_sequence: 视线序列 [(gaze_vector, time), ...]
"""
test_case = self.TEST_CASES[test_name]
detector = DistractionDetector()

for gaze_vector, t in gaze_sequence:
state, deviation_time = detector.update(gaze_vector)

# 检查是否符合期望
if test_case[3] == "no_warning":
assert state != GazeState.DISTRACTED
elif test_case[3] == "warning_2s":
if deviation_time > 2.0:
assert state == GazeState.DISTRACTED

5.2 报警延迟要求

场景 最大延迟
视觉分心 2秒
疲劳(闭眼) 1秒
使用手机 即时
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def check_response_time(gaze_sequence, expected_warning_time):
"""
检查报警延迟是否符合要求
"""
detector = DistractionDetector()
actual_warning_time = None

for i, (gaze_vector, t) in enumerate(gaze_sequence):
state, _ = detector.update(gaze_vector)

if state == GazeState.DISTRACTED:
actual_warning_time = t
break

# 允许±0.5秒误差
if actual_warning_time is not None:
assert abs(actual_warning_time - expected_warning_time) < 0.5

六、完整流水线

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
class DistractionPipeline:
"""
完整的分心检测流水线
"""
def __init__(self,
gaze_model_path,
zone_classifier_path=None):

# 加载模型
self.gaze_estimator = GazeCapsNet.from_pretrained(gaze_model_path)

if zone_classifier_path:
self.zone_classifier = GazeZoneClassifier.from_pretrained(zone_classifier_path)
else:
self.zone_classifier = None

# 初始化组件
self.face_detector = SCRFD() # 人脸检测
self.temporal_filter = GazeKalmanFilter()
self.consistency_checker = ConsistencyChecker()
self.distraction_detector = DistractionDetector()
self.warning_system = WarningSystem()

def process_frame(self, frame):
"""
处理单帧图像
"""
# 1. 人脸检测
faces = self.face_detector.detect(frame)
if len(faces) == 0:
return None, 0.0, "NO_FACE"

# 2. 视线估计
face_crop = self.crop_face(frame, faces[0])
gaze_raw = self.gaze_estimator.predict(face_crop)

# 3. 时序滤波
gaze_smoothed = self.temporal_filter.update(gaze_raw)

# 4. 一致性检查
is_valid, confidence = self.consistency_checker.check(gaze_smoothed)
if not is_valid:
confidence *= 0.5

# 5. 视线区域分类
if self.zone_classifier:
zone = self.zone_classifier.predict(gaze_smoothed)
else:
zone = classify_gaze_zone(gaze_smoothed)

# 6. 分心判断
warning_level = self.warning_system.update(gaze_smoothed, confidence)

return gaze_smoothed, confidence, zone, warning_level

七、总结

7.1 关键优化点

环节 优化方法 效果
视线区域分类 多任务学习 边界准确率+15%
时序滤波 卡尔曼滤波 噪声降低50%
置信度评估 多因素融合 误报率降低30%
分心判断 时间窗口+宽限期 符合Euro NCAP

7.2 实施建议

  1. 先标定后部署:根据车型调整视线区域角度范围
  2. 数据驱动阈值:根据实际数据调整分心时间阈值
  3. A/B测试验证:验证不同策略的用户接受度
  4. 持续迭代优化:收集edge case数据,持续改进模型

参考文献

  1. Euro NCAP. “Driver Monitoring Test Procedure.” Technical Bulletin SD 202, 2025.
  2. Victor, T., et al. “Sensitivity of eye-movement measures to in-vehicle task difficulty.” Transportation Research Part F, 2005.
  3. Zhang, X., et al. “ETH-XGaze: A Large Scale Dataset.” ECCV, 2020.

本文是IMS分心检测系列文章之一,上一篇:GazeCapsNet详解


从视线估计到分心检测:算法优化实战指南
https://dapalm.com/2026/03/13/2026-03-13-从视线估计到分心检测-算法优化实战指南/
作者
Mars
发布于
2026年3月13日
许可协议