选择偏倚:相关性证据的致命漏洞Selection Bias: The Fatal Flaw of Correlation Evidence
主动参与健康管理项目的人,往往具备以下特征:更强的健康意识、更高的教育水平、更规律的生活习惯。换句话说,参与者在参与之前就已经更健康了。People who actively participate in health management programs tend to have stronger health awareness, higher education, and more regular lifestyles. In other words, participants are already healthier before they join.
PSM 的三步工作原理PSM in Three Steps
计算每位患者的"干预倾向评分"Calculate Each Patient's "Propensity Score"
对每位患者(干预组和对照组),基于其年龄、性别、BMI、基线风险评分、吸烟状态等协变量,使用逻辑回归模型估计"该患者在其特征下接受干预的概率"——即倾向评分。For each patient (treatment and control), use logistic regression to estimate the probability of receiving the intervention given their covariates — the propensity score.
按评分配对,消除基线差异Match by Score, Eliminate Baseline Differences
对干预组中的每位患者,在对照组中找到倾向评分最相近的患者作为"匹配对"(最近邻匹配)。设置卡钳(通常为倾向评分标准差的0.2倍),确保配对不过于勉强。配对完成后,检验匹配质量:要求所有协变量的标准化均值差(SMD)< 0.1。For each patient in the treatment group, find the closest propensity score match in the control group (nearest neighbor matching). Set a caliper (0.2 × SD of propensity score) to ensure adequate matching quality. Verify matching: require SMD < 0.1 for all covariates.
估计因果效应(ATT)Estimate Causal Effect (ATT)
在配对成功的样本上,比较干预组和匹配对照组的健康结果差异,得到平均处理效应(ATT,Average Treatment Effect on the Treated)。使用Bootstrap重采样计算置信区间。Compare health outcomes between matched treatment and control groups to obtain the Average Treatment Effect on the Treated (ATT). Use Bootstrap resampling to calculate confidence intervals.
PSM输出指标解读Reading PSM Output Metrics
| 指标Metric | 含义Meaning | 判读标准Standard |
|---|---|---|
| ATT | 干预组平均处理效应。干预相比不干预平均产生的健康变化量。Average Treatment Effect on the Treated. Average health change from intervention vs. no intervention. | 方向性 + CI不过零 = 显著效应Direction + CI not crossing zero = significant |
| 95% CI | ATT的95%置信区间。反映估计的不确定性。95% Confidence Interval of ATT. Reflects estimation uncertainty. | 区间不含零 = 统计显著CI not containing zero = significant |
| SMD(匹配后) | 标准化均值差。衡量配对后两组的可比性。Standardized Mean Difference. Measures post-match group comparability. | < 0.1 = 匹配质量良好Good match quality |
| 匹配率Match Rate | 干预组中成功配对的比例。Proportion of treatment group successfully matched. | > 80% = 重叠性良好Good overlap |
| Bootstrap SE | Bootstrap标准误。反映ATT估计在重采样下的稳定性。Bootstrap standard error. Reflects ATT stability under resampling. | 越小越稳定Smaller is more stable |
PSM 的适用条件与局限性PSM Assumptions and Limitations
条件可忽略性(Conditional Ignorability)Conditional Ignorability
在控制了所有协变量之后,干预分配与潜在结局无关。换句话说:影响干预分配的因素都被纳入了倾向评分模型。如果存在重要的未测量混杂因素(如患者的健康意识),PSM无法控制。After controlling for all covariates, treatment assignment is independent of potential outcomes. In other words, all factors influencing treatment assignment are included in the propensity score model. PSM cannot control for unmeasured confounders like health consciousness.
重叠性(Overlap / Common Support)Overlap / Common Support
干预组和对照组在倾向评分分布上存在重叠——即对于每个干预组患者,对照组中都存在特征相近的人可以配对。如果两组特征差异过大,PSM无法完成有效匹配。Treatment and control groups must overlap in propensity score distribution — for every treated patient, there must be a comparable control. If groups are too different, PSM cannot produce valid matches.
无法控制未观测混杂Cannot Control Unmeasured Confounders
PSM只能控制数据中存在的可观测协变量。相比随机对照试验(RCT),PSM是次优的因果推断方法。ReHealth Core在所有报告中明确说明这一局限,并建议将PSM证据作为补充性证据使用,而非替代RCT。PSM can only control observable covariates. Compared to RCTs, PSM is a second-best causal inference method. ReHealth Core clearly states this limitation in all reports and recommends PSM evidence as supplementary, not replacing RCTs.