我试图从比较表达低水平或基因 BRCA1 的癌症患者和表达高水平 BRCA1(BRCA1 -BReast CAncer 基因 1)的患者的生存 Kaplan-Meier 图中获得对数秩 p 值
所以,我从 GEO (GSE1456) 下载了癌症数据 - 一个 DNA 阵列,链接到来自患者的癌症数据,以及描述
在这个数组中,BRCA1 量化从 2 到 7(任意单位),所以我想在 5.74 处拆分数据(低/高表达式)
split =5.74
probe = '204531_s_at' #code name for the probe that measures BRCA1
T1=df.loc[df[probe] < split ,'SURV_RELAPSE'] # Expression value (2-7)
T2=df.loc[df[probe] > split,'SURV_RELAPSE']
E1=df.loc[df[probe] < split ,'RELAPSE'] # 0- no cancer relapse, 1- cancer relapse
E2=df.loc[df[probe] > split ,'RELAPSE']
我使用生命线绘制 Kaplan-Meier
kmf = KaplanMeierFitter()
kmf.fit(T1, event_observed=E1, label='low')
kmf.plot_survival_function(show_censors=True, ax=ax)
kmf.fit(T2, event_observed=E2, label='high')
kmf.plot_survival_function( show_censors=True, ax=ax)
最后我得到了logrank值:
results = logrank_test(T1, T2, event_observed_A=E1, event_observed_B=E2)
results.print_summary()
print('p_value: ', results.p_value)
具有值p_value:0.3932833438047245
为了验证我的分析,我将其与 KM-plotter 进行了比较。好消息是它显示了相同的图像:
但是,具有不同的 logrank 值,logrank p = 0.046
根据我的分析,这两个生存图没有统计学意义,但通过使用 KM 绘图仪,它们具有统计学意义。
我怀疑经过审查的样本可能会在 p 的计算方式中发挥作用,但不确定如何解决它。
问题:谁能指导我如何使用生命线正确计算对数秩?
谢谢
编辑添加原始数据:
T1:
SURV_RELAPSE
100.32
98.76
97.44
96.84
76.56
97.56
48.96
97.32
66.36
16.68
75.84
91.32000000000001
96.35999999999999
88.56
96.35999999999999
79.80000000000001
92.76
93.72
95.4
95.52
82.32000000000001
84.36
101.52000000000001
73.56
86.52
97.56
84.84
47.88
90.6
93.0
96.96000000000001
18.6
100.80000000000001
90.48
75.72
99.60000000000001
84.84
97.80000000000001
98.88
72.24
70.19999999999999
69.72
70.32000000000001
68.52
67.56
95.88
19.32
99.96000000000001
69.84
70.32000000000001
51.599999999999994
95.03999999999999
78.24
77.52
101.16
74.88
46.68
91.08
2.7600000000000002
92.03999999999999
94.67999999999999
89.28
33.36
75.12
71.4
71.76
72.6
81.12
95.28
76.80000000000001
88.80000000000001
77.28
15.600000000000001
95.88
94.08
92.03999999999999
77.28
100.32
99.84
T2:
SURV_RELAPSE
45.839999999999996
97.80000000000001
96.84
99.60000000000001
41.64
6.720000000000001
92.52
53.28
94.32000000000001
35.04
13.080000000000002
52.32000000000001
90.6
14.28
61.92
91.80000000000001
87.48
96.35999999999999
98.39999999999999
91.08
101.76
100.08
87.36
94.08
91.80000000000001
87.36
92.76
22.200000000000003
37.8
60.0
69.0
68.16
66.72
68.28
99.24
12.48
72.36
89.03999999999999
51.239999999999995
90.12
99.60000000000001
93.36
71.4
99.84
46.32
94.67999999999999
97.56
99.84
78.6
78.6
84.6
13.440000000000001
68.28
92.03999999999999
47.04
93.36
90.0
88.32000000000001
92.76
13.440000000000001
17.52
101.64000000000001
96.35999999999999
9.120000000000001
91.56
93.0
15.120000000000001
79.32000000000001
86.28
10.8
75.84
101.88
101.88
71.28
16.080000000000002
8.040000000000001
33.480000000000004
16.56
67.44
66.96000000000001
E1:
RELAPSE
0
0
0
0
0
0
1
0
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
1
1
0
1
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
E2:
RELAPSE
1
0
0
0
1
1
0
1
0
1
1
1
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
1
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
0
0
1
1
0
0
1
0
0
1
0
0
1
0
0
0
0
1
1
1
1
0
0