0

我使用 5 个高效网络模型制作了一个堆叠模型,用于 Kaggle 比赛。下面给出的是堆叠模型的架构:

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1_0 (InputLayer)          [(None, 600, 600, 3) 0                                            
__________________________________________________________________________________________________
input_3_1 (InputLayer)          [(None, 600, 600, 3) 0                                            
__________________________________________________________________________________________________
input_5_2 (InputLayer)          [(None, 600, 600, 3) 0                                            
__________________________________________________________________________________________________
input_7_3 (InputLayer)          [(None, 600, 600, 3) 0                                            
__________________________________________________________________________________________________
input_9_4 (InputLayer)          [(None, 600, 600, 3) 0                                            
__________________________________________________________________________________________________
effnet_layer0_0 (Functional)    (None, None, None, 2 64097680    input_1_0[0][0]                  
__________________________________________________________________________________________________
effnet_layer1_1 (Functional)    (None, None, None, 2 64097680    input_3_1[0][0]                  
__________________________________________________________________________________________________
effnet_layer2_2 (Functional)    (None, None, None, 2 64097680    input_5_2[0][0]                  
__________________________________________________________________________________________________
effnet_layer3_3 (Functional)    (None, None, None, 2 64097680    input_7_3[0][0]                  
__________________________________________________________________________________________________
effnet_layer4_4 (Functional)    (None, None, None, 2 64097680    input_9_4[0][0]                  
__________________________________________________________________________________________________
global_average_pooling2d_0 (Glo (None, 2560)         0           effnet_layer0_0[0][0]            
__________________________________________________________________________________________________
global_average_pooling2d_1_1 (G (None, 2560)         0           effnet_layer1_1[0][0]            
__________________________________________________________________________________________________
global_average_pooling2d_2_2 (G (None, 2560)         0           effnet_layer2_2[0][0]            
__________________________________________________________________________________________________
global_average_pooling2d_3_3 (G (None, 2560)         0           effnet_layer3_3[0][0]            
__________________________________________________________________________________________________
global_average_pooling2d_4_4 (G (None, 2560)         0           effnet_layer4_4[0][0]            
__________________________________________________________________________________________________
dropout_0 (Dropout)             (None, 2560)         0           global_average_pooling2d_0[0][0] 
__________________________________________________________________________________________________
dropout_1_1 (Dropout)           (None, 2560)         0           global_average_pooling2d_1_1[0][0
__________________________________________________________________________________________________
dropout_2_2 (Dropout)           (None, 2560)         0           global_average_pooling2d_2_2[0][0
__________________________________________________________________________________________________
dropout_3_3 (Dropout)           (None, 2560)         0           global_average_pooling2d_3_3[0][0
__________________________________________________________________________________________________
dropout_4_4 (Dropout)           (None, 2560)         0           global_average_pooling2d_4_4[0][0
__________________________________________________________________________________________________
dense_0 (Dense)                 (None, 4)            10244       dropout_0[0][0]                  
__________________________________________________________________________________________________
dense_1_1 (Dense)               (None, 4)            10244       dropout_1_1[0][0]                
__________________________________________________________________________________________________
dense_2_2 (Dense)               (None, 4)            10244       dropout_2_2[0][0]                
__________________________________________________________________________________________________
dense_3_3 (Dense)               (None, 4)            10244       dropout_3_3[0][0]                
__________________________________________________________________________________________________
dense_4_4 (Dense)               (None, 4)            10244       dropout_4_4[0][0]                
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 20)           0           dense_0[0][0]                    
                                                                 dense_1_1[0][0]                  
                                                                 dense_2_2[0][0]                  
                                                                 dense_3_3[0][0]                  
                                                                 dense_4_4[0][0]                  
__________________________________________________________________________________________________
dense (Dense)                   (None, 10)           210         concatenate[0][0]                
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 4)            44          dense[0][0]                      
==================================================================================================
Total params: 320,539,874
Trainable params: 254
Non-trainable params: 320,539,620

堆叠模型的性能指标:

堆叠模型精度

堆叠模型损失

基本模型的性能指标:

准确性

失利

但是,当我使用堆叠模型进行 Kaggle 预测时,我得到了 0.551 的分数,而当我使用其中一个基本模型时,我得到了 0.581 的分数。

为什么会这样?堆叠模型不应该比基本模型提供更好的结果吗?

4

1 回答 1

0

较大的模型通常很难训练 [0],因此如果您只是增加模型的大小/堆叠模型,与更简单的模型相比,不要期望有太大的改进。

此外,您的数据集有多大?似乎在这两种模型中,都存在过度拟合的迹象(或者至少测试的损失值停滞不前)。

[0] https://arxiv.org/abs/1512.03385

于 2021-09-27T09:24:08.500 回答