插入符号中的特征选择+与ROC的总和

 蓝紫藤田_835 发布于 2023-02-03 12:08

我一直在尝试使用插入包来应用递归功能选择.我需要的是ref使用AUC作为性能测量.谷歌搜索了一个月后,我无法使该过程正常工作.这是我用过的代码:

library(caret)
library(doMC)
registerDoMC(cores = 4)

data(mdrr)

subsets <- c(1:10)

ctrl <- rfeControl(functions=caretFuncs, 
                   method = "cv",
                   repeats =5, number = 10,
                   returnResamp="final", verbose = TRUE)

trainctrl <- trainControl(classProbs= TRUE)

caretFuncs$summary <- twoClassSummary

set.seed(326)

rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass, sizes=subsets,
                            rfeControl=ctrl,
                            method="svmRadial",
                            metric="ROC",
                            trControl=trainctrl)

执行此脚本时,我得到以下结果:

Recursive feature selection

Outer resampling method: Cross-Validation (10 fold) 

Resampling performance over subset size:

Variables Accuracy  Kappa AccuracySD KappaSD Selected
     1   0.7501 0.4796    0.04324 0.09491         
     2   0.7671 0.5168    0.05274 0.11037         
     3   0.7671 0.5167    0.04294 0.09043         
     4   0.7728 0.5289    0.04439 0.09290         
     5   0.8012 0.5856    0.04144 0.08798         
     6   0.8049 0.5926    0.02871 0.06133         
     7   0.8049 0.5925    0.03458 0.07450         
     8   0.8124 0.6090    0.03444 0.07361         
     9   0.8181 0.6204    0.03135 0.06758        *
    10   0.8069 0.5971    0.04234 0.09166         
   342   0.8106 0.6042    0.04701 0.10326         

The top 5 variables (out of 9):
nC, X3v, Sp, X2v, X1v

该过程始终使用Accuracy作为性能测量.出现的另一个问题是,当我尝试从使用以下方法获得的模型中获得预测:

predictions <- predict(rf.profileROC.Radial$fit,mdrrDescr)

我收到以下消息

In predictionFunction(method, modelFit, tempX, custom = models[[i]]$control$custom$prediction) :
  kernlab class prediction calculations failed; returning NAs

结果证明从模型中得到一些预测是不可能的.

这是通过获得的信息 sessionInfo()

R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C               LC_TIME=es_ES.UTF-8       
 [4] LC_COLLATE=es_ES.UTF-8     LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=es_ES.UTF-8   
 [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
 [10] LC_TELEPHONE=C             LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] grid      parallel  splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] e1071_1.6-2     class_7.3-9     pROC_1.6.0.1    doMC_1.3.2      iterators_1.0.6 foreach_1.4.1  
 [7] caret_6.0-21    ggplot2_0.9.3.1 lattice_0.20-24 kernlab_0.9-19 

loaded via a namespace (and not attached):
 [1] car_2.0-19         codetools_0.2-8    colorspace_1.2-4   compiler_3.0.2     dichromat_2.0-0   
 [6] digest_0.6.4       gtable_0.1.2       labeling_0.2       MASS_7.3-29        munsell_0.4.2     
 [11] nnet_7.3-7         plyr_1.8           proto_0.3-10       RColorBrewer_1.0-5 Rcpp_0.10.6       
 [16] reshape2_1.2.2     scales_0.2.3       stringr_0.6.2      tools_3.0.2       

topepo.. 7

一个问题是一个小错字('trControl='而不是'trainControl=').此外,caretFuncs在将其附加到rfe控制功能后进行更改.最后,您需要告诉trainControl您计算ROC曲线.

此代码有效:

 caretFuncs$summary <- twoClassSummary

 ctrl <- rfeControl(functions=caretFuncs, 
                    method = "cv",
                    repeats =5, number = 10,
                    returnResamp="final", verbose = TRUE)

 trainctrl <- trainControl(classProbs= TRUE,
                           summaryFunction = twoClassSummary)
 rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass, 
                             sizes=subsets,
                             rfeControl=ctrl,
                             method="svmRadial",
                             ## I also added this line to
                             ## avoid a warning:
                             metric = "ROC",
                             trControl = trainctrl)


 > rf.profileROC.Radial

 Recursive feature selection

 Outer resampling method: Cross-Validated (10 fold) 

 Resampling performance over subset size:

  Variables    ROC   Sens   Spec   ROCSD  SensSD  SpecSD Selected
          1 0.7805 0.8356 0.6304 0.08139 0.10347 0.10093         
          2 0.8340 0.8491 0.6609 0.06955 0.10564 0.09787         
          3 0.8412 0.8491 0.6565 0.07222 0.10564 0.09039         
          4 0.8465 0.8491 0.6609 0.06581 0.09584 0.10207         
          5 0.8502 0.8624 0.6652 0.05844 0.08536 0.09404         
          6 0.8684 0.8923 0.7043 0.06222 0.06893 0.09999         
          7 0.8642 0.8691 0.6913 0.05655 0.10837 0.06626         
          8 0.8697 0.8823 0.7043 0.05411 0.08276 0.07333         
          9 0.8792 0.8753 0.7348 0.05414 0.08933 0.07232        *
         10 0.8622 0.8826 0.6696 0.07457 0.08810 0.16550         
        342 0.8650 0.8926 0.6870 0.07392 0.08140 0.17367         

 The top 5 variables (out of 9):
    nC, X3v, Sp, X2v, X1v

对于预测问题,您应该使用rf.profileROC.Radial而不是fit组件:

 > predict(rf.profileROC.Radial, head(mdrrDescr))
       pred    Active  Inactive
 1 Inactive 0.4392768 0.5607232
 2   Active 0.6553482 0.3446518
 3   Active 0.6387261 0.3612739
 4 Inactive 0.3060582 0.6939418
 5   Active 0.6661557 0.3338443
 6   Active 0.7513180 0.2486820

马克斯

1 个回答
  • 一个问题是一个小错字('trControl='而不是'trainControl=').此外,caretFuncs在将其附加到rfe控制功能后进行更改.最后,您需要告诉trainControl您计算ROC曲线.

    此代码有效:

     caretFuncs$summary <- twoClassSummary
    
     ctrl <- rfeControl(functions=caretFuncs, 
                        method = "cv",
                        repeats =5, number = 10,
                        returnResamp="final", verbose = TRUE)
    
     trainctrl <- trainControl(classProbs= TRUE,
                               summaryFunction = twoClassSummary)
     rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass, 
                                 sizes=subsets,
                                 rfeControl=ctrl,
                                 method="svmRadial",
                                 ## I also added this line to
                                 ## avoid a warning:
                                 metric = "ROC",
                                 trControl = trainctrl)
    
    
     > rf.profileROC.Radial
    
     Recursive feature selection
    
     Outer resampling method: Cross-Validated (10 fold) 
    
     Resampling performance over subset size:
    
      Variables    ROC   Sens   Spec   ROCSD  SensSD  SpecSD Selected
              1 0.7805 0.8356 0.6304 0.08139 0.10347 0.10093         
              2 0.8340 0.8491 0.6609 0.06955 0.10564 0.09787         
              3 0.8412 0.8491 0.6565 0.07222 0.10564 0.09039         
              4 0.8465 0.8491 0.6609 0.06581 0.09584 0.10207         
              5 0.8502 0.8624 0.6652 0.05844 0.08536 0.09404         
              6 0.8684 0.8923 0.7043 0.06222 0.06893 0.09999         
              7 0.8642 0.8691 0.6913 0.05655 0.10837 0.06626         
              8 0.8697 0.8823 0.7043 0.05411 0.08276 0.07333         
              9 0.8792 0.8753 0.7348 0.05414 0.08933 0.07232        *
             10 0.8622 0.8826 0.6696 0.07457 0.08810 0.16550         
            342 0.8650 0.8926 0.6870 0.07392 0.08140 0.17367         
    
     The top 5 variables (out of 9):
        nC, X3v, Sp, X2v, X1v
    

    对于预测问题,您应该使用rf.profileROC.Radial而不是fit组件:

     > predict(rf.profileROC.Radial, head(mdrrDescr))
           pred    Active  Inactive
     1 Inactive 0.4392768 0.5607232
     2   Active 0.6553482 0.3446518
     3   Active 0.6387261 0.3612739
     4 Inactive 0.3060582 0.6939418
     5   Active 0.6661557 0.3338443
     6   Active 0.7513180 0.2486820
    

    马克斯

    2023-02-03 12:09 回答
撰写答案
今天,你开发时遇到什么问题呢?
立即提问
热门标签
PHP1.CN | 中国最专业的PHP中文社区 | PNG素材下载 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有