热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

R:删除重复的条目,如果它们在一年内到来-R:removingduplicatedentriesiftheycomewithinayear

ImnewtoR,Ihaveadataframeof500000entriesofpatientIDsanddatesandothervariables..我是

Im new to R, I have a data frame of 500000 entries of patient IDs and dates and other variables..

我是R的新手,我有一个500000条患者ID和日期以及其他变量的数据框。

I want to remove any repeated duplicated patient ID(PtID) if they happen to come within one year of their first appearance.. for example:

我想删除任何重复的重复患者ID(PtID),如果它们恰好在他们第一次出现后的一年内...例如:

 PtID    date**
 1. 1    01/01/2006
 2. 2    01/01/2006
 3. 1    24/02/2006 
 4. 4    26/03/2006
 5. 1    04/05/2006
 6. 1    05/05/2007

in this case I want to remove the 3rd and the 5th rows and keep the 1st and 6th rows..

在这种情况下,我想删除第3行和第5行并保留第1行和第6行..

can somebody help me with this please.. this is the str(my data which is called final1)

请有人帮我这个..这是str(我的数据叫做final1)

str(final1)
'data.frame':   605870 obs. of  70 variables:
...
 $ Date          : Date, format: "2006-03-12" "2006-04-01" ...
$ PtID          : int  11251 11251 11251 11251 11251 11251 11251 30938 30938 11245 ...
...

1 个解决方案

#1


1  

Here's one solution that uses ply and lubridate. First load the packages:

这是一个使用ply和lubridate的解决方案。首先加载包:

require(plyr)
require(lubridate)

Next create some sample data (notice that this is a bit more straightforward than your example!)

接下来创建一些示例数据(请注意,这比您的示例更简单!)

num = 1:6
PtID = c(1,2,1,4,1,1)
date = c("01/01/2006", "01/01/2006","24/02/2006", "26/03/2006", "04/05/2006",
  "05/05/2007")
dd = data.frame(PtID, date)

Now we make the date column an R date object:

现在我们将日期列设为R日期对象:

dd$date = dmy(date)

and a function that contains the rule of whether a row should be included:

以及包含是否应包含行的规则的函数:

keepId = function(dates) {
  keep = ((dates - min(dates)) > 365*24*60*60) |
  ((dates == min(dates)))
  return(keep)
}

All that remains is using ddply to partition the date frame by the PtID

剩下的就是使用ddply通过PtID对日期帧进行分区

dd_sub = ddply(dd, c("PtID"), transform, keep = keepId(date))
dd_sub[dd_sub$keep,]

推荐阅读
author-avatar
cut1089289
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有