作者:mobiledu2502868653 | 来源:互联网 | 2023-05-17 14:23
Ihaveasetofnucleotidesequencesinavectorofstringscalledx.我在称为x的字符串向量中有一组核苷酸序列。Iwant
I have a set of nucleotide sequences in a vector of strings called x.
我在称为x的字符串向量中有一组核苷酸序列。
I want to check whether some (say 10) motifs are present in x. I want to produce a data frame or table where the rows are the sequences in X and the columns are the patterns/motifs are in the vector sdseqs.
我想检查x中是否存在一些(比如10个)图案。我想生成一个数据框或表,其中行是X中的序列,列是模式/主题在向量sdseqs中。
sdframe <- data.frame
sdseqs = c("AGGAG.+ATG",
"AGAAG.+ATG","AAAGG.+ATG","GGAGG.+ATG","GAAGA.+ATG",
"GGAGA.+ATG","AAGGT.+ATG","AGGAA.+ATG","AAGGA.+ATG","GTGGA.+ATG")
for (i in 1:10) {
sdframe <- cbind(sdframe,(grepl(sdseqs[i], x)))
}
This code works just fine but the first column of the data frame will be empty, with question marks. The other columns are populated with true and false - that's what i want.
此代码工作正常,但数据框的第一列将为空,带有问号。其他列填充了true和false - 这就是我想要的。
I tried to define an empty data frame outside the loop at the beginning. I am new to R and I am coming from Perl. This what I usually did in Perl: you define variables to be used within a loop outside. How can I do this in R?
我试图在开头的循环外定义一个空数据框。我是R的新手,我来自Perl。这就是我在Perl中经常做的事情:您定义要在外部循环中使用的变量。我怎么能在R中这样做?
Also, a viable option would be to delete the first column from my data frame, but that does not seem so straightforward to me.
另外,一个可行的选择是从我的数据框中删除第一列,但这对我来说似乎并不那么简单。
Any help is appreciated.
任何帮助表示赞赏。
The output i Get with my code now:
输出我现在使用我的代码:
sdframe
[1,] ? TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE
[2,] ? FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE
[3,] ? FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE
[4,] ? TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[5,] ? FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[6,] ? FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE
[7,] ? FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
[8,] ? FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
[9,] ? FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[10,] ? FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[11,] ? FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
I want the same but without the first column of ?. Note my x has 11 sequences, the motifs i checked for are the column (10 columns, 11 counting the first with ?)
我想要相同但没有第一列?注意我的x有11个序列,我检查的主题是列(10列,11个计数第一个?)
2 个解决方案