Pertanyaan String membelah dan memperluas (vektor) di pembatas: R


Saya punya vektor ini (ukurannya besar) myvec. Saya harus membaginya sesuai / dan buat vektor hasil lainnya resvector. Bagaimana saya bisa menyelesaikannya di R?

myvec<-c("IID:WE:G12D/V/A","GH:SQ:p.R172W/G", "HH:WG:p.S122F/H")

resvector

IID:WE:G12D, IID:WE:G12V,IID:WE:G12A,GH:SQ:p.R172W,GH:SQ:p.R172G,HH:WG:p.S122F,HH:WG:p.S122H

4
2017-07-21 07:30


asal


Jawaban:


Anda dapat mencoba ini, menggunakan strsplit sebagaimana disebutkan oleh @Tensibai:

sp_vec <- strsplit(myvec, "/") # split the element of the vector by "/" : you will get a list where each element is the decomposition (vector) of one element of your vector, according to "/"
ts_vec <- lapply(sp_vec, # for each element of the previous list, do
                 function(x){
                     base <- sub("\\w$", "", x[1]) # get the common beginning of the column names (so first item of vector without the last letter)
                     x[-1] <- paste0(base, x[-1]) # paste this common beginning to the rest of the vector items (so the other letters)
                     x}) # return the vector
resvector <- unlist(ts_vec) # finally, unlist to get the needed vector

resvector
# [1] "IID:WE:G12D"   "IID:WE:G12V"   "IID:WE:G12A"   "GH:SQ:p.R172W" "GH:SQ:p.R172G" "HH:WG:p.S122F" "HH:WG:p.S122H"

5
2017-07-21 07:40



Berikut ini adalah jawaban ringkas dengan regex dan beberapa pemrograman fungsional:

x = gsub('[A-Z]/.+','',myvec)
y = strsplit(gsub('[^/]+(?=[A-Z]/.+)','',myvec, perl=T),'/')

unlist(Map(paste0, x, y))
# "IID:WE:G12D"   "IID:WE:G12V"   "IID:WE:G12A" "GH:SQ:p.R172W" "GH:SQ:p.R172G" "HH:WG:p.S122F" "HH:WG:p.S122H"

5
2017-07-21 14:41



myvec<-c("IID:WE:G12D/V/A","GH:SQ:p.R172W/G", "HH:WG:p.S122F/H")

custmSplit <- function(str){
  splitbysep <-  strsplit(str, '/')[[1]]
  splitbysep[-1] <- paste0(substr(splitbysep[1], 1, nchar(splitbysep[1])), splitbysep[-1])
  return(splitbysep)
}

do.call('c', lapply(myvec, custmSplit))
# [1] "IID:WE:G12D"    "IID:WE:G12DV"   "IID:WE:G12DA"   "GH:SQ:p.R172W"  "GH:SQ:p.R172WG" "HH:WG:p.S122F"  "HH:WG:p.S122FH"

1
2017-07-21 07:50