Any way to simplify R code? -
i have following code seems long winded - i'm doing same thing each file figured there must method simplify alludes me @ present! appreciated always:
.lvb.sf.1.1 <- read.csv("lvb_sf_1-1.csv", header=t, sep=","); .lvb.sf.1.6 <- read.csv("lvb_sf_1-6.csv", header=t, sep=",") .lvb.sf.1.2 <- read.csv("lvb_sf_1-2.csv", header=t, sep=","); .lvb.sf.1.7 <- read.csv("lvb_sf_1-7.csv", header=t, sep=",") .lvb.sf.1.3 <- read.csv("lvb_sf_1-3.csv", header=t, sep=","); .lvb.sf.1.8 <- read.csv("lvb_sf_1-8.csv", header=t, sep=",") .lvb.sf.1.4 <- read.csv("lvb_sf_1-4.csv", header=t, sep=","); .lvb.sf.1.9 <- read.csv("lvb_sf_1-9.csv", header=t, sep=",") .lvb.sf.1.5 <- read.csv("lvb_sf_1-5.csv", header=t, sep=","); .lvb.sf.2.0 <- read.csv("lvb_sf_2.csv", header=t, sep=",") # interpolate missing monthly values - linear interpolation of above x <- zoo(.lvb.sf.1.1); .lvb.sf.1.1 <- as.data.frame(na.approx(x)); x <- zoo(.lvb.sf.1.2); .lvb.sf.1.2 <- as.data.frame(na.approx(x)) x <- zoo(.lvb.sf.1.3); .lvb.sf.1.3 <- as.data.frame(na.approx(x)); x <- zoo(.lvb.sf.1.4); .lvb.sf.1.4 <- as.data.frame(na.approx(x)) x <- zoo(.lvb.sf.1.5); .lvb.sf.1.5 <- as.data.frame(na.approx(x)); x <- zoo(.lvb.sf.1.6); .lvb.sf.1.6 <- as.data.frame(na.approx(x)) x <- zoo(.lvb.sf.1.7); .lvb.sf.1.7 <- as.data.frame(na.approx(x)); x <- zoo(.lvb.sf.1.8); .lvb.sf.1.8 <- as.data.frame(na.approx(x)) x <- zoo(.lvb.sf.1.9); .lvb.sf.1.9 <- as.data.frame(na.approx(x)); x <- zoo(.lvb.sf.2.0); .lvb.sf.2.0 <- as.data.frame(na.approx(x)) # create rowmeans columns above .lvb.sf.1.1$mean <- rowmeans(.lvb.sf.1.1[,c(2:4)]); .lvb.sf.1.6$mean <- rowmeans(.lvb.sf.1.6[,c(2:4)]) .lvb.sf.1.2$mean <- rowmeans(.lvb.sf.1.2[,c(2:4)]); .lvb.sf.1.7$mean <- rowmeans(.lvb.sf.1.7[,c(2:4)]) .lvb.sf.1.3$mean <- rowmeans(.lvb.sf.1.3[,c(2:4)]); .lvb.sf.1.8$mean <- rowmeans(.lvb.sf.1.8[,c(2:4)]) .lvb.sf.1.4$mean <- rowmeans(.lvb.sf.1.4[,c(2:4)]); .lvb.sf.1.9$mean <- rowmeans(.lvb.sf.1.9[,c(2:4)]) .lvb.sf.1.5$mean <- rowmeans(.lvb.sf.1.5[,c(2:4)]); .lvb.sf.2.0$mean <- rowmeans(.lvb.sf.2.0[,c(2:4)]) # rmse calculation lvb.rmse.tws.1.1 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.1[,5]); lvb.rmse.tws.1.6 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.6[,5]) lvb.rmse.tws.1.2 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.2[,5]); lvb.rmse.tws.1.7 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.7[,5]) lvb.rmse.tws.1.3 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.3[,5]); lvb.rmse.tws.1.8 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.8[,5]) lvb.rmse.tws.1.4 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.4[,5]); lvb.rmse.tws.1.9 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.9[,5]) lvb.rmse.tws.1.5 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.1.5[,5]); lvb.rmse.tws.2.0 <- rmse(lvb.obs.tws.lag_only[,1], .lvb.sf.2.0[,5])
thanks!
when performing same same sequence of actions multiple times, function composition should lot. example
interpolate <- function(x) as.data.frame(na.approx(zoo(x))) # take data.frame , add 'mean' column containing mean of columns 2:4 addrowmeans <- function(x) { x$mean <- rowmeans(x[ , 2:4]) x }
using these make code less bulky, shown in end.
as iterating through data sets perform above actions, use list of data.frames structure , go through using for loop. reduces code copying , pasting , make script more flexible, changing number of files won't require manual work.
a better idea loops use apply family of functions in r faster , have more comprehensible syntax.
with functions defined above , lapply
base r, algorithm op reduces to
# read files, store them list of data.frames lapply(files, read.csv, h = true) -> data.list # interpolate missing monthly values - linear interpolation of above lapply(data.list, interpolate) -> data.interpolated # create rowmeans columns above lapply(data.interpolated , addrowmeans) -> data.interpolated # rmse calculation (assuming rmse has arguments names x , y) lapply(data.interpolated[5], function(x) rmse(lvb.obs.tws.lag_only[1], x))
where files created below
sprintf('%1.1f', seq(from = 1.9, = 2.1, = .1)) -> nums files <- paste('prefix_', nums, '.csv', sep = '') files [1] "prefix_1.9.csv" "prefix_2.0.csv" "prefix_2.1.csv"
Comments
Post a Comment