dataframe - R table manipulation -


i have data.frame below

product=c(rep("a",4),rep("b",2)) ww1=c(201438,201440,201444,201446,201411,201412) ww2=ww1-6 diff=rep(6,6) demand=rep(100,6)  df=data.frame(product,ww1,ww2,diff,demand) df<- df[with(df,order(product, ww1)),]  df    product    ww1    ww2 diff demand 1       201438 201432    6    100 2       201440 201434    6    100 3       201444 201438    6    100 4       201446 201440    6    100 5       b 201411 201405    6    100 6       b 201412 201406    6    100 

i want add rows based upon conditions below.

for row in data, if product on earlier row same product on current row, ww1 on earlier row not same ww1-1 on current row (basically ww1 difference 1), add new row.

for newly added row:

product same product on earlier row. ww1 ww1 on earlier row + 1 ww2 ww2 on earlier row + 1 ww_diff 6 demand 0 

the final output need below:

product ww1 ww2 ww_diff demand   201438  201432  6   100   201439  201433  6   0   201440  201434  6   100   201441  201435  6   0   201442  201436  6   100   201443  201437  6   0   201444  201438  6   100   201445  201439  6   0   201446  201440  6   100 b   201411  201405  6   100 b   201412  201406  6   100 

as of thinking of writing macro in excel, slow , therefore prefer r solution

update1===============================

how add column seq? column 1 earliest entry of ww1 of every product , increments 1.

product ww1 ww2 ww_diff demand  seq   201438  201432  6   100 1   201439  201433  6   0   2   201440  201434  6   100 3   201441  201435  6   0   4   201442  201436  6   100 5   201443  201437  6   0   6   201444  201438  6   100 7   201445  201439  6   0   8   201446  201440  6   100 9 b   201411  201405  6   100 1 b   201412  201406  6   100 2 

update2=======================================================

i posting questions again (i unchecked accepted answer of alistaire answer not working on original data, works on small sample of data :(

in below solution user alistaire, df3 <- right_join(df, data.frame(ww1=ww1big)) seem causing issue.

in final solution, prefer if columns specified names. way won't forced arrange columns in particular order.

here's similar data.table solution assume should more efficient i'm minimizing calculations per group , using binary join instead.

library(data.table) setkey(setdt(df), product, ww1) ## sorting `product` , `ww1` indx <- setkey(df[, list(ww1 = seq.int(ww1[1l], ww1[.n], = 1l)), = product]) ## running `seq.int` on `ww1` per group res <- df[indx][is.na(ww2), `:=`(ww2 = ww1 - 6l, diff = 6l, demand = 0l)] ## filling missing values res[, seq := seq_len(.n), = product] # adding index res #     product    ww1    ww2 diff demand seq #  1:       201438 201432    6    100   1 #  2:       201439 201433    6      0   2 #  3:       201440 201434    6    100   3 #  4:       201441 201435    6      0   4 #  5:       201442 201436    6      0   5 #  6:       201443 201437    6      0   6 #  7:       201444 201438    6    100   7 #  8:       201445 201439    6      0   8 #  9:       201446 201440    6    100   9 # 10:       b 201411 201405    6    100   1 # 11:       b 201412 201406    6    100   2 

Comments

Popular posts from this blog

Java 3D LWJGL collision -

spring - SubProtocolWebSocketHandler - No handlers -

methods - python can't use function in submodule -