Writing lines to a csv file in a for loop in python -
i have large csv file 5000 rows in it. first column contains identifying names each row i.e. lhgzz01 first 9 rows have lhgzz01 name next 10 have else , on. there no pattern such used np.unique find index name changes.
i want write loop write each row of source csv new csv files containing same names in loop.
datafile = open('source.csv','rb') reader = csv.reader(datafile) data = [] idx = [] dataidx = [] next(reader, none)#skip headers row in reader: d = row[0] idx.append(d) data.append(row) dataidx.append(row[0]) index =np.sort(np.unique(idx,return_index=true)[1]) nme = []#list of unique names row in index: nm = data[row][0] nme.append(nm) in np.arange(0,9): open(str(out_dir)+str(nme[0])+'.csv','w') f1: row = data[i] writer=csv.writer(f1, delimiter=',')#lineterminator='\n', writer.writerow(row)
the code above writes first row of new csv , stops.
my question how loop through source.csv file splitting data after every new name change , write rows same row name unique csv?
apologies long winded question problem beyond python skills unfortunately , driving me nuts.
any or suggestions appreciated
sample csv:
sample csv<br/> id north_dms east_dms dist <br/> lhgzz01 403921 374459 12500m <br/> lhgzz01 403610 353000 12500m <br/> lhgzz01 404640 360400 12500m <br/> lhgzz01 404515 361900 12500m <br/> lhgzz01 411240 381900 12500m <br/> lhgzz01 415629 400600 12500m <br/> lhgzz01 401503 384400 12500m <br/> lhgzz01 400319 382200 12500m <br/> lhgzz01 403921 372800 12500m <br/> lhgzz02 412000 353200 12500m <br/> lhgzz02 412749 343200 12500m <br/> lhgzz02 403111 353000 12500m <br/> lhgzz02 400600 374459 12500m <br/> lhgzz02 401818 400600 12500m <br/> lhgzz02 401525 393100 12500m <br/> lhgzz02 401605 392400 12500m <br/> lhgzz02 412000 384400 12500m <br/> lhgzz02 372912 382157 8400m <br/> gppha01 381500 382200 8400m <br/> gppha01 393000 375252 8400m <br/> gppha01 395400 370602 8400m <br/> gppha01 401503 372912 8400m <br/> gppha01 400831 382157 8400m <br/> gppha01 390651 365700 8400m <br/> gppha01 372912 382954 8400m <br/> gppha02 392130 370602 12500m <br/> gppha02 400319 364000 12500m <br/> gppha02 400831 361900 12500m <br/> gppha02 390651 365700 12500m <br/> gppha02 382157 400600 12500m <br/> gppha02 382200 401818 12500m <br/> gppha02 375252 401525 12500m <br/> gppha02 385112 401605 12500m <br/> gppha02 392020 400319 12500m <br/> gppha02 392130 392130 12500m <br/> gppha03 392020 392020 9800m <br/> gppha03 385112 383000 9800m <br/> gppha03 382954 400600 9800m <br/> gppha03 365700 364000 9800m <br/> gppha03 381900 372912 9800m <br/> gppha03 383000 380700 9800m <br/> gppha03 392020 373724 9800m <br/> gppha03 385112 363842 7500m <br/> vvdfb01 374459 361210 12500m <br/> vvdfb01 353000 360002 12500m <br/> vvdfb01 360400 360002 12500m <br/> vvdfb01 361900 364000 12500m <br/> vvdfb01 381900 360002 12500m <br/> vvdfb01 400600 360002 12500m <br/> vvdfb01 384400 361210 12500m <br/> vvdfb01 382200 350530 12500m <br/> vvdfb02 372800 344400 12500m <br/> vvdfb02 353200 343100 12500m <br/> vvdfb02 343200 351448 12500m <br/> vvdfb02 353000 360002 12500m <br/> vvdfb02 374459 364000 12500m <br/> vvdfb02 400600 351448 12500m <br/> vvdfb02 393100 345353 12500m <br/> vvdfb02 392400 341731 12500m <br/>
every time open file in w
mode, overwrite there. should open file 1 time, loop on calls writerow
like:
with open(str(out_dir)+str(nme[0])+'.csv','w') f1: writer=csv.writer(f1, delimiter=',')#lineterminator='\n', in np.arange(0,9): row = data[i] writer.writerow(row)
instead of reopening file each iteration through for
loop
Comments
Post a Comment