python - Matching range of timestamps in pandas -
i having trouble doing in pandas
df1 (my left join):
name | timestart | timeend | values | order john 12/24/2014 08:10:32 12/24/2014 08:14:21 2 1 john 12/24/2014 08:15:03 12/24/2014 08:22:49 2 2
df2
name | timestart | timeend | values john 12/24/2014 08:12:57 12/24/2014 08:13:31 8
timestart in df2 greater timestart in df1 (this typically happens during interaction). , match, have less next row of data individual.
here thought process. shift row columns see if match. compare df2 timestart > initial timestart on df1 less next row timestart (same name).
df1.sort(['name', 'timestart'], ascending=[1, 1], inplace = true) df1['name_r'] = df1['name'].shift(-1) df1['matching row'] = np.where((df1['name_r'] == df1['name']), 1, 0) df1['next timestamp'] = np.where(df1['matching row'] == 1, df1['timestart'].shift(-1), np.datetime64('nat')) df1['test'] = np.where(df2['timestart'] > df1['timestart'] < df1['next timestamp'], 1, 0)
edit - possible asof command? trick name has match, closest timestamp timestart on each file/dataframe.
there "asof join" in pandas 0.19. example, ignore starting time , join recent ending time.
pd.merge_asof(df1, df2, on='timeend')
Comments
Post a Comment