java - fast way to execute multiple CREATE statements -
i access neo4j database via java , want create 1,3 million nodes. therefore create 1,3 million "create" statements. figured out, query way long. can execute ~100 create statements per query - otherwise query fails:
client client; webresource cypher; string request; clientresponse cypherresponse; string query = ""; int nrqueries = 0; for(hashmap<string, string> entity : entities){ nrqueries++; query += " create [...] "; if(nrqueries%100==0){ client = client.create(); cypher = client.resource(server_root_uri + "cypher"); request = "{\"query\":\""+query+"\"}"; cypherresponse = cypher.accept(mediatype.application_json).post(clientresponse.class, request); cypherresponse.close(); query = ""; } }
well, want execute 1,3 million queries , can combine 100 1 request, still have 13,000 requests, take long time. there way faster?
you have 2 other options should considering: import tool , load csv option.
the right question here "how put data neo4j fast" rather "how execute lot of create statements quickly". both of these options way faster doing individual create
statements, wouldn't mess individual create
s anymore.
michael hunger wrote great blog post describing multiple facets of importing data neo4j, should check out if want understand more why options, not options.
the load csv
option going name suggests. you'll use cypher query language load data directly files, , goes substantially faster because commit records in "batches" (the documentation describes this). you're still using transactions data in, you're doing faster, in batches, , while being able create complex relationships along way.
the import tool similar, except it's high performance creates of large volumes of data. magic here (and why it's fast) skips transaction layer. both thing , bad thing, depending on perspective (michael hunger's blog post believe explains tradeoffs).
without knowing data, it's hard make specific recommendation - generality, i'd start load csv
default, , move import tool if , if volume of data big, or insert performance requirements intense. reflects slight bias on part transactions thing, , staying @ cypher layer (rather using separate command line tool) thing, ymmv.
Comments
Post a Comment