Parse large number of xml files in java -
i'll getting large number of xml files (numbering in tens of thousands every few minutes) mq. xml files aren't big. have extract information , save database. cannot use third party libraries unfortunately (except apache commons). strategies/techniques used in scenario? there xml parser in java or apache can handle such situations well?
i might add i'm using jdk 1.4
based on comments , discussion around topic - propose consolidated solution.
parsing xml files using sax - @markspace mentioned, should go sax built-in , has performance.
use bulk inserts if possible - since plan insert large amount of data consider type of data reading , storing database. xml files contain same schema (which means correspond single table in database) or represent different objects (which means end inserting data multiple tables).
in case schema of xml files needs inserted same table in database, consider batching these data objects , bulk-inserting them database. more performing in terms of time resources (you open single connection persist batch opposed multiple connections each objects). of course need spend time in tuning batch size , deciding error handling strategy batch inserts (discard all v/s discard erroneous)
if schema of xml files different, consider clubbing similar xmls groups can bulk insert these groups later.
finally - , important : ensure release resources such file handles, database connections etc once done processing or in case encounter errors. in simple words use
try-catch-finally
@ correct places.
while no means complete, hope answer provides set of critical checkpoints need consider while writing scalable performant code
Comments
Post a Comment