How to save up memory while using Multiprocessing in Python? -
i've got function takes node id of graph input , calculate in graph(without altering graph object), saves results on filesystem, code looks this:
... # graph file being loaded g = loadgraph(gfile='data/graph.txt') # list of nodeids being loaded nodeids = loadseeds(sfile='data/seeds.txt') import multiprocessing mp # parallel part of code print ("entering parallel part ..") num_workers = mp.cpu_count() # 4 on machine p = mp.pool(num_workers) # _myparallelfunction(nodeid) {calculate nodeid in g , save file} p.map(_myparallelfunction, nodeids) p.close() ...
the problem when load graph python takes lots of memory(about 2g, it's big graph thousands of nodes actually), when starts go parallel part of code(the parallel map function execution) seems every process given separate copy of g , run out of memory on machine(it's got 6g ram , 3g swap), wanted see there way give each process same copy of g memory hold 1 copy of required? suggestions appreciated , in advance.
if dividing graph smaller parts not work, may able find solution using this or multiprocessing.sharedctypes, depending on kind of object graph is.
Comments
Post a Comment