Javascript (node.js) capped number of child processes - Stack Overflow

hopefully I can describe what I'm looking for clearly enough. Working with Node and Python.I'

hopefully I can describe what I'm looking for clearly enough. Working with Node and Python.

I'm trying to run a number of child processes (.py scripts, using child_process.exec()) in parallel, but no more than a specified number at a time (say, 2). I receive an unknown number of requests in batches (say this batch has 3 requests). I'd like to stop spawning processes until one of the current ones finishes.

for (var i = 0; i < requests.length; i++) {

    //code that would ideally block execution for a moment
    while (active_pids.length == max_threads){
        console.log("Waiting for more threads...");
        sleep(100)
        continue
    };

    //code that needs to run if threads are available
    active_pids.push(i);

    cp.exec('python python-test.py '+ requests[i],function(err, stdout){
        console.log("Data processed for: " + stdout);

        active_pids.shift();

          if (err != null){
              console.log(err);
          }
    });
}

I know that while loop doesn't work, it was the first attempt.

I'm guessing there's a way to do this with

setTimeout(someSpawningFunction(){

    if (active_pids.length == max_threads){
        return
    } else {
        //spawn process?
    }

},100)

But I can't quite wrap my head around it.

Or maybe

waitpid(-1)

Inserted in the for loop above in an if statement in place of the while loop? However I can't get the waitpid() module to install at the moment.

And yes, I understand that blocking execution is considered very bad in JS, but in my case, I need it to happen. I'd rather avoid external cluster manager-type libraries if possible.

Thanks for any help.

EDIT/Partial Solution

An ugly hack would be to use the answer from: this SO question (execSync()). But that would block the loop until the LAST child finished. That's my plan so far, but not ideal.

hopefully I can describe what I'm looking for clearly enough. Working with Node and Python.

I'm trying to run a number of child processes (.py scripts, using child_process.exec()) in parallel, but no more than a specified number at a time (say, 2). I receive an unknown number of requests in batches (say this batch has 3 requests). I'd like to stop spawning processes until one of the current ones finishes.

for (var i = 0; i < requests.length; i++) {

    //code that would ideally block execution for a moment
    while (active_pids.length == max_threads){
        console.log("Waiting for more threads...");
        sleep(100)
        continue
    };

    //code that needs to run if threads are available
    active_pids.push(i);

    cp.exec('python python-test.py '+ requests[i],function(err, stdout){
        console.log("Data processed for: " + stdout);

        active_pids.shift();

          if (err != null){
              console.log(err);
          }
    });
}

I know that while loop doesn't work, it was the first attempt.

I'm guessing there's a way to do this with

setTimeout(someSpawningFunction(){

    if (active_pids.length == max_threads){
        return
    } else {
        //spawn process?
    }

},100)

But I can't quite wrap my head around it.

Or maybe

waitpid(-1)

Inserted in the for loop above in an if statement in place of the while loop? However I can't get the waitpid() module to install at the moment.

And yes, I understand that blocking execution is considered very bad in JS, but in my case, I need it to happen. I'd rather avoid external cluster manager-type libraries if possible.

Thanks for any help.

EDIT/Partial Solution

An ugly hack would be to use the answer from: this SO question (execSync()). But that would block the loop until the LAST child finished. That's my plan so far, but not ideal.

Share Improve this question edited May 23, 2017 at 12:06 CommunityBot 11 silver badge asked Jun 2, 2015 at 12:49 Antoine ZambelliAntoine Zambelli 7648 silver badges20 bronze badges 2
  • Why do you say you need blocking execution to happen? Your setTimeout solution-sketch is not synchronous. – apsillers Commented Jun 2, 2015 at 13:17
  • @apsillers: yeah, setTimeout would not be ideal. I was thinking there might be a messy way to do it though. Like I said, I'm a little confused by it. I do need to limit the number of processes though. Implementing your answer right now and seeing if I can get it to work (it should do what I want, indeed). – Antoine Zambelli Commented Jun 2, 2015 at 13:20
Add a ment  | 

2 Answers 2

Reset to default 6

async.timesLimit from the async library is the perfect tool to use here. It allows you to asynchronously run a function n times, but run a maximum of k of those function calls in parallel at any given time.

async.timesLimit(requests.length, max_threads, function(i, next){
    cp.exec('python python-test.py '+ requests[i], function(err, stdout){
        console.log("Data processed for: " + stdout);

        if (err != null){
            console.log(err);
        }

        // this task is resolved
        next(null, stdout);
    });
}, function(err, stdoutArray) {
  // this runs after all processes have run; what's next?
});

Or, if you want errors to be fatal and stop the loop, call next(err, stdout).

You can just maintain a queue of external processes waiting to run and a counter for how many are currently running. The queue would simply contain an object for each process that had properties containing the data you need to know what process to run. You can just use an array of those objects for the queue.

Whenever you get a new request to run an external process, you add it to the queue and then you start up external processes increasing your counter each time you start one until your counter hits your max number.

Then, while monitoring those external processes, whenever one finishes, you decrement the counter and if your queue of tasks waiting to run is not empty, you start up another one and increase the counter again.

The async library has this type of functionality built in (running a specific number of operations at a time), though it isn't very difficult to implement yourself with a queue and a counter. The key is that you just have to hook into the pletion even for your external process so you can maintain the counter and start up any new tasks that are waiting.

There is no reason to need to use synchronous or serial execution or to block in order to achieve your goal here.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745195866a4616084.html

相关推荐

  • Javascript (node.js) capped number of child processes - Stack Overflow

    hopefully I can describe what I'm looking for clearly enough. Working with Node and Python.I'

    9小时前
    10

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信