Go and Rust comparison - building a task scheduler

Go and Rust comparison - building a task scheduler


Rust

Months ago I found this amazing YouTube video by Jyotinder Singh: How to build a Distributed Task Scheduler with Go, Postgres, and gRPC

It’s a project that he built a distributed task scheduler in Go using PostgreSQL and gRPC. The architecture looks very practical compare to other ordinary projects such as pseudo e-commerce site, or todo app (they are also great too). And I thought it would be great if I make my own in Rust.

So here are some things I learned though my version of distributed task scheduler.

No worker pool pattern

In Go, one way to maximize machine’s concurrent capability is to use worker pool. You can spin up a number of workers on start up and send a task to each worker through a channel.

Example Code:

func (w *Worker) Run() error {
	w.startWorkerPool(workerPoolSize)
	w.initGRPCClient()
	defer w.closeGRPCConnection()
	go w.periodicHeartBeat()
	w.startGRPCServer()
	log.Println("Serving now...")
	return w.awaitShutdown()
}

func (w *Worker) startWorkerPool(poolSize int) {
	for i := 0; i < poolSize; i++ {
        // spawn workers
		go func() {
			w.wg.Add(1)
			defer w.wg.Done()
			w.process()
		}()
	}
}

func (w *Worker) process() {
	for {
		select {
		case <-w.ctx.Done():
			return
		case t := <-w.taskQueue:
			req := &pb.UpdateTaskStatusRequest{
				TaskId:    t.GetTaskId(),
				Status:    pb.TaskStatus_STARTED,
				StartedAt: time.Now().Unix(),
			}
			w.grpcClient.UpdateTaskStatus(w.ctx, req)
            // ...
		}
	}
}

This is totally valid code in Go as channel is thread-safe and multiple goroutines can access to a single channel. It also takes advantage of memory utilization (say, gRPC client and server respectively)

The story is slightly different in async Rust world. It provides channels that only allow a single consumer. And through my research I learned these “message passing” pattern is not suitable for this async programming because a channel blocks all async tasks until it has something to get consumed.

This time I chose a simpler way to achieve it in async Rust - just spawn a new async task.

Example code in Rust:

#[tonic::async_trait]
impl WorkerService for Worker {
    async fn submit_task(
        &self,
        req: Request<TaskRequest>,
    ) -> Result<Response<TaskResponse>, Status> {
        let req = req.into_inner();
        println!("Received a task from coordinator. ID: {}", req.task_id);
        let response = req.clone().into();
        let addr = self.coordinator_addr.clone();
        tokio::spawn(handle_task(req.task_id.clone(), addr)); // pass it to another async function!
        Ok(Response::new(response))
    }
}

async fn handle_task(task_id: String, addr: String) -> Result<(), String> {
    println!("Connecting to gRPC server: {}", addr);
    let mut client = CoordinatorServiceClient::connect(addr.to_string())
        .await
        .map_err(|e| e.to_string())?;
    println!("Created a new gRPC client for coordinator");
    let req = UpdateTaskStatusRequest::start(&task_id)?;
    let _ = client
        .update_task_status(Request::new(req.clone()))
        .await
        .map_err(|e| e.to_string())?;
    // Processing task now...
    Ok(())
}

One downside here is that in Rust code, each time the worker receives a request, it initializes gRPC client. This could be a huge disadvantage for a service as scale such as one Rust might get involved.

Well, there actually is a library called async-channel. This should be the answer for my question. I will try it later on.

Thanks for reading ✌️

© 2024 Hiro