Leveraging Intra-Function Parallelism in Serverless Machine Learning
Running stateful machine learning algorithms with serverless architectures inherently induces overheads, as serverless functions are not directly network-addressable, hence one must rely on a remote storage service for storing the shared state. To hide the access latency to the remote storage, one can employ intra-function parallelism to take advantage of the multicore computing resources of the serverless functions. In this work, we port to serverless two stateful machine learning algorithms, k-means clustering and logistic regression, and then adopt intra-function parallelism to parallelize the execution of the serverless functions. Several experiments have demonstrated that intra-function parallelism delivers performance improvements in serverless machine learning. Improved performances of up to 68% have been achieved when running k-means on serverless functions that employ intra-function parallelism. We demonstrate with k-means and logistic regression that from a performance perspective it is preferable to execute a smaller number of multiple-vCPUs workers than a larger number of single-vCPU workers, due to decreased synchronization overheads.