@sergaderg Oh yeah, that completely slipped my mind. And yet, it doesn't seem like it helps a lot considering the massive hardware requirements.
edit: I looked into the performance characteristics and it seems there's a threshold of batch size 64 after which performance stops improving. On a scale of millions of requests, that's pretty much negligible.