Notes from the Tail

The Last Percentile of Large Scale Systems

I'm Ashraf. This is a collection of my notes and views on tail latency, memory systems, engineering of large scale AI/ML training/inference infrastructure and GPU performance.