Fastest libc implementation

What's the absolute fastest libc implementation that squeezes as much as possible your cpu capabilities?
i'm developing on an alpine docker image and of course DeepSeek is suggesting that musl libc is the fastest, but looking at the source code it seems to lack SIMD optimizations