Matrix multiplication – a case study of micro-optimization in C/C++ (www.reddit.com)