Inline assembly in C#
I was working on trying to figure out how sqeeze out some more speed for some calculations, which involved the following bit of math. $$SumOfSquares = \sum_{i=0}^{N} data_i^2$$ The language of choice was C# with Visual Studio 2013 for a 64-bit process. One idea was to try and use SSE to improve the throughput of the computation. Unfortunately the .NET framework currently (as of .NET 4.2) does not generate SSE instructions as part of its JIT compiler. A workaround would be to make a dll with the math function exported, and call that exported function from C# using PInvoke. The Visual Studio compiler (2012 and above) is an excellant optimizing compiler with the ability to auto-vectorize loops. But then I would have to add a C dll as a dependency of my C# application. One more file to keep track just for a single calculation. ...