-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate the use of Linq #109
Comments
Also see #95 |
Pop quiz: Allocations:
Anything I'm missing? I watched the relevant section of the video, but that LINQ call is a much larger beast than ours. There's no captured variable in the lambda, so no helper class must be created, System.Array is not a value type (so no allocation from boxing), and no I will watch the rest of the video, as I love watching these kinds of videos, but I'm also aware of performance issues in general. The code I wrote for this project is largely ~5 years old. I would definitely consider myself a better programmer now than I was then. Of course there will be some bad code in there that needs fixing. I am generally against LINQ because (in this case, AFAIK) the performance hit of using LINQ is really not that bad and it's not used anywhere in common "hot paths" for font rendering, just auxiliary things that are most likely going to be called once or twice during initalization. The main reason that I'd prefer to keep LINQ in is that LINQ is available in 3.5, and 3.5 is the earliest version of .NET that isn't completely end-of-life (no security updates, even if a massive flaw is found). Supporting a 15 year old unsupported runtime with minimal usage nowadays seems to me like it should be an edge case rather than a promise. Edit 1: Yeah, that If we are going to drop LINQ support, I would prefer to not use tuples either and stick to only .NET 2.0 language features with a .NET 2.0 build and a PCL build. The actual logical return type for Edit 2: All of the TrueType table structs are good candidates for conversion to structs similar to |
I built the API largely looking at FreeType docs, which gets a bit sparse when you get to things like TrueType and PostScript headers, so I pretty much stuck with the native types.
|
Microsoft discontinued support for all frameworks besides .NET 3.5 and .NET 4.5.2. |
Of the versions of .NET that are currently supported, LINQ is available in all of them. I guess it's less about LINQ itself and more that I'd prefer the minimum targeted version to be 3.5 instead of 2.0. LINQ just happened to be the only we're using that won't work on 2.0. Looking at it now, there's very little to gain from that bump up, and we're not gaining anything from LINQ, so I changed my mind. I guess an analogy might help - if you were doing a major overhaul on a piece of software today that runs on Windows 98, would you keep supporting 98 or move the minimum requirements up to at least XP? And yeah, no worries - I have generally been neglecting my open source work for the past year or so and I'm aware of that. I have a full time job lined up to start eventually (hoping within the next month? HR is painfully slow) and my freelance work will be scaled back a lot, the work I end up doing won't be super-top-priority like it is now |
Cheers. Would you mind merging the repos at the same time as your current changes? That will create a permanent landing place for Harfbuzz pull requests going forward. |
yeah, I'll commit what I have so far on the .NET 2.0 build and merge in HarfBuzz. |
HarfBuzz merged in |
Super, but think you missed the nuspec files. Also I think the current convention is to have the nuspec files live alongside the .csproj files rather than NuGet/Build - this allows properties from the .csproj to be referenced from the .nuspec. |
@corliss do you know of any issues tracking the linq performance overhead in roslyn? (Our https://github.com/neosmart/linqplus attempts to boost linq performance by specializing extension methods by data structure to provide a more efficient option than a naive |
great! |
These are all the uses of Linq I could find in SharpFont:
TrueType\Header.cs:
return rec.Created.Select(x => (int)x).ToArray();
TrueType\Header.cs:
return rec.Modified.Select(x => (int)x).ToArray();
FaceInfo.cs:
return rec.xuid.Select(x => (uint)x).ToArray();
Library.cs:
ParameterRec[] paramRecs = parameters.Select(x => x.Record).ToArray();
Fnt\Header.cs:
return rec.reserved1.Select(x => (uint) x).ToArray();
These are trivial uses of Linq, that do the same thing: cast an array to a different type. It is easier, cleaner, and much more efficient, to do this without Linq.
Linq adds a lot of overhead. This is one of the main learnings of the C# community over the last many years, and even Microsoft ran into this when creating Roslyn - see https://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/DEV-B333. The relevant portion of the talk is around 36:30, but I recommend watching the entire talk - it can change everything about programming in .net.
In addition linq is hard to debug. Try walking the code above for each element. Compare that to debugging a simple
for
orforeach
loop.Note: I'm providing this as data and information, but don't wish to be drawn into an argument, don't have time for it. If you really believe that these lines are important enough to warrant a separate project with LinqBridge, as mentioned in #104, that's your choice. But a decision like this is about the level of programming competence.
tl;dr
Replacing this Linq code with plain C# makes it simpler, faster, with less dependencies.
Pop quiz
return rec.Select(x => (int)x).ToArray();
The plain C# implementation causes just one allocation - the output array.
How many allocations occur with the Linq implementation, on an input of, e.g., 7 items? (Hint - the CLR and libraries are open source.) It is really important to know this, to appreciate why Linq is such an issue.
Edit
Turns out to be even worse than I thought. A call to
SharpFont.TrueType.Header.Created
results in:which calls
SharpFont.TrueType.Internal.HeaderRec.Created
:which creates a 2-element
FT_Long
array on each invocation, and then the calling property uses Linq to convert that into anint
array!!! This happens on every invocation.These arrays are not even needed. Just return a struct with the two values from the inner property and you have zero allocations and zero method calls. Here's the same thing rewritten using tuples (you can use any struct if you don't want to use tuples.)
Contrast this to the present situation where you have a bewildering number of allocations and method calls just to return
Created
.I don't mean to sound harsh - this code was probably put together in haste without a good review. But it's worth understanding just how easy it is to do the wrong thing in C# and to compound code smells with band-aids that only seem attractive, but just mask the real problem.
Edit 2
Looks like there isn't any need to wrap the struct
SharpFont.TrueType.Internal.HeaderRec
with a classSharpFont.TrueType.Header
. Just expose the struct, and save yet another allocation, and a whole load of code duplication by eliminating the class entirely. This also reduces the above to justThe text was updated successfully, but these errors were encountered: