Sunday, March 6, 2011

Fast operation on Values of Dictionary in C#

At work I encountered a significant slowdown mapping data objects onto domain objects because it was implemented using reflection. I'm sure if you've worked with reflection before you know that it's not the fastest toolset available. Especially not if it's used to map properties from 1 object onto another, with those objects being in a dictionary about with a count of 10000 ~ 1000000.

Saying it was slow is quite an understatement, it was unbearably slow, with delays up to 30 seconds and that was just the mapping, let alone displaying it in a grid and/or performing some calculations on it.

I thought it could be a lot faster, I've encountered code fragments that cache the PropertyInfo objects, which speeds things up considerably, but performing GetValue() and SetValue on them is still quite slow. I've played with HyperDescriptor before (which uses a custom implementation of TypeDescriptor that emits the IL equivalent of the get & sets), which is a great performance boost, but I thought I could still somehow do better.

The last few days I analyzed how the IL statements work and tried a few things until I got the hang of it.

Perhaps let me first explain what is required to be done:
- Our custom data engine reads from the database and stores the objects in a Dictionary.
- According to our DSDM method, we have several modules with their own domain, GUI and adapter.
- The translation between the module objects and the data engine objects is done using an xml schema that describes the mapping between those 2.

This allows my colleagues to work in their isolated module, without having to worry about other objects related to other modules. Some objects can be shared between modules (e.g. a Country object, where module A uses CountryName, CountryCode and module B uses CountryName and CountryCitizenName).

The bottleneck was, as you can expect, the mapping of module objects with data objects.

The exact mapping that had to be done was: Dictionary <----> Dictionary, which was implemented like this:


   var pCopyDic = new Dictionary<string, PersonCopy>(nrOfObjects);

foreach (var p in personsDic)
{
PersonCopy pCopy = new PersonCopy();
foreach (var pair in propMappings)
{
object value = pair.Key.GetValue(p.Value, null);
pair.Value.SetValue(pCopy, value, null);
}
pCopyDic.Add(p.Key, pCopy);
}


Which was not very fast (the code is a sample from my speed test), compared to a direct copy it was ungodly slow. The direct code used to compare is:



pCopyDic = new Dictionary<string, PersonCopy>(nrOfObjects);
foreach (var p in personsDic)
{
PersonCopy pCopy = new PersonCopy();
pCopy.DateOfBirth = p.Value.DateOfBirth;
pCopy.FirstName = p.Value.FirstName;
pCopy.Function = p.Value.Function;
pCopy.GUID = p.Value.GUID;
pCopy.IsEmployee = p.Value.IsEmployee;
pCopy.LastName = p.Value.LastName;
pCopy.ParentGUID = p.Value.ParentGUID;
pCopy.Rating = p.Value.Rating;
pCopyDic.Add(p.Key, pCopy);
}


Running both code fragments on a dictionary with 1 million random generated Person objects, I got the following times on average (using StopWatch):

Map props with reflection, 1000000 persons: 32208ms / 57522777ticks
Direct copy 1000000 persons: 821ms / 1467400ticks

The first thing I tried out is emitting those get & set of properties in IL. It was my first time writing actual IL so it was a lot of trial and error. I started off with code I found here and modified it to work with the backing fields rather than with the properties (iirc accessing fields is faster than calling the get & set methods of a property to set the field, although it seems the JIT optimizes in this greatly on my machine). Thus this became the following code:



/// <summary>
/// Map compiler generated properties (e.g. public string LastName { get; set; }) to another object's properties.
/// </summary>
/// <typeparam name="From"></typeparam>
/// <typeparam name="To"></typeparam>
/// <param name="myObject"></param>
/// <param name="mapping"></param>
/// <param name="flags"></param>
/// <returns></returns>
public static To MapProperties<From, To>(this From myObject, Dictionary<string, string> mapping, BindingFlags flags = System.Reflection.BindingFlags.Instance System.Reflection.BindingFlags.Public System.Reflection.BindingFlags.NonPublic)
{
MetaData data;
if (!store.TryGetValue(typeof(From), out data))
{
data = new MetaData();
store.Add(typeof(From), data);
}

if (data.Map == null)
{
// Create ILGenerator
DynamicMethod dymMethod = new DynamicMethod("DoMap", typeof(To), new Type[] { typeof(From) }, true);
ConstructorInfo cInfo = typeof(To).GetConstructor(new Type[] { });

ILGenerator generator = dymMethod.GetILGenerator();

LocalBuilder lbf = generator.DeclareLocal(typeof(To));
//lbf.SetLocalSymInfo("_temp");

generator.Emit(OpCodes.Newobj, cInfo);
generator.Emit(OpCodes.Stloc_0);
foreach (PropertyInfo prop in typeof(From).GetProperties(flags))
{
if ((prop.GetGetMethod(true) ?? prop.GetSetMethod(true)).IsDefined(typeof(CompilerGeneratedAttribute), false))
{
FieldInfo fromField = GetFieldInfo<From>(prop.Name);

if (!mapping.ContainsKey(prop.Name))
throw new ArgumentException("The mapping dictionary does not contain a mapping for " + prop.Name);

FieldInfo toField = GetFieldInfo<To>(mapping[prop.Name]);

if (fromField.FieldType != toField.FieldType)
throw new InvalidOperationException("Mapping between '" + prop.Name + "' from type '" + typeof(From).Name + "' and '" + mapping[prop.Name] + "' from type '" + typeof(To).Name + "' is invalid. The properties have different underlying types");

// Load the new object on the eval stack... (currently 1 item on eval stack)
generator.Emit(OpCodes.Ldloc_0);
// Load initial object (parameter) (currently 2 items on eval stack)
generator.Emit(OpCodes.Ldarg_0);
// Replace value by field value (still currently 2 items on eval stack)
generator.Emit(OpCodes.Ldfld, fromField);
// Store the value of the top on the eval stack into the object underneath that value on the value stack.
// (0 items on eval stack)
generator.Emit(OpCodes.Stfld, toField);
}
}

// Load new constructed obj on eval stack -> 1 item on stack
generator.Emit(OpCodes.Ldloc_0);
// Return constructed object. --> 0 items on stack
generator.Emit(OpCodes.Ret);

data.Map = dymMethod.CreateDelegate(typeof(Func<From, To>));
}
return ((Func<From, To>)data.Map)(myObject);
}




private static FieldInfo GetFieldInfo<From>(string propName)
{
Type t = typeof(From);
FieldInfo fromField = t.GetField(string.Format("<{0}>k__BackingField", propName), BindingFlags.NonPublic BindingFlags.Instance BindingFlags.FlattenHierarchy);

while (fromField == null && t != null)
{
t = t.BaseType;
fromField = t.GetField(string.Format("<{0}>k__BackingField", propName), BindingFlags.NonPublic BindingFlags.Instance BindingFlags.FlattenHierarchy);
}

return fromField;
}



Wrapping this code in a cached delegate and calling it 1 million times in a loop gave me:


Map props with emitted method 1000000 persons: 1037ms / 1852797ticks



Hey not bad, it approaches the direct copy method, at least it's faster than reflection (hint: at first I had +- 400-500ms more, but that's because I forgot to initialize the target dictionary with the initial capacity, which in turn made unnecessary resizes, so keep this in mind if you ever copy 1 collection over to another one).

I could have probably settled for this, it's acceptable, but I thought that calling a delegate a million times would still provide some overhead, so I moved creating the dictionary and adding the new pairs in the IL code. Several hours and much cursing later (at least I learned a lot from it) I got it working, but there was not a lot of gained speed. It was slightly faster but only by 20-30ms so I was somewhat disappointed.

Then I thought of the dictionary structure, the keys stay the same, only the values are mapped onto their respective counterparts. Adding the objects into the new dictionary however calculates the hash code and stores the index to a pair in a bucket (the actual pairs are stored in an array Entry<TKey,TValue>[] entries). These calculations are unnecessary, the entire dictionary structure stays the same, the only thing that is different are the value properties of each Entry object.

Thus, the next thing I did was create a new dictionary and copy over all relevant private fields of the original dictionary, to have the same structure). Next I instantiated the entries[] of the target dictionary with the same size as the original dictionary, and finally instead of Adding each pair I iterated over the original entries[] and made a new Entry object with the target type as value and stored it into the same index in the entries array of the target dictionary, effectively bypassing all hashcode calculations. This is also entirely done in IL.

After I got it working, I tested its speed and got:

Direct on dic. values with emitted method 1000000 persons: 614ms /1097640ticks

That's fast! That's even faster than the direct copy method!

Here's the code:



/// <summary>
/// Map compiler generated properties (e.g. public string LastName { get; set; }) to another object's properties.
/// </summary>
/// <typeparam name="From"></typeparam>
/// <typeparam name="To"></typeparam>
/// <param name="myObject"></param>
/// <param name="mapping"></param>
/// <param name="flags"></param>
/// <returns></returns>
public static Dictionary<string, To> MapProperties<From, To>(this Dictionary<string, From> fromDic, Dictionary<string, string> mapping, BindingFlags flags = System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.NonPublic)
{
MetaData data;
if (!store.TryGetValue(typeof(From), out data))
{
data = new MetaData();
store.Add(typeof(From), data);
}

if (data.MapDictionary == null)
{
// Create ILGenerator
DynamicMethod dymMethod = new DynamicMethod("DoDictionaryMap", typeof(Dictionary<string, To>), new Type[] { typeof(Dictionary<string, From>) }, true);
ConstructorInfo newTo = typeof(To).GetConstructor(Type.EmptyTypes);
ConstructorInfo newDictionaryTo = typeof(Dictionary<string, To>).GetConstructor(new Type[] { typeof(int) });
FieldInfo fldEntries = typeof(Dictionary<string, From>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance);

FieldInfo fldEntryKey = fldEntries.FieldType.GetElementType().GetField("key", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
FieldInfo fldEntryValue = fldEntries.FieldType.GetElementType().GetField("value", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);

ILGenerator generator = dymMethod.GetILGenerator();

Label loopConditionCheck = generator.DefineLabel();
Label insideLoop = generator.DefineLabel();


generator.DeclareLocal(typeof(Dictionary<string, To>)); // toDic , 0
generator.DeclareLocal(fldEntries.FieldType.GetElementType()); // pair entry<string, from>, 1
generator.DeclareLocal(typeof(int)); // i, 2
generator.DeclareLocal(typeof(int)); // count, 3;
generator.DeclareLocal(typeof(To)); // to, 4;
generator.DeclareLocal(typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType()); // entry<to> 5;

// store count
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Callvirt, typeof(Dictionary<string, From>).GetProperty("Count").GetGetMethod());
generator.Emit(OpCodes.Stloc_S, 3);

generator.Emit(OpCodes.Ldloc_S, 3); // load count and pass it as capacity parameter for toDic
generator.Emit(OpCodes.Newobj, newDictionaryTo); // toDic = new ...
generator.Emit(OpCodes.Stloc_0);

// COPY Dictionary fields to toDic
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));

generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("comparer", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("comparer", BindingFlags.NonPublic | BindingFlags.Instance));

generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("count", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("count", BindingFlags.NonPublic | BindingFlags.Instance));

generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("freeCount", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("freeCount", BindingFlags.NonPublic | BindingFlags.Instance));

generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("freeList", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("freeList", BindingFlags.NonPublic | BindingFlags.Instance));

generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, fldEntries);
generator.Emit(OpCodes.Ldlen);
generator.Emit(OpCodes.Newarr, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType());
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance));

// End COPY

generator.Emit(OpCodes.Ldc_I4_0); // i = 0
generator.Emit(OpCodes.Stloc_2);

generator.Emit(OpCodes.Br, loopConditionCheck); // perform loop test
generator.MarkLabel(insideLoop);

generator.Emit(OpCodes.Ldarg_0); // load fromDic on stack
generator.Emit(OpCodes.Ldfld, fldEntries); // load entries field from dic on stack
generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Ldelem, fldEntries.FieldType.GetElementType()); // load fromDic.entries[i]
generator.Emit(OpCodes.Stloc_1); // pair = fromDic.eEntries[i];

generator.Emit(OpCodes.Newobj, newTo); // To t = new To();
generator.Emit(OpCodes.Stloc_S, 4);

foreach (PropertyInfo prop in typeof(From).GetProperties(flags))
{
if ((prop.GetGetMethod(true) ?? prop.GetSetMethod(true)).IsDefined(typeof(CompilerGeneratedAttribute), false))
{
FieldInfo fromField = GetFieldInfo<From>(prop.Name);

if (!mapping.ContainsKey(prop.Name))
throw new ArgumentException("The mapping dictionary does not contain a mapping for " + prop.Name);

FieldInfo toField = GetFieldInfo<To>(mapping[prop.Name]);

if (fromField.FieldType != toField.FieldType)
throw new InvalidOperationException("Mapping between '" + prop.Name + "' from type '" + typeof(From).Name + "' and '" + mapping[prop.Name] + "' from type '" + typeof(To).Name + "' is invalid. The properties have different underlying types");

generator.Emit(OpCodes.Ldloc_S, 4); // load 'to'

// load pair.Value.<Field> on stack
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntryValue); // load value from pair on stack
generator.Emit(OpCodes.Ldfld, fromField);

////// save to to.<tofield>
generator.Emit(OpCodes.Stfld, toField);
}
}


// bypass add & insert manually into entries

//entryTo = new Entry<,>();
generator.Emit(OpCodes.Ldloca_S, 5);
generator.Emit(OpCodes.Initobj, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType());


// entryTo.key = entryFrom.key
generator.Emit(OpCodes.Ldloca_S, 5);
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntries.FieldType.GetElementType().GetField("key", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("key", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));

// entryTo.hashCode = entryFrom.hashCode
generator.Emit(OpCodes.Ldloca_S, 5);
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntries.FieldType.GetElementType().GetField("hashCode", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("hashCode", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));

// entryTo.next = entryFrom.next
generator.Emit(OpCodes.Ldloca_S, 5);
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntries.FieldType.GetElementType().GetField("next", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("next", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));

// entryTo.value = 'to'
generator.Emit(OpCodes.Ldloca_S, 5);
generator.Emit(OpCodes.Ldloc_S, 4);
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("value", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));


generator.Emit(OpCodes.Ldloc_0); // load entries[]
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Ldloc_S, 5);
generator.Emit(OpCodes.Stelem, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType()); // save element

generator.Emit(OpCodes.Ldc_I4_1); // load 1
generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Add); // i + 1
generator.Emit(OpCodes.Stloc_2); // i = i+1

generator.MarkLabel(loopConditionCheck);

generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Ldloc_S, 3); // load count
generator.Emit(OpCodes.Blt, insideLoop); // i < fromDic.entries.Length

generator.Emit(OpCodes.Ldloc_0); // return toDic;
generator.Emit(OpCodes.Ret);

data.MapDictionary = dymMethod.CreateDelegate(typeof(Func<Dictionary<string, From>, Dictionary<string, To>>));
}
return ((Func<Dictionary<string, From>, Dictionary<string, To>>)data.MapDictionary)(fromDic);
}


I'm still a novice at writing IL (and have no assembly background whatsoever), so I'm sure there are probably some tweaks that can be done to make it even faster (if you know any, please let me know :)).

Well that's it for mapping properties, the last optimization there could be generalized to provide a way to quickly generate a new dictionary with a Func delegate to transform the values to a new type. Think of examples like quickly casting each value in a dictionary (which is usually done with the .ToDictionary(p => p.Key, p => (To)p.Value) Linq extension method, but lacks performance because of the unnecessary hashcode calculations).

So this is the code you were probably looking for when you read the title of this post:



/// <summary>
/// Cast values from a dictionary using the specified delegate. This results in a new dictionary
/// </summary>
public static Dictionary<string, To> CastValues<From, To>(this Dictionary<string, From> fromDic, Func<From, To> cast)
{
MetaData data;
if (!store.TryGetValue(typeof(From), out data))
{
data = new MetaData();
store.Add(typeof(From), data);
}


if (data.CastDictionaryValues == null)
{
// Create ILGenerator
DynamicMethod dymMethod = new DynamicMethod("DoDictionaryCastValues", typeof(Dictionary<string, To>), new Type[] { typeof(Dictionary<string, From>) }, true);
ConstructorInfo newDictionaryTo = typeof(Dictionary<string, To>).GetConstructor(new Type[] { typeof(int) });
FieldInfo fldEntries = typeof(Dictionary<string, From>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance);

FieldInfo fldEntryKey = fldEntries.FieldType.GetElementType().GetField("key", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
FieldInfo fldEntryValue = fldEntries.FieldType.GetElementType().GetField("value", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);

ILGenerator generator = dymMethod.GetILGenerator();

// define labels for loop
Label loopConditionCheck = generator.DefineLabel();
Label insideLoop = generator.DefineLabel();

// define local variables
generator.DeclareLocal(typeof(Dictionary<string, To>)); // toDic , 0
generator.DeclareLocal(fldEntries.FieldType.GetElementType()); // pair entry<string, from>, 1
generator.DeclareLocal(typeof(int)); // i, 2
generator.DeclareLocal(typeof(int)); // count, 3;
generator.DeclareLocal(typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType()); // entry<to> 4;
generator.DeclareLocal(typeof(int)); // bucketLength, 5;
generator.DeclareLocal(typeof(Dictionary<string, To>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance).FieldType); // newbuckets, 6;

// store count
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Callvirt, typeof(Dictionary<string, From>).GetProperty("Count").GetGetMethod());
generator.Emit(OpCodes.Stloc_S, 3);

generator.Emit(OpCodes.Ldloc_S, 3); // load count and pass it as capacity parameter for toDic
generator.Emit(OpCodes.Newobj, newDictionaryTo); // toDic = new ...
generator.Emit(OpCodes.Stloc_0);

// COPY Dictionary fields to toDic

// toDic.buckets = fromDic.buckets;

// This is incorrect, changes on old dictionary will influence the new one because the reference is copied!!!
//generator.Emit(OpCodes.Ldloc_0);
//generator.Emit(OpCodes.Ldarg_0);
//generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));
//generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));

// This is the correct way to copy the bucket array
// bucketLength = fromDic.buckets.Length
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Ldlen);
generator.Emit(OpCodes.Stloc_S, 5);
// newbuckets = new Int32[bucketLength]
generator.Emit(OpCodes.Ldloc_S, 5);
generator.Emit(OpCodes.Newarr, typeof(Int32));
generator.Emit(OpCodes.Stloc_S, 6);

// Array.Copy(fromDic.buckets, newbuckets, bucketLength)
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Ldloc_S, 6);
generator.Emit(OpCodes.Ldloc_S, 5);
generator.Emit(OpCodes.Call, typeof(Array).GetMethod("Copy", new Type[] { typeof(Array), typeof(Array), typeof(Int32) }));

// toDic.buckets = newbuckets
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldloc_S, 6);
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));


// toDic.comparer = fromDic.comparer;
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("comparer", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("comparer", BindingFlags.NonPublic | BindingFlags.Instance));

// toDic.count = fromDic.count;
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("count", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("count", BindingFlags.NonPublic | BindingFlags.Instance));

// toDic.freeCount = fromDic.freeCount;
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("freeCount", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("freeCount", BindingFlags.NonPublic | BindingFlags.Instance));

// toDic.freeList = fromDic.freeList;
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("freeList", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("freeList", BindingFlags.NonPublic | BindingFlags.Instance));

// toDic.entries = new Entry<,>[fromDic.entries.Length];
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, fldEntries);
generator.Emit(OpCodes.Ldlen);
generator.Emit(OpCodes.Newarr, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType());
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance));

// End COPY

// i = 0;
generator.Emit(OpCodes.Ldc_I4_0);
generator.Emit(OpCodes.Stloc_2);

generator.Emit(OpCodes.Br, loopConditionCheck); // perform loop test
generator.MarkLabel(insideLoop);
{
// pair = fromDic.entries[i];
generator.Emit(OpCodes.Ldarg_0); // load fromDic on stack
generator.Emit(OpCodes.Ldfld, fldEntries); // load entries field from dic on stack
generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Ldelem, fldEntries.FieldType.GetElementType()); // load fromDic.entries[i]
generator.Emit(OpCodes.Stloc_1);

// bypass add & insert manually into entries from toDic

// entryTo = new Entry<,>();
generator.Emit(OpCodes.Ldloca_S, 4);
generator.Emit(OpCodes.Initobj, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType());

// entryTo.key = entryFrom.key
generator.Emit(OpCodes.Ldloca_S, 4);
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntries.FieldType.GetElementType().GetField("key", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("key", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));

// entryTo.hashCode = entryFrom.hashCode
generator.Emit(OpCodes.Ldloca_S, 4);
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntries.FieldType.GetElementType().GetField("hashCode", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("hashCode", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));

// entryTo.next = entryFrom.next
generator.Emit(OpCodes.Ldloca_S, 4);
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntries.FieldType.GetElementType().GetField("next", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("next", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));

// entryTo.value = 'to'
generator.Emit(OpCodes.Ldloca_S, 4);
// call cast(pair.value)
generator.Emit(OpCodes.Ldloc_1);
generator.Emit(OpCodes.Ldfld, fldEntryValue); // load value from pair on stack
generator.Emit(OpCodes.Call, cast.Method);
// and store the to value into the new entry
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType().GetField("value", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public));

// toDic.entries[i] = entryTo;
generator.Emit(OpCodes.Ldloc_0); // load entries[]
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Ldloc_S, 4);
generator.Emit(OpCodes.Stelem, typeof(Dictionary<string, To>).GetField("entries", BindingFlags.NonPublic | BindingFlags.Instance).FieldType.GetElementType()); // save element

generator.Emit(OpCodes.Ldc_I4_1); // load 1
generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Add); // i + 1
generator.Emit(OpCodes.Stloc_2); // i = i+1
}
generator.MarkLabel(loopConditionCheck);

generator.Emit(OpCodes.Ldloc_2); // load i
generator.Emit(OpCodes.Ldloc_S, 3); // load count
generator.Emit(OpCodes.Blt, insideLoop); // i < fromDic.entries.Length

generator.Emit(OpCodes.Ldloc_0); // return toDic;
generator.Emit(OpCodes.Ret);

data.CastDictionaryValues = dymMethod.CreateDelegate(typeof(Func<Dictionary<string, From>, Dictionary<string, To>>));
}
return ((Func<Dictionary<string, From>, Dictionary<string, To>>)data.CastDictionaryValues)(fromDic);
}


This can be used like:



pCopyDic = personsDic.CastValues<Person, PersonCopy>(p =>
{
PersonCopy pCopy = new PersonCopy();
pCopy.DateOfBirth = p.DateOfBirth;
pCopy.FirstName = p.FirstName;
pCopy.Function = p.Function;
pCopy.GUID = p.GUID;
pCopy.IsEmployee = p.IsEmployee;
pCopy.LastName = p.LastName;
pCopy.ParentGUID = p.ParentGUID;
pCopy.Rating = p.Rating;
return pCopy;
});



To check out how a Dictionary is implemented I used .NET Reflector, to check how I had to write IL statements I searched on various websites and wrote the equivalent in a dummy method in C#, compiled it then disassembled it with Reflector.

I know micro optimization like this is usually not needed, so don't always rely that it will work well. This optimization can improve speed a lot but won't fix anything if you have a structural problem (that scales exponentially). That said, it's awesome to tinker with IL and improve framework related issues :).

Edit: It seems I had a slight oversight. Copying the buckets from 1 dictionary to another copied the reference, so any changes on the source dictionary reflected on the new dictionary as well! This is the correct code (sigh now to change it in the already html marked up code above):



// toDic.buckets = fromDic.buckets;
// This is incorrect, changes on old dictionary will influence the new one because the reference is copied!!!
//generator.Emit(OpCodes.Ldloc_0);
//generator.Emit(OpCodes.Ldarg_0);
//generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));
//generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));

// This is the correct way to copy the bucket array
// bucketLength = fromDic.buckets.Length
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Ldlen);
generator.Emit(OpCodes.Stloc_S, 5);
// newbuckets = new Int32[bucketLength]
generator.Emit(OpCodes.Ldloc_S, 5);
generator.Emit(OpCodes.Newarr, typeof(Int32));
generator.Emit(OpCodes.Stloc_S, 6);

// Array.Copy(fromDic.buckets, newbuckets, bucketLength)
generator.Emit(OpCodes.Ldarg_0);
generator.Emit(OpCodes.Ldfld, typeof(Dictionary<string, From>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));
generator.Emit(OpCodes.Ldloc_S, 6);
generator.Emit(OpCodes.Ldloc_S, 5);
generator.Emit(OpCodes.Call, typeof(Array).GetMethod("Copy", new Type[] { typeof(Array), typeof(Array), typeof(Int32) }));

// toDic.buckets = newbuckets
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldloc_S, 6);
generator.Emit(OpCodes.Stfld, typeof(Dictionary<string, To>).GetField("buckets", BindingFlags.NonPublic | BindingFlags.Instance));


This reminds me that I really need to do some layout changes, because copy pasting code in here is a pain.

No comments:

Post a Comment