Monday, March 23, 2009

NHibernate: Why override GetHashCode and Equals?

When using NHibernate collections, or when comparing two objects that are configured via NHibernate configuration, NHibernate code will use the Equals and GetHashCode methods in the class. It is very important to implement this methods in a way that will make two different entities return different values and to identical entities return identical values. That’s the only condition. The value doesn’t need to be related to anything in the table at all as long as it passes that test. This is both easy and hard. Let me show you why.

The point here is to always generate a value that will represent a row in the table. In a normal table, any candidate key (that is, a unique index or a primary key) will serve this role. In the samples that come with Visual T4 Editor Professional edition we use the primary key to generate these methods:

        public override int GetHashCode()
{
int hashCode = 0;
hashCode = hashCode ^ CustomerID.GetHashCode();
return hashCode;
}

public override bool Equals(object obj)
{
Customers toCompare = obj as Customers;
if (toCompare == null)
{
return false;
}
if (this.CustomerID != toCompare.CustomerID)
{
return false;
}
return true;
}


The Equals method is fairly simple to override. Just compare keys, and if everything matches return true. GetHashCode is a little different. You must return an integer value tailored to the needs of the two rules. For the sample, we have chosen to use xor to compare, due to a high success rate. For production code, you should choose to “shift and or” or to concatenate and get the hash code of the string. Both methods will give you a 100% success rate, on varying levels of complexity and performance.



There is an additional problem here, specially for code generation: if you have a transient object (that is, one that is not yet saved to the database) and identity value will always be a bad choice. The problem is that until NHibernate saves the object to the database, the value of the identity (or primary key, depending on configuration) will be the default value for the datatype (0 in this case). Hence, two unsaved values will be equal. Although this is technically correct (unsaved values may be considered “null” for some purposes) it is highly impractical.



In this case, the best solution is to take an alternate key (an unique non-null key) and use that for the comparison. As long as the user actually fills that value when constructing the object, everything should be fine. A code generator may find a candidate key and use it. It will not, however, be able to know if this candidate key will be filled when needed. As a default solution both primary key and a candidate key may work. As a definitive solution, you should use the class knowing that you may run into this issue and manually override (as explained in Extending Code Generation with Partial Classes and Inheritance) when needed. To be 100% sure, you should always create the GetHashCode manually, and override equals with this:



        public override bool Equals(object obj)
{
Customers toCompare = obj as Customers;
if (toCompare == null)
{
return false;
}
return (this.GetHashCode() != toCompare.GetHashCode())
}


--

Written by Joaquin, joj AT clariusconsulting DOT net. Disclaimer: opinions expressed in this post are my own and not necessarily reflect those of Clarius Consulting.

3 comments:

  1. I think Equals must always return true, when objects are equal, and always false otherwise. Unlike it, GetHashCode of 2 objects may be equal even if objects are different (because it's rather short, and, say, strings are long).

    So this code will fail:

    return (this.GetHashCode() != toCompare.GetHashCode())

    because when HashCodes are same, but objects are not, this code will return ...

    What will return this code? TRUE if hashcodes are different? Hmmm.... Is it another bug? Should == instead of !=
    ReplyDelete