Hi all!
I found myself implementing hashCode and equals methods for a class, and googled a little bit looking for a good implementation.
I found that the most used is this one:
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((Int1 == null) ? 0 : Int1.hashCode());
result = prime * result + ((Int2 == null) ? 0 : Int2.hashCode());
result = prime * result + ((Int3 == null) ? 0 : Int3.hashCode());
result = prime * result + ((Int4 == null) ? 0 : Int4.hashCode());
result = prime * result + ((Int5 == null) ? 0 : Int5.hashCode());
result = prime * result + ((Int6 == null) ? 0 : Int6.hashCode());
result = prime * result + ((Int7 == null) ? 0 : Int7.hashCode());
return result;
}
Actually, this exact code was generated automatically by Eclipse tools for a test class I made.
What I found alarming, is that, in such an implementation, that prime variable, is multiplied by itself in every row!
This is what result equation would look like:
result = prime^7 + prime^6*Int7.hashCode() + prime^5*Int6.hashCode() + ... (where prime^n means a power).
And, translating it into numbers, we have that:
result > prime^7 = 31^7 = 27,512,614,111 > 2,147,483,647 = MAX VALUE OF JAVA PRIMITIVE INT!!!!
I dont want to think about classes with 10 or 20 fields!!
Im not sure about how java faces a value over its maximum, but, anyway I think that, over 6 fields, this implementation is a bit HEAVY for an int hashCode
Obviously, you can change the 31 for a lower prime, but dont think there would be much difference in that.
These would be the maximum valid number of fields for the corresponding primes:
31 6
29 6
23 6
19 7
17 7
13 8
A part from that, although it covers pretty fast the whole int range, there's no consistency about collisions, so it's far form being a "perfect" hash code ( an utopic hash code implementation that returns a unique value for each different instance of a class ).
So, thinking about a better implementation I came into this:
public int hashCode() {
double result = Math.pow(2,((Int1 == null) ? 0 : Int1.hashCode()));
result *= Math.pow(3,((Int1 == null) ? 0 : Int1.hashCode()));
result *= Math.pow(5,((Int1 == null) ? 0 : Int1.hashCode()));
result *= Math.pow(7,((Int1 == null) ? 0 : Int1.hashCode()));
result *= Math.pow(11,((Int1 == null) ? 0 : Int1.hashCode()));
result *= Math.pow(13,((Int1 == null) ? 0 : Int1.hashCode()));
result *= Math.pow(17,((Int1 == null) ? 0 : Int1.hashCode()));
return (int) result;
}
but I guess that, although there would be absolutely no collision its cost would be too much for optimizing hash tables searches.
So, what do you think about it?
Or better...Does anyone thought of it before? and Did you get into a better implementation?
Thanx, and good programming!