Home All Groups Group Topic Archive Search About

HashTable stores whole object as Key NOT the result of calling GetHashCode on the object when you do

Author
12 Apr 2006 10:50 AM
John Davis
Hi All

I was testing my understanding of Hashtables and I thought that if I
override GetHashCode then i could add an object to hashTable like so:

Dim ht As New Hashtable
ht.Add(s1, s1)
ht.Add(s2, s2)
ht.Add(s3, s3)

Where s* are instance of a student class that overrides GetHashCode to
return the StudentID (code at end of post).

I shouldn't add it like:

ht.Add(s1.GetHashCode(), s1)

as the HashTable calls GetHashCode on the Key for me.

This should then store the Hash (The ID returned from GHC) as the Key and
the student as the object.  However if you loop through the HashTable with:

       Dim de As DictionaryEntry
       For Each de In ht
            Console.WriteLine("Key = {0}, Value = {1}", de.Key, de.Value)
        Next

It shows that the student is stored as both the Key and the Value. Does this
mean that you should just add objects to the HashTable like:

Dim ht As New Hashtable
ht.Add(s1, Nothing)
ht.Add(s2, Nothing)
ht.Add(s3, Nothing)

Seems to me I have misunderstood something here and thw whole point of a
HashTable is to use a key to get to a value, was hoping somebody could clear
this up for me.

Cheers

John

Student Code

Class Student
    Private _Name As String
    Private _ID As Integer
    Private _Age As Integer

    Public Sub New(ByVal Name As String, ByVal ID As Integer, ByVal age As
Integer)
        _Name = Name
        _ID = ID
        _Age = age
    End Sub

    Public Overrides Function ToString() As String
        Return String.Format("{0} is {1} years old and is student no: {2}",
_Name, _Age, _ID)
    End Function

    Public Overrides Function GetHashCode() As Integer
        Return _ID
    End Function

End Class

Author
12 Apr 2006 3:33 PM
Damien
John Davis wrote:
Show quote
> Hi All
>
> I was testing my understanding of Hashtables and I thought that if I
> override GetHashCode then i could add an object to hashTable like so:
>
>  Dim ht As New Hashtable
>  ht.Add(s1, s1)
>  ht.Add(s2, s2)
>  ht.Add(s3, s3)
>
> Where s* are instance of a student class that overrides GetHashCode to
> return the StudentID (code at end of post).
>
> I shouldn't add it like:
>
> ht.Add(s1.GetHashCode(), s1)
>
> as the HashTable calls GetHashCode on the Key for me.
>
> This should then store the Hash (The ID returned from GHC) as the Key and
> the student as the object.  However if you loop through the HashTable with:
>
>        Dim de As DictionaryEntry
>        For Each de In ht
>             Console.WriteLine("Key = {0}, Value = {1}", de.Key, de.Value)
>         Next
>
> It shows that the student is stored as both the Key and the Value. Does this
> mean that you should just add objects to the HashTable like:
>
>  Dim ht As New Hashtable
>  ht.Add(s1, Nothing)
>  ht.Add(s2, Nothing)
>  ht.Add(s3, Nothing)
>
> Seems to me I have misunderstood something here and thw whole point of a
> HashTable is to use a key to get to a value, was hoping somebody could clear
> this up for me.
>
The purpose of a hashtable is to provide a fast lookup based on the
key. It achieves this lookup using the hash of the key value. But
remember, two keys may have the same hash value, so it has to store the
entire key to be sure it's located the correct item.

The best way to understand how it works is to use Reflector. Hashtable
is implemented in mscorlib, and the interesting things to look at are
Insert() and InitHash/GetHash. (Also look at ContainsKey() to
understand how the lookups work)

In your case, you're own objects can serve as their key, but in many
cases they will not - for instance, a hashtable containing a large
object relating to pending requests for a client may be keyed on the
ClientID. In your case, you could have chosen not to override
GetHashCode and instead called ht.Add(s1.StudentID,s1). This will allow
you to find an item using their StudentID, whereas in your example, you
need to have an Student object before you can search for... a Student
object. About the only thing you can do with your sample is, I believe,
use your hash table to test for membership (is the student in the hash
table), whereas with my example, I can determine membership and locate
the larger object.

Hope this helps,

Damien
Author
13 Apr 2006 8:55 AM
Nick Hounsome
Typically you would use Add(s1.StudentId,s1) as this allows you to look up a
student given ONLY the student ID [without doing something stupid like
creating a student from the ID for the sole purpose of lookup]

Whilst it is a good idea to override GetHashCode to return student ID, the
hash code is not generally required to be unique so writing
Add(s1.GetHashCode(),s1) would still be bad as it would confuse people.

Once you have your student you could then have a different hashtable to look
up some other stuff to do with the student such as Add(s1,degreeName). In
this case the hash table would indeed keep a REFERENCE to the student and
call the GetHashCode method. This is often useful as it allows you to find
all the students in the relationship in one step rather than mapping from ID
to degreeName which would require you to get all the IDs and then lookup
them up in the "ID to student" table.

To keep the code nice and deal with the asymetric access you will want 2
methods (or equivalent):

void Add(Student s) { ht.Add(s.StudentId,s); }
Student GetStudent(int id) { return ht[id]; }


Show quote
"Damien" <Damien_The_Unbelie***@hotmail.com> wrote in message
news:1144856001.378762.213920@z34g2000cwc.googlegroups.com...
> John Davis wrote:
>> Hi All
>>
>> I was testing my understanding of Hashtables and I thought that if I
>> override GetHashCode then i could add an object to hashTable like so:
>>
>>  Dim ht As New Hashtable
>>  ht.Add(s1, s1)
>>  ht.Add(s2, s2)
>>  ht.Add(s3, s3)
>>
>> Where s* are instance of a student class that overrides GetHashCode to
>> return the StudentID (code at end of post).
>>
>> I shouldn't add it like:
>>
>> ht.Add(s1.GetHashCode(), s1)
>>
>> as the HashTable calls GetHashCode on the Key for me.
>>
>> This should then store the Hash (The ID returned from GHC) as the Key and
>> the student as the object.  However if you loop through the HashTable
>> with:
>>
>>        Dim de As DictionaryEntry
>>        For Each de In ht
>>             Console.WriteLine("Key = {0}, Value = {1}", de.Key, de.Value)
>>         Next
>>
>> It shows that the student is stored as both the Key and the Value. Does
>> this
>> mean that you should just add objects to the HashTable like:
>>
>>  Dim ht As New Hashtable
>>  ht.Add(s1, Nothing)
>>  ht.Add(s2, Nothing)
>>  ht.Add(s3, Nothing)
>>
>> Seems to me I have misunderstood something here and thw whole point of a
>> HashTable is to use a key to get to a value, was hoping somebody could
>> clear
>> this up for me.
>>
> The purpose of a hashtable is to provide a fast lookup based on the
> key. It achieves this lookup using the hash of the key value. But
> remember, two keys may have the same hash value, so it has to store the
> entire key to be sure it's located the correct item.
>
> The best way to understand how it works is to use Reflector. Hashtable
> is implemented in mscorlib, and the interesting things to look at are
> Insert() and InitHash/GetHash. (Also look at ContainsKey() to
> understand how the lookups work)
>
> In your case, you're own objects can serve as their key, but in many
> cases they will not - for instance, a hashtable containing a large
> object relating to pending requests for a client may be keyed on the
> ClientID. In your case, you could have chosen not to override
> GetHashCode and instead called ht.Add(s1.StudentID,s1). This will allow
> you to find an item using their StudentID, whereas in your example, you
> need to have an Student object before you can search for... a Student
> object. About the only thing you can do with your sample is, I believe,
> use your hash table to test for membership (is the student in the hash
> table), whereas with my example, I can determine membership and locate
> the larger object.
>
> Hope this helps,
>
> Damien
>
Author
13 Apr 2006 1:31 PM
John Davis
Hi Guys

Thanks for your input, I had a look using Reflector and it is a lot clearer
now.

Regarding your suggestion to add a student using

ht.Add(s1.StudentID, s1)

This is obviously a good approach if there is an obvious property or public
field to use as the hash but I just wondered if a class designer has
overriden GetHashCode then this would mean that the user of said object
wouldn't have to know which field property to use and could just call.

ht.Add(s1.GetHashCode(), s1)

Having said this the user of the object would need to know what GetHashCode
actually returned in order to do a lookup so they might as well just add it
using that property/field in the first place.  I guess like you say Nick,
GetHashCode is only useful if your are storing the student as the key in
order to store a different object (such as there exams results/degree name)
in the value.

Cheers again for your help on this matter.

John


Show quote
"Nick Hounsome" <nh***@nickhounsome.me.uk> wrote in message
news:hmo%f.43180$8Q3.43128@fe1.news.blueyonder.co.uk...
> Typically you would use Add(s1.StudentId,s1) as this allows you to look up
> a student given ONLY the student ID [without doing something stupid like
> creating a student from the ID for the sole purpose of lookup]
>
> Whilst it is a good idea to override GetHashCode to return student ID, the
> hash code is not generally required to be unique so writing
> Add(s1.GetHashCode(),s1) would still be bad as it would confuse people.
>
> Once you have your student you could then have a different hashtable to
> look up some other stuff to do with the student such as
> Add(s1,degreeName). In this case the hash table would indeed keep a
> REFERENCE to the student and call the GetHashCode method. This is often
> useful as it allows you to find all the students in the relationship in
> one step rather than mapping from ID to degreeName which would require you
> to get all the IDs and then lookup them up in the "ID to student" table.
>
> To keep the code nice and deal with the asymetric access you will want 2
> methods (or equivalent):
>
> void Add(Student s) { ht.Add(s.StudentId,s); }
> Student GetStudent(int id) { return ht[id]; }
>
>
> "Damien" <Damien_The_Unbelie***@hotmail.com> wrote in message
> news:1144856001.378762.213920@z34g2000cwc.googlegroups.com...
>> John Davis wrote:
>>> Hi All
>>>
>>> I was testing my understanding of Hashtables and I thought that if I
>>> override GetHashCode then i could add an object to hashTable like so:
>>>
>>>  Dim ht As New Hashtable
>>>  ht.Add(s1, s1)
>>>  ht.Add(s2, s2)
>>>  ht.Add(s3, s3)
>>>
>>> Where s* are instance of a student class that overrides GetHashCode to
>>> return the StudentID (code at end of post).
>>>
>>> I shouldn't add it like:
>>>
>>> ht.Add(s1.GetHashCode(), s1)
>>>
>>> as the HashTable calls GetHashCode on the Key for me.
>>>
>>> This should then store the Hash (The ID returned from GHC) as the Key
>>> and
>>> the student as the object.  However if you loop through the HashTable
>>> with:
>>>
>>>        Dim de As DictionaryEntry
>>>        For Each de In ht
>>>             Console.WriteLine("Key = {0}, Value = {1}", de.Key,
>>> de.Value)
>>>         Next
>>>
>>> It shows that the student is stored as both the Key and the Value. Does
>>> this
>>> mean that you should just add objects to the HashTable like:
>>>
>>>  Dim ht As New Hashtable
>>>  ht.Add(s1, Nothing)
>>>  ht.Add(s2, Nothing)
>>>  ht.Add(s3, Nothing)
>>>
>>> Seems to me I have misunderstood something here and thw whole point of a
>>> HashTable is to use a key to get to a value, was hoping somebody could
>>> clear
>>> this up for me.
>>>
>> The purpose of a hashtable is to provide a fast lookup based on the
>> key. It achieves this lookup using the hash of the key value. But
>> remember, two keys may have the same hash value, so it has to store the
>> entire key to be sure it's located the correct item.
>>
>> The best way to understand how it works is to use Reflector. Hashtable
>> is implemented in mscorlib, and the interesting things to look at are
>> Insert() and InitHash/GetHash. (Also look at ContainsKey() to
>> understand how the lookups work)
>>
>> In your case, you're own objects can serve as their key, but in many
>> cases they will not - for instance, a hashtable containing a large
>> object relating to pending requests for a client may be keyed on the
>> ClientID. In your case, you could have chosen not to override
>> GetHashCode and instead called ht.Add(s1.StudentID,s1). This will allow
>> you to find an item using their StudentID, whereas in your example, you
>> need to have an Student object before you can search for... a Student
>> object. About the only thing you can do with your sample is, I believe,
>> use your hash table to test for membership (is the student in the hash
>> table), whereas with my example, I can determine membership and locate
>> the larger object.
>>
>> Hope this helps,
>>
>> Damien
>>
>
>

AddThis Social Bookmark Button