Need help on efficient searches using hashes on large #s of data structures
800348Sep 1 2008 — edited Sep 3 2008I have a massive amount of data and I need a way to access it using generated indexes instead of traditional searching such as binary chop as I require a fast response time.
I have 100,000 Array data structures each containing roughly 10 to 60 elements. The elements of the Arrays do not store any raw data they just hold references to objects. I have about 10,000 unique objects which are referenced to by the arrays.
What I need to do is supply a number of objects as the query and the system should return all the arrays which contain two or more different objects from the query.
So for example>
Given query objects> “DER” , “ERE” , “YPS”
and arrays> [DER, SFR, PPR]
[PER, ERE, SWE, YPS]
[ERE, PPD, DER, YPS, SWE]
[PRD, LDF, WSA, MMD]
It should return arrays 2 and 3.
I had a look at hashMaps but I’m not entirely sure if that’s what I need.
Would I need to hash the each Array and then hash the query objects and look for overlaps in the two hashes or something similar?
Thanks In Advance