Skip to Main Content

Java Programming

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

HashMap memory usage

807588Sep 27 2007 — edited Feb 11 2009
Hi,
I am implementing an indexer / compressor for plain text files (text, query log and urls files). The basic skeleton of the indexer is the Huffman codec, plus some various addon to boost performance.
Huffman is used on words (Huffword); the first operation I execute is the complete scan of the file to collect term frequencies, which I will use to generate the Huffman model. Frequencies are stored in a HashMap<String, Integer>.
The main problem is the HashMap dimension, I quickly run out of memory.
In a query log of 300MB I collect something around 1700000 String-Integer pairs; is it possible that I need an 512MB-sized heap?
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Mar 11 2009
Added on Sep 27 2007
11 comments
1,128 views