Skip to Main Content

Java Programming

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Base40 encode and decode

962952Sep 18 2012 — edited Sep 18 2012
Hi

I'm looking to compress (and decompress) some already short strings (think about the same amount of text as a search engine might show for a result).

One of the most effective ways of compressing short strings seems to be base 40 encoding as found in the accepted answer here: http://stackoverflow.com/questions/7389252/shorten-an-already-short-string-in-java

At least against my data, it seems to outperform LZF and Smaz.

However, I can't for the life of me figure out how to decode it. I've even found encode and decode implementations in C, but my C is woefully inadequate to derive the Java: http://www.drdobbs.com/embedded-systems/slimming-strings-with-custom-base-40-pac/229400732

To show willing, here's one of many attempts at writing a method to decode it:

public String unpack(byte[] input) { //FIXME: No workie.
ByteArrayInputStream bois = new ByteArrayInputStream(input);
DataInputStream dis = new DataInputStream(bois);

StringBuilder sb = new StringBuilder();

char a,b,c;
try {
while ((a = dis.readChar()) != '\0' && (b = dis.readChar()) != '\0' && (c = dis.readChar()) != '\0') {
sb.append(chars.charAt(a % 40));
sb.append(chars.charAt(b / 40 % 40));
sb.append(chars.charAt(c / 40 / 40));
}
} catch (IOException e) {
throw new AssertionError(e);
}

return sb.toString();
}

Could anyone help me out? Thanks in advance.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Oct 16 2012
Added on Sep 18 2012
2 comments
3,311 views