Skip to Main Content

Java EE (Java Enterprise Edition) General Discussion

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

HTML String to text

843841Aug 4 2004 — edited Aug 4 2004
Hi,
I wrote a servlet that recives e-mails and prints them. The problem is that there are e-mails that contains an HTML message, so I want to convert the string that contains the HTML message into text.
I found this program that reads a HTML page from URL and prints the text of the HTML, exactly what I want except the fact that it reads the HTML from a URL and not from a String:

import java.io.*;
import java.net.*;
import java.util.*;
import javax.swing.*;
import javax.swing.text.*;
import javax.swing.text.html.*;

class HTML2Text {
public static void main(String[] args) {
EditorKit kit = new HTMLEditorKit();
Document doc = kit.createDefaultDocument();
// The Document class does not yet handle charset's properly.
doc.putProperty("IgnoreCharsetDirective", Boolean.TRUE);
try {
// Create a reader on the HTML content.
Reader rd = getReader(args[0]);
// Parse the HTML.
kit.read(rd, doc, 0);
System.out.println( doc.getText(0, doc.getLength()) );
}
catch (Exception e) {
e.printStackTrace();
}
System.exit(1);
}

// Returns a reader on the HTML data. If 'uri' begins
// with "http:", it's treated as a URL; otherwise,
// it's assumed to be a local filename.
static Reader getReader(String uri)
throws IOException {
// Retrieve from Internet.
if (uri.startsWith("http:")) {
URLConnection conn = new URL(uri).openConnection();
return new InputStreamReader(conn.getInputStream());
}
// Retrieve from file.
else {
return new FileReader(uri);
}
}
}

can someone tell me how to make this program to get a String (that contains the HTML) instead of the URL of the HTML page?

Thanks, Naor.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Sep 1 2004
Added on Aug 4 2004
3 comments
285 views