encoding problem saving content of a html page(in turkish) as text file
807600Nov 6 2007 — edited Nov 7 2007I am studying on a project about blogs classification in turkish.
I have managed to parse HTML and get the content of blog. But i need some help besacuse When i write the content of blog to screen or file, i see some chacters( Turkish caracters ) is corrupted. I think it is caused by encoding system. But i dont know where to start to solve this problem.
it would be very kind if yuo guide where am i supposed to start