Hi,
I have problem when extracting RSS from some feeds, Scandinavian characters are converted.
When I view feed source, characters are correct. So I assume problem might be my XML query.
Example RSS feed from
http://www.mtv3.fi/rss/urheilu_f1.rss
I get results like
Pakokaasuvirtausten hyödyntäminen rajoitetaan minimiin
That should say
Pakokaasuvirtausten hyädyntäminen rajoitetaan minimiin
And problem do not occur e.g. feed from here
http://www.hs.fi/uutiset/rss/
I use below table and package to get RSS fead and store it to clob column
CREATE TABLE rss
(payload CLOB
);
create or replace
PACKAGE RSS_UTIL AS
g_wallet_path CONSTANT VARCHAR2(200) := 'file:/somepath/wallets';
g_wallet_pass CONSTANT VARCHAR2(200) := 'verysecret';
g_collection_name CONSTANT VARCHAR2(200) := 'RSS_RESPONSE';
g_user_agent CONSTANT VARCHAR2(200) := 'Mozilla/5.0';
--------------------------------------------------------------------------------
PROCEDURE get_rss_feed(
p_url IN VARCHAR2
);
END;
/
create or replace
PACKAGE BODY RSS_UTIL AS
PROCEDURE get_rss_feed(
p_url IN VARCHAR2
)
AS
l_clob CLOB;
l_buffer VARCHAR2(32767);
l_http_req utl_http.req;
l_http_resp utl_http.resp;
BEGIN
--
--utl_http.set_wallet(g_wallet_path, g_wallet_pass);
--
utl_http.set_proxy(apex_application.g_proxy_server, NULL);
--
l_http_req := utl_http.begin_request(p_url);
--
utl_http.set_header(l_http_req, 'User-Agent', g_user_agent);
--
l_http_resp := utl_http.get_response(l_http_req);
--
dbms_lob.createtemporary(l_clob, FALSE);
dbms_lob.open(l_clob, dbms_lob.lob_readwrite);
--
BEGIN
LOOP
utl_http.read_text(l_http_resp, l_buffer);
dbms_lob.writeappend(l_clob, LENGTH(l_buffer), l_buffer);
END LOOP;
EXCEPTION WHEN utl_http.end_of_body THEN
NULL;
END;
--
utl_http.end_response(l_http_resp);
--
EXECUTE IMMEDIATE 'TRUNCATE TABLE RSS';
INSERT INTO RSS VALUES(l_clob);
COMMIT;
--
END;
END;
/
BEGIN
rss_util.get_rss_feed('http://www.mtv3.fi/rss/urheilu_f1.rss');
END;
/
And I query data from clob by
SELECT extractValue(value(t), '//link/text()') AS LINK,
extractValue(value(t), '//category[1]/text()') AS category,
extractValue(value(t), '//category[2]/text()') AS sub_category,
extractValue(value(t), '//title/text()') AS title,
extractValue(value(t), '//description/text()') AS description,
extractValue(value(t), '//author/text()') AS author,
extractValue(value(t), '//pubDate/text()') AS rss_date
FROM rss,
TABLE(xmlsequence(extract(xmltype(rss.payload),'//rss/channel/item'))) t ;
I have same also as Apex app if you like check
http://actionet.homelinux.net/htmldb/lspdemo?p=48
Just press login button to enter application.
Only difference is that I store RSS to apex_collection clob column and query it from there to report.
How I should change my code or query to fix this problem?
Regards,
Jari
http://dbswh.webhop.net/dbswh/f?p=BLOG:HOME:0