Hi,
I'm trying to find a way to load JSON-Data stored in HDFS into Oracle Database via Oracle SQL Connector for HDFS (OSCH).
The most promissing way seemed to my to set up an External HIVE table and create an external oracle table on top of it with the oracle hdfs external table tool:
add jar hive-json-serde-0.2.jar ;
create external table owners_test2 (
col1 string, col2 double, col3 string
)
ROW FORMAT
serde 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'
LOCATION 'hdfs:' ;
2014-05-12 16:41:25,927 INFO [main] metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(404)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2014-05-12 16:41:25,955 INFO [main] metastore.ObjectStore (ObjectStore.java:initialize(222)) - ObjectStore, initialize called
2014-05-12 16:41:26,178 INFO [main] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property datanucleus.cache.level2 unknown - will be ignored
2014-05-12 16:41:26,697 INFO [main] metastore.ObjectStore (ObjectStore.java:getPMF(267)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2014-05-12 16:41:26,732 INFO [main] metastore.ObjectStore (ObjectStore.java:setConf(205)) - Initialized ObjectStore
2014-05-12 16:41:27,745 INFO [main] metastore.HiveMetaStore (HiveMetaStore.java:logInfo(447)) - 0: get_table : db=default tbl=owners_test2
2014-05-12 16:41:27,834 INFO [main] HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(238)) - ugi=oracle ip=unknown-ip-addr cmd=get_table : db=default tbl=owners_test2
2014-05-12 16:41:27,839 INFO [main] DataNucleus.Datastore (Log4JLogger.java:info(77)) - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2014-05-12 16:41:27,840 INFO [main] DataNucleus.Datastore (Log4JLogger.java:info(77)) - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
oracle.hadoop.exttab.ExternalTableException: Unsupported Hive table serialization library org.apache.hadoop.hive.contrib.serde2.JsonSerde
at oracle.hadoop.exttab.hive.HiveSource.initialize(HiveSource.java:262)
at oracle.hadoop.exttab.hive.HiveSource.getDataSet(HiveSource.java:444)
at oracle.hadoop.exttab.ExternalTable.doCreateTable(ExternalTable.java:1367)
at oracle.hadoop.exttab.ExternalTable.run(ExternalTable.java:1673)
at oracle.hadoop.exttab.ExternalTable.run(ExternalTable.java:1745)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at oracle.hadoop.exttab.ExternalTable.main(ExternalTable.java:1851)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
any suggestions somebody? i literally found nothing in google!
what are your prefered ways to load json-data from hdfs to oracle-tables?
thanks!