Import of tables and views from Hive was introduced in Oracle SQL Developer Data Modeler 4.0.2 and probably some of you have seen that description https://blogs.oracle.com/datawarehousing/entry/oracle_sql_developer_data_modeler
I'm going to describe some detail on the process
1. The goal - there is no dedicated physical model for Hive so the goal was to extract metadata about tables defined in Hive and prepare them for use with Oracle Big Data SQL | Oracle Database | Oracle
2. Setting the connection
If you use SQL Developer 4.0.3 then steps described here https://blogs.oracle.com/datawarehousing/entry/oracle_sql_developer_data_modeler will work for you, setting the connection in standalone version of Data Modeler looks different:

If you use a JDBC driver from another vendor you need to check driver class used (could be org.apache.hive.jdbc.HiveDriver ).
3. Mapping of Hive primitive types
There is no physical model for Hive thus Hive native primitive types cannot fit into rules around logical types to native types mapping:
- Import – direct name match between hive primitive type and existing Logical data type is required. New “String “ logical data types is added in order to cover “String” primitive Hive data type and it’s mapped for all supported databases to logical type Varchar with default size set to “max” – max means it’ll take max possible value for each database version supported
- DDL generation and complex type presentation in compare/merge – the name of logical data type is taken if it matches Hive primitive type otherwise data type is transformed to Oracle 12 c data type and then transformed to Hive type:
• CHAR, NCHAR – to CHAR
• VARCHAR2, NVARCHAR2 – to Varchar
• LONGRAW, BLOB, RAW – to Binary
• FLOAT, BINARY_FLOAT – to Float
• BINARY_DOUBLE – to Double
• Date – to Date
• Timestamp – to Timestamp
• LONG, CLOB, NCLOB and everything else – to STRING
4. Mapping of Hive complex types
Structured types and collection types in Data Types model are used to represent Object relational definitions in Oracle database. The same constructs are used to represent Hive complex type:
- ARRAY – mapped to collection type
- STRUCT – mapped to Structured type
- MAP and UNIONTYPE – mapped to Structured type – one more property is added to structured type – “Hive Type” with possible values (empty – Hive Struct, MAP – Hive MAP, UNIONTYPE – Hive Uniontype
Examples:
CREATE TABLE emp1 (
name STRING,
salary FLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING, FLOAT>,
address STRUCT<street:STRUCT<name:STRING, number:INT>, city:STRING, state:STRING, zip:INT>
)
CREATE TABLE emp_addresses (
name STRING,
salary FLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING, FLOAT>,
addresses ARRAY<STRUCT<street:STRUCT<name:Varchar(30), number:INT>, city:STRING, state:varchar(2), zip:INT>>
)
create table union_test(
foo UNION_TYPE< int , double , aray <string> , struct< a: int , b : string>>);
Following types will be created in data types model after importing of these definitions from Hive:

So Hive complex type - STRUCT<street:STRUCT<name:Varchar(30), number:INT>, city:STRING, state:varchar(2), zip:INT>
Will be presented in Data Types model :

And column with same complex type definition will be based on the same structured type – equivalent definitions are recognized and reused
5. Imported objects
5.1 Tables - columns with data types and defined comments for table and columns are imported; also details about some Hive specific constructs are imported and stored in dynamic properties for table and columns – partitioned columns, clustering and sorting columns, table type (external or managed – in Hive terms), location …; Those dynamic properties can be included into reports or visualized on relational diagram using settings for existing dynamic properties. In DM 4.1 they can be promoted to User defined Properties for better visualization (color) and manipulation (type, list of values,default value and group)
5.2 Views- definitions are imported as they are defined in Hive – no translation is applied
6.Hive tables as external tables in Oracle 12c physical model
New functionality for presenting Oracle external tables in relational model is used to represent Hive tables in relational model (with set classification type as “External”) and in Oracle physical model (with organization set to “External”).
Size for column with primitive types can be changed using already existing in DM functionality for setting size, precision and scale.
By default complex types represented by structured types and collection types are represented in generated DDL (for Oracle external tables) with size 4000 (columns subordinates, deductions and address in the example below):
CREATE TABLE emp1_addresses
(
name VARCHAR2 (4000) ,
salary FLOAT ,
subordinates VARCHAR2(4000) ,
deductions VARCHAR2(4000) ,
address VARCHAR2(4000)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE\_HIVE
DEFAULT DIRECTORY Dir1
ACCESS PARAMETERS
(
com.oracle.bigdata.tablename:emp1_addresses
)
)
;
That size for complex types can be changed at structured type/collection type or at column level in physical model for Oracle.
New property is introduced for structured and collection types – “Max Size as String”

Complex type can be used in definitions of several tables and if the size is different then the size can be redefined at column level in physical model:

So we can tune generated DDL to:
CREATE TABLE emp1_addresses
(
name VARCHAR2 (50) ,
salary FLOAT ,
subordinates VARCHAR2(160) ,
deductions VARCHAR2(70) ,
address VARCHAR2(70)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE\_HIVE
DEFAULT DIRECTORY Dir1
ACCESS PARAMETERS
(
com.oracle.bigdata.tablename:emp1_addresses
)
)
;
7. Expanded visualization of complex type
New option is added to relational model diagrams that allow seeing details of structured types and collection of structured types:

After "resize to visible" details of complex types can be seen on diagram:

UnionType (Hive type) has a specific presentation – its meaning in Hive is that column can accommodate value of one of declared types. In used structured type those possible types are represented by attributes (>attr_1, >attr_2..) each one with related type – column foo in table default.union_test3.
Map (Hive type) is represented by structured type with 2 attributes “key” and “value” each one with type related to Hive definition – column deductions in table emp2_changed_name.
Structured type has one more property – “Hive Type” and it can be set to Map or UNIONTYPE
8. Visibility of existing dynamic properties
Visibility can be set for dynamic properties already created on tables and entities. New page is added for that in design level settings (settings in design properties dialog

And option to show them in View>details menu on diagram ("Dynamic properties" need to be chcked - the same is used for User Defined Properties in DM 4.1):

9. Hive DDl preview
Limited Hive DDL can be generated If table has dynamic property hiveDDL with value true - then it’s possible to have DDL preview for Hive.
(Data Modeler 4.1 provides possibility for user defined DDL using scripting that can replace DDL generated by built-in DDL generator - so it's possible customers to generate complete Hive DDL based on Dynamic/user defined properties set for tables and columns)

That also works on tables not imported from Hive – as the example above from OE schema

DDL statement:
CREATE TABLE OE.CUSTOMERS
(
CUSTOMER\_ID DECIMAL(6) ,
CUST\_FIRST\_NAME VARCHAR(20) ,
CUST\_LAST\_NAME VARCHAR(20) ,
CUST\_ADDRESS STRUCT\<STREET\_ADDRESS:VARCHAR(40),POSTAL\_CODE:VARCHAR(10),CITY:VARCHAR(30),STATE\_PROVINCE:VARCHAR(10),COUNTRY\_ID:CHAR(2)> ,
PHONE\_NUMBERS ARRAY\<VARCHAR(25)> ,
NLS\_LANGUAGE VARCHAR(3) ,
NLS\_TERRITORY VARCHAR(30) ,
CREDIT\_LIMIT DECIMAL(9,2) ,
CUST\_EMAIL VARCHAR(30) ,
ACCOUNT\_MGR\_ID DECIMAL(6) ,
CUST\_GEO\_LOCATION STRUCT\<SDO\_GTYPE:DECIMAL,SDO\_SRID:DECIMAL,SDO\_POINT:STRUCT\<X:DECIMAL,Y:DECIMAL,Z:DECIMAL>,SDO\_ELEM\_INFO:ARRAY\<DECIMAL>,SDO\_ORDINATES:ARRAY\<DECIMAL>> ,
DATE\_OF\_BIRTH Date ,
MARITAL\_STATUS VARCHAR(20) ,
GENDER VARCHAR(1) ,
INCOME\_LEVEL VARCHAR(20)
)
;
Philip Stoyanov