Data Lake & Services

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Oracle large table fetch from databricks takes longer

User_1WSU2Mar 10 2022 — edited Mar 10 2022

I have an oracle table containing 50 Million Records and about 13-15 columns and having composite primary key. I am trying to fetch this table into databricks using oracle.jdbc.driver.OracleDriver as below:
Approaches I tried:

Approach 1.

val myDF = spark.read.jdbc(url = url,
table = "TableName",
columnName="PartionColumn",
lowerBound=lowerBound,
upperBound=UpperBound,
numPartitions=10,
connectionProperties)

myDF.write.option("mergeSchema", "true").format("delta").mode("overwrite").saveAsTable("TableName")

Approach 2.
val myDF = spark.read.jdbc(url, query, connectionProperties)
myDF.write.option("mergeSchema", "true").format("delta").mode("overwrite").saveAsTable("TableName")

And both of these approaches takes more than 25 hours to ingest table into databricks
Also, when I try to load data into dataframe and display it, it doesn't display the result.
Can someone suggest what am I doing wrong here or any help on best approach to achieve this would be appreciated ?

Thanks

Added on Mar 10 2022

#big-data-appliance-discussions

0 comments

567 views