Skip to Main Content

Java Database Connectivity (JDBC)

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Loading extremely large Jsons into Oracle/PostgreSQL database via Python

User_23889Oct 26 2022 — edited Oct 30 2022

I am starting a new project where I have to load extremely large JSON-files into an Oracle-Database (or PostgreSQL, though that's not very important).
The files are approximately 100-150 GB, holding 13 different arrays, each at least 500 million lines. Fortunately, the structure isnt very deep and most are just 1 level deep but with a lot of properties.
I am very experienced with databases, but new to Python and Json.
The size of the files means that i cannot just use the most common Python libraries, or at least that i have to use them in a way that're not very intuitive to me as a beginner.
I have a working solution for my 1 mb test-json, but it is not scalable and i have not been able to produce a solution for the real file due to the size in url.
I would be eternally grateful if someone could guide me in the right direction in regards to useful libraries, solutions or helpful websites.
Loading, parsing and traversing such huge files.
A good Pythonic way to load them into the database as i suspect 800 million individual inserts are not the ideal approach.
If more info is needed, i will gladly expand the topic.

Comments
Post Details
Added on Oct 26 2022
0 comments
231 views