database - Best practices for huge volumes of data load/unload? -


My question applies to ETL scenarios, where the conversion is done outside the database (completely). If you used to load huge versions of data (Extract, Transform and Load) (20+ million records or more) and there are databases in it: Oracle and MSSQL server, what would be the best way to do this:

    < Li> Read effectively from the source database: Is there a way that I can avoid all queries on the network? I've heard good things about Direct Path Extract Method / Bulk Unload Method - I'm not sure how they work, but I think that I would like to read / import any type of non-network based data to a dump file would be required?
  1. Write targeted data in a effectively targeted database?: Should I consider Apache Hadop? Will this help me start my transition and will load all my data equally in the destination database? - Will this be faster than saying, Oracle's bulk load utility? If not, is there a way to remotely remove remote load utility on Oracle / MSSQL server?

Appreciate your thoughts / suggestions

I always For this I use the bulk load facility of DB. The bulk load's remote control is a cisadamine problem; There is always a way to do this.

This means that the first two stages of the ETL and applications which will form the correct file format for bulk load facility, and the final step is being bulk loading.


Comments