Sunday, November 6, 2016

extract entity data from MS CRM Online

If you need to extract data from CRM Online you can use many of the 3rd party adapters such as Kingswaysoft.

I also put together a highly concurrent and parallel downloader and general purpose tool located here: https://github.com/aappddeevv/mscrm-soap-auth.

The tool can download entity data by paging through entity attributes page-after-page and writing it to disk. The program can run multiple concurrent entity downloads.

It has two modes to download any individual entity. It can go page-by-page, which is actually quite slow.

Or, you can download the entity's IDs to disk and as soon as you have enough (think a queue) you can then start the download of the entity data using the key file while the key file is being generated. Since it takes longer to download the attributes then just the key, you can obtain highly concurrent entity downloads--millions of rows per hour are easy to achieve on a reasonable laptop/server. I have achieve 10 million entities per hour on my laptop. Don't forget to set your JAVA_OPT="-Xmx" setting e.g. "-Xmx13G" for a 16GB machine. If you use an ec2 instance designed for large memory jobs e.g. 30GB core memory, you can achieve 2-3x what you can do on a laptop.