Many people have suffered a lot by Magento Commerce Dataflow speed. I had serious problem by myself, when i tried to import 10,000 items into Magento database. I wrote my own import script, one of possible solutions to Magento Dataflow speed problem. But my achieved results were about 30-60 seconds per item. That's terrible.
So i did little bit of research on Varien_Profiler, and the major problem was called database queries. So little googling took me to Magento Blog Article about Performance - this is really good starting page if you do experience Magento Dataflow speed issues. Setup your MySQL configuration by example in the article, which will help a lot - i've got 10 times faster queries on my averange Windows Vista laptop.
But still 3 - 6 seconds * 10,000 items isn't in fact a good result. So i dug deeper in Magento Forums about speed, and tried to run benchmarks of Magento Dataflow import / export profiles. Export was already fast enough, but Import was a real pain in the ass. Even original script from Magento (= Dataflow API) didn't run faster than my own script.
I've found lot of people with the same issue on forums, so i didn't give up. Another solution for improving performance of Magento, is installation of Memcache, or APC cache, but i haven't got better results using them. Something was wrong, because script didn't use 100% CPU or memory.
So after at least 100 times i red profile xmls created by Magento Admin, and finally i've found holy grail in getting speed Magento Dataflow import. 10,000 items in 20 minutes. The solution is very easy: setup Number of records in your import profile to 10, or 100, maybe even 1,000 items - this causes Magento doesn't run one query per imported item, but cache 10, 100 or 1,000 rows and then run'em all at once (well not really, but that's the point).
I hope you found this article useful. Don't forget to setup enough memory_limit for your php script when using this method.
I'm pretty sure you can enhance speed of import rapidly, when you setup your custom product attributes scope to GLOBAL instead of store view, because if you are using attributes per store, then all this data are duplicated per store, and then amount of data handled by Dataflow import/export will grow rapidly.












