The mover will, as the name suggests, move your files to the correct location they should be at in regards to your share configuration! That is it.
Shares are your way of telling Unraid how to distribute the content you have on your server and which drive(s) that content should be.
Those shares can be configured in different ways like at what folder depth Unraid should split or keep content together (split level), how disks should be filled (Allocation method) but also if a cache should be used or not (use cache).
The cache has multiple functions:
it can act as a buffer to temporarily hold the data you copy on your server
it can be used as permanent storage for things that are frequently written
it can be a very fast storage medium
When your array has parity drives then each time you write to the array, the parity information on the parity drive needs to be updated through the parity calculations. If you want to read more on how parity works: https://wiki.unraid.net/Parity
Because of that parity calculation, writing to your array is slower than what the drives usually could be capable of.
That is where the cache comes into play because the drives assigned to a cache pool are independent of the Array and therefore not covered by your Parity. That also means that while your Array is protected from 1 or 2 drive failures through 1 or 2 parity drives, the cache needs to be protected independently.
Since the cache is not covered by the parity, the write speed is already higher and can also be increased by using SSDs for even higher write speeds.
Now, if you set your Cache drive to use the cache then everything that is written to the server will be saved on the cache drives first. Later, when you usually wouldn't use your Unraid server (like at night), Unraid would invoke the mover to move the files to where they should belong.
That means that if you have configured your cache to "use cache=yes" then all data will be moved to the array when the mover is invoked. "use cache=prefer", for example, will move everything FROM the array TO the cache instead.
This configuration is important depending on what files you are dealing with. If you have data that is frequently written, you would store them on a cache to not constantly have to update the parity information. Data that is frequently read but rarely or never written again would make sense to have on the array.
For example, Docker containers have some configuration which would make sense to have them exclusively on your cache drive. Video files would make sense to keep on your array. So you would set your share that contains the docker container configuration (AppData) to "use cache=prefer" while you set your other share (for example, storage) to "use cache=yes".
"priming" your Unraid server (copying your data to the server) can be done in many different ways each has its advantages and disadvantages.
Prepare the server as you would want it with parity, array drives and cache pools. You could then copy the data to the server in chunks to fill up the cache drive and then invoke the mover to move the files to the array. This is a fairly safe method because anything that is copied will already be parity protected but it will take a long time.
The other way would be to have a parity but not use a cache for the initial copy process. This would also take longer than normal but it would prevent you from filling up your cache and need to trigger the mover all the time.
The last option would be to not assign a parity drive and not use a cache. Without a parity drive, there are no parity calculations happening so your array is faster and you don't need to invoke the mover all the time because you won't use it. After the copy process is finished, you would add your parity drives and let the parity be built so that you have your redundancy. And then you will configure your shares in the way you see fit.
That means that you don't "need" to disable the mover if you don't use a cache.
Thanks for the in-depth reply. I'm not really worried about the data dying on the initial ingestion, I'm more worried about a multi-day transfer being borked by the mover or a full cache drive.
Transfers will fail when the cache is full.
Mover will only move files not in use. So if you copy 10 files, 8 are done and you're copying the 9th, mover will not touch the 9th files (unless by the time it's done with the first 8, you finished the 9th and are on the 10th).
You mostly can turn cache on and off on the fly. I say mostly, cause if you disable cache, the files on it are not available on the share anymore (until you turn it on again), so that often is an issue with your appdata, docker, vm, ... shares. But for a data share, just run mover so there's nothing related to the share on cache, and just disable it.
Fragmentation shouldn't really matter much to begin with and will happen over time anyway, but shouldn't increase from not using cache: it doesn't matter if the new file goes direct ram to array, or ram to cache, cache to ram and ram to array.
As for speed, it will be slower, but I wonder how much. You use usb disks, judging by the size spinning disks (there are ssds of that size). They are probably slower than your array disks. With normal parity, the array might be slower, but you can turn on turbo write (google it, also called reconstruct write) for your first intake. So unless you start copying from multiple disks at the same time, you are limited by your usb disk speed. Cache if just for small burst, get a "small" (500gb is still a lot) amount of data to your server as fast as possible. Anything past that, even with the smartest way to handle it, is limited by array speed.
Thanks for the reply. I'm just using this as data storage, if I ever want to mess with VMs or docker I'll get another M2 and run them on that, if it's possible to segregate such things.
I'm not really in a rush to get the data on it ASAP, I just want to plug in a drive, drag it over, and come back in a day or two when it's done transferring. I'll just turn off the cache for that.