* Disable copying for anywhere but the GPU
* Remove unused import and remove references to transferring from the GPU from the docs
* Skip gpu test in cpu mode
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
* Add async transferer class
* Add async ndarray copy interface
* Add python bindings
* Fix comment
* Add python class
* Fix linting issues
* Add python unit test
* Update python interface
* move async_transferer to cuda only directory
* Fix linting issue
* Move out of contrib
* Add doc strings
* Move test compute from backend
* Update comment
* Fix test naming
* Fix argument usage
* Wrap/unwrap backend parameters
* Move to dataloading
* Move to 'dataloading'
* Make GPU/CPU compatible
* Fix unit tests
* Add docs
* Use only backend interface for datamovement in unit test