Commit Graph

50 Commits

Author SHA1 Message Date
Muhammed Fatih BALIN
b1e39432bf [Build] Organize cmake file (Fixed) (#7715) 2024-08-18 10:15:01 -04:00
Hongzhi (Steve), Chen
4910eace8c Revert "[Build] Organize cmake file." (#7710) 2024-08-16 15:29:56 +08:00
Muhammed Fatih BALIN
4032c2a80a [Build] Organize cmake file. (#7657) 2024-08-06 08:48:10 -04:00
Muhammed Fatih BALIN
e9661d3b36 [CUDA] Add Ada architecture to DGL architecture list. (#7282) 2024-04-08 03:49:31 -04:00
Muhammed Fatih BALIN
60a27bc6e8 [GraphBolt][CUDA] Pass GPU architecture option from DGL (#6792) 2023-12-26 13:55:06 +08:00
Rhett Ying
fdeda8a862 [dev] update cuda cmake (#6574) 2023-11-16 11:57:15 +08:00
Rhett Ying
19096c6a8e [dev] enable cuda12.1 build (#6567) 2023-11-15 11:08:07 +08:00
czkkkkkk
edcecdd0a1 [CMAKE] Move DGL cuda file declaration to the main CMakeLists.txt (#6300) 2023-09-12 17:07:40 +08:00
Hugo MacDermott-Opeskin
4a42027d4a [Build] Add CMake changes from conda-forge build (#6189)
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
2023-09-01 11:21:03 +08:00
Hongzhi (Steve), Chen
346410928f [Misc] Cleanup old flags, and rely on BUILD_TYPE for all features. (#6154)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2023-08-15 16:31:44 +08:00
Muhammed Fatih BALIN
f0d8ca1e40 [Dev] Change CXX standard to 17 (#6138) 2023-08-14 10:26:40 +08:00
Hongzhi (Steve), Chen
cff938c6ad [Misc] Add comment to clarify __dgl_option. (#6106)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2023-08-07 13:06:46 +08:00
Hongzhi (Steve), Chen
f7fef600e3 [Misc] Support build option: "all". (#6102)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2023-08-07 12:48:32 +08:00
Hongzhi (Steve), Chen
d7410cf468 [Misc] Support DGL feature option. (#6088)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2023-08-03 14:24:37 +08:00
Hongzhi (Steve), Chen
12ade95c9e [Misc] Cleanup unused cmake util. (#6084)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2023-08-02 14:55:51 +08:00
Hongzhi (Steve), Chen
ffd8edeb2a [Misc] Cleanup duplicated flags. (#6081)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2023-08-02 13:51:24 +08:00
Muhammed Fatih BALIN
224d6a6936 [Dev] Resolve compile issue with gcc 11.3.0 (#6072) 2023-08-01 14:37:24 +08:00
Muhammed Fatih BALIN
69a532c1ab [Feature] Gpu cache for node and edge data (#4341)
Co-authored-by: xiny <xiny@nvidia.com>
2023-07-24 13:17:10 +08:00
Hongzhi (Steve), Chen
9ff56d2098 [Cleanup] Remove featgraph and unused TVM dependency. (#5767)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2023-06-02 15:14:31 +08:00
Xin Yao
bea5c78b3f [Fix] Remove curand host functions (#5552) 2023-04-17 10:37:46 +08:00
Xin Yao
acb955e15f [Cleanup] Cleanup unused CMake options (#5470)
* cleanup unused cmake options

* disable BUILD_TORCH for cugraph

* resolve comments
2023-03-22 16:06:44 +08:00
Xin Yao
8d5d8962ad [Refactor] Replace third_party/nccl with PyTorch's NCCL backend (#4989)
* expose GeneratePermutation

* add sparse_all_to_all_push

* add sparse_all_to_all_pull

* add unit test

* handle world_size=1

* remove python nccl wrapper

* remove the nccl dependency

* use pinned memory to speedup D2H copy

* fix lint

* resolve comments

* fix lint

* fix ut

* resolve comments
2023-03-08 12:59:10 +08:00
Xin Yao
7ee550f004 update cmake for cuda12 (#5048) 2023-01-05 10:47:10 +08:00
Xin Yao
65b34702e6 [Makefile] Refactor CUDA makefile and add Hopper (SM90) to default build (#4830)
* Update CUDA.cmake to align with PyTorch's

* add Ada and Hopper

* add more comments

* resolve comments

Co-authored-by: Triston <triston.cao@gmail.com>
2022-11-19 12:17:07 -08:00
czkkkkkk
06438d7033 [Sparse] Link to DGL (#4877) 2022-11-17 10:14:08 +08:00
lixiaobai
dd762a1e8a [PinSAGESampler] support PinSAGE sampler on GPU (#3567)
* Feat: support API "randomwalk_topk" in library

* Feat: use the new API "randomwalk_topk" for PinSAGESampler

* Minor

* Minor

* Refactor: modified codes as checker required

* Minor

* Minor

* Minor

* Minor

* Fix: checking errors in RandomWalkTopk

* Refactor: modified the docstring for randomwalk_topk

* change randomwalk_topk to internal

* fix

* rename

* Minor for pinsage.py

* Feat: support randomwalk and SelectPinSageNeighbors on GPU

Port RandomWalk algorithm on GPU,
and port SelectPinSageNeighbors on GPU.

* Feat: support GPU on python APIs

* Feat: remove perf print information in FrequenchHashmap

* Fix: modified the code format

Modified the code format as task_lint.sh suggested

* Feat: let test script support PinSAGESampler on GPU

Let test script support PinSAGESampler on GPU,
minor of "restart_prob".

* Minor

* Minor

* Minor

* Refactor: use the atomic operations from the array module

* Minor: change the long lines

* Refactor: modified the get_node_types for gpu

* Feat: update the contributor date

* Perf: remove unnecessary stream sync

* Feat: support other random walk

But the non-uniform choice is still not supported.

* Fix: add CUDA switch for random walk

Co-authored-by: Quan Gan <coin2028@hotmail.com>
2021-12-15 13:42:26 +08:00
Hongyu Cai
9c41e97cc0 [Doc] Fix type in CUDA.cmake (#3479)
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
2021-11-08 14:21:48 +08:00
David Min
905c0aa578 [Feature][Performance][GPU] Introducing UnifiedTensor for efficient zero-copy host memory access from GPU (#3086)
* Add pytorch-direct version

* Initial commit of unified tensor

* Merge branch 'master' of https://github.com/davidmin7/dgl

* Remove unnecessary things

* Fix error message

* Fix/Add descriptions

* whitespace fix

* add unpin

* disable IndexSelectCPUFromGPU with no CUDA

* add a newline for unified_tensor.py

* Apply changes based on feedback

* add 'os' module

* skip unified tensor unit test for cpu only

* Update tests/pytorch/test_unified_tensor.py

Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* reflect feedback

Co-authored-by: shhssdm <shhssdm@gmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>
2021-07-17 00:06:20 +08:00
nv-dlasalle
7d069d62eb Add Ampere support to cmake files (#3031)
* Update cmake to build Ampere

* Fix version check
2021-06-16 12:57:28 -07:00
nv-dlasalle
66eb240d15 [Bugfix] Include NCCL as a submodule (#2934)
* Add NCCL as a submodule

* Allow using third_party/nccl or system nccl

* Add nccl_external as a dependency

* Fix conditional

Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
2021-05-25 09:46:16 +08:00
nv-dlasalle
ae8dbe6d3c [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)
* Split NCCL wrapper from sparse optimizer and sparse embedding

* Add more unit tests for single node nccl

* Fix unit test for tf

* Switch to device histogram

* Fix histgram issues

* Finish migration to histogram

* Handle cases with zero send/recieve data

* Start on partition object

* Get compiling

* Updates

* Add unit tests

* Switch to partition object

* Fix linting issues

* Rename partition file

* Add python doc

* Fix python assert and finish doxygen comments

* Remove stubs for range based partition to satisfy pylint

* Wrap unit test in GPU only

* Wrap explicit cuda call in ifdef

* Merge with partition.py

* update docstrings

* Cleanup partition_op

* Add Workspace object

* Switch to using workspace object

* Move last remainder based function out of nccl_api

* Add error messages

* Update docs with examples

* Fix linting erros

Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>
2021-05-20 10:58:17 -07:00
Tianqi Zhang (张天启)
c88fca5055 [Feature] Add edge coarsening for homogeneous undirected graphs (#2691)
* finish graph matching gpu version

* use C++ shuffle

* finish graph matching

* fix bug

* fix bug

* change name and use swap

* upt

* fix format problem

* fix format problem

* stronger test

* upt

* upt

* change python api

* upt

* upt

* format check

* upt

* upt

* fix bug

Co-authored-by: Tong He <hetong007@gmail.com>
2021-03-09 16:35:27 +08:00
nv-dlasalle
bc3a532f5e [Sampling] Implement dgl.to_block() for the GPU (#2339)
* Add start of to_block gpu implementation

* Pull in more changes from 0.4.2 cuda_to_block

* Move more code to IdArray

* Refactor DeviceNodeMapMaker

* Updates

* get compiling

* Integrate to_block

* Fix ID allocation

* Minor fixes

* Cleanup cuda calls to use cuda_common

* Reduce kernel calls

* Lint cleanup

* Expand documentation

* Remove unused function

* Rename variables for consistency

* Add doxygen comments

* Fix file extension

* Remove raw asynccopy for deviceapi

* Remove unused function

* Fix block/tile configuration

* Add cuda_device_common.cuh

* Add basic hashtable

* Migrate part of hashtable

* Refactor to use external hashtable

* Make functions members

* Format hash table functions

* Migrate duplicate filling

* Move last function over

* Refactor with cu file

* lint c++ code

* Move context check to C++ code

* Use macro switch

* Add missing files

* Update docstring

* update docs

* Move atomic functions

* Refactor hashtable

* Fix linting

* Expand docs

* Fix mismatched argument names

* Switch doxygen comments from using @param to \param

Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
2021-02-08 13:00:43 +08:00
Zihao Ye
7bab1365e2 [feature] Supporting half precision floating data type (fp16). (#2552)
* add tvm as submodule

* compilation is ok but calling fails

* can call now

* pack multiple modules, change names

* upd

* upd

* upd

* fix cmake

* upd

* upd

* upd

* upd

* fix

* relative path

* upd

* upd

* upd

* singleton

* upd

* trigger

* fix

* upd

* count reducible

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* only keep related files

* upd

* upd

* upd

* upd

* lint

* lint

* lint

* lint

* pylint

* upd

* upd

* compilation

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd doc

* refactor

* fix

* upd number

Co-authored-by: Zhi Lin <linzhilynn@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-78.us-east-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-21-156.us-east-2.compute.internal>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
2021-01-28 11:21:58 +08:00
Zhi Lin
4208ce2b9e [Feature] Tvm integration (#2367)
Co-authored-by: Zihao Ye <expye@outlook.com>
2020-12-31 17:40:25 +08:00
Quan (Andy) Gan
9a7235faf2 [Performance] Use allocator from PyTorch if possible (#2328)
* first commit

* some thoughts

* move around

* more commit

* more fixes

* now it uses torch allocator

* fix symbol export error

* fix

* fixes

* test fix

* add script

* building separate library per version

* fix for vs2019

* more fixes

* fix on windows build

* update jenkinsfile

* auto copy built dlls for windows

* lint and installation guide update

* fix

* specify conda environment

* set environment for ci

* fix

* fix

* fix

* fix again

* revert

* fix cmake

* fix

* switch to using python interpreter path

* remove scripts

* debug

* oops sorry

* Update index.rst

* Update index.rst

* copies automatically, no need for this

* do not print message if library not found

* tiny fixes

* debug on nightly

* replace add_compile_definitions to make CMake 3.5 happy

* fix linking to wrong lib for multiple pytorch envs

* changed building strategy

* fix nightly

* fix windows

* fix windows again

* setup bugfix

* address comments

* change README
2020-12-25 13:57:51 +08:00
Zihao Ye
5d3da4bcef [hotfix] Enable AVX optimization by default. (#2438) 2020-12-21 12:35:17 +08:00
Zihao Ye
e379e52585 [hotfix] Make USE_AVX a flag in cmake to avoid compilation error for arm user (#2428)
* upd cmake

* upd

* format
2020-12-17 17:29:15 +08:00
Minjie Wang
77968e30b5 [Build] use different flags for NVCC and CC (#2342) 2020-11-14 15:48:24 +08:00
Quan (Andy) Gan
501b2b75a5 [Bug] Multiple fixes for CUDA 11 support (#2333)
* multiple fixes

* fix CI

* fiddle

* revert stubs

* remove stubs

* poke

* remove linking of driver library

* minor

Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
2020-11-13 15:37:29 +08:00
Minjie Wang
4fb0241bfb [CUDA] Add CUDA11 support (#2308)
* add support for cuda 11

* fix inc bug in pytorch 1.8

* poke ci

* fix

* small fix

* try fix

* try fix

Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
2020-11-07 22:16:51 +08:00
Zihao Ye
5cff2f1cb2 [Feature] Use new cusparse API to support CUDA 11. (#1979)
* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd
2020-08-27 15:35:56 +08:00
Tong He
3d47693b1f [Op] Farthest Point Sampler in Cpp and CUDA (#1630)
* working framework without actual algorithm logic

* rename

* fix

* fps passes compilation

* correct algorithm

* add cuda implementation

* update random start

* before refactor

* pass compilation but cuda not working

* working

* code working, will add docstring

* add mxnet support

* update docstring

* update doc and test

* cpplint

* cpcplint

* pylint

* temporary fix

* fix for win64

* fix unitetest

* fix

* fix

* remove comment

* move to geometry package

* remove redundant include

* add docstrings and comments

* add proof

* add validity check
2020-06-22 00:52:20 +08:00
Minjie Wang
b0d9e7aa43 [Refactor] Separating graph and sparse matrix operations (#699)
* WIP: array refactoring

* WIP: implementation

* wip

* most csr part

* WIP: on coo

* WIP: coo

* finish refactoring immutable graph

* compiled

* fix undefined ndarray copy bug; add COOToCSR when coo has no data array

* fix bug in COOToCSR

* fix bug in CSR constructor

* fix bug in in_edges(vid)

* fix OutEdges bug

* pass test_graph

* pass test_graph

* fix bug in CSR constructor

* fix bug in CSR constructor

* fix bug in CSR constructor

* fix stupid bug

* pass gpu test

* remove debug printout

* fix lint

* rm biparate grpah

* fix lint

* address comments

* fix bug in Clone

* cpp utests
2019-07-17 17:49:38 -04:00
Quan (Andy) Gan
059b1a6d6f [Release] Bump up version (#636)
* bump up version

* conda+cuda trial

* switch conda branch

* revert

* disable cudnn
2019-06-12 16:50:05 +08:00
Quan (Andy) Gan
a1513f7c8f update known gpu arch list (#629) 2019-06-10 10:40:36 +08:00
Quan (Andy) Gan
e35e860ae3 [Build] Support older CMake & OpenMP toggle (#619)
* cmake fixes for older systems

* allow specification of cuda path

* test script fixes to enable openmp & test

* update minigun; disable minigun partial frontier compile
2019-06-07 14:00:34 -04:00
Lingfan Yu
653428bdc7 [Feature][Kernel] DGL kernel support (#596)
* [Kernel] Minigun integration and fused kernel support (#519)

* kernel interface

* add minigun

* Add cuda build

* functors

* working on binary elewise

* binary reduce

* change kernel interface

* WIP

* wip

* fix minigun

* compile

* binary reduce kernels

* compile

* simple test passed

* more reducers

* fix thrust problem

* fix cmake

* fix cmake; add proper guard for atomic

* WIP: bcast

* WIP

* bcast kernels

* update to new minigun pass-by-value practice

* broadcasting dim

* add copy src and copy edge

* fix linking

* fix none array problem

* fix copy edge

* add device_type and device_id to backend operator

* cache csr adj, remove cache for adjmat and incmat

* custom ops in backend and pytorch impl

* change dgl-mg kernel python interface

* add id_mapping var

* clean up plus v2e spmv schedule

* spmv schedule & clean up fall back

* symbolic message and reduce func, remove bundle func

* new executors

* new backend interface for dgl kernels and pytorch impl

* minor fix

* fix

* fix docstring, comments, func names

* nodeflow

* fix message id mapping and bugs...

* pytorch test case & fix

* backward binary reduce

* fix bug

* WIP: cusparse

* change to int32 csr for cusparse workaround

* disable cusparse

* change back to int64

* broadcasting backward

* cusparse; WIP: add rev_csr

* unit test for kernels

* pytorch backward with dgl kernel

* edge softmax

* fix backward

* improve softmax

* cache edge on device

* cache mappings on device

* fix partial forward code

* cusparse done

* copy_src_sum with cusparse

* rm id getter

* reduce grad for broadcast

* copy edge reduce backward

* kernel unit test for broadcasting

* full kernel unit test

* add cpu kernels

* edge softmax unit test

* missing ref

* fix compile and small bugs

* fix bug in bcast

* Add backward both

* fix torch utests

* expose infershape

* create out tensor in python

* fix c++ lint

* [Kernel] Add GPU utest and kernel utest (#524)

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* [Kernel] Update kernel branch (#550)

* [Model] add multiprocessing training with sampling. (#484)

* reorganize sampling code.

* add multi-process training.

* speed up gcn_cv

* fix graphsage_cv.

* add new API in graph store.

* update barrier impl.

* support both local and distributed training.

* fix multiprocess train.

* fix.

* fix barrier.

* add script for loading data.

* multiprocessing sampling.

* accel training.

* replace pull with spmv for speedup.

* nodeflow copy from parent with context.

* enable GPU.

* fix a bug in graph store.

* enable multi-GPU training.

* fix lint.

* add comments.

* rename to run_store_server.py

* fix gcn_cv.

* fix a minor bug in sampler.

* handle error better in graph store.

* improve graphsage_cv for distributed mode.

* update README.

* fix.

* update.

* [Tutorial] add sampling tutorial. (#522)

* add sampling tutorial.

* add readme

* update author list.

* fix indent in the code.

* rename the file.

* update tutorial.

* fix the last API.

* update image.

* [BUGFIX] fix the problems in the sampling tutorial. (#523)

* add index.

* update.

* update tutorial.

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* Fixing typo in JTNN after interface change (#536)

* [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)

* [Bug Fix] Fix inplace op at backend (#546)

* Fix inplace operation

* fix line seprator

* [Feature] Add batch and unbatch for immutable graph (#539)

* Add batch and unbatch for immutable graph

* fix line seprator

* fix lintr

* remove unnecessary include

* fix code review

* [BUGFix] Improve multi-processing training (#526)

* fix.

* add comment.

* remove.

* temp fix.

* initialize for shared memory.

* fix graphsage.

* fix gcn.

* add more unit tests.

* add more tests.

* avoid creating shared-memory exclusively.

* redefine remote initializer.

* improve initializer.

* fix unit test.

* fix lint.

* fix lint.

* initialize data in the graph store server properly.

* fix test.

* fix test.

* fix test.

* small fix.

* add comments.

* cleanup server.

* test graph store with a random port.

* print.

* print to stderr.

* test1

* test2

* remove comment.

* adjust the initializer signature.

* [API] update graph store API. (#549)

* add init_ndata and init_edata in DGLGraph.

* adjust SharedMemoryGraph API.

* print warning.

* fix comment.

* update example

* fix.

* fix examples.

* add unit tests.

* add comments.

* [Refactor] Immutable graph index (#543)

* WIP

* header

* WIP .cc

* WIP

* transpose

* wip

* immutable graph .h and .cc

* WIP: nodeflow.cc

* compile

* remove all tmp dl managed ctx; they caused refcount issue

* one simple test

* WIP: testing

* test_graph

* fix graph index

* fix bug in sampler; pass pytorch utest

* WIP on mxnet

* fix lint

* fix mxnet unittest w/ unfortunate workaround

* fix msvc

* fix lint

* SliceRows and test_nodeflow

* resolve reviews

* resolve reviews

* try fix win ci

* try fix win ci

* poke win ci again

* poke

* lazy multigraph flag; stackoverflow error

* revert node subgraph test

* lazy object

* try fix win build

* try fix win build

* poke ci

* fix build script

* fix compile

* add a todo

* fix reviews

* fix compile

* [Kernel] Update kernel branch (#576)

* [Model] add multiprocessing training with sampling. (#484)

* reorganize sampling code.

* add multi-process training.

* speed up gcn_cv

* fix graphsage_cv.

* add new API in graph store.

* update barrier impl.

* support both local and distributed training.

* fix multiprocess train.

* fix.

* fix barrier.

* add script for loading data.

* multiprocessing sampling.

* accel training.

* replace pull with spmv for speedup.

* nodeflow copy from parent with context.

* enable GPU.

* fix a bug in graph store.

* enable multi-GPU training.

* fix lint.

* add comments.

* rename to run_store_server.py

* fix gcn_cv.

* fix a minor bug in sampler.

* handle error better in graph store.

* improve graphsage_cv for distributed mode.

* update README.

* fix.

* update.

* [Tutorial] add sampling tutorial. (#522)

* add sampling tutorial.

* add readme

* update author list.

* fix indent in the code.

* rename the file.

* update tutorial.

* fix the last API.

* update image.

* [BUGFIX] fix the problems in the sampling tutorial. (#523)

* add index.

* update.

* update tutorial.

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* Fixing typo in JTNN after interface change (#536)

* [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)

* [Bug Fix] Fix inplace op at backend (#546)

* Fix inplace operation

* fix line seprator

* [Feature] Add batch and unbatch for immutable graph (#539)

* Add batch and unbatch for immutable graph

* fix line seprator

* fix lintr

* remove unnecessary include

* fix code review

* [BUGFix] Improve multi-processing training (#526)

* fix.

* add comment.

* remove.

* temp fix.

* initialize for shared memory.

* fix graphsage.

* fix gcn.

* add more unit tests.

* add more tests.

* avoid creating shared-memory exclusively.

* redefine remote initializer.

* improve initializer.

* fix unit test.

* fix lint.

* fix lint.

* initialize data in the graph store server properly.

* fix test.

* fix test.

* fix test.

* small fix.

* add comments.

* cleanup server.

* test graph store with a random port.

* print.

* print to stderr.

* test1

* test2

* remove comment.

* adjust the initializer signature.

* [API] update graph store API. (#549)

* add init_ndata and init_edata in DGLGraph.

* adjust SharedMemoryGraph API.

* print warning.

* fix comment.

* update example

* fix.

* fix examples.

* add unit tests.

* add comments.

* [Refactor] Immutable graph index (#543)

* WIP

* header

* WIP .cc

* WIP

* transpose

* wip

* immutable graph .h and .cc

* WIP: nodeflow.cc

* compile

* remove all tmp dl managed ctx; they caused refcount issue

* one simple test

* WIP: testing

* test_graph

* fix graph index

* fix bug in sampler; pass pytorch utest

* WIP on mxnet

* fix lint

* fix mxnet unittest w/ unfortunate workaround

* fix msvc

* fix lint

* SliceRows and test_nodeflow

* resolve reviews

* resolve reviews

* try fix win ci

* try fix win ci

* poke win ci again

* poke

* lazy multigraph flag; stackoverflow error

* revert node subgraph test

* lazy object

* try fix win build

* try fix win build

* poke ci

* fix build script

* fix compile

* add a todo

* fix reviews

* fix compile

* all demo use python-3 (#555)

* [DEMO] Reproduce numbers of distributed training in AMLC giant graph paper (#556)

* update

* update

* update

* update num_hops

* fix bug

* update

* report numbers of distributed training in AMLC giant graph paper

* [DEMO] Remove duplicate code for sampling (#557)

* update

* update

* re-use single-machine code

* update

* use relative path

* update

* update

* update

* add __init__.py

* add __init__.py

* import sys, os

* fix typo

* update

* [Perf] Improve performance of graph store. (#554)

* fix.

* use inplace.

* move to shared memory graph store.

* fix.

* add more unit tests.

* fix.

* fix test.

* fix test.

* disable test.

* fix.

* [BUGIFX] fix a bug in edge_ids (#560)

* add test.

* fix compute.

* fix test.

* turn on test.

* fix a bug.

* add test.

* fix.

* disable test.

* [DEMO] Add Pytorch demo for distributed sampler (#562)

* update

* update

* update

* add sender

* update

* remove duplicate cpde

* [Test] Add gtest to project (#547)

* add gtest module

* add gtest

* fix

* Update CMakeLists.txt

* Update README.md

* [Perf] lazily create msg_index. (#563)

* lazily create msg_index.

* update test.

* [BUGFIX] fix bugs for running GCN on giant graphs. (#561)

* load mxnet csr.

* enable load large csr.

* fix

* fix.

* fix int overflow.

* fix test.

* [BugFix] Fix error when bfs_level = 0 in Entity Classification with RGCN (#559)

* [DEMO] Update demo of distributed sampler (#564)

* update

* update

* update demo

* add network cpp test (#565)

* Add unittest for C++ RPC (#566)

* [CI] Fix CI for cpp test (#570)

* fix CI for cpp test

* update port number

* [Docker] update docker image (#575)

* update docker image

* specify lint version

* rm torch import from unified tests

* [Kernel][Scheduler][MXNet] Scheduler for DGL kernels and MXNet backend support (#541)

* [Model] add multiprocessing training with sampling. (#484)

* reorganize sampling code.

* add multi-process training.

* speed up gcn_cv

* fix graphsage_cv.

* add new API in graph store.

* update barrier impl.

* support both local and distributed training.

* fix multiprocess train.

* fix.

* fix barrier.

* add script for loading data.

* multiprocessing sampling.

* accel training.

* replace pull with spmv for speedup.

* nodeflow copy from parent with context.

* enable GPU.

* fix a bug in graph store.

* enable multi-GPU training.

* fix lint.

* add comments.

* rename to run_store_server.py

* fix gcn_cv.

* fix a minor bug in sampler.

* handle error better in graph store.

* improve graphsage_cv for distributed mode.

* update README.

* fix.

* update.

* [Tutorial] add sampling tutorial. (#522)

* add sampling tutorial.

* add readme

* update author list.

* fix indent in the code.

* rename the file.

* update tutorial.

* fix the last API.

* update image.

* [BUGFIX] fix the problems in the sampling tutorial. (#523)

* add index.

* update.

* update tutorial.

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* edge softmax module

* WIP

* Fixing typo in JTNN after interface change (#536)

* mxnet backend support

* improve reduce grad

* add max to unittest backend

* fix kernel unittest

* [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)

* lint

* lint

* win build

* [Bug Fix] Fix inplace op at backend (#546)

* Fix inplace operation

* fix line seprator

* [Feature] Add batch and unbatch for immutable graph (#539)

* Add batch and unbatch for immutable graph

* fix line seprator

* fix lintr

* remove unnecessary include

* fix code review

* [BUGFix] Improve multi-processing training (#526)

* fix.

* add comment.

* remove.

* temp fix.

* initialize for shared memory.

* fix graphsage.

* fix gcn.

* add more unit tests.

* add more tests.

* avoid creating shared-memory exclusively.

* redefine remote initializer.

* improve initializer.

* fix unit test.

* fix lint.

* fix lint.

* initialize data in the graph store server properly.

* fix test.

* fix test.

* fix test.

* small fix.

* add comments.

* cleanup server.

* test graph store with a random port.

* print.

* print to stderr.

* test1

* test2

* remove comment.

* adjust the initializer signature.

* try

* fix

* fix

* fix

* fix

* fix

* try

* test

* test

* test

* try

* try

* try

* test

* fix

* try gen_target

* fix gen_target

* fix msvc var_args expand issue

* fix

* [API] update graph store API. (#549)

* add init_ndata and init_edata in DGLGraph.

* adjust SharedMemoryGraph API.

* print warning.

* fix comment.

* update example

* fix.

* fix examples.

* add unit tests.

* add comments.

* [Refactor] Immutable graph index (#543)

* WIP

* header

* WIP .cc

* WIP

* transpose

* wip

* immutable graph .h and .cc

* WIP: nodeflow.cc

* compile

* remove all tmp dl managed ctx; they caused refcount issue

* one simple test

* WIP: testing

* test_graph

* fix graph index

* fix bug in sampler; pass pytorch utest

* WIP on mxnet

* fix lint

* fix mxnet unittest w/ unfortunate workaround

* fix msvc

* fix lint

* SliceRows and test_nodeflow

* resolve reviews

* resolve reviews

* try fix win ci

* try fix win ci

* poke win ci again

* poke

* lazy multigraph flag; stackoverflow error

* revert node subgraph test

* lazy object

* try fix win build

* try fix win build

* poke ci

* fix build script

* fix compile

* add a todo

* fix reviews

* fix compile

* WIP

* WIP

* all demo use python-3 (#555)

* ToImmutable and CopyTo

* [DEMO] Reproduce numbers of distributed training in AMLC giant graph paper (#556)

* update

* update

* update

* update num_hops

* fix bug

* update

* report numbers of distributed training in AMLC giant graph paper

* [DEMO] Remove duplicate code for sampling (#557)

* update

* update

* re-use single-machine code

* update

* use relative path

* update

* update

* update

* add __init__.py

* add __init__.py

* import sys, os

* fix typo

* update

* [Perf] Improve performance of graph store. (#554)

* fix.

* use inplace.

* move to shared memory graph store.

* fix.

* add more unit tests.

* fix.

* fix test.

* fix test.

* disable test.

* fix.

* [BUGIFX] fix a bug in edge_ids (#560)

* add test.

* fix compute.

* fix test.

* turn on test.

* fix a bug.

* add test.

* fix.

* disable test.

* DGLRetValue DGLContext conversion

* [DEMO] Add Pytorch demo for distributed sampler (#562)

* update

* update

* update

* add sender

* update

* remove duplicate cpde

* [Test] Add gtest to project (#547)

* add gtest module

* add gtest

* fix

* Update CMakeLists.txt

* Update README.md

* Add support to convert immutable graph to 32 bits

* [Perf] lazily create msg_index. (#563)

* lazily create msg_index.

* update test.

* fix binary reduce following new minigun template

* enable both int64 and int32 kernels

* [BUGFIX] fix bugs for running GCN on giant graphs. (#561)

* load mxnet csr.

* enable load large csr.

* fix

* fix.

* fix int overflow.

* fix test.

* new kernel interface done for CPU

* docstring

* rename & docstring

* copy reduce and backward

* [BugFix] Fix error when bfs_level = 0 in Entity Classification with RGCN (#559)

* [DEMO] Update demo of distributed sampler (#564)

* update

* update

* update demo

* adapt cuda kernels to the new interface

* add network cpp test (#565)

* fix bug

* Add unittest for C++ RPC (#566)

* [CI] Fix CI for cpp test (#570)

* fix CI for cpp test

* update port number

* [Docker] update docker image (#575)

* update docker image

* specify lint version

* rm torch import from unified tests

* remove pytorch-specific test_function

* fix unittest

* fix

* fix unittest backend bug in converting tensor to numpy array

* fix

* mxnet version

* [BUGFIX] fix for MXNet 1.5. (#552)

* remove clone.

* turn on numpy compatible.

* Revert "remove clone."

This reverts commit 17bbf76ed7.

* revert format changes

* fix mxnet api name

* revert mistakes in previous revert

* roll back CI to 20190523 build

* fix unittest

* disable test_shared_mem_store.py for now

* remove mxnet/test_specialization.py

* sync win64 test script

* fix lowercase

* missing backend in gpu unit test

* transpose to get forward graph

* pass update all

* add sanity check

* passing test_specialization.py

* fix and pass test_function

* fix check

* fix pytorch softmax

* mxnet kernels

* c++ lint

* pylint

* try

* win build

* fix

* win

* ci enable gpu build

* init submodule recursively

* backend docstring

* try

* test win dev

* doc string

* disable pytorch test_nn

* try to fix windows issue

* bug fixed, revert changes

* [Test] fix CI. (#586)

* disable unit test in mxnet tutorial.

* retry socket connection.

* roll back to set_np_compat

* try to fix multi-processing test hangs when it fails.

* fix test.

* fix.

* doc string

* doc string and clean up

* missing field in ctypes

* fix node flow schedule and unit test

* rename

* pylint

* copy from parent default context

* fix unit test script

* fix

* demo bug in nodeflow gpu test

* [Kernel][Bugfix] fix nodeflow bug (#604)

* fix nodeflow bug

* remove debug code

* add build gtest option

* fix cmake; fix graph index bug in spmv.py

* remove clone

* fix div rhs grad bug

* [Kernel] Support full builtin method, edge softmax and unit tests (#605)

* add full builtin support

* unit test

* unit test backend

* edge softmax

* apply edge with builtin

* fix kernel unit test

* disable mxnet test_shared_mem_store

* gen builtin reduce

* enable mxnet gpu unittest

* revert some changes

* docstring

* add note for the hack

* [Kernel][Unittest][CI] Fix MXNet GPU CI (#607)

* update docker image for MXNet GPU CI

* force all dgl graph input and output on CPU

* fix gpu unittest

* speedup compilation

* add some comments

* lint

* add more comments

* fix as requested

* add some comments

* comment

* lint

* lint

* update pylint

* fix as requested

* lint

* lint

* lint

* docstrings of python DGL kernel entries

* disable lint warnings on arguments in kernel.py

* fix docstring in scheduler

* fix some bug in unittest; try again

* Revert "Merge branch 'kernel' of github.com:zzhang-cn/dgl into kernel"

This reverts commit 1d2299e68b, reversing
changes made to ddc97fbf1b.

* Revert "fix some bug in unittest; try again"

This reverts commit ddc97fbf1b.

* more comprehensive kernel test

* remove shape check in test_specialization
2019-06-06 15:47:55 -04:00
Lingfan Yu
a1d50f0f53 [Refactor] Rename before release (#261)
* include/dgl/runtime

* include

* src/runtime

* src/graph

* src/scheduler

* src

* clean up CMakeLists

* further clean up in cmake

* install commands

* python/dgl/_ffi/_cython

* python/dgl/_ffi/_ctypes

* python/dgl/_ffi

* python/dgl

* some fix

* copy right
2018-12-05 16:45:36 -05:00
Minjie Wang
2694b12725 import ffi solution from TVM 2018-09-05 10:51:31 -04:00