52 Commits

Author SHA1 Message Date
Hongzhi (Steve), Chen
86fe4c20fc Remove datapipe from dgl dependency. (#7667)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
2024-08-08 10:39:41 +08:00
Rhett Ying
64750575a2 [CI] upgrade torch to 2.3.0 and cuda to 12.1 (#7399) 2024-05-11 16:21:05 +08:00
Rhett Ying
9fde953d4b [dev] remove unused dockerfiles (#7398) 2024-05-11 10:39:29 +08:00
Rhett Ying
e5263013a9 [CI] use torch 2.0.0, cu118, ubuntu2004, python310 (#7158) 2024-02-27 16:37:48 +08:00
Rhett Ying
3df4f8cb43 [CI] add pyg into CI image (#6997) 2024-01-23 15:23:41 +08:00
Rhett Ying
6dcdaf59ce [CI] add torcheval into CI docker (#6527) 2023-11-03 18:49:13 +08:00
Rhett Ying
9e26c8e388 [GraphBolt] upgrade CUDA to 17 and update CI gpu image to torch2.0+cu118 (#6505) 2023-10-30 13:15:29 +08:00
Daniil Sizov
9c36f24fb4 Cpu docker tcmalloc (#5969) 2023-08-11 10:12:58 +08:00
Rhett Ying
b328b85f61 [CI] update python version for tensorflow (#6125) 2023-08-10 16:17:37 +08:00
Rhett Ying
e6e5430419 [CI]upgrade pytorch to 1.13+cpu_cu116 (#5977) 2023-07-11 22:18:28 +08:00
Rhett Ying
fa3fbbfb8f [CI] update CI docker images (#5796) 2023-06-06 19:39:32 +08:00
Rhett Ying
a1fe08a8ca [CI] install torchdadta in docker CI (#5795) 2023-06-06 14:23:55 +08:00
Hongzhi (Steve), Chen
0552466c9c sort (#5281)
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
2023-02-16 13:06:46 +08:00
Hongzhi (Steve), Chen
f7a0c4bbf9 remove (#5280)
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
2023-02-16 12:57:49 +08:00
Rhett Ying
465828cdbc [CI] pre-install required py packages in image for unit/tutorial test (#5276) 2023-02-10 19:11:30 +08:00
Rhett Ying
46455328e4 [CI] fix bugs for multigpu benchmarks (#5140) 2023-01-11 14:50:37 +08:00
Rhett Ying
46a3fc2b5f [CI] updgrade pytorch version for benchmark CI (#5102)
* [CI] updgrade pytorch version for benchmark CI

* update build arguments

* updage

* updage

* upgrade torch to 1.13

* updage docker image

* update cmake args

* try with cu116_torch112

* update build

* update

* update

* update

* update docker image

* update

* update

* update

* update

* final update

* fix continue running

* update

* update

* update
2023-01-06 14:35:36 +08:00
Quan (Andy) Gan
00f2099934 [CI] upgrade CI pytorch version (#5047)
* upgrade CI pytorch version

* update

* revert tensorflow image

* old image (220816) backup

* Update Dockerfile.ci_gpu
2022-12-19 17:29:03 +08:00
Rhett Ying
d248e7686f [CI] enable ssh in docker image for dist test (#4432) 2022-08-18 15:23:51 +08:00
Rhett Ying
cf4727a9d9 [CI] upgrade python version to 3.7.0 (#4406)
* [CI] upgrade python version to 3.7.0

* do not upgrade for mxnet cpu due to seg fault

* fix test failure for mxnet
2022-08-17 15:28:34 +08:00
Xin Yao
32f12ee19e [Doc] Unify the minimal versions required for PyTorch/TensorFlow/MXNet (#4180) 2022-06-29 18:37:19 +08:00
Rhett Ying
5640b12969 [CI] Upgrade software version of CI docker image (#4189)
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
2022-06-29 13:34:13 +08:00
Jinjing Zhou
338db32d21 [Feature] Try upload report to s3 (#3891)
refactor CI report and log
2022-04-22 16:02:12 +08:00
Jinjing Zhou
8f0136b0e3 Remove unnecessary jenkins part (#3806)
* fix

* try mount

* what's happening

* fix

* ci
2022-03-04 17:21:19 +08:00
Jinjing Zhou
e424d29614 [CI] Use torch 1.9.0 in CI (#3752) 2022-02-18 16:35:24 +08:00
Quan (Andy) Gan
92db4bd5a0 fix (#3700) 2022-01-28 20:51:15 +08:00
Jinjing Zhou
dc629fc564 update docker image to newer pytorch (#3680)
Co-authored-by: Mufei Li <mufeili1996@gmail.com>
2022-01-23 23:25:34 +08:00
Jinjing Zhou
a62f2c14f6 [CI] Increase CI cpu machine cpus (#3595) 2021-12-17 18:18:19 +08:00
Jinjing Zhou
a5dc230af9 Fix shared memory in unit test (#3204) 2021-07-30 15:02:09 +08:00
Jinjing Zhou
e57c6e3506 [Fix] Fix lint resource usage & Fix Docs (#3032)
* fix

* remove nvidiasmi

* fix

* fix docs

* fix

* fix
2021-06-18 18:33:52 +08:00
Jinjing Zhou
6a56562a7c [CI] Use k8s cluster (#2957)
* add

* fix

* set default

* fix

* try master

* try fix

* try

* fix

* 111

* fix

* fix

* update

* ccc

* try

* fix

* fix

* try new machine

* fix

* fix

* fix

* Revert "fix"

This reverts commit e716d66b04.

* try

* more parrallel

* use k8s for all

* fix name

* try not specify instance type

* ci

* use one yaml

* Revert "use one yaml"

This reverts commit 717d8d852b.

* add timeout

* fix permission

* mount efs

* print

* fix pvc

* fix

* restrict num of gpu instances

* check

* fix

* fix
2021-06-04 18:31:11 +08:00
Quan (Andy) Gan
97bdae9e85 [Docker] Add Dependency Packages for CI (#2713)
* add ogb to docker build

* mxnet 1.6 disappeared
2021-03-02 22:16:31 +08:00
Jinjing Zhou
704ec68576 [Benchmark] Fix benchmark tests and add script to generate excel result (#2681)
* add script

* 21

* log

* fix regression tests

* add iter time

* fix

* Revert "fix"

This reverts commit 9b4587ad61.

* fix

* fix

* add

* add ogb instruction

* add cu11 dockerfile

* fix

* fix

* more iter apply_edge

* add iteration time

Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
2021-02-24 19:41:49 +08:00
Jinjing Zhou
92321347c0 [CI] Update PyTorch in CI image to 1.5.1 (#1808)
* test

* Revert "test"

This reverts commit be555b06a9.

* change
2020-07-15 17:04:04 +08:00
Jinjing Zhou
6d04885381 [Backend] Turn to official dlpack for Tensorflow (#1511)
* Turn to official dlpack

* fix

* fix
2020-05-11 21:16:25 +08:00
Jinjing Zhou
e9440acb06 [TF] TF backend fix and new logic to choose backend (#1393)
* TF backend fix and new logic to choose backend

* fix

* fix

* fix

* fix

* fix backend

* fix

* dlpack alignment

* add flag

* flag

* lint

* lint

* remove unused

* several fixes

Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
2020-03-30 18:45:17 +08:00
Jinjing Zhou
c23a61bd5c fix s3 link (#1310) 2020-03-04 21:01:29 +08:00
VoVAllen
d30a69bf27 [Backend] TF backend (#978)
* tf

* add builtin support

* fiix

* pytest

* fix

* fix

* fix some bugs

* fix selecting

* fix todo

* fix test

* fix test fail in tf

* fix

* fix

* fix gather row

* fix gather row

* log backend

* fix gather row

* fix gather row

* fix for pytorch

* fix

* fix

* fix

* fix

* fix

* fix tests

* fix

* fix

* fix

* fix

* fix

* fix

* fix convert

* fix

* fix

* fix

* fix inplace

* add alignment setting

* add debug option

* Revert "add alignment setting"

This reverts commit ec63fb3506.

* tf ci

* fix lint

* fix lint

* add tfdlpack

* fix type

* add env

* fix backend

* fix

* fix tests

* remove one_hot

* remove comment

* remove comment

* fix

* use pip to install all

* fix test

* fix base

* fix

* fix

* add skip

* upgrade cmake

* change version

* change ci

* fix

* fix

* fix

* fix

* fix seg fault

* fix

* fix python version

* fix

* try fix

* fix

* fix

* tf takes longer time in ci

* change py version

* fix

* fix

* fix oom

* change kg env

* change kg env

* 啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊啊

* 我再也不搞各种乱七八糟环境了……

* use pytest

* Chang image
2019-12-20 15:56:51 +08:00
VoVAllen
dd65ee211e [CI] Change tests for flexibility
* change ci image

* fix

* force bash

* fix

* fix python version

* fix

* fix

* fix

* update gpu

* cuda

* jenkins

* fix build sh

* fix

* Revert "fix"

This reverts commit 6b091914b3.

* try fix

* fix

* Revert "fix"

This reverts commit e42c3035fa.

* try fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix tests

* try fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix ctx problem

* fix many tests

* fix typo

* add backend

* move to pytorch folder

* fix?

* fix ci

* try skip

* try false

* try?

* try?

* Revert to 7d9a52f

* fix various

* fix lint

* Revert Jenkinsfile

* revert env

* revert env

* address comment

* remove file
2019-12-16 14:54:37 +08:00
xiang song(charlie.song)
93e3c49ddc [KG] Update CI to cover Knowledge Graph (#913)
* upd

* fig edgebatch edges

* add test

* trigger

* Update README.md for pytorch PinSage example.

Add noting that the PinSage model example under
example/pytorch/recommendation only work with Python 3.6+
as its dataset loader depends on stanfordnlp package
which work only with Python 3.6+.

* Provid a frame agnostic API to test nn modules on both CPU and CUDA side.

1. make dgl.nn.xxx frame agnostic
2. make test.backend include dgl.nn modules
3. modify test_edge_softmax of test/mxnet/test_nn.py and
    test/pytorch/test_nn.py work on both CPU and GPU

* Fix style

* Delete unused code

* Make agnostic test only related to tests/backend

1. clear all agnostic related code in dgl.nn
2. make test_graph_conv agnostic to cpu/gpu

* Fix code style

* fix

* doc

* Make all test code under tests.mxnet/pytorch.test_nn.py
work on both CPU and GPU.

* Fix syntex

* Remove rand

* Add TAGCN nn.module and example

* Now tagcn can run on CPU.

* Add unitest for TGConv

* Fix style

* For pubmed dataset, using --lr=0.005 can achieve better acc

* Fix style

* Fix some descriptions

* trigger

* Fix doc

* Add nn.TGConv and example

* Fix bug

* Update data in mxnet.tagcn test acc.

* Fix some comments and code

* delete useless code

* Fix namming

* Fix bug

* Fix bug

* Add test for mxnet TAGCov

* Add test code for mxnet TAGCov

* Update some docs

* Fix some code

* Update docs dgl.nn.mxnet

* Update weight init

* Fix

* reproduce the bug

* Fix concurrency bug reported at #755.
Also make test_shared_mem_store.py more deterministic.

* Update test_shared_mem_store.py

* Update dmlc/core

* Update Knowledge Graph CI with new Docker image

* Remove unused line_profierx

* Poke Jenkins

* Update test with exit code check and simplify docker

* Update Jenkinsfile to make app test a standalone stage

* Update kg_test

* Update Jenkinsfile

* Make some KG test parallel

* Update

* KG MXNet does not support ComplEx

* Update Jenkinsfile

* Update Jenkins file

* Change torch-1.2 to torch-1.2-cu92

* ci

* Update ubuntu_install_mxnet_cpu.sh

* Update ubuntu_install_mxnet_gpu.sh

* We only need to test train and eval script.
Delete some test code
2019-10-11 01:32:34 -07:00
Da Zheng
70ee86648b [Docker]Move to the latest MXNet version. (#610)
* use the right version of mxnet.

* fix.
2019-06-06 13:56:35 -07:00
Lingfan Yu
653428bdc7 [Feature][Kernel] DGL kernel support (#596)
* [Kernel] Minigun integration and fused kernel support (#519)

* kernel interface

* add minigun

* Add cuda build

* functors

* working on binary elewise

* binary reduce

* change kernel interface

* WIP

* wip

* fix minigun

* compile

* binary reduce kernels

* compile

* simple test passed

* more reducers

* fix thrust problem

* fix cmake

* fix cmake; add proper guard for atomic

* WIP: bcast

* WIP

* bcast kernels

* update to new minigun pass-by-value practice

* broadcasting dim

* add copy src and copy edge

* fix linking

* fix none array problem

* fix copy edge

* add device_type and device_id to backend operator

* cache csr adj, remove cache for adjmat and incmat

* custom ops in backend and pytorch impl

* change dgl-mg kernel python interface

* add id_mapping var

* clean up plus v2e spmv schedule

* spmv schedule & clean up fall back

* symbolic message and reduce func, remove bundle func

* new executors

* new backend interface for dgl kernels and pytorch impl

* minor fix

* fix

* fix docstring, comments, func names

* nodeflow

* fix message id mapping and bugs...

* pytorch test case & fix

* backward binary reduce

* fix bug

* WIP: cusparse

* change to int32 csr for cusparse workaround

* disable cusparse

* change back to int64

* broadcasting backward

* cusparse; WIP: add rev_csr

* unit test for kernels

* pytorch backward with dgl kernel

* edge softmax

* fix backward

* improve softmax

* cache edge on device

* cache mappings on device

* fix partial forward code

* cusparse done

* copy_src_sum with cusparse

* rm id getter

* reduce grad for broadcast

* copy edge reduce backward

* kernel unit test for broadcasting

* full kernel unit test

* add cpu kernels

* edge softmax unit test

* missing ref

* fix compile and small bugs

* fix bug in bcast

* Add backward both

* fix torch utests

* expose infershape

* create out tensor in python

* fix c++ lint

* [Kernel] Add GPU utest and kernel utest (#524)

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* [Kernel] Update kernel branch (#550)

* [Model] add multiprocessing training with sampling. (#484)

* reorganize sampling code.

* add multi-process training.

* speed up gcn_cv

* fix graphsage_cv.

* add new API in graph store.

* update barrier impl.

* support both local and distributed training.

* fix multiprocess train.

* fix.

* fix barrier.

* add script for loading data.

* multiprocessing sampling.

* accel training.

* replace pull with spmv for speedup.

* nodeflow copy from parent with context.

* enable GPU.

* fix a bug in graph store.

* enable multi-GPU training.

* fix lint.

* add comments.

* rename to run_store_server.py

* fix gcn_cv.

* fix a minor bug in sampler.

* handle error better in graph store.

* improve graphsage_cv for distributed mode.

* update README.

* fix.

* update.

* [Tutorial] add sampling tutorial. (#522)

* add sampling tutorial.

* add readme

* update author list.

* fix indent in the code.

* rename the file.

* update tutorial.

* fix the last API.

* update image.

* [BUGFIX] fix the problems in the sampling tutorial. (#523)

* add index.

* update.

* update tutorial.

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* Fixing typo in JTNN after interface change (#536)

* [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)

* [Bug Fix] Fix inplace op at backend (#546)

* Fix inplace operation

* fix line seprator

* [Feature] Add batch and unbatch for immutable graph (#539)

* Add batch and unbatch for immutable graph

* fix line seprator

* fix lintr

* remove unnecessary include

* fix code review

* [BUGFix] Improve multi-processing training (#526)

* fix.

* add comment.

* remove.

* temp fix.

* initialize for shared memory.

* fix graphsage.

* fix gcn.

* add more unit tests.

* add more tests.

* avoid creating shared-memory exclusively.

* redefine remote initializer.

* improve initializer.

* fix unit test.

* fix lint.

* fix lint.

* initialize data in the graph store server properly.

* fix test.

* fix test.

* fix test.

* small fix.

* add comments.

* cleanup server.

* test graph store with a random port.

* print.

* print to stderr.

* test1

* test2

* remove comment.

* adjust the initializer signature.

* [API] update graph store API. (#549)

* add init_ndata and init_edata in DGLGraph.

* adjust SharedMemoryGraph API.

* print warning.

* fix comment.

* update example

* fix.

* fix examples.

* add unit tests.

* add comments.

* [Refactor] Immutable graph index (#543)

* WIP

* header

* WIP .cc

* WIP

* transpose

* wip

* immutable graph .h and .cc

* WIP: nodeflow.cc

* compile

* remove all tmp dl managed ctx; they caused refcount issue

* one simple test

* WIP: testing

* test_graph

* fix graph index

* fix bug in sampler; pass pytorch utest

* WIP on mxnet

* fix lint

* fix mxnet unittest w/ unfortunate workaround

* fix msvc

* fix lint

* SliceRows and test_nodeflow

* resolve reviews

* resolve reviews

* try fix win ci

* try fix win ci

* poke win ci again

* poke

* lazy multigraph flag; stackoverflow error

* revert node subgraph test

* lazy object

* try fix win build

* try fix win build

* poke ci

* fix build script

* fix compile

* add a todo

* fix reviews

* fix compile

* [Kernel] Update kernel branch (#576)

* [Model] add multiprocessing training with sampling. (#484)

* reorganize sampling code.

* add multi-process training.

* speed up gcn_cv

* fix graphsage_cv.

* add new API in graph store.

* update barrier impl.

* support both local and distributed training.

* fix multiprocess train.

* fix.

* fix barrier.

* add script for loading data.

* multiprocessing sampling.

* accel training.

* replace pull with spmv for speedup.

* nodeflow copy from parent with context.

* enable GPU.

* fix a bug in graph store.

* enable multi-GPU training.

* fix lint.

* add comments.

* rename to run_store_server.py

* fix gcn_cv.

* fix a minor bug in sampler.

* handle error better in graph store.

* improve graphsage_cv for distributed mode.

* update README.

* fix.

* update.

* [Tutorial] add sampling tutorial. (#522)

* add sampling tutorial.

* add readme

* update author list.

* fix indent in the code.

* rename the file.

* update tutorial.

* fix the last API.

* update image.

* [BUGFIX] fix the problems in the sampling tutorial. (#523)

* add index.

* update.

* update tutorial.

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* Fixing typo in JTNN after interface change (#536)

* [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)

* [Bug Fix] Fix inplace op at backend (#546)

* Fix inplace operation

* fix line seprator

* [Feature] Add batch and unbatch for immutable graph (#539)

* Add batch and unbatch for immutable graph

* fix line seprator

* fix lintr

* remove unnecessary include

* fix code review

* [BUGFix] Improve multi-processing training (#526)

* fix.

* add comment.

* remove.

* temp fix.

* initialize for shared memory.

* fix graphsage.

* fix gcn.

* add more unit tests.

* add more tests.

* avoid creating shared-memory exclusively.

* redefine remote initializer.

* improve initializer.

* fix unit test.

* fix lint.

* fix lint.

* initialize data in the graph store server properly.

* fix test.

* fix test.

* fix test.

* small fix.

* add comments.

* cleanup server.

* test graph store with a random port.

* print.

* print to stderr.

* test1

* test2

* remove comment.

* adjust the initializer signature.

* [API] update graph store API. (#549)

* add init_ndata and init_edata in DGLGraph.

* adjust SharedMemoryGraph API.

* print warning.

* fix comment.

* update example

* fix.

* fix examples.

* add unit tests.

* add comments.

* [Refactor] Immutable graph index (#543)

* WIP

* header

* WIP .cc

* WIP

* transpose

* wip

* immutable graph .h and .cc

* WIP: nodeflow.cc

* compile

* remove all tmp dl managed ctx; they caused refcount issue

* one simple test

* WIP: testing

* test_graph

* fix graph index

* fix bug in sampler; pass pytorch utest

* WIP on mxnet

* fix lint

* fix mxnet unittest w/ unfortunate workaround

* fix msvc

* fix lint

* SliceRows and test_nodeflow

* resolve reviews

* resolve reviews

* try fix win ci

* try fix win ci

* poke win ci again

* poke

* lazy multigraph flag; stackoverflow error

* revert node subgraph test

* lazy object

* try fix win build

* try fix win build

* poke ci

* fix build script

* fix compile

* add a todo

* fix reviews

* fix compile

* all demo use python-3 (#555)

* [DEMO] Reproduce numbers of distributed training in AMLC giant graph paper (#556)

* update

* update

* update

* update num_hops

* fix bug

* update

* report numbers of distributed training in AMLC giant graph paper

* [DEMO] Remove duplicate code for sampling (#557)

* update

* update

* re-use single-machine code

* update

* use relative path

* update

* update

* update

* add __init__.py

* add __init__.py

* import sys, os

* fix typo

* update

* [Perf] Improve performance of graph store. (#554)

* fix.

* use inplace.

* move to shared memory graph store.

* fix.

* add more unit tests.

* fix.

* fix test.

* fix test.

* disable test.

* fix.

* [BUGIFX] fix a bug in edge_ids (#560)

* add test.

* fix compute.

* fix test.

* turn on test.

* fix a bug.

* add test.

* fix.

* disable test.

* [DEMO] Add Pytorch demo for distributed sampler (#562)

* update

* update

* update

* add sender

* update

* remove duplicate cpde

* [Test] Add gtest to project (#547)

* add gtest module

* add gtest

* fix

* Update CMakeLists.txt

* Update README.md

* [Perf] lazily create msg_index. (#563)

* lazily create msg_index.

* update test.

* [BUGFIX] fix bugs for running GCN on giant graphs. (#561)

* load mxnet csr.

* enable load large csr.

* fix

* fix.

* fix int overflow.

* fix test.

* [BugFix] Fix error when bfs_level = 0 in Entity Classification with RGCN (#559)

* [DEMO] Update demo of distributed sampler (#564)

* update

* update

* update demo

* add network cpp test (#565)

* Add unittest for C++ RPC (#566)

* [CI] Fix CI for cpp test (#570)

* fix CI for cpp test

* update port number

* [Docker] update docker image (#575)

* update docker image

* specify lint version

* rm torch import from unified tests

* [Kernel][Scheduler][MXNet] Scheduler for DGL kernels and MXNet backend support (#541)

* [Model] add multiprocessing training with sampling. (#484)

* reorganize sampling code.

* add multi-process training.

* speed up gcn_cv

* fix graphsage_cv.

* add new API in graph store.

* update barrier impl.

* support both local and distributed training.

* fix multiprocess train.

* fix.

* fix barrier.

* add script for loading data.

* multiprocessing sampling.

* accel training.

* replace pull with spmv for speedup.

* nodeflow copy from parent with context.

* enable GPU.

* fix a bug in graph store.

* enable multi-GPU training.

* fix lint.

* add comments.

* rename to run_store_server.py

* fix gcn_cv.

* fix a minor bug in sampler.

* handle error better in graph store.

* improve graphsage_cv for distributed mode.

* update README.

* fix.

* update.

* [Tutorial] add sampling tutorial. (#522)

* add sampling tutorial.

* add readme

* update author list.

* fix indent in the code.

* rename the file.

* update tutorial.

* fix the last API.

* update image.

* [BUGFIX] fix the problems in the sampling tutorial. (#523)

* add index.

* update.

* update tutorial.

* fix gpu utest

* cuda utest runnable

* temp disable test nodeflow; unified test for kernel

* cuda test kernel done

* edge softmax module

* WIP

* Fixing typo in JTNN after interface change (#536)

* mxnet backend support

* improve reduce grad

* add max to unittest backend

* fix kernel unittest

* [BugFix] Fix getting src and dst id of ALL edges in NodeFlow.apply_block (#515)

* lint

* lint

* win build

* [Bug Fix] Fix inplace op at backend (#546)

* Fix inplace operation

* fix line seprator

* [Feature] Add batch and unbatch for immutable graph (#539)

* Add batch and unbatch for immutable graph

* fix line seprator

* fix lintr

* remove unnecessary include

* fix code review

* [BUGFix] Improve multi-processing training (#526)

* fix.

* add comment.

* remove.

* temp fix.

* initialize for shared memory.

* fix graphsage.

* fix gcn.

* add more unit tests.

* add more tests.

* avoid creating shared-memory exclusively.

* redefine remote initializer.

* improve initializer.

* fix unit test.

* fix lint.

* fix lint.

* initialize data in the graph store server properly.

* fix test.

* fix test.

* fix test.

* small fix.

* add comments.

* cleanup server.

* test graph store with a random port.

* print.

* print to stderr.

* test1

* test2

* remove comment.

* adjust the initializer signature.

* try

* fix

* fix

* fix

* fix

* fix

* try

* test

* test

* test

* try

* try

* try

* test

* fix

* try gen_target

* fix gen_target

* fix msvc var_args expand issue

* fix

* [API] update graph store API. (#549)

* add init_ndata and init_edata in DGLGraph.

* adjust SharedMemoryGraph API.

* print warning.

* fix comment.

* update example

* fix.

* fix examples.

* add unit tests.

* add comments.

* [Refactor] Immutable graph index (#543)

* WIP

* header

* WIP .cc

* WIP

* transpose

* wip

* immutable graph .h and .cc

* WIP: nodeflow.cc

* compile

* remove all tmp dl managed ctx; they caused refcount issue

* one simple test

* WIP: testing

* test_graph

* fix graph index

* fix bug in sampler; pass pytorch utest

* WIP on mxnet

* fix lint

* fix mxnet unittest w/ unfortunate workaround

* fix msvc

* fix lint

* SliceRows and test_nodeflow

* resolve reviews

* resolve reviews

* try fix win ci

* try fix win ci

* poke win ci again

* poke

* lazy multigraph flag; stackoverflow error

* revert node subgraph test

* lazy object

* try fix win build

* try fix win build

* poke ci

* fix build script

* fix compile

* add a todo

* fix reviews

* fix compile

* WIP

* WIP

* all demo use python-3 (#555)

* ToImmutable and CopyTo

* [DEMO] Reproduce numbers of distributed training in AMLC giant graph paper (#556)

* update

* update

* update

* update num_hops

* fix bug

* update

* report numbers of distributed training in AMLC giant graph paper

* [DEMO] Remove duplicate code for sampling (#557)

* update

* update

* re-use single-machine code

* update

* use relative path

* update

* update

* update

* add __init__.py

* add __init__.py

* import sys, os

* fix typo

* update

* [Perf] Improve performance of graph store. (#554)

* fix.

* use inplace.

* move to shared memory graph store.

* fix.

* add more unit tests.

* fix.

* fix test.

* fix test.

* disable test.

* fix.

* [BUGIFX] fix a bug in edge_ids (#560)

* add test.

* fix compute.

* fix test.

* turn on test.

* fix a bug.

* add test.

* fix.

* disable test.

* DGLRetValue DGLContext conversion

* [DEMO] Add Pytorch demo for distributed sampler (#562)

* update

* update

* update

* add sender

* update

* remove duplicate cpde

* [Test] Add gtest to project (#547)

* add gtest module

* add gtest

* fix

* Update CMakeLists.txt

* Update README.md

* Add support to convert immutable graph to 32 bits

* [Perf] lazily create msg_index. (#563)

* lazily create msg_index.

* update test.

* fix binary reduce following new minigun template

* enable both int64 and int32 kernels

* [BUGFIX] fix bugs for running GCN on giant graphs. (#561)

* load mxnet csr.

* enable load large csr.

* fix

* fix.

* fix int overflow.

* fix test.

* new kernel interface done for CPU

* docstring

* rename & docstring

* copy reduce and backward

* [BugFix] Fix error when bfs_level = 0 in Entity Classification with RGCN (#559)

* [DEMO] Update demo of distributed sampler (#564)

* update

* update

* update demo

* adapt cuda kernels to the new interface

* add network cpp test (#565)

* fix bug

* Add unittest for C++ RPC (#566)

* [CI] Fix CI for cpp test (#570)

* fix CI for cpp test

* update port number

* [Docker] update docker image (#575)

* update docker image

* specify lint version

* rm torch import from unified tests

* remove pytorch-specific test_function

* fix unittest

* fix

* fix unittest backend bug in converting tensor to numpy array

* fix

* mxnet version

* [BUGFIX] fix for MXNet 1.5. (#552)

* remove clone.

* turn on numpy compatible.

* Revert "remove clone."

This reverts commit 17bbf76ed7.

* revert format changes

* fix mxnet api name

* revert mistakes in previous revert

* roll back CI to 20190523 build

* fix unittest

* disable test_shared_mem_store.py for now

* remove mxnet/test_specialization.py

* sync win64 test script

* fix lowercase

* missing backend in gpu unit test

* transpose to get forward graph

* pass update all

* add sanity check

* passing test_specialization.py

* fix and pass test_function

* fix check

* fix pytorch softmax

* mxnet kernels

* c++ lint

* pylint

* try

* win build

* fix

* win

* ci enable gpu build

* init submodule recursively

* backend docstring

* try

* test win dev

* doc string

* disable pytorch test_nn

* try to fix windows issue

* bug fixed, revert changes

* [Test] fix CI. (#586)

* disable unit test in mxnet tutorial.

* retry socket connection.

* roll back to set_np_compat

* try to fix multi-processing test hangs when it fails.

* fix test.

* fix.

* doc string

* doc string and clean up

* missing field in ctypes

* fix node flow schedule and unit test

* rename

* pylint

* copy from parent default context

* fix unit test script

* fix

* demo bug in nodeflow gpu test

* [Kernel][Bugfix] fix nodeflow bug (#604)

* fix nodeflow bug

* remove debug code

* add build gtest option

* fix cmake; fix graph index bug in spmv.py

* remove clone

* fix div rhs grad bug

* [Kernel] Support full builtin method, edge softmax and unit tests (#605)

* add full builtin support

* unit test

* unit test backend

* edge softmax

* apply edge with builtin

* fix kernel unit test

* disable mxnet test_shared_mem_store

* gen builtin reduce

* enable mxnet gpu unittest

* revert some changes

* docstring

* add note for the hack

* [Kernel][Unittest][CI] Fix MXNet GPU CI (#607)

* update docker image for MXNet GPU CI

* force all dgl graph input and output on CPU

* fix gpu unittest

* speedup compilation

* add some comments

* lint

* add more comments

* fix as requested

* add some comments

* comment

* lint

* lint

* update pylint

* fix as requested

* lint

* lint

* lint

* docstrings of python DGL kernel entries

* disable lint warnings on arguments in kernel.py

* fix docstring in scheduler

* fix some bug in unittest; try again

* Revert "Merge branch 'kernel' of github.com:zzhang-cn/dgl into kernel"

This reverts commit 1d2299e68b, reversing
changes made to ddc97fbf1b.

* Revert "fix some bug in unittest; try again"

This reverts commit ddc97fbf1b.

* more comprehensive kernel test

* remove shape check in test_specialization
2019-06-06 15:47:55 -04:00
Minjie Wang
5492994251 [CI] Fix CI bugs (#592)
* new jenkins script

* fix ci

* poke ci

* new config

* new config

* new config

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* update docker image; poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* poke ci

* update image

* update image

* fix

* Windows CI support

* typo

* typo*2

* missed sh

* typo*3

* missed dir
2019-06-01 14:32:39 -04:00
Lingfan Yu
ada3786c57 [Docker]Update MXNet CI Image (#582)
* use mxnet 20190528 build

* fix mxnet api name
2019-05-28 18:04:54 -07:00
Minjie Wang
c64405b1f6 [Docker] update docker image (#575)
* update docker image

* specify lint version

* rm torch import from unified tests
2019-05-27 17:04:46 -04:00
Minjie Wang
b5afaeac65 [CI] Small change on the MX CPU image (#343) 2019-01-07 15:59:34 -05:00
Da Zheng
dd26ff10a8 [CI] change the mxnet installation in docker. (#218) 2018-12-02 14:47:50 -05:00
Minjie Wang
cd3e25a064 [CI] add graphviz (#147) 2018-11-12 15:18:22 -05:00
VoVAllen
7cb50072a4 [Doc][Model] New Capsule Tutorial & Example (#143)
* new capsule tutorial

* capsule for new API

* fix deprecated API

* New tutorial and example

* investigate gc problem

* add viz code

* new capsule tutorial

* remove ipynb

* move u_hat

* add link

* add requirements.txt

* remove ani.save

* update ci to install requirements

* add graphviz
2018-11-12 12:43:13 -05:00
Minjie Wang
a95459e3a2 [CI] Improved CI (#141)
* change ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* update ci

* nx package

* update ci

* update ci

* update ci

* fix

* mx dockerfile by zhengda

* python3.6->3.5

* update ci image

* add tutorial test

* fix ci

* fix ssl problem

* minor change

* small fix on traversal utest

* fix syntax

* add matplotlib in image

* fix

* update ci

* update ci
2018-11-12 01:06:05 -05:00