15 Commits

Author SHA1 Message Date
Wenxuan Cao
3bc8e228fc [DistGB] enable dist partition pipeline to save FusedCSCSamplingGraph partition directly (#7728)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-8-126.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-52-174.us-west-2.compute.internal>
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
2024-09-19 17:05:11 +08:00
Theodore Vasiloudis
1ab0170a10 [Distributed] Ensure round-robin edge file downloads, reduce logging, other improvements. (#5578)
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
2023-04-27 11:17:55 -07:00
Theodore Vasiloudis
5cab4230fb [Dist] Add argument in dispatch_data.py to allow user-defined metadata JSON filename. (#5445) 2023-03-17 08:43:29 +08:00
kylasa
a14f69c97d [DistDGL][Feature_Request]Changes in the metadata.json file for input graph dataset. (#5310)
* Implemented the following changes.

* Remove NUM_NODES_PER_CHUNK
* Remove NUM_EDGES_PER_CHUNK
* Remove the dependency between no. of edge files per edge type and no. of partitions
* Remove the dependency between no. of edge feature files per edge type and no. of partitions
* Remove the dependency between no. of edge feature files and no. of edge files per edge type.
* Remove the dependency between no. of node feature files and no. of partitions
* Add “node_type_counts”. This will be a list of integers. Each integer will represent total count of a node-type. The index in this list and the index in the “node_type” will be the same for a given node-type.
* Add “edge_type_counts”. This will be a list of integers. Each integer will represent total count of an edge-type. The index in this list and the index in the “edge_type” list will be the same for a given edge-type.

* Applying lintrunner patch.

* Adding missing keys to the metadata in the unit test framework.

* lintrunner patch.

* Resolving CI test failures due to merge conflicts.

* Applying lintrunner patch

* applying lintrunner patch

* Replacing tabspace with spaces - to satisfy lintrunner

* Fixing the CI Test Failure cases.

* Applying lintrunner patch

* lintrunner complaining about a blank line.

* Resolving issues with print statement for NoneType

* Removed tests for the arbitrary chunks tests. Since this functionality is not supported anymore.

* Addressing CI review comments.

* addressing CI review comments

* lintrunner patch

* lintrunner patch.

* Addressing CI review comments.

* lintrunner patch.
2023-02-24 17:01:04 -08:00
kylasa
aa42aaeb9f [DistDGL][Lintrunner]Lintrunner for tools directory (#5261)
* lintrunner patch for gloo_wrapper.py

* lintrunner changes to the tools directory.
2023-02-03 09:56:47 -08:00
kylasa
c8ea9fa4e4 [Dist] Flexible pipeline - Initial commit (#4733)
* Flexible pipeline - Initial commit

1. Implementation of flexible pipeline feature.
2. With this implementation, the pipeline now supports multiple partitions per process. And also assumes that num_partitions is always a multiple of num_processes.

* Update test_dist_part.py

* Code changes to address review comments

* Code refactoring of exchange_features function into two functions for better readability

* Upadting test_dist_part to fix merge issues with the master branch

* corrected variable names...

* Fixed code refactoring issues.

* Provide missing function arguments to exchange_feature function

* Providing the missing function argument to fix error.

* Provide missing function argument to 'get_shuffle_nids' function.

* Repositioned a variable within its scope.

* Removed tab space which is causing the indentation problem

* Fix issue with the CI test framework, which is the root cause for the failure of the CI tests.

1. Now we read files specific to the partition-id and store this data separately, identified by the local_part_id, in the local process.
2. Similarly as above, we also differentiate the node and edge features type_ids with the same keys as above.
3. These above two changes will help up to get the appropriate feature data during the feature exchange and send to the destination process correctly.

* Correct the parametrization for the CI unit test cases.

* Addressing Rui's code review comments.

* Addressing code review comments.
2022-11-18 08:21:55 -08:00
Rhett Ying
1990e797e1 [Dist] Reduce startup overhead: sort etypes and save in specified formats (#4735)
* [Dist] reduce startup overhead: enable to save in specified formats

* [Dist] reduce startup overhead: sort partitions when generating

* sort csc/csr only whenmultiple etypes

* refine
2022-10-26 19:07:10 +08:00
Hongzhi (Steve), Chen
ea48ce7a37 [Misc] Black auto fix. (#4697)
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
2022-10-11 13:31:13 +08:00
Rhett Ying
cf19254a19 [Dist] enable to partition many chunks into less partitions via pipeline (#4620)
* [Dist] enable to partition many chunks into less partitions via pipeline

* refine

* add meta file for num_parts, add more tests, refine docstring

* remove args.num_parts

* create pydantic class for partition metadata

* refine

* rename json file
2022-09-28 16:52:39 +08:00
Rhett Ying
6c1500d49f [Dist] save original node/edge IDs into separate files (#4649)
* [Dist] save original node/edge IDs into separate files

* separate nids and eids
2022-09-28 14:30:24 +08:00
kylasa
ace76327dd Garbage Collection and memory snapshot code for debugging partitioning pipeline (target as master branch) (#4598)
* Squashed commit of the following:

commit e605a550b3
Author: kylasa <kylasa@gmail.com>
Date:   Thu Sep 15 14:45:39 2022 -0700

    Delete pyproject.toml

commit f2db9e700d
Author: kylasa <kylasa@gmail.com>
Date:   Thu Sep 15 14:44:40 2022 -0700

    Changes suggested by isort program to sort imports.

commit 5a6078beac
Author: kylasa <kylasa@gmail.com>
Date:   Thu Sep 15 14:39:50 2022 -0700

    addressing code review comments from the CI process.

commit c8e92decb7
Author: kylasa <kylasa@gmail.com>
Date:   Wed Sep 14 18:23:59 2022 -0700

    Corrected a typo in the import statement

commit 14ddb0e9b5
Author: kylasa <kylasa@gmail.com>
Date:   Tue Sep 13 18:47:34 2022 -0700

    Memory snapshot code for debugging memory footprint of the graph partitioning pipeline

Squashed commit done

* Addressing code review comments.

* Update utils.py

* dummy change to trigger CI tests

Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
2022-09-23 15:37:29 -07:00
peizhou001
166b273bec add ssh port config for dispatchdata (#4557) 2022-09-20 20:41:41 +08:00
Rhett Ying
099b173f6f [DistPart] expose timeout config for process group (#4532)
* [DistPart] expose timeout config for process group

* refine code

* Update tools/distpartitioning/data_proc_pipeline.py

Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
2022-09-15 15:28:36 +08:00
Mufei Li
2e8ae9f980 [Dist][CI] Unit test for the new distributed partitioning pipeline (#4394)
* chunked graph data format

* Update

* Update

* Update task_distributed_test.sh

* Update

* Update

* Revert "Update"

This reverts commit 03c461870f.

* Update

* Update

* ssh-keygen

* CI

* install openssh

* openssh

* Update

* CI

* Update

* Update

Co-authored-by: Ubuntu <ubuntu@ip-172-31-53-142.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-87.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-20-21.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
2022-08-19 14:20:56 +08:00
kylasa
8086d1edde Adding launch script and wrapper script to trigger distributed graph … (#4276)
* Adding launch script and wrapper script to trigger distributed graph partitioning pipeline as defined in the UX document

1. dispatch_data.py is a wrapper script which builds the command and triggers the distributed partitioning pipeline
2. distgraphlaunch.py is the main python script which triggers the pipeline and to simplify its usage dispatch_data.py is included as a wrapper script around it.

* Added code to auto-detect python version and retrieve some parameters from the input metadata json file

1. Auto detect python version
2. Read the metadata json file and extract some parameters to pass to the user defined command which is used to trigger the pipeline.

* Updated the json file name to metadata.json file per UX documentation

1. Renamed json file name per UX documentation.

* address comments

* fix

* fix doc

* use unbuffered logging to cure anxiety

* cure more anxiety

* Update tools/dispatch_data.py

Co-authored-by: Minjie Wang <minjie.wang@nyu.edu>

* oops

Co-authored-by: Quan Gan <coin2028@hotmail.com>
Co-authored-by: Minjie Wang <minjie.wang@nyu.edu>
2022-08-11 15:46:38 +08:00