mirror of
https://github.com/dmlc/dgl.git
synced 2026-06-03 19:34:33 +08:00
[DGL-Go] Change name to dglgo (#3778)
* add * remove * fix * rework the readme and some changes * add png * update png * add recipe get Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
This commit is contained in:
397
dglgo/README.md
Normal file
397
dglgo/README.md
Normal file
@@ -0,0 +1,397 @@
|
||||
# DGL-Go
|
||||
|
||||
|
||||
DGL-Go is a command line tool for users to get started with training, using and
|
||||
studying Graph Neural Networks (GNNs). Data scientists can quickly apply GNNs
|
||||
to their problems, whereas researchers will find it useful to customize their
|
||||
experiments.
|
||||
|
||||
|
||||
## Installation and get started
|
||||
|
||||
DGL-Go requires DGL v0.8+ so please make sure DGL is updated properly.
|
||||
Install DGL-Go by `pip install dglgo` and type `dgl` in your console:
|
||||
```
|
||||
Usage: dgl [OPTIONS] COMMAND [ARGS]...
|
||||
|
||||
Options:
|
||||
--help Show this message and exit.
|
||||
|
||||
Commands:
|
||||
configure Generate a configuration file
|
||||
export Export a runnable python script
|
||||
recipe Get example recipes
|
||||
train Launch training
|
||||
```
|
||||
|
||||

|
||||
|
||||
Using DGL-Go is as easy as three steps:
|
||||
|
||||
1. Use `dgl configure` to pick the task, dataset and model of your interests. It generates
|
||||
a configuration file for later use. You could also use `dgl recipe get` to retrieve
|
||||
a configuration file we provided.
|
||||
1. Use `dgl train` to launch training according to the configuration and see the results.
|
||||
1. Use `dgl export` to generate a *self-contained, reproducible* Python script for advanced
|
||||
customization, or try the model on custom data stored in CSV format.
|
||||
|
||||
Next, we will walk through all these steps one-by-one.
|
||||
|
||||
## Training GraphSAGE for node classification on Cora
|
||||
|
||||
Let's use one of the most classical setups -- training a GraphSAGE model for node
|
||||
classification on the Cora citation graph dataset as an
|
||||
example.
|
||||
|
||||
### Step one: `dgl configure`
|
||||
|
||||
First step, use `dgl configure` to generate a YAML configuration file.
|
||||
|
||||
```
|
||||
dgl configure nodepred --data cora --model sage --cfg cora_sage.yaml
|
||||
```
|
||||
|
||||
Note that `nodepred` is the name of DGL-Go *pipeline*. For now, you can think of
|
||||
pipeline as training task: `nodepred` is for node prediction task; other
|
||||
options include `linkpred` for link prediction task, etc. The command will
|
||||
generate a configurate file `cora_sage.yaml` which includes:
|
||||
|
||||
* Options for the selected dataset (i.e., `cora` here).
|
||||
* Model hyperparameters (e.g., number of layers, hidden size, etc.).
|
||||
* Training hyperparameters (e.g., learning rate, loss function, etc.).
|
||||
|
||||
Different choices of task, model and datasets may give very different options,
|
||||
so DGL-Go also adds a comment for what each option does in the file.
|
||||
At this point you can also change options to explore optimization potentials.
|
||||
|
||||
Below shows the configuration file generated by the command above.
|
||||
|
||||
```yaml
|
||||
version: 0.0.1
|
||||
pipeline_name: nodepred
|
||||
device: cpu
|
||||
data:
|
||||
name: cora
|
||||
split_ratio: # Ratio to generate split masks, for example set to [0.8, 0.1, 0.1] for 80% train/10% val/10% test. Leave blank to use builtin split in original dataset
|
||||
model:
|
||||
name: sage
|
||||
embed_size: -1 # The dimension of created embedding table. -1 means using original node embedding
|
||||
hidden_size: 16 # Hidden size.
|
||||
num_layers: 1 # Number of hidden layers.
|
||||
activation: relu # Activation function name under torch.nn.functional
|
||||
dropout: 0.5 # Dropout rate.
|
||||
aggregator_type: gcn # Aggregator type to use (``mean``, ``gcn``, ``pool``, ``lstm``).
|
||||
general_pipeline:
|
||||
early_stop:
|
||||
patience: 20 # Steps before early stop
|
||||
checkpoint_path: checkpoint.pth # Early stop checkpoint model file path
|
||||
num_epochs: 200 # Number of training epochs
|
||||
eval_period: 5 # Interval epochs between evaluations
|
||||
optimizer:
|
||||
name: Adam
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: model.pth # Path to save the model
|
||||
num_runs: 1 # Number of experiments to run
|
||||
```
|
||||
|
||||
Apart from `dgl configure`, you could also get one of DGL-Go's built-in configuration files
|
||||
(called *recipe*) using `dgl recipe`. There are two sub-commands:
|
||||
|
||||
```
|
||||
dgl recipe list
|
||||
```
|
||||
|
||||
will list the available recipes:
|
||||
|
||||
```
|
||||
➜ dgl recipe list
|
||||
===============================================================================
|
||||
| Filename | Pipeline | Dataset |
|
||||
===============================================================================
|
||||
| linkpred_citation2_sage.yaml | linkpred | ogbl-citation2 |
|
||||
| linkpred_collab_sage.yaml | linkpred | ogbl-collab |
|
||||
| nodepred_citeseer_sage.yaml | nodepred | citeseer |
|
||||
| nodepred_citeseer_gcn.yaml | nodepred | citeseer |
|
||||
| nodepred-ns_arxiv_gcn.yaml | nodepred-ns | ogbn-arxiv |
|
||||
| nodepred_cora_gat.yaml | nodepred | cora |
|
||||
| nodepred_pubmed_sage.yaml | nodepred | pubmed |
|
||||
| linkpred_cora_sage.yaml | linkpred | cora |
|
||||
| nodepred_pubmed_gcn.yaml | nodepred | pubmed |
|
||||
| nodepred_pubmed_gat.yaml | nodepred | pubmed |
|
||||
| nodepred_cora_gcn.yaml | nodepred | cora |
|
||||
| nodepred_cora_sage.yaml | nodepred | cora |
|
||||
| nodepred_citeseer_gat.yaml | nodepred | citeseer |
|
||||
| nodepred-ns_product_sage.yaml | nodepred-ns | ogbn-products |
|
||||
===============================================================================
|
||||
```
|
||||
|
||||
Then use
|
||||
|
||||
```
|
||||
dgl recipe get nodepred_cora_sage.yaml
|
||||
```
|
||||
|
||||
to copy the YAML configuration file to your local folder.
|
||||
|
||||
### Step 2: `dgl train`
|
||||
|
||||
Simply run `dgl train --cfg cora_sage.yaml` will start the training process.
|
||||
```log
|
||||
...
|
||||
Epoch 00190 | Loss 1.5225 | TrainAcc 0.9500 | ValAcc 0.6840
|
||||
Epoch 00191 | Loss 1.5416 | TrainAcc 0.9357 | ValAcc 0.6840
|
||||
Epoch 00192 | Loss 1.5391 | TrainAcc 0.9357 | ValAcc 0.6840
|
||||
Epoch 00193 | Loss 1.5257 | TrainAcc 0.9643 | ValAcc 0.6840
|
||||
Epoch 00194 | Loss 1.5196 | TrainAcc 0.9286 | ValAcc 0.6840
|
||||
EarlyStopping counter: 12 out of 20
|
||||
Epoch 00195 | Loss 1.4862 | TrainAcc 0.9643 | ValAcc 0.6760
|
||||
Epoch 00196 | Loss 1.5142 | TrainAcc 0.9714 | ValAcc 0.6760
|
||||
Epoch 00197 | Loss 1.5145 | TrainAcc 0.9714 | ValAcc 0.6760
|
||||
Epoch 00198 | Loss 1.5174 | TrainAcc 0.9571 | ValAcc 0.6760
|
||||
Epoch 00199 | Loss 1.5235 | TrainAcc 0.9714 | ValAcc 0.6760
|
||||
Test Accuracy 0.7740
|
||||
Accuracy across 1 runs: 0.774 ± 0.0
|
||||
```
|
||||
|
||||
That's all! Basically you only need two commands to train a graph neural network.
|
||||
|
||||
### Step 3: `dgl export` for more advanced customization
|
||||
|
||||
That's not everything yet. You may want to open the hood and and invoke deeper
|
||||
customization. DGL-Go can export a **self-contained, reproducible** Python
|
||||
script for you to do anything you like.
|
||||
|
||||
Try `dgl export --cfg cora_sage.yaml --output script.py`,
|
||||
and you'll get the script used to train the model. Here's the code snippet:
|
||||
|
||||
```python
|
||||
...
|
||||
|
||||
class GraphSAGE(nn.Module):
|
||||
def __init__(self,
|
||||
data_info: dict,
|
||||
embed_size: int = -1,
|
||||
hidden_size: int = 16,
|
||||
num_layers: int = 1,
|
||||
activation: str = "relu",
|
||||
dropout: float = 0.5,
|
||||
aggregator_type: str = "gcn"):
|
||||
"""GraphSAGE model
|
||||
|
||||
Parameters
|
||||
----------
|
||||
data_info : dict
|
||||
The information about the input dataset.
|
||||
embed_size : int
|
||||
The dimension of created embedding table. -1 means using original node embedding
|
||||
hidden_size : int
|
||||
Hidden size.
|
||||
num_layers : int
|
||||
Number of hidden layers.
|
||||
dropout : float
|
||||
Dropout rate.
|
||||
activation : str
|
||||
Activation function name under torch.nn.functional
|
||||
aggregator_type : str
|
||||
Aggregator type to use (``mean``, ``gcn``, ``pool``, ``lstm``).
|
||||
"""
|
||||
super(GraphSAGE, self).__init__()
|
||||
self.data_info = data_info
|
||||
self.embed_size = embed_size
|
||||
if embed_size > 0:
|
||||
self.embed = nn.Embedding(data_info["num_nodes"], embed_size)
|
||||
in_size = embed_size
|
||||
else:
|
||||
in_size = data_info["in_size"]
|
||||
self.layers = nn.ModuleList()
|
||||
self.dropout = nn.Dropout(dropout)
|
||||
self.activation = getattr(nn.functional, activation)
|
||||
|
||||
for i in range(num_layers):
|
||||
in_hidden = hidden_size if i > 0 else in_size
|
||||
out_hidden = hidden_size if i < num_layers - 1 else data_info["out_size"]
|
||||
self.layers.append(dgl.nn.SAGEConv( in_hidden, out_hidden, aggregator_type))
|
||||
|
||||
def forward(self, graph, node_feat, edge_feat=None):
|
||||
if self.embed_size > 0:
|
||||
dgl_warning(
|
||||
"The embedding for node feature is used, and input node_feat is ignored, due to the provided embed_size.",
|
||||
norepeat=True)
|
||||
h = self.embed.weight
|
||||
else:
|
||||
h = node_feat
|
||||
h = self.dropout(h)
|
||||
for l, layer in enumerate(self.layers):
|
||||
h = layer(graph, h, edge_feat)
|
||||
if l != len(self.layers) - 1:
|
||||
h = self.activation(h)
|
||||
h = self.dropout(h)
|
||||
return h
|
||||
|
||||
...
|
||||
|
||||
def train(cfg, pipeline_cfg, device, data, model, optimizer, loss_fcn):
|
||||
g = data[0] # Only train on the first graph
|
||||
g = dgl.remove_self_loop(g)
|
||||
g = dgl.add_self_loop(g)
|
||||
g = g.to(device)
|
||||
|
||||
node_feat = g.ndata.get('feat', None)
|
||||
edge_feat = g.edata.get('feat', None)
|
||||
label = g.ndata['label']
|
||||
train_mask, val_mask, test_mask = g.ndata['train_mask'].bool(
|
||||
), g.ndata['val_mask'].bool(), g.ndata['test_mask'].bool()
|
||||
|
||||
stopper = EarlyStopping(**pipeline_cfg['early_stop'])
|
||||
|
||||
val_acc = 0.
|
||||
for epoch in range(pipeline_cfg['num_epochs']):
|
||||
model.train()
|
||||
logits = model(g, node_feat, edge_feat)
|
||||
loss = loss_fcn(logits[train_mask], label[train_mask])
|
||||
|
||||
optimizer.zero_grad()
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
train_acc = accuracy(logits[train_mask], label[train_mask])
|
||||
if epoch != 0 and epoch % pipeline_cfg['eval_period'] == 0:
|
||||
val_acc = accuracy(logits[val_mask], label[val_mask])
|
||||
|
||||
if stopper.step(val_acc, model):
|
||||
break
|
||||
|
||||
print("Epoch {:05d} | Loss {:.4f} | TrainAcc {:.4f} | ValAcc {:.4f}".
|
||||
format(epoch, loss.item(), train_acc, val_acc))
|
||||
|
||||
stopper.load_checkpoint(model)
|
||||
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
logits = model(g, node_feat, edge_feat)
|
||||
test_acc = accuracy(logits[test_mask], label[test_mask])
|
||||
return test_acc
|
||||
|
||||
|
||||
def main():
|
||||
cfg = {
|
||||
'version': '0.0.1',
|
||||
'device': 'cuda:0',
|
||||
'model': {
|
||||
'embed_size': -1,
|
||||
'hidden_size': 16,
|
||||
'num_layers': 2,
|
||||
'activation': 'relu',
|
||||
'dropout': 0.5,
|
||||
'aggregator_type': 'gcn'},
|
||||
'general_pipeline': {
|
||||
'early_stop': {
|
||||
'patience': 100,
|
||||
'checkpoint_path': 'checkpoint.pth'},
|
||||
'num_epochs': 200,
|
||||
'eval_period': 5,
|
||||
'optimizer': {
|
||||
'lr': 0.01,
|
||||
'weight_decay': 0.0005},
|
||||
'loss': 'CrossEntropyLoss',
|
||||
'save_path': 'model.pth',
|
||||
'num_runs': 10}}
|
||||
device = cfg['device']
|
||||
pipeline_cfg = cfg['general_pipeline']
|
||||
# load data
|
||||
data = AsNodePredDataset(CoraGraphDataset())
|
||||
# create model
|
||||
model_cfg = cfg["model"]
|
||||
cfg["model"]["data_info"] = {
|
||||
"in_size": model_cfg['embed_size'] if model_cfg['embed_size'] > 0 else data[0].ndata['feat'].shape[1],
|
||||
"out_size": data.num_classes,
|
||||
"num_nodes": data[0].num_nodes()
|
||||
}
|
||||
model = GraphSAGE(**cfg["model"])
|
||||
model = model.to(device)
|
||||
loss = torch.nn.CrossEntropyLoss()
|
||||
optimizer = torch.optim.Adam(
|
||||
model.parameters(),
|
||||
**pipeline_cfg["optimizer"])
|
||||
# train
|
||||
test_acc = train(cfg, pipeline_cfg, device, data, model, optimizer, loss)
|
||||
torch.save(model, pipeline_cfg["save_path"])
|
||||
return test_acc
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
You can see that everything is collected into one Python script which includes the
|
||||
entire `GraphSAGE` model definition, data processing and training loop. Simply running
|
||||
`python script.py` will give you the *exact same* result as you've seen by `dgl train`.
|
||||
At this point, you can change any part as you wish such as plugging your own GNN module,
|
||||
changing the loss function and so on.
|
||||
|
||||
## Use DGL-Go on your own dataset
|
||||
|
||||
DGL-Go supports training a model on custom dataset by DGL's `CSVDataset`.
|
||||
|
||||
### Step 1: Prepare your CSV and metadata file.
|
||||
|
||||
Follow the tutorial at [Loading data from CSV
|
||||
files](https://docs.dgl.ai/en/latest/guide/data-loadcsv.html#guide-data-pipeline-loadcsv`)
|
||||
to prepare your dataset. Generally, the dataset folder should include:
|
||||
* At least one CSV file for node data.
|
||||
* At least one CSV file for edge data.
|
||||
* A metadata file called `meta.yaml`.
|
||||
|
||||
### Step 2: `dgl configure` with `--data csv` option
|
||||
Run
|
||||
|
||||
```
|
||||
dgl configure nodepred --data csv --model sage --cfg csv_sage.yaml
|
||||
```
|
||||
|
||||
to generate the configuration file. You will see that the file includes a section like
|
||||
the followings:
|
||||
|
||||
```yaml
|
||||
...
|
||||
data:
|
||||
name: csv
|
||||
split_ratio: # Ratio to generate split masks, for example set to [0.8, 0.1, 0.1] for 80% train/10% val/10% test. Leave blank to use builtin split in original dataset
|
||||
data_path: ./ # metadata.yaml, nodes.csv, edges.csv should in this folder
|
||||
...
|
||||
```
|
||||
|
||||
Fill in the `data_path` option with the path to your dataset folder.
|
||||
|
||||
If your dataset does not have any native split for training, validation and test sets,
|
||||
you can set the split ratio in the `split_ratio` option, which will
|
||||
generate a random split for you.
|
||||
|
||||
### Step 3: `train` the model / `export` the script
|
||||
Then you can do the same as the tutorial above, either train the model by
|
||||
`dgl train --cfg csv_sage.yaml` or use `dgl export --cfg csv_sage.yaml
|
||||
--output script.py` to get the training script.
|
||||
|
||||
## FAQ
|
||||
|
||||
**Q: What are the available options for each command?**
|
||||
A: You can use `--help` for all commands. For example, use `dgl --help` for general
|
||||
help message; use `dgl configure --help` for the configuration options; use
|
||||
`dgl configure nodepred --help` for the configuration options of node prediction pipeline.
|
||||
|
||||
**Q: What exactly is nodepred/linkpred? How many are they?**
|
||||
A: They are called DGl-Go pipelines. A pipeline represents the training methodology for
|
||||
a certain task. Therefore, its naming convention is *<task_name>[-<method_name>]*. For example,
|
||||
`nodepred` trains the selected GNN model for node classification using full-graph training method;
|
||||
while `nodepred-ns` trains the model for node classifiation but using neighbor sampling.
|
||||
The first release included three training pipelines (`nodepred`, `nodepred-ns` and `linkpred`)
|
||||
but you can expect more will be coming in the future. Use `dgl configure --help` to see
|
||||
all the available pipelines.
|
||||
|
||||
**Q: How to add my model to the official model recipe zoo?**
|
||||
A: Currently not supported. We will enable this feature soon. Please stay tuned!
|
||||
|
||||
**Q: After training a model on some dataset, how can I apply it to another one?**
|
||||
A: The `save_path` option in the generated configuration file allows you to specify where
|
||||
to save the model after training. You can then modify the script generated by `dgl export`
|
||||
to load the the model checkpoint and evaluate it on another dataset.
|
||||
BIN
dglgo/dglgo.png
Normal file
BIN
dglgo/dglgo.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 148 KiB |
20
dglgo/dglgo/cli/cli.py
Normal file
20
dglgo/dglgo/cli/cli.py
Normal file
@@ -0,0 +1,20 @@
|
||||
import typer
|
||||
from ..pipeline import *
|
||||
from ..model import *
|
||||
from .config_cli import config_app
|
||||
from .train_cli import train
|
||||
from .export_cli import export
|
||||
from .recipe_cli import recipe_app
|
||||
|
||||
no_args_is_help = False
|
||||
app = typer.Typer(no_args_is_help=True, add_completion=False)
|
||||
app.add_typer(config_app, name="configure", no_args_is_help=no_args_is_help)
|
||||
app.add_typer(recipe_app, name="recipe", no_args_is_help=True)
|
||||
app.command(help="Launch training", no_args_is_help=no_args_is_help)(train)
|
||||
app.command(help="Export a runnable python script", no_args_is_help=no_args_is_help)(export)
|
||||
|
||||
def main():
|
||||
app()
|
||||
|
||||
if __name__ == "__main__":
|
||||
app()
|
||||
@@ -6,9 +6,9 @@ import typing
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
|
||||
config_app = typer.Typer(help="Generate the config files")
|
||||
config_app = typer.Typer(help="Generate a configuration file")
|
||||
for key, pipeline in PipelineFactory.registry.items():
|
||||
config_app.command(key, help=pipeline.get_description())(pipeline.get_cfg_func())
|
||||
|
||||
if __name__ == "__main__":
|
||||
config_app()
|
||||
config_app()
|
||||
@@ -10,8 +10,8 @@ import isort
|
||||
import autopep8
|
||||
|
||||
def export(
|
||||
cfg: str = typer.Option("cfg.yml", help="config yaml file name"),
|
||||
output: str = typer.Option("output.py", help="output python file name")
|
||||
cfg: str = typer.Option("cfg.yaml", help="config yaml file name"),
|
||||
output: str = typer.Option("script.py", help="output python file name")
|
||||
):
|
||||
user_cfg = yaml.safe_load(Path(cfg).open("r"))
|
||||
pipeline_name = user_cfg["pipeline_name"]
|
||||
54
dglgo/dglgo/cli/recipe_cli.py
Normal file
54
dglgo/dglgo/cli/recipe_cli.py
Normal file
@@ -0,0 +1,54 @@
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
import typer
|
||||
import os
|
||||
import shutil
|
||||
import yaml
|
||||
|
||||
def list_recipes():
|
||||
file_current_dir = Path(__file__).resolve().parent
|
||||
recipe_dir = file_current_dir.parent.parent / "recipes"
|
||||
file_list = list(recipe_dir.glob("*.yaml"))
|
||||
header = "| {:<30} | {:<18} | {:<20} |".format("Filename", "Pipeline", "Dataset")
|
||||
typer.echo("="*len(header))
|
||||
typer.echo(header)
|
||||
typer.echo("="*len(header))
|
||||
for file in file_list:
|
||||
cfg = yaml.safe_load(Path(file).open("r"))
|
||||
typer.echo("| {:<30} | {:<18} | {:<20} |".format(file.name, cfg["pipeline_name"], cfg["data"]["name"]))
|
||||
typer.echo("="*len(header))
|
||||
|
||||
def copy_recipes(dir: str = typer.Option("dglgo_example_recipes", help="directory name for recipes")):
|
||||
file_current_dir = Path(__file__).resolve().parent
|
||||
recipe_dir = file_current_dir.parent.parent / "recipes"
|
||||
current_dir = Path(os.getcwd())
|
||||
new_dir = current_dir / dir
|
||||
new_dir.mkdir(parents=True, exist_ok=True)
|
||||
for file in recipe_dir.glob("*.yaml"):
|
||||
shutil.copy(file, new_dir)
|
||||
print("Example recipes are copied to {}".format(new_dir.absolute()))
|
||||
|
||||
def get_recipe(recipe_name: Optional[str] = typer.Argument(None, help="The recipe filename to get, e.q. nodepred_citeseer_gcn.yaml")):
|
||||
if recipe_name is None:
|
||||
typer.echo("Usage: dgl recipe get [RECIPE_NAME] \n")
|
||||
typer.echo(" Copy the recipe to current directory \n")
|
||||
typer.echo(" Arguments:")
|
||||
typer.echo(" [RECIPE_NAME] The recipe filename to get, e.q. nodepred_citeseer_gcn.yaml\n")
|
||||
typer.echo("Here are all avaliable recipe filename")
|
||||
list_recipes()
|
||||
else:
|
||||
file_current_dir = Path(__file__).resolve().parent
|
||||
recipe_dir = file_current_dir.parent.parent / "recipes"
|
||||
current_dir = Path(os.getcwd())
|
||||
recipe_path = recipe_dir / recipe_name
|
||||
shutil.copy(recipe_path, current_dir)
|
||||
print("Recipe {} is copied to {}".format(recipe_path.absolute(), current_dir.absolute()))
|
||||
|
||||
|
||||
recipe_app = typer.Typer(help="Get example recipes")
|
||||
recipe_app.command(name="list", help="List all available example recipes")(list_recipes)
|
||||
recipe_app.command(name="copy", help="Copy all available example recipes to current directory")(copy_recipes)
|
||||
recipe_app.command(name="get", help="Copy the recipe to current directory")(get_recipe)
|
||||
|
||||
if __name__ == "__main__":
|
||||
recipe_app()
|
||||
@@ -5,12 +5,11 @@ from enum import Enum
|
||||
import typing
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
|
||||
import isort
|
||||
import autopep8
|
||||
|
||||
def train(
|
||||
cfg: str = typer.Option("cfg.yml", help="config yaml file name"),
|
||||
cfg: str = typer.Option("cfg.yaml", help="config yaml file name"),
|
||||
):
|
||||
user_cfg = yaml.safe_load(Path(cfg).open("r"))
|
||||
pipeline_name = user_cfg["pipeline_name"]
|
||||
@@ -18,8 +17,8 @@ def train(
|
||||
|
||||
f_code = autopep8.fix_code(output_file_content, options={'aggressive': 1})
|
||||
f_code = isort.code(f_code)
|
||||
exec(f_code, {'__name__': '__main__'})
|
||||
|
||||
code = compile(f_code, 'dglgo_tmp.py', 'exec')
|
||||
exec(code, {'__name__': '__main__'})
|
||||
|
||||
if __name__ == "__main__":
|
||||
train_app = typer.Typer()
|
||||
@@ -49,7 +49,7 @@ class GCN(nn.Module):
|
||||
in_hidden = hidden_size if i > 0 else in_size
|
||||
out_hidden = hidden_size if i < num_layers - 1 else data_info["out_size"]
|
||||
|
||||
self.layers.append(dgl.nn.GraphConv(in_hidden, out_hidden, norm=norm))
|
||||
self.layers.append(dgl.nn.GraphConv(in_hidden, out_hidden, norm=norm, allow_zero_in_degree=True))
|
||||
|
||||
self.dropout = nn.Dropout(p=dropout)
|
||||
self.act = getattr(torch, activation)
|
||||
@@ -12,6 +12,8 @@ class GIN(nn.Module):
|
||||
aggregator_type='sum'):
|
||||
"""Graph Isomophism Networks
|
||||
|
||||
Edge feature is ignored in this model.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
data_info : dict
|
||||
@@ -55,7 +55,7 @@ class GraphSAGE(nn.Module):
|
||||
h = node_feat
|
||||
h = self.dropout(h)
|
||||
for l, layer in enumerate(self.layers):
|
||||
h = layer(graph, h)
|
||||
h = layer(graph, h, edge_feat)
|
||||
if l != len(self.layers) - 1:
|
||||
h = self.activation(h)
|
||||
h = self.dropout(h)
|
||||
@@ -64,7 +64,7 @@ class GraphSAGE(nn.Module):
|
||||
def forward_block(self, blocks, node_feat, edge_feat = None):
|
||||
h = node_feat
|
||||
for l, (layer, block) in enumerate(zip(self.layers, blocks)):
|
||||
h = layer(block, h)
|
||||
h = layer(block, h, edge_feat)
|
||||
if l != len(self.layers) - 1:
|
||||
h = self.activation(h)
|
||||
h = self.dropout(h)
|
||||
@@ -14,6 +14,8 @@ class SGC(nn.Module):
|
||||
bias=True, k=2):
|
||||
""" Simplifying Graph Convolutional Networks
|
||||
|
||||
Edge feature is ignored in this model.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
data_info : dict
|
||||
@@ -20,6 +20,7 @@ class LinkpredPipelineCfg(BaseModel):
|
||||
eval_period: int = 5
|
||||
optimizer: dict = {"name": "Adam", "lr": 0.005}
|
||||
loss: str = "BCELoss"
|
||||
save_path: str = "model.pth"
|
||||
num_runs: int = 1
|
||||
|
||||
|
||||
@@ -29,6 +30,7 @@ pipeline_comments = {
|
||||
"train_batch_size": "Edge batch size when training",
|
||||
"num_epochs": "Number of training epochs",
|
||||
"eval_period": "Interval epochs between evaluations",
|
||||
"save_path": "Path to save the model",
|
||||
"num_runs": "Number of experiments to run",
|
||||
}
|
||||
|
||||
@@ -67,20 +69,18 @@ class LinkpredPipeline(PipelineBase):
|
||||
def config(
|
||||
data: DataFactory.filter("linkpred").get_dataset_enum() = typer.Option(..., help="input data name"),
|
||||
cfg: str = typer.Option(
|
||||
"cfg.yml", help="output configuration path"),
|
||||
"cfg.yaml", help="output configuration path"),
|
||||
node_model: NodeModelFactory.get_model_enum() = typer.Option(...,
|
||||
help="Model name"),
|
||||
edge_model: EdgeModelFactory.get_model_enum() = typer.Option(...,
|
||||
help="Model name"),
|
||||
neg_sampler: NegativeSamplerFactory.get_model_enum() = typer.Option(
|
||||
"uniform", help="Negative sampler name"),
|
||||
device: DeviceEnum = typer.Option(
|
||||
"cpu", help="Device, cpu or cuda"),
|
||||
"persource", help="Negative sampler name"),
|
||||
):
|
||||
self.__class__.setup_user_cfg_cls()
|
||||
generated_cfg = {
|
||||
"pipeline_name": "linkpred",
|
||||
"device": device.value,
|
||||
"device": "cpu",
|
||||
"data": {"name": data.name},
|
||||
"neg_sampler": {"name": neg_sampler.value},
|
||||
"node_model": {"name": node_model.value},
|
||||
@@ -89,6 +89,7 @@ class LinkpredPipeline(PipelineBase):
|
||||
output_cfg = self.user_cfg_cls(**generated_cfg).dict()
|
||||
output_cfg = deep_convert_dict(output_cfg)
|
||||
comment_dict = {
|
||||
"device": "Torch device name, e.q. cpu or cuda or cuda:0",
|
||||
"general_pipeline": pipeline_comments,
|
||||
"node_model": NodeModelFactory.get_constructor_doc_dict(node_model.value),
|
||||
"edge_model": EdgeModelFactory.get_constructor_doc_dict(edge_model.value),
|
||||
@@ -99,6 +100,9 @@ class LinkpredPipeline(PipelineBase):
|
||||
},
|
||||
}
|
||||
comment_dict = merge_comment(output_cfg, comment_dict)
|
||||
|
||||
if cfg is None:
|
||||
cfg = "_".join(["linkpred", data.value, node_model.value, edge_model.value]) + ".yaml"
|
||||
yaml = ruamel.yaml.YAML()
|
||||
yaml.dump(comment_dict, Path(cfg).open("w"))
|
||||
print("Configuration file is generated at {}".format(Path(cfg).absolute()))
|
||||
@@ -112,6 +112,7 @@ def main():
|
||||
loss = torch.nn.{{ loss }}()
|
||||
optimizer = torch.optim.Adam(params, **pipeline_cfg["optimizer"])
|
||||
test_hits = train(cfg, pipeline_cfg, device, dataset, model, optimizer, loss)
|
||||
torch.save(model, pipeline_cfg["save_path"])
|
||||
return test_hits
|
||||
|
||||
if __name__ == '__main__':
|
||||
@@ -18,6 +18,7 @@ pipeline_comments = {
|
||||
"patience": "Steps before early stop",
|
||||
"checkpoint_path": "Early stop checkpoint model file path"
|
||||
},
|
||||
"save_path": "Path to save the model",
|
||||
"num_runs": "Number of experiments to run",
|
||||
}
|
||||
|
||||
@@ -27,6 +28,7 @@ class NodepredPipelineCfg(BaseModel):
|
||||
eval_period: int = 5
|
||||
optimizer: dict = {"name": "Adam", "lr": 0.01, "weight_decay": 5e-4}
|
||||
loss: str = "CrossEntropyLoss"
|
||||
save_path: str = "model.pth"
|
||||
num_runs: int = 1
|
||||
|
||||
@PipelineFactory.register("nodepred")
|
||||
@@ -54,15 +56,14 @@ class NodepredPipeline(PipelineBase):
|
||||
def get_cfg_func(self):
|
||||
def config(
|
||||
data: DataFactory.filter("nodepred").get_dataset_enum() = typer.Option(..., help="input data name"),
|
||||
cfg: str = typer.Option(
|
||||
"cfg.yml", help="output configuration path"),
|
||||
cfg: Optional[str] = typer.Option(
|
||||
None, help="output configuration path"),
|
||||
model: NodeModelFactory.get_model_enum() = typer.Option(..., help="Model name"),
|
||||
device: DeviceEnum = typer.Option("cpu", help="Device, cpu or cuda"),
|
||||
):
|
||||
self.__class__.setup_user_cfg_cls()
|
||||
generated_cfg = {
|
||||
"pipeline_name": self.pipeline_name,
|
||||
"device": device,
|
||||
"device": "cpu",
|
||||
"data": {"name": data.name},
|
||||
"model": {"name": model.value},
|
||||
"general_pipeline": {}
|
||||
@@ -70,6 +71,7 @@ class NodepredPipeline(PipelineBase):
|
||||
output_cfg = self.user_cfg_cls(**generated_cfg).dict()
|
||||
output_cfg = deep_convert_dict(output_cfg)
|
||||
comment_dict = {
|
||||
"device": "Torch device name, e.q. cpu or cuda or cuda:0",
|
||||
"data": {
|
||||
"split_ratio": 'Ratio to generate split masks, for example set to [0.8, 0.1, 0.1] for 80% train/10% val/10% test. Leave blank to use builtin split in original dataset'
|
||||
},
|
||||
@@ -79,6 +81,8 @@ class NodepredPipeline(PipelineBase):
|
||||
comment_dict = merge_comment(output_cfg, comment_dict)
|
||||
|
||||
yaml = ruamel.yaml.YAML()
|
||||
if cfg is None:
|
||||
cfg = "_".join(["nodepred", data.value, model.value]) + ".yaml"
|
||||
yaml.dump(comment_dict, Path(cfg).open("w"))
|
||||
print("Configuration file is generated at {}".format(Path(cfg).absolute()))
|
||||
|
||||
@@ -88,7 +92,7 @@ class NodepredPipeline(PipelineBase):
|
||||
def gen_script(cls, user_cfg_dict):
|
||||
# Check validation
|
||||
cls.setup_user_cfg_cls()
|
||||
user_cfg = cls.user_cfg_cls(**user_cfg_dict)
|
||||
user_cfg = cls.user_cfg_cls(**user_cfg_dict)
|
||||
file_current_dir = Path(__file__).resolve().parent
|
||||
with open(file_current_dir / "nodepred.jinja-py", "r") as f:
|
||||
template = Template(f.read())
|
||||
@@ -102,6 +106,8 @@ class NodepredPipeline(PipelineBase):
|
||||
render_cfg.update(DataFactory.get_generated_code_dict(user_cfg_dict["data"]["name"], '**cfg["data"]'))
|
||||
|
||||
generated_user_cfg = copy.deepcopy(user_cfg_dict)
|
||||
if "split_ratio" in generated_user_cfg["data"]:
|
||||
generated_user_cfg["data"].pop("split_ratio")
|
||||
if len(generated_user_cfg["data"]) == 1:
|
||||
generated_user_cfg.pop("data")
|
||||
else:
|
||||
@@ -116,9 +122,6 @@ class NodepredPipeline(PipelineBase):
|
||||
|
||||
if user_cfg_dict["data"].get("split_ratio", None) is not None:
|
||||
render_cfg["data_initialize_code"] = "{}, split_ratio={}".format(render_cfg["data_initialize_code"], user_cfg_dict["data"]["split_ratio"])
|
||||
if "split_ratio" in generated_user_cfg["data"]:
|
||||
generated_user_cfg["data"].pop("split_ratio")
|
||||
|
||||
render_cfg["user_cfg_str"] = f"cfg = {str(generated_user_cfg)}"
|
||||
render_cfg["user_cfg"] = user_cfg_dict
|
||||
return template.render(**render_cfg)
|
||||
@@ -112,6 +112,7 @@ def main():
|
||||
optimizer = torch.optim.{{ user_cfg.general_pipeline.optimizer.name }}(model.parameters(), **pipeline_cfg["optimizer"])
|
||||
# train
|
||||
test_acc = train(cfg, pipeline_cfg, device, data, model, optimizer, loss)
|
||||
torch.save(model, pipeline_cfg["save_path"])
|
||||
return test_acc
|
||||
|
||||
if __name__ == '__main__':
|
||||
@@ -36,6 +36,14 @@ pipeline_comments = {
|
||||
"patience": "Steps before early stop",
|
||||
"checkpoint_path": "Early stop checkpoint model file path"
|
||||
},
|
||||
"sampler": {
|
||||
"fan_out": "List of neighbors to sample per edge type for each GNN layer, with the i-th element being the fanout for the i-th GNN layer. Length should be the same as num_layers in model setting",
|
||||
"batch_size": "Batch size of seed nodes in training stage",
|
||||
"num_workers": "Number of workers to accelerate the graph data processing step",
|
||||
"eval_batch_size": "Batch size of seed nodes in training stage in evaluation stage",
|
||||
"eval_num_workers": "Number of workers to accelerate the graph data processing step in evaluation stage"
|
||||
},
|
||||
"save_path": "Path to save the model",
|
||||
"num_runs": "Number of experiments to run",
|
||||
}
|
||||
|
||||
@@ -47,6 +55,7 @@ class NodepredNSPipelineCfg(BaseModel):
|
||||
optimizer: dict = {"name": "Adam", "lr": 0.005, "weight_decay": 0.0}
|
||||
loss: str = "CrossEntropyLoss"
|
||||
num_runs: int = 1
|
||||
save_path: str = "model.pth"
|
||||
|
||||
@PipelineFactory.register("nodepred-ns")
|
||||
class NodepredNsPipeline(PipelineBase):
|
||||
@@ -60,7 +69,7 @@ class NodepredNsPipeline(PipelineBase):
|
||||
class NodePredUserConfig(UserConfig):
|
||||
eval_device: DeviceEnum = Field("cpu")
|
||||
data: DataFactory.filter("nodepred-ns").get_pydantic_config() = Field(..., discriminator="name")
|
||||
model : NodeModelFactory.get_pydantic_model_config() = Field(..., discriminator="name")
|
||||
model : NodeModelFactory.filter(lambda cls: hasattr(cls, "forward_block")).get_pydantic_model_config() = Field(..., discriminator="name")
|
||||
general_pipeline: NodepredNSPipelineCfg
|
||||
|
||||
cls.user_cfg_cls = NodePredUserConfig
|
||||
@@ -72,16 +81,14 @@ class NodepredNsPipeline(PipelineBase):
|
||||
def get_cfg_func(self):
|
||||
def config(
|
||||
data: DataFactory.filter("nodepred-ns").get_dataset_enum() = typer.Option(..., help="input data name"),
|
||||
cfg: str = typer.Option(
|
||||
"cfg.yml", help="output configuration path"),
|
||||
model: NodeModelFactory.get_model_enum() = typer.Option(..., help="Model name"),
|
||||
device: DeviceEnum = typer.Option(
|
||||
"cpu", help="Device, cpu or cuda"),
|
||||
cfg: Optional[str] = typer.Option(
|
||||
None, help="output configuration path"),
|
||||
model: NodeModelFactory.filter(lambda cls: hasattr(cls, "forward_block")).get_model_enum() = typer.Option(..., help="Model name"),
|
||||
):
|
||||
self.__class__.setup_user_cfg_cls()
|
||||
generated_cfg = {
|
||||
generated_cfg = {
|
||||
"pipeline_name": "nodepred-ns",
|
||||
"device": device,
|
||||
"device": "cpu",
|
||||
"data": {"name": data.name},
|
||||
"model": {"name": model.value},
|
||||
"general_pipeline": {"sampler":{"name": "neighbor"}}
|
||||
@@ -89,14 +96,21 @@ class NodepredNsPipeline(PipelineBase):
|
||||
output_cfg = self.user_cfg_cls(**generated_cfg).dict()
|
||||
output_cfg = deep_convert_dict(output_cfg)
|
||||
comment_dict = {
|
||||
"device": "Torch device name, e.q. cpu or cuda or cuda:0",
|
||||
"data": {
|
||||
"split_ratio": 'Ratio to generate split masks, for example set to [0.8, 0.1, 0.1] for 80% train/10% val/10% test. Leave blank to use builtin split in original dataset'
|
||||
},
|
||||
"general_pipeline": pipeline_comments,
|
||||
"model": NodeModelFactory.get_constructor_doc_dict(model.value)
|
||||
"model": NodeModelFactory.get_constructor_doc_dict(model.value),
|
||||
}
|
||||
comment_dict = merge_comment(output_cfg, comment_dict)
|
||||
|
||||
# truncate length fan_out to be the same as num_layers in model
|
||||
if "num_layers" in comment_dict["model"]:
|
||||
comment_dict['general_pipeline']["sampler"]["fan_out"] = [5,10,15,15,15][:int(comment_dict['model']["num_layers"])]
|
||||
|
||||
if cfg is None:
|
||||
cfg = "_".join(["nodepred-ns", data.value, model.value]) + ".yaml"
|
||||
yaml = ruamel.yaml.YAML()
|
||||
yaml.dump(comment_dict, Path(cfg).open("w"))
|
||||
print("Configuration file is generated at {}".format(
|
||||
@@ -112,6 +126,10 @@ class NodepredNsPipeline(PipelineBase):
|
||||
template = Template(f.read())
|
||||
pipeline_cfg = NodepredNSPipelineCfg(
|
||||
**user_cfg_dict["general_pipeline"])
|
||||
|
||||
if "num_layers" in user_cfg_dict["model"]:
|
||||
assert user_cfg_dict["model"]["num_layers"] == len(user_cfg_dict["general_pipeline"]["sampler"]["fan_out"]), \
|
||||
"The num_layers in model config should be the same as the length of fan_out in sampler. For example, if num_layers is 1, the fan_out cannot be [5, 10]"
|
||||
|
||||
render_cfg = copy.deepcopy(user_cfg_dict)
|
||||
model_code = NodeModelFactory.get_source_code(
|
||||
@@ -123,6 +141,8 @@ class NodepredNsPipeline(PipelineBase):
|
||||
user_cfg_dict["data"]["name"], '**cfg["data"]'))
|
||||
generated_user_cfg = copy.deepcopy(user_cfg_dict)
|
||||
|
||||
if "split_ratio" in generated_user_cfg["data"]:
|
||||
generated_user_cfg["data"].pop("split_ratio")
|
||||
if len(generated_user_cfg["data"]) == 1:
|
||||
generated_user_cfg.pop("data")
|
||||
else:
|
||||
@@ -135,8 +155,6 @@ class NodepredNsPipeline(PipelineBase):
|
||||
|
||||
if user_cfg_dict["data"].get("split_ratio", None) is not None:
|
||||
render_cfg["data_initialize_code"] = "{}, split_ratio={}".format(render_cfg["data_initialize_code"], user_cfg_dict["data"]["split_ratio"])
|
||||
if "split_ratio" in generated_user_cfg["data"]:
|
||||
generated_user_cfg["data"].pop("split_ratio")
|
||||
|
||||
render_cfg["user_cfg_str"] = f"cfg = {str(generated_user_cfg)}"
|
||||
render_cfg["user_cfg"] = user_cfg_dict
|
||||
@@ -145,4 +163,4 @@ class NodepredNsPipeline(PipelineBase):
|
||||
|
||||
@staticmethod
|
||||
def get_description() -> str:
|
||||
return "Node classification sampling pipeline"
|
||||
return "Node classification neighbor sampling pipeline"
|
||||
@@ -157,8 +157,8 @@ def main():
|
||||
model = model.to(device)
|
||||
loss = torch.nn.{{ user_cfg.general_pipeline.loss }}()
|
||||
optimizer = torch.optim.{{ user_cfg.general_pipeline.optimizer.name }}(model.parameters(), **pipeline_cfg["optimizer"])
|
||||
# train
|
||||
test_acc = train(cfg, pipeline_cfg, device, data, model, optimizer, loss)
|
||||
torch.save(model, pipeline_cfg["save_path"])
|
||||
return test_acc
|
||||
|
||||
if __name__ == '__main__':
|
||||
@@ -334,6 +334,14 @@ class ModelFactory:
|
||||
type_annotation_dict[k] = param.annotation
|
||||
return type_annotation_dict
|
||||
|
||||
def filter(self, filter_func):
|
||||
new_fac = ModelFactory()
|
||||
for name in self.registry:
|
||||
if filter_func(self.registry[name]):
|
||||
new_fac.registry[name] = self.registry[name]
|
||||
new_fac.code_registry[name] = self.code_registry[name]
|
||||
return new_fac
|
||||
|
||||
|
||||
class SamplerFactory:
|
||||
""" The factory class for creating executors"""
|
||||
@@ -411,7 +419,7 @@ class SamplerFactory:
|
||||
|
||||
|
||||
NegativeSamplerFactory = SamplerFactory()
|
||||
NegativeSamplerFactory.register("uniform")(GlobalUniform)
|
||||
NegativeSamplerFactory.register("global")(GlobalUniform)
|
||||
NegativeSamplerFactory.register("persource")(PerSourceUniform)
|
||||
|
||||
NodeModelFactory = ModelFactory()
|
||||
0
dglgo/recipes/__init__.py
Normal file
0
dglgo/recipes/__init__.py
Normal file
@@ -31,4 +31,5 @@ general_pipeline:
|
||||
name: Adam
|
||||
lr: 0.005
|
||||
loss: BCELoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 1 # Number of experiments to run
|
||||
@@ -31,4 +31,5 @@ general_pipeline:
|
||||
name: Adam
|
||||
lr: 0.005
|
||||
loss: BCELoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 1 # Number of experiments to run
|
||||
@@ -31,4 +31,5 @@ general_pipeline:
|
||||
name: Adam
|
||||
lr: 0.005
|
||||
loss: BCELoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 1 # Number of experiments to run
|
||||
@@ -31,4 +31,5 @@ general_pipeline:
|
||||
lr: 0.005
|
||||
weight_decay: 0.0
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 5
|
||||
@@ -35,4 +35,5 @@ general_pipeline:
|
||||
lr: 0.005
|
||||
weight_decay: 0.0
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 5 # Number of experiments to run
|
||||
@@ -28,4 +28,5 @@ general_pipeline:
|
||||
lr: 0.005
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -24,4 +24,5 @@ general_pipeline:
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -23,4 +23,5 @@ general_pipeline:
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -28,4 +28,5 @@ general_pipeline:
|
||||
lr: 0.005
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -24,4 +24,5 @@ general_pipeline:
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -23,4 +23,5 @@ general_pipeline:
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -28,4 +28,5 @@ general_pipeline:
|
||||
lr: 0.005
|
||||
weight_decay: 0.001
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -24,4 +24,5 @@ general_pipeline:
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -23,4 +23,5 @@ general_pipeline:
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
save_path: "model.pth"
|
||||
num_runs: 10 # Number of experiments to run
|
||||
@@ -3,7 +3,7 @@
|
||||
from setuptools import find_packages
|
||||
from distutils.core import setup
|
||||
|
||||
setup(name='dglenter',
|
||||
setup(name='dglgo',
|
||||
version='0.0.1',
|
||||
description='DGL',
|
||||
author='DGL Team',
|
||||
@@ -15,12 +15,15 @@ setup(name='dglenter',
|
||||
'autopep8>=1.6.0',
|
||||
'numpydoc>=1.1.0',
|
||||
"pydantic>=1.9.0",
|
||||
"ruamel.yaml>=0.17.20"
|
||||
"ruamel.yaml>=0.17.20",
|
||||
"PyYAML>=5.1"
|
||||
],
|
||||
license='APACHE',
|
||||
package_data={"": ["./*"]},
|
||||
include_package_data=True,
|
||||
license='APACHE',
|
||||
entry_points={
|
||||
'console_scripts': [
|
||||
"dgl-enter = dglenter.cli.cli:main"
|
||||
"dgl = dglgo.cli.cli:main"
|
||||
]
|
||||
},
|
||||
url='https://github.com/dmlc/dgl',
|
||||
26
dglgo/tests/cfg.yml
Normal file
26
dglgo/tests/cfg.yml
Normal file
@@ -0,0 +1,26 @@
|
||||
version: 0.0.1
|
||||
pipeline_name: nodepred
|
||||
device: cpu
|
||||
data:
|
||||
name: cora
|
||||
split_ratio: # Ratio to generate split masks, for example set to [0.8, 0.1, 0.1] for 80% train/10% val/10% test. Leave blank to use builtin split in original dataset
|
||||
model:
|
||||
name: sage
|
||||
embed_size: -1 # The dimension of created embedding table. -1 means using original node embedding
|
||||
hidden_size: 16 # Hidden size.
|
||||
num_layers: 1 # Number of hidden layers.
|
||||
activation: relu # Activation function name under torch.nn.functional
|
||||
dropout: 0.5 # Dropout rate.
|
||||
aggregator_type: gcn # Aggregator type to use (``mean``, ``gcn``, ``pool``, ``lstm``).
|
||||
general_pipeline:
|
||||
early_stop:
|
||||
patience: 20 # Steps before early stop
|
||||
checkpoint_path: checkpoint.pth # Early stop checkpoint model file path
|
||||
num_epochs: 200 # Number of training epochs
|
||||
eval_period: 5 # Interval epochs between evaluations
|
||||
optimizer:
|
||||
name: Adam
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
num_runs: 1 # Number of experiments to run
|
||||
1
dglgo/tests/run_test.sh
Normal file
1
dglgo/tests/run_test.sh
Normal file
@@ -0,0 +1 @@
|
||||
python -m pytest --pdb -vv --capture=tee-sys test_pipeline.py::test_recipe
|
||||
62
dglgo/tests/test_pipeline.py
Normal file
62
dglgo/tests/test_pipeline.py
Normal file
@@ -0,0 +1,62 @@
|
||||
import subprocess
|
||||
from typing import NamedTuple
|
||||
import pytest
|
||||
from pathlib import Path
|
||||
# class DatasetSpec:
|
||||
|
||||
dataset_spec = {
|
||||
"cora": {"timeout": 30}
|
||||
}
|
||||
|
||||
|
||||
|
||||
class ExperimentSpec(NamedTuple):
|
||||
pipeline: str
|
||||
dataset: str
|
||||
model: str
|
||||
timeout: int
|
||||
extra_cfg: dict = {}
|
||||
|
||||
exps = [ExperimentSpec(pipeline="nodepred", dataset="cora", model="sage", timeout=0.5)]
|
||||
|
||||
@pytest.mark.parametrize("spec", exps)
|
||||
def test_train(spec):
|
||||
cfg_path = "/tmp/test.yaml"
|
||||
run = subprocess.run(["dgl", "config", spec.pipeline, "--data", spec.dataset, "--model", spec.model, "--cfg", cfg_path], timeout=spec.timeout, capture_output=True)
|
||||
assert run.stderr is None or len(run.stderr) == 0, "Found error message: {}".format(run.stderr)
|
||||
output = run.stdout.decode("utf-8")
|
||||
print(output)
|
||||
|
||||
run = subprocess.run(["dgl", "train", "--cfg", cfg_path], timeout=spec.timeout, capture_output=True)
|
||||
assert run.stderr is None or len(run.stderr) == 0, "Found error message: {}".format(run.stderr)
|
||||
output = run.stdout.decode("utf-8")
|
||||
print(output)
|
||||
|
||||
TEST_RECIPE_FOLDER = "my_recipes"
|
||||
|
||||
@pytest.fixture
|
||||
def setup_recipe_folder():
|
||||
run = subprocess.run(["dgl", "recipe", "copy", "--dir", TEST_RECIPE_FOLDER], timeout=15, capture_output=True)
|
||||
|
||||
@pytest.mark.parametrize("file", [str(f) for f in Path(TEST_RECIPE_FOLDER).glob("*.yaml")])
|
||||
def test_recipe(file, setup_recipe_folder):
|
||||
print("DGL enter train {}".format(file))
|
||||
try:
|
||||
run = subprocess.run(["dgl", "train", "--cfg", file], timeout=5, capture_output=True)
|
||||
sh_stdout, sh_stderr = run.stdout, run.stderr
|
||||
except subprocess.TimeoutExpired as e:
|
||||
sh_stdout = e.stdout
|
||||
sh_stderr = e.stderr
|
||||
if sh_stderr is not None and len(sh_stderr) != 0:
|
||||
error_str = sh_stderr.decode("utf-8")
|
||||
lines = error_str.split("\n")
|
||||
for line in lines:
|
||||
line = line.strip()
|
||||
if line.startswith("WARNING") or line.startswith("Aborted") or line.startswith("0%"):
|
||||
continue
|
||||
else:
|
||||
assert len(line) == 0, error_str
|
||||
print("{} stdout: {}".format(file, sh_stdout))
|
||||
print("{} stderr: {}".format(file, sh_stderr))
|
||||
|
||||
# test_recipe( , None)
|
||||
270
enter/README.md
270
enter/README.md
@@ -1,270 +0,0 @@
|
||||
# DGL-Enter
|
||||
|
||||
(What is DGL-Enter? Why design this? What is it for?)
|
||||
|
||||
DGL-Enter is a commanline tool for user to quickly bootstrap models with multiple datasets. And provide full capability for user to customize the pipeline into their own takks.
|
||||
|
||||
## Installation guide
|
||||
You can install DGL-enter easily by `pip install dglenter`. Then you should be able to use DGL-Enter in you commandline tool by type in `dgl-enter`
|
||||
```
|
||||
Usage: dgl-enter [OPTIONS] COMMAND [ARGS]...
|
||||
|
||||
Options:
|
||||
--help Show this message and exit.
|
||||
|
||||
Commands:
|
||||
config Generate the config files
|
||||
export Export the python file from config
|
||||
train Train the model
|
||||
```
|
||||
|
||||
|
||||
## Train GraphSAGE on Cora from scratch
|
||||
Here we'll use one of the most classic model GraphSAGE and Cora citation graph dataset as an example, to show how easy to train a model with DGL-Enter.
|
||||
### Step 1: Use `dgl-enter config` to generate a yaml configuration file
|
||||
Run `dgl-enter config nodepred --data cora --model sage --cfg cora_sage.yml`. Then you'll get a configuration file `cora_sage.yml` includes all the configuration to be tuned, with the comments
|
||||
|
||||
Optionally, You can change the config as you want to acheive a better performance. Below is a modified sample based on the template generated by the command above.
|
||||
The early stop part is removed for simplicity
|
||||
|
||||
```yaml
|
||||
version: 0.0.1
|
||||
pipeline_name: nodepred
|
||||
device: cpu
|
||||
data:
|
||||
name: cora
|
||||
split_ratio: # Ratio to generate split masks, for example set to [0.8, 0.1, 0.1] for 80% train/10% val/10% test. Leave blank to use builtin split in original dataset
|
||||
model:
|
||||
name: sage
|
||||
embed_size: -1 # The dimension of created embedding table. -1 means using original node embedding
|
||||
hidden_size: 16 # Hidden size.
|
||||
num_layers: 1 # Number of hidden layers.
|
||||
activation: relu # Activation function name under torch.nn.functional
|
||||
dropout: 0.5 # Dropout rate.
|
||||
aggregator_type: gcn # Aggregator type to use (``mean``, ``gcn``, ``pool``, ``lstm``).
|
||||
general_pipeline:
|
||||
num_epochs: 200 # Number of training epochs
|
||||
eval_period: 5 # Interval epochs between evaluations
|
||||
optimizer:
|
||||
name: Adam
|
||||
lr: 0.01
|
||||
weight_decay: 0.0005
|
||||
loss: CrossEntropyLoss
|
||||
num_runs: 1 # Number of experiments to run
|
||||
|
||||
```
|
||||
|
||||
### Step 2: Use `dgl-enter train` to initiate the training process.
|
||||
|
||||
Simply run `dgl-enter train --cfg cora_sage.yml` will start the training process
|
||||
```log
|
||||
...
|
||||
Epoch 00190 | Loss 1.5225 | TrainAcc 0.9500 | ValAcc 0.6840
|
||||
Epoch 00191 | Loss 1.5416 | TrainAcc 0.9357 | ValAcc 0.6840
|
||||
Epoch 00192 | Loss 1.5391 | TrainAcc 0.9357 | ValAcc 0.6840
|
||||
Epoch 00193 | Loss 1.5257 | TrainAcc 0.9643 | ValAcc 0.6840
|
||||
Epoch 00194 | Loss 1.5196 | TrainAcc 0.9286 | ValAcc 0.6840
|
||||
EarlyStopping counter: 12 out of 20
|
||||
Epoch 00195 | Loss 1.4862 | TrainAcc 0.9643 | ValAcc 0.6760
|
||||
Epoch 00196 | Loss 1.5142 | TrainAcc 0.9714 | ValAcc 0.6760
|
||||
Epoch 00197 | Loss 1.5145 | TrainAcc 0.9714 | ValAcc 0.6760
|
||||
Epoch 00198 | Loss 1.5174 | TrainAcc 0.9571 | ValAcc 0.6760
|
||||
Epoch 00199 | Loss 1.5235 | TrainAcc 0.9714 | ValAcc 0.6760
|
||||
Test Accuracy 0.7740
|
||||
Accuracy across 1 runs: 0.774 ± 0.0
|
||||
```
|
||||
|
||||
That's all! Basically you only need two line of command to train a graph neural network.
|
||||
## Debug your model and advanced customization
|
||||
|
||||
That's not everything yet. We belive you may want to change more than the configuration files, to change the training pipeline, calculate new metrics, or look into the code for details.
|
||||
DGL-Enter can export a self-contained, runnable python script for you to do anything you like.
|
||||
|
||||
Try `dgl-enter export --cfg cora_sage.yml --output script.py`, and you'll get the script used to train the model, like a magic!
|
||||
|
||||
Below
|
||||
```python
|
||||
...
|
||||
|
||||
def train(cfg, pipeline_cfg, device, data, model, optimizer, loss_fcn):
|
||||
g = data[0] # Only train on the first graph
|
||||
g = dgl.remove_self_loop(g)
|
||||
g = dgl.add_self_loop(g)
|
||||
g = g.to(device)
|
||||
|
||||
node_feat = g.ndata.get('feat', None)
|
||||
edge_feat = g.edata.get('feat', None)
|
||||
label = g.ndata['label']
|
||||
train_mask, val_mask, test_mask = g.ndata['train_mask'].bool(
|
||||
), g.ndata['val_mask'].bool(), g.ndata['test_mask'].bool()
|
||||
|
||||
val_acc = 0.
|
||||
for epoch in range(pipeline_cfg['num_epochs']):
|
||||
model.train()
|
||||
logits = model(g, node_feat, edge_feat)
|
||||
loss = loss_fcn(logits[train_mask], label[train_mask])
|
||||
|
||||
optimizer.zero_grad()
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
train_acc = accuracy(logits[train_mask], label[train_mask])
|
||||
if epoch != 0 and epoch % pipeline_cfg['eval_period'] == 0:
|
||||
val_acc = accuracy(logits[val_mask], label[val_mask])
|
||||
|
||||
print("Epoch {:05d} | Loss {:.4f} | TrainAcc {:.4f} | ValAcc {:.4f}".
|
||||
format(epoch, loss.item(), train_acc, val_acc))
|
||||
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
logits = model(g, node_feat, edge_feat)
|
||||
test_acc = accuracy(logits[test_mask], label[test_mask])
|
||||
return test_acc
|
||||
|
||||
|
||||
def main():
|
||||
cfg = {
|
||||
'version': '0.0.1',
|
||||
'device': 'cpu',
|
||||
'data': {
|
||||
'split_ratio': None},
|
||||
'model': {
|
||||
'embed_size': -1,
|
||||
'hidden_size': 16,
|
||||
'num_layers': 1,
|
||||
'activation': 'relu',
|
||||
'dropout': 0.5,
|
||||
'aggregator_type': 'gcn'},
|
||||
'general_pipeline': {
|
||||
'num_epochs': 200,
|
||||
'eval_period': 5,
|
||||
'optimizer': {
|
||||
'lr': 0.01,
|
||||
'weight_decay': 0.0005},
|
||||
'loss': 'CrossEntropyLoss',
|
||||
'num_runs': 1}}
|
||||
device = cfg['device']
|
||||
pipeline_cfg = cfg['general_pipeline']
|
||||
# load data
|
||||
data = AsNodePredDataset(CoraGraphDataset())
|
||||
# create model
|
||||
model_cfg = cfg["model"]
|
||||
cfg["model"]["data_info"] = {
|
||||
"in_size": model_cfg['embed_size'] if model_cfg['embed_size'] > 0 else data[0].ndata['feat'].shape[1],
|
||||
"out_size": data.num_classes,
|
||||
"num_nodes": data[0].num_nodes()
|
||||
}
|
||||
model = GraphSAGE(**cfg["model"])
|
||||
model = model.to(device)
|
||||
loss = torch.nn.CrossEntropyLoss()
|
||||
optimizer = torch.optim.Adam(
|
||||
model.parameters(),
|
||||
**pipeline_cfg["optimizer"])
|
||||
# train
|
||||
test_acc = train(cfg, pipeline_cfg, device, data, model, optimizer, loss)
|
||||
return test_acc
|
||||
|
||||
...
|
||||
|
||||
```
|
||||
|
||||
## Recipes
|
||||
|
||||
We've prepared a set of finetuned config under `enter/recipes`, that you can try easily to get a reproducable result.
|
||||
|
||||
For example, using GCN with pubmet dataset, you can use `enter/recipes/nodepred_pubmed_gcn.yml`.
|
||||
|
||||
To try it, type in `dgl-enter train --cfg recipes/nodepred_pubmed_gcn.yml` to train the model, or `dgl-enter export --cfg recipes/nodepred_pubmed_gcn.yml` to get the full training script.
|
||||
|
||||
## Use DGL-Enter on your own dataset
|
||||
You can modify the generated script in anyway you want. However, we also provided an end2end way to use your own dataset, by using our `CSVDataset`.
|
||||
|
||||
Step 1: Prepare your csv and metadata file.
|
||||
|
||||
Following the tutorial at [Loading data from CSV files](https://docs.dgl.ai/en/latest/guide/data-loadcsv.html#guide-data-pipeline-loadcsv`), Prepare your own CSV dataset includes three files minimally, node data csv, edge data csv and the meta data file (meta.yml).
|
||||
|
||||
```yml
|
||||
dataset_name: my_csv_dataset
|
||||
edge_data:
|
||||
- file_name: edges.csv
|
||||
node_data:
|
||||
- file_name: nodes.csv
|
||||
```
|
||||
|
||||
Step 2: Choose to csv dataset in the `dgl-enter config` stage
|
||||
Try `dgl-enter config nodepred --data csv --model sage --cfg csv_sage.yml`, to use SAGE model for your dataset. You'll see the data part is now the configuration related to CSV dataset. `data_path` is used to specify the data folder, and `./` means the current folder.
|
||||
|
||||
If your dataset doesn't have the builtin split on the nodes for train/val/test, you need to manually set the split ratio in the config yml file, DGL will random generate the split for you.
|
||||
|
||||
```yml
|
||||
data:
|
||||
name: csv
|
||||
split_ratio: # Ratio to generate split masks, for example set to [0.8, 0.1, 0.1] for 80% train/10% val/10% test. Leave blank to use builtin split in original dataset
|
||||
data_path: ./ # metadata.yaml, nodes.csv, edges.csv should in this folder
|
||||
```
|
||||
|
||||
|
||||
Step 3: `train` the model/`export` the script
|
||||
Then you can do the same as the tutorial above, either train the model by `dgl-eneter train --cfg csv_sage.yaml` or use `dgl-enter export --cfg csv_sage.yml --output my_dataset.py` to get the training script.
|
||||
|
||||
## API Referencce
|
||||
|
||||
DGL enter is a new tool for user to bootstrap datasets and common models.
|
||||
|
||||
The entry point of enter is `dgl-enter`, and it has three subcommand `config`, `train` and `export`.
|
||||
|
||||
### Config
|
||||
The config stage is to generate a configuration file on the specific pipeline.
|
||||
|
||||
`dgl-enter` currently provides 3 pipelines:
|
||||
- nodepred (Node prediction tasks, suitable for small dataset to prototype)
|
||||
- nodepred-ns (Node prediction tasks with sampling method, suitable for medium and large dataset)
|
||||
- linkpred (Link prediction tasks, to predict whether edge exists among node pairs based on node features)
|
||||
|
||||
You can get the full list by `dgl-enter config --help`
|
||||
```
|
||||
Usage: dgl-enter config [OPTIONS] COMMAND [ARGS]...
|
||||
|
||||
Generate the config files
|
||||
|
||||
Options:
|
||||
--help Show this message and exit.
|
||||
|
||||
Commands:
|
||||
linkpred Link prediction pipeline
|
||||
nodepred Node classification pipeline
|
||||
nodepred-ns Node classification sampling pipeline
|
||||
```
|
||||
|
||||
For each pipeline it will have diffirent options to specified. For example, for node prediction pipeline, you can do `dgl-enter config nodepred --help`, you'll get:
|
||||
```
|
||||
Usage: dgl-enter config nodepred [OPTIONS]
|
||||
|
||||
Node classification pipeline
|
||||
|
||||
Options:
|
||||
--data [cora|citeseer|ogbl-collab|csv|reddit|co-buy-computer]
|
||||
input data name [required]
|
||||
--cfg TEXT output configuration path [default:
|
||||
cfg.yml]
|
||||
--model [gcn|gat|sage|sgc|gin] Model name [required]
|
||||
--device [cpu|cuda] Device, cpu or cuda [default: cpu]
|
||||
--help Show this message and exit.
|
||||
```
|
||||
|
||||
You can always get the detailed help information by adding `--help` to the command line
|
||||
|
||||
### Train
|
||||
You can train a model on the dataset based on the configuration file generated by `dgl-enter config`, by `dgl-enter train`.
|
||||
```
|
||||
Usage: dgl-enter train [OPTIONS]
|
||||
|
||||
Train the model
|
||||
|
||||
Options:
|
||||
--cfg TEXT yaml file name [default: cfg.yml]
|
||||
--help Show this message and exit.
|
||||
```
|
||||
|
||||
### Export
|
||||
Get the self-contained, runnable python script derived from the configuration file by `dgl-enter export`.
|
||||
@@ -1,18 +0,0 @@
|
||||
import typer
|
||||
from ..pipeline import *
|
||||
from ..model import *
|
||||
from .config_cli import config_app
|
||||
from .train_cli import train
|
||||
from .export_cli import export
|
||||
|
||||
no_args_is_help = False
|
||||
app = typer.Typer(no_args_is_help=no_args_is_help, add_completion=False)
|
||||
app.add_typer(config_app, name="config", no_args_is_help=no_args_is_help)
|
||||
app.command(help="Train the model", no_args_is_help=no_args_is_help)(train)
|
||||
app.command(help="Export the python file from config", no_args_is_help=no_args_is_help)(export)
|
||||
|
||||
def main():
|
||||
app()
|
||||
|
||||
if __name__ == "__main__":
|
||||
app()
|
||||
Reference in New Issue
Block a user