Files
dgl/docs/source/api/python/data.rst
VoVAllen 0fb13f7b9d [Feature] Data format (#728)
* Add serialization

* add serialization

* add serialization

* lalalalalalalala

* lalalalalalalala

* serialize

* serialize

* nnn

* WIP: import tvm runtime node system

* WIP: object system

* containers

* tested basic container composition

* tested custom object

* tmp

* fix setattr bug

* tested object container return

* fix lint

* some comments about get/set state

* fix lint

* fix lint

* update cython

* fix cython

* ffi doc

* fix doc

* WIP: using object system for graph

* c++ side refactoring done; compiled

* remove stale apis

* fix bug in DGLGraphCreate; passed test_graph.py

* fix bug in python modify; passed utest for pytorch/cpu

* fix lint

* Add serialization

* Add serialization

* fix

* fix typo

* serialize with new ffi

* commit

* commit

* commit

* save

* save

* save

* save

* commit

* clean

* Delete tt2.py

* fix lint

* Add serialization

* fix lint 2

* fix lint

* fix lint

* fix lint

* fix lint

* Fix Lint

* Add serialization

* Change to Macro

* fix

* fix

* fix bugs

* refactor

* refactor

* updating dmlc-core to include force flag

* trying tempfile

* delete leaked pointer

* Fix assert

* fix assert

* add comment and test case

* add graph labels

* add load labels

* lint

* lint

* add graph labels

* lint

*  fix windows

* fix

* update dmlc-core to latest

* fix

* fix camel naming
2019-09-09 20:57:51 +08:00

118 lines
2.6 KiB
ReStructuredText

.. _apidata:
Dataset
=======
.. currentmodule:: dgl.data
Utils
-----
.. autosummary::
:toctree: ../../generated/
utils.get_download_dir
utils.download
utils.check_sha1
utils.extract_archive
utils.split_dataset
utils.save_graphs
utils.load_graphs
utils.load_labels
.. autoclass:: dgl.data.utils.Subset
:members: __getitem__, __len__
Dataset Classes
---------------
Stanford sentiment treebank dataset
```````````````````````````````````
For more information about the dataset, see `Sentiment Analysis <https://nlp.stanford.edu/sentiment/index.html>`__.
.. autoclass:: SST
:members: __getitem__, __len__
Mini graph classification dataset
`````````````````````````````````
.. autoclass:: MiniGCDataset
:members: __getitem__, __len__, num_classes
Graph kernel dataset
````````````````````
For more information about the dataset, see `Benchmark Data Sets for Graph Kernels <https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets>`__.
.. autoclass:: TUDataset
:members: __getitem__, __len__
Graph isomorphism network dataset
```````````````````````````````````
A compact subset of graph kernel dataset
.. autoclass:: GINDataset
:members: __getitem__, __len__
Protein-Protein Interaction dataset
```````````````````````````````````
.. autoclass:: PPIDataset
:members: __getitem__, __len__
Molecular Graphs
----------------
To work on molecular graphs, make sure you have installed `RDKit 2018.09.3 <https://www.rdkit.org/docs/Install.html>`__.
Featurization
`````````````
For the use of graph neural networks, we need to featurize nodes (atoms) and edges (bonds). Below we list some
featurization methods/utilities:
.. autosummary::
:toctree: ../../generated/
chem.one_hot_encoding
chem.BaseAtomFeaturizer
chem.CanonicalAtomFeaturizer
Graph Construction
``````````````````
Several methods for constructing DGLGraphs from SMILES/RDKit molecule objects are listed below:
.. autosummary::
:toctree: ../../generated/
chem.mol_to_graph
chem.smile_to_bigraph
chem.mol_to_bigraph
chem.smile_to_complete_graph
chem.mol_to_complete_graph
Dataset Classes
```````````````
If your dataset is stored in a ``.csv`` file, you may find it helpful to use
.. autoclass:: dgl.data.chem.CSVDataset
:members: __getitem__, __len__
Currently two datasets are supported:
* Tox21
* TencentAlchemyDataset
.. autoclass:: dgl.data.chem.Tox21
:members: __getitem__, __len__, task_pos_weights
.. autoclass:: dgl.data.chem.TencentAlchemyDataset
:members: __getitem__, __len__, set_mean_and_std