Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dgl
Commits
d3560b71
Unverified
Commit
d3560b71
authored
Apr 02, 2020
by
Mufei Li
Committed by
GitHub
Apr 02, 2020
Browse files
[DGL-LifeSci] Documentation (#1414)
* Update * Update * Update
parent
7e0893e6
Changes
30
Show whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
535 additions
and
188 deletions
+535
-188
apps/life_sci/README.md
apps/life_sci/README.md
+2
-0
apps/life_sci/docs/source/api/data.rst
apps/life_sci/docs/source/api/data.rst
+41
-27
apps/life_sci/docs/source/api/model.gnn.rst
apps/life_sci/docs/source/api/model.gnn.rst
+21
-18
apps/life_sci/docs/source/api/model.pretrain.rst
apps/life_sci/docs/source/api/model.pretrain.rst
+66
-0
apps/life_sci/docs/source/api/model.readout.rst
apps/life_sci/docs/source/api/model.readout.rst
+16
-13
apps/life_sci/docs/source/api/model.rst
apps/life_sci/docs/source/api/model.rst
+0
-8
apps/life_sci/docs/source/api/model.zoo.rst
apps/life_sci/docs/source/api/model.zoo.rst
+49
-33
apps/life_sci/docs/source/api/utils.complexes.rst
apps/life_sci/docs/source/api/utils.complexes.rst
+11
-0
apps/life_sci/docs/source/api/utils.mols.rst
apps/life_sci/docs/source/api/utils.mols.rst
+146
-0
apps/life_sci/docs/source/api/utils.pipeline.rst
apps/life_sci/docs/source/api/utils.pipeline.rst
+24
-0
apps/life_sci/docs/source/api/utils.rst
apps/life_sci/docs/source/api/utils.rst
+0
-47
apps/life_sci/docs/source/api/utils.splitters.rst
apps/life_sci/docs/source/api/utils.splitters.rst
+39
-0
apps/life_sci/docs/source/conf.py
apps/life_sci/docs/source/conf.py
+1
-1
apps/life_sci/docs/source/get_started.rst
apps/life_sci/docs/source/get_started.rst
+0
-2
apps/life_sci/docs/source/index.rst
apps/life_sci/docs/source/index.rst
+31
-16
apps/life_sci/docs/source/install/index.rst
apps/life_sci/docs/source/install/index.rst
+64
-0
apps/life_sci/python/dgllife/data/alchemy.py
apps/life_sci/python/dgllife/data/alchemy.py
+10
-8
apps/life_sci/python/dgllife/data/csv_dataset.py
apps/life_sci/python/dgllife/data/csv_dataset.py
+8
-10
apps/life_sci/python/dgllife/data/pdbbind.py
apps/life_sci/python/dgllife/data/pdbbind.py
+5
-4
apps/life_sci/python/dgllife/data/pubchem_aromaticity.py
apps/life_sci/python/dgllife/data/pubchem_aromaticity.py
+1
-1
No files found.
apps/life_sci/README.md
View file @
d3560b71
# DGL-LifeSci
[
Documentation
](
https://lifesci.dgl.ai/index.html
)
|
[
Discussion Forum
](
https://discuss.dgl.ai
)
## Introduction
Deep learning on graphs has been an arising trend in the past few years. There are a lot of graphs in
...
...
apps/life_sci/docs/source/api/data.rst
View file @
d3560b71
.. _apidata:
dgllife.data
========
====
Datasets
========
TBD by Murphy
.. contents:: Contents
:local:
dgllife.data.alchemy
--------------------
Molecular Property Prediction
--------------------
---------
.. automodule:: dgllife.data.alchemy
:members:
Tox21
`````
.. autoclass:: dgllife.data.Tox21
:members: task_pos_weights, __getitem__, __len__
:show-inheritance:
dgllife.data.csv_dataset
------------------------
Alchemy for Quantum Chemistry
`````````````````````````````
.. auto
module
:: dgllife.data.
csv_d
ataset
:members:
.. auto
class
:: dgllife.data.
TencentAlchemyD
ataset
:members:
set_mean_and_std, __getitem__, __len__
Pubmed Aromaticity
``````````````````
dgllife.data.pdbbind
---------------------
.. autoclass:: dgllife.data.PubChemBioAssayAromaticity
:members: __getitem__, __len__
:show-inheritance:
.. automodule:: dgllife.data.pdbbind
:members:
Adapting to New Datasets with CSV
`````````````````````````````````
.. autoclass:: dgllife.data.MoleculeCSVDataset
:members: __getitem__, __len__
dgllife.data.pubchem_aromaticity
-------------------
--------------
Reaction Prediction
-------------------
.. automodule:: dgllife.data.pubchem_aromaticity
:members:
USPTO
`````
.. autoclass:: dgllife.data.USPTO
:members: __getitem__, __len__
:show-inheritance:
dgllife.data.tox21
---------------------------------
Adapting to New Datasets for Weisfeiler-Lehman Networks
```````````````````````````````````````````````````````
.. auto
module
:: dgllife.data.
tox21
:members:
.. auto
class
:: dgllife.data.
WLNReactionDataset
:members:
__getitem__, __len__
Protein-Ligand Binding Affinity Prediction
------------------------------------------
dgllife.data.uspto
---------------------------------
PDBBind
```````
.. auto
module
:: dgllife.data.
uspto
:members:
.. auto
class
:: dgllife.data.
PDBBind
:members:
__getitem__, __len__
apps/life_sci/docs/source/api/model.gnn.rst
View file @
d3560b71
.. _apimodelgnn:
dgllife.model.gnn
==================
Graph Neural Networks for Updating Node/Edge Representations
==================
==========================================
TBD by Murphy
All models based on graph neural networks start with updating node/edge representations.
We introduce various GNN models implemented in DGL-LifeSci for representation update.
dgllife.model.gnn.attentivefp
-------------------------------------------
.. contents:: Contents
:local:
AttentiveFP
-----------
.. automodule:: dgllife.model.gnn.attentivefp
:members:
dgllife.model.gnn.gat
----------------------------
GAT
---
.. automodule:: dgllife.model.gnn.gat
:members:
dgllife.model.gnn.gcn
---
-------------------------
GCN
---
.. automodule:: dgllife.model.gnn.gcn
:members:
dgllife.model.gnn.mgcn
----
------------------------
MGCN
----
.. automodule:: dgllife.model.gnn.mgcn
:members:
dgllife.model.gnn.mpnn
----
------------------------
MPNN
----
.. automodule:: dgllife.model.gnn.mpnn
:members:
dgllife.model.gnn.s
ch
n
et
------
----------------------
S
ch
N
et
------
.. automodule:: dgllife.model.gnn.schnet
:members:
dgllife.model.gnn.wln
---
-------------------------
WLN
---
.. automodule:: dgllife.model.gnn.wln
:members:
apps/life_sci/docs/source/api/model.pretrain.rst
0 → 100644
View file @
d3560b71
..
_apimodelpretrain
:
Pre
-
trained
Models
==================
We
provide
multiple
pre
-
trained
models
for
users
to
use
without
the
need
of
training
from
scratch
.
Example
Usage
-------------
Property
Prediction
```````````````````
..
code
-
block
::
python
from
dgllife
.
data
import
Tox21
from
dgllife
.
model
import
load_pretrained
from
dgllife
.
utils
import
smiles_to_bigraph
,
CanonicalAtomFeaturizer
dataset
=
Tox21
(
smiles_to_bigraph
,
CanonicalAtomFeaturizer
())
model
=
load_pretrained
(
'GCN_Tox21'
)
#
Pretrained
model
loaded
model
.
eval
()
smiles
,
g
,
label
,
mask
=
dataset
[
0
]
feats
=
g
.
ndata
.
pop
(
'h'
)
label_pred
=
model
(
g
,
feats
)
print
(
smiles
)
#
CCOc1ccc2nc
(
S
(
N
)(=
O
)=
O
)
sc2c1
print
(
label_pred
[:,
mask
!= 0]) # Mask non-existing labels
#
tensor
([[
1.4190
,
-
0.1820
,
1.2974
,
1.4416
,
0.6914
,
#
2.0957
,
0.5919
,
0.7715
,
1.7273
,
0.2070
]])
Generative
Models
..
code
-
block
::
python
from
dgllife
.
model
import
load_pretrained
model
=
load_pretrained
(
'DGMG_ZINC_canonical'
)
model
.
eval
()
smiles
=
[]
for
i
in
range
(
4
):
smiles
.
append
(
model
(
rdkit_mol
=
True
))
print
(
smiles
)
#
[
'CC1CCC2C(CCC3C2C(NC2=CC(Cl)=CC=C2N)S3(=O)=O)O1'
,
#
'O=C1SC2N=CN=C(NC(SC3=CC=CC=N3)C1=CC=CO)C=2C1=CCCC1'
,
#
'CC1C=CC(=CC=1)C(=O)NN=C(C)C1=CC=CC2=CC=CC=C21'
,
#
'CCN(CC1=CC=CC=C1F)CC1CCCN(C)C1'
]
If
you
are
running
the
code
block
above
in
Jupyter
notebook
,
you
can
also
visualize
the
molecules
generated
with
..
code
-
block
::
python
from
IPython
.
display
import
SVG
from
rdkit
import
Chem
from
rdkit
.
Chem
import
Draw
mols
=
[
Chem
.
MolFromSmiles
(
s
)
for
s
in
smiles
]
SVG
(
Draw
.
MolsToGridImage
(
mols
,
molsPerRow
=
4
,
subImgSize
=(
180
,
150
),
useSVG
=
True
))
..
image
::
https
://
data
.
dgl
.
ai
/
dgllife
/
dgmg
/
dgmg_model_zoo_example2
.
png
API
---
..
autofunction
::
dgllife
.
model
.
load_pretrained
apps/life_sci/docs/source/api/model.readout.rst
View file @
d3560b71
.. _apimodelreadout:
dgllife.model.readout
========================
Readout for Computing Graph Representations
========================
===================
TBD by Murphy
After updating node/edge representations with graph neural networks (GNNs), a common operation is to compute
graph representations out of updated node/edge representations. For example, we need to compute molecular
representations out of atom/bond representations in molecular property prediction. We call the various modules
for computing graph-level representations **readout** as in Neural Message Passing for Quantum Chemistry and this
section lists the readout modules implemented in DGL-LifeSci.
dgllife.model.readout.attentivefp_readout
------------------------------------------
.. contents:: Contents
:local:
AttentiveFP Readout
-------------------
.. automodule:: dgllife.model.readout.attentivefp_readout
:members:
dgllife.model.readout.mlp_readout
------------------------------------------
MLP Readout
-----------
.. automodule:: dgllife.model.readout.mlp_readout
:members:
dgllife.model.readout.weighted_sum_and_max
--------------------------------------------
Weighted Sum and Max Readout
----------------------------
.. automodule:: dgllife.model.readout.weighted_sum_and_max
:members:
apps/life_sci/docs/source/api/model.rst
deleted
100644 → 0
View file @
7e0893e6
.. _apimodelgnn:
dgllife.model
==================
TBD by Murphy
.. autofunction:: dgllife.model.load_pretrained
apps/life_sci/docs/source/api/model.zoo.rst
View file @
d3560b71
.. _apimodelzoo:
dgllife.model.m
odel
_z
oo
=========
====================
M
odel
Z
oo
=========
T
BD by Murphy
T
his section introduces complete models for various downstream tasks.
dgllife.model.model_zoo.attentivefp_predictor
-----------------------------------------------
.. contents:: Contents
:local:
Building Blocks
---------------
MLP Predictor
`````````````
.. automodule:: dgllife.model.model_zoo.mlp_predictor
:members:
Molecular Property Prediction
-----------------------------
AttentiveFP Predictor
`````````````````````
.. automodule:: dgllife.model.model_zoo.attentivefp_predictor
:members:
dgllife.model.model_zoo.gat_p
redictor
-------------------------------------------
GAT P
redictor
`````````````
.. automodule:: dgllife.model.model_zoo.gat_predictor
:members:
dgllife.model.model_zoo.gcn_p
redictor
-------------------------------------------
GCN P
redictor
`````````````
.. automodule:: dgllife.model.model_zoo.gcn_predictor
:members:
dgllife.model.model_zoo.mgcn_p
redictor
-------------------------------------------
MGCN P
redictor
``````````````
.. automodule:: dgllife.model.model_zoo.mgcn_predictor
:members:
dgllife.model.model_zoo.mlp_predictor
-------------------------------------------
.. automodule:: dgllife.model.model_zoo.mlp_predictor
:members:
dgllife.model.model_zoo.mpnn_predictor
-------------------------------------------
MPNN Predictor
``````````````
.. automodule:: dgllife.model.model_zoo.mpnn_predictor
:members:
dgllife.model.model_zoo.s
ch
n
et
_p
redictor
-------------------------------------------
S
ch
N
et
P
redictor
````````````````
.. automodule:: dgllife.model.model_zoo.schnet_predictor
:members:
dgllife.model.model_zoo.wln_reaction_center
-------------------------------------------
.. automodule:: dgllife.model.model_zoo.wln_reaction_center
Generative Models
-----------------
DGMG
````
.. automodule:: dgllife.model.model_zoo.dgmg
:members:
dgllife.model.model_zoo.acnn
-------------------------------------------
.. auto
module
:: dgllife.model.model_zoo.
acnn
JTNN
````
.. auto
class
:: dgllife.model.model_zoo.
jtnn.DGLJTNNVAE
:members:
dgllife.model.model_zoo.dgmg
-------------------------------------------
.. automodule:: dgllife.model.model_zoo.dgmg
Reaction Prediction
WLN for Reaction Center Prediction
``````````````````````````````````
.. automodule:: dgllife.model.model_zoo.wln_reaction_center
:members:
dgllife.model.model_zoo.jtnn
-------------------------------------------
.. autoclass:: dgllife.model.model_zoo.jtnn.DGLJTNNVAE
Protein-Ligand Binding Affinity Prediction
ACNN
````
.. automodule:: dgllife.model.model_zoo.acnn
:members:
\ No newline at end of file
apps/life_sci/docs/source/api/utils.complexes.rst
0 → 100644
View file @
d3560b71
.. _apiutilscomplexes:
Utils for protein-ligand complexes
==================================
Utilities in DGL-LifeSci for working with protein-ligand complexes.
.. autosummary::
:toctree: ../generated/
dgllife.utils.ACNN_graph_construction_and_featurization
apps/life_sci/docs/source/api/utils.mols.rst
0 → 100644
View file @
d3560b71
.. _apiutilsmols:
Utils for Molecules
===================
Utilities in DGL-LifeSci for working with molecules.
RDKit Utils
-----------
RDKit utils for loading molecules and accessing their information.
.. autosummary::
:toctree: ../generated/
dgllife.utils.get_mol_3d_coordinates
dgllife.utils.load_molecule
dgllife.utils.multiprocess_load_molecules
Graph Construction
------------------
The modeling of graph neural networks starts with constructing appropriate graph topologies. We provide
three common graph constructions:
* ``bigraph``: Bi-directed graphs corresponding exactly to molecular graphs
* ``complete_graph``: Graphs with all pairs of atoms connected
* ``nearest_neighbor_graph``: Graphs where each atom is connected to its closest (k) atoms based on molecule coordinates
.. autosummary::
:toctree: ../generated/
dgllife.utils.mol_to_graph
dgllife.utils.smiles_to_bigraph
dgllife.utils.mol_to_bigraph
dgllife.utils.smiles_to_complete_graph
dgllife.utils.mol_to_complete_graph
dgllife.utils.k_nearest_neighbors
dgllife.utils.mol_to_nearest_neighbor_graph
dgllife.utils.smiles_to_nearest_neighbor_graph
Featurization for Molecules
---------------------------
To apply graph neural networks, we need to prepare node and edge features for molecules. Intuitively,
they can be developed based on various descriptors (features) of atoms/bonds/molecules. Particularly, we can
work with numerical descriptors directly or use ``one_hot_encoding`` for categorical descriptors. When using
multiple descriptors together, we can simply concatenate them with ``ConcatFeaturizer``.
General Utils
```````````
.. autosummary::
:toctree: ../generated/
dgllife.utils.one_hot_encoding
dgllife.utils.ConcatFeaturizer
Featurization for Nodes
```````````````````````
We consider the following atom descriptors:
* type/atomic number
* degree (excluding neighboring hydrogen atoms)
* total degree (including neighboring hydrogen atoms)
* explicit valence
* implicit valence
* hybridization
* total number of neighboring hydrogen atoms
* formal charge
* number of radical electrons
* aromatic atom
* ring membership
* chirality
* mass
We can employ their numerical values directly or with one-hot encoding.
.. autosummary::
:toctree: ../generated/
dgllife.utils.atom_type_one_hot
dgllife.utils.atomic_number_one_hot
dgllife.utils.atomic_number
dgllife.utils.atom_degree_one_hot
dgllife.utils.atom_degree
dgllife.utils.atom_total_degree_one_hot
dgllife.utils.atom_total_degree
dgllife.utils.atom_explicit_valence_one_hot
dgllife.utils.atom_explicit_valence
dgllife.utils.atom_implicit_valence_one_hot
dgllife.utils.atom_implicit_valence
dgllife.utils.atom_hybridization_one_hot
dgllife.utils.atom_total_num_H_one_hot
dgllife.utils.atom_total_num_H
dgllife.utils.atom_formal_charge_one_hot
dgllife.utils.atom_formal_charge
dgllife.utils.atom_num_radical_electrons_one_hot
dgllife.utils.atom_num_radical_electrons
dgllife.utils.atom_is_aromatic_one_hot
dgllife.utils.atom_is_aromatic
dgllife.utils.atom_is_in_ring_one_hot
dgllife.utils.atom_is_in_ring
dgllife.utils.atom_chiral_tag_one_hot
dgllife.utils.atom_mass
For using featurization methods like above in creating node features:
.. autosummary::
:toctree: ../generated/
dgllife.utils.BaseAtomFeaturizer
dgllife.utils.BaseAtomFeaturizer.feat_size
dgllife.utils.CanonicalAtomFeaturizer
dgllife.utils.CanonicalAtomFeaturizer.feat_size
Featurization for Edges
```````````````````````
We consider the following bond descriptors:
* type
* conjugated bond
* ring membership
* stereo configuration
.. autosummary::
:toctree: ../generated/
dgllife.utils.bond_type_one_hot
dgllife.utils.bond_is_conjugated_one_hot
dgllife.utils.bond_is_conjugated
dgllife.utils.bond_is_in_ring_one_hot
dgllife.utils.bond_is_in_ring
dgllife.utils.bond_stereo_one_hot
For using featurization methods like above in creating edge features:
.. autosummary::
:toctree: ../generated/
dgllife.utils.BaseBondFeaturizer
dgllife.utils.BaseBondFeaturizer.feat_size
dgllife.utils.CanonicalBondFeaturizer
dgllife.utils.CanonicalBondFeaturizer.feat_size
apps/life_sci/docs/source/api/utils.pipeline.rst
0 → 100644
View file @
d3560b71
.. _apiutilspipeline:
Model Development Pipeline
==========================
.. contents:: Contents
:local:
Model Evaluation
----------------
A utility class for evaluating model performance on (multi-label) supervised learning.
.. autoclass:: dgllife.utils.Meter
:members: update, compute_metric
Early Stopping
--------------
Early stopping is a standard practice for preventing models from overfitting and we provide a utility
class for handling it.
.. autoclass:: dgllife.utils.EarlyStopping
:members:
apps/life_sci/docs/source/api/utils.rst
deleted
100644 → 0
View file @
7e0893e6
.. _apiutils:
dgllife.utils
==================
TBD by Murphy
dgllife.utils.complex_to_graph
-------------------------------------------
.. automodule:: dgllife.utils.complex_to_graph
:members:
dgllife.utils.early_stop
-------------------------------------------
.. automodule:: dgllife.utils.early_stop
:members:
dgllife.utils.eval
-------------------------------------------
.. automodule:: dgllife.utils.eval
:members:
dgllife.utils.featurizers
-------------------------------------------
.. automodule:: dgllife.utils.featurizers
:members:
dgllife.utils.mol_to_graph
-------------------------------------------
.. automodule:: dgllife.utils.mol_to_graph
:members:
dgllife.utils.rdkit_utils
-------------------------------------------
.. automodule:: dgllife.utils.rdkit_utils
:members:
dgllife.utils.splitters
-------------------------------------------
.. automodule:: dgllife.utils.splitters
:members:
apps/life_sci/docs/source/api/utils.splitters.rst
0 → 100644
View file @
d3560b71
.. _apiutilssplitters:
Splitting Datasets
==================
We provide multiple splitting methods for datasets.
.. contents:: Contents
:local:
ConsecutiveSplitter
-------------------
.. autoclass:: dgllife.utils.ConsecutiveSplitter
:members: train_val_test_split, k_fold_split
RandomSplitter
--------------
.. autoclass:: dgllife.utils.RandomSplitter
:members: train_val_test_split, k_fold_split
MolecularWeightSplitter
-----------------------
.. autoclass:: dgllife.utils.MolecularWeightSplitter
:members: train_val_test_split, k_fold_split
ScaffoldSplitter
----------------
.. autoclass:: dgllife.utils.ScaffoldSplitter
:members: train_val_test_split, k_fold_split
SingleTaskStratifiedSplitter
----------------------------
.. autoclass:: dgllife.utils.SingleTaskStratifiedSplitter
:members: train_val_test_split, k_fold_split
apps/life_sci/docs/source/conf.py
View file @
d3560b71
...
...
@@ -157,7 +157,7 @@ man_pages = [
# dir menu entry, description, category)
texinfo_documents
=
[
(
master_doc
,
'dgllife'
,
'DGL-LifeSci Documentation'
,
author
,
'dgllife'
,
'Application library for
XXXXXXXXXXXXXXXX
.'
,
author
,
'dgllife'
,
'Application library for
life science
.'
,
'Miscellaneous'
),
]
...
...
apps/life_sci/docs/source/get_started.rst
deleted
100644 → 0
View file @
7e0893e6
Get Started
===========
apps/life_sci/docs/source/index.rst
View file @
d3560b71
DGL-LifeSci:
A GNN Package for Chemistry and Molecular Applications
DGL-LifeSci:
Bringing Graph Neural Networks to Chemistry and Biology
===========================================================================================
Blahlah ...
DGL-LifeSci is a python package for applying graph neural networks to various tasks in chemistry
and biology, on top of PyTorch and DGL. It provides:
Get Started
------------
* Various utilities for data processing, training and evaluation.
* Efficient and flexible model implementations.
* Pre-trained models for use without training from scratch.
You could borrow some from the README page.
We cover various applications in our
`examples <https://github.com/dmlc/dgl/tree/master/apps/life_sci/examples>`_, including:
API Reference
---------------
* `Molecular property prediction <https://github.com/dmlc/dgl/tree/master/apps/life_sci/examples/property_prediction>`_
* `Generative models <https://github.com/dmlc/dgl/tree/master/apps/life_sci/examples/generative_models>`_
* `Protein-ligand binding affinity prediction <https://github.com/dmlc/dgl/tree/master/apps/life_sci/examples/binding_affinity_prediction>`_
* `Reaction prediction <https://github.com/dmlc/dgl/tree/master/apps/life_sci/examples/reaction_prediction>`_
The highest level breakdown. What are the APIs for?
Get Started
------------
Follow the :doc:`instructions<install/index>` to install DGL.
.. toctree::
:maxdepth: 1
:caption:
Get Started
:caption:
Installation
:hidden:
:glob:
get_started
install/index
.. toctree::
:maxdepth: 2
...
...
@@ -28,12 +35,20 @@ The highest level breakdown. What are the APIs for?
:hidden:
:glob:
api/utils.mols
api/utils.splitters
api/utils.pipeline
api/utils.complexes
api/data
api/model
api/model
.pretrain
api/model.gnn
api/model.zoo
api/model.readout
api/utils
api/model.zoo
Free software
-------------
DGL-LifeSci is free software; you can redistribute it and/or modify it under the terms
of the Apache License 2.0. We welcome contributions. Join us on `GitHub <https://github.com/dmlc/dgl/tree/master/apps/life_sci>`_.
Index
-----
...
...
apps/life_sci/docs/source/install/index.rst
0 → 100644
View file @
d3560b71
Install DGL-LifeSci
===================
This topic explains how to install DGL-LifeSci. We recommend installing DGL-LifeSci by using ``conda`` or ``pip``.
System requirements
-------------------
DGL-LifeSci works with the following operating systems:
* Ubuntu 16.04
* macOS X
* Windows 10
DGL-LifeSci requires:
* Python 3.6 or later
* `DGL 0.4.3 or later <https://www.dgl.ai/pages/start.html>`_
* `PyTorch 1.2.0 or later <https://pytorch.org/>`_
If you have just installed DGL, the first time you use it, a message will pop up as follows:
.. code:: bash
DGL does not detect a valid backend option. Which backend would you like to work with?
Backend choice (pytorch, mxnet or tensorflow):
and you need to enter ``pytorch``.
Additionally, we require **RDKit 2018.09.3** for cheminformatics. We recommend installing it with
.. code:: bash
conda install -c conda-forge rdkit==2018.09.3
Other verions of RDKit are not tested.
Install from conda
----------------------
If ``conda`` is not yet installed, get either `miniconda <https://conda.io/miniconda.html>`_ or
the full `anaconda <https://www.anaconda.com/download/>`_.
.. code:: bash
conda install -c dglteam dgllife
Install from pip
----------------
.. code:: bash
pip install dgllife
.. _install-from-source:
Install from source
-------------------
To use the latest experimental features,
.. code:: bash
git clone https://github.com/dmlc/dgl.git
cd apps/life_sci/python
python setup.py install
apps/life_sci/python/dgllife/data/alchemy.py
View file @
d3560b71
...
...
@@ -259,30 +259,32 @@ class TencentAlchemyDataset(object):
SMILES for the ith datapoint
DGLGraph
DGLGraph for the ith datapoint
Tensor of dtype float32
Labels of the datapoint for all tasks
Tensor of dtype float32
and shape (T)
Labels of the datapoint for all tasks
.
"""
return
self
.
smiles
[
item
],
self
.
graphs
[
item
],
self
.
labels
[
item
]
def
__len__
(
self
):
"""
Length of
the dataset
"""
Size for
the dataset
.
Returns
-------
int
Length of D
ataset
Size for the d
ataset
.
"""
return
len
(
self
.
graphs
)
def
set_mean_and_std
(
self
,
mean
=
None
,
std
=
None
):
"""Set mean and std or compute from labels for future normalization.
The mean and std can be fetched later with ``self.mean`` and ``self.std``.
Parameters
----------
mean :
int or float
Default to be None
.
std :
int or float
Default to be None
.
mean :
float32 tensor of shape (T)
Mean of labels for all tasks
.
std :
float32 tensor of shape (T)
Std of labels for all tasks
.
"""
labels
=
np
.
array
([
i
.
numpy
()
for
i
in
self
.
labels
])
if
mean
is
None
:
...
...
apps/life_sci/python/dgllife/data/csv_dataset.py
View file @
d3560b71
...
...
@@ -10,10 +10,9 @@ __all__ = ['MoleculeCSVDataset']
class
MoleculeCSVDataset
(
object
):
"""MoleculeCSVDataset
This is a general class for loading molecular data from pandas.DataFrame.
This is a general class for loading molecular data from
:class:`
pandas.DataFrame
`
.
In data pre-processing, we set non-existing labels to be 0,
and returning mask with 1 where label exists.
In data pre-processing, we construct a binary mask indicating the existence of labels.
All molecules are converted into DGLGraphs. After the first-time construction, the
DGLGraphs can be saved for reloading so that we do not need to reconstruct them every time.
...
...
@@ -22,8 +21,7 @@ class MoleculeCSVDataset(object):
----------
df: pandas.DataFrame
Dataframe including smiles and labels. Can be loaded by pandas.read_csv(file_path).
One column includes smiles and other columns for labels.
Column names other than smiles column would be considered as task names.
One column includes smiles and some other columns include labels.
smiles_to_graph: callable, str -> DGLGraph
A function turning a SMILES into a DGLGraph.
node_featurizer : callable, rdkit.Chem.rdchem.Mol -> dict
...
...
@@ -33,7 +31,7 @@ class MoleculeCSVDataset(object):
Featurization for edges like bonds in a molecule, which can be used to update
edata for a DGLGraph.
smiles_column: str
Column name that including smiles.
Column name that including smiles
in ``df``
.
cache_file_path: str
Path to store the preprocessed DGLGraphs. For example, this can be ``'dglgraph.bin'``.
task_names : list of str or None
...
...
@@ -118,19 +116,19 @@ class MoleculeCSVDataset(object):
SMILES for the ith datapoint
DGLGraph
DGLGraph for the ith datapoint
Tensor of dtype float32
Tensor of dtype float32
and shape (T)
Labels of the datapoint for all tasks
Tensor of dtype float32
Tensor of dtype float32
and shape (T)
Binary masks indicating the existence of labels for all tasks
"""
return
self
.
smiles
[
item
],
self
.
graphs
[
item
],
self
.
labels
[
item
],
self
.
mask
[
item
]
def
__len__
(
self
):
"""
Length of
the dataset
"""
Size for
the dataset
Returns
-------
int
Length of D
ataset
Size for the d
ataset
"""
return
len
(
self
.
smiles
)
apps/life_sci/python/dgllife/data/pdbbind.py
View file @
d3560b71
...
...
@@ -57,9 +57,10 @@ class PDBBind(object):
Whether we need to extract molecular conformation from proteins and ligands.
Default to True.
construct_graph_and_featurize : callable
Construct a DGLHeteroGraph for the use of GNNs. Mapping self.ligand_mols[i],
self.protein_mols[i], self.ligand_coordinates[i] and self.protein_coordinates[i]
to a DGLHeteroGraph. Default to :func:`ACNN_graph_construction_and_featurization`.
Construct a DGLHeteroGraph for the use of GNNs. Mapping ``self.ligand_mols[i]``,
``self.protein_mols[i]``, ``self.ligand_coordinates[i]`` and
``self.protein_coordinates[i]`` to a DGLHeteroGraph.
Default to :func:`dgllife.utils.ACNN_graph_construction_and_featurization`.
zero_padding : bool
Whether to perform zero padding. While DGL does not necessarily require zero padding,
pooling operations for variable length inputs can introduce stochastic behaviour, which
...
...
apps/life_sci/python/dgllife/data/pubchem_aromaticity.py
View file @
d3560b71
...
...
@@ -12,7 +12,7 @@ class PubChemBioAssayAromaticity(MoleculeCSVDataset):
"""Subset of PubChem BioAssay Dataset for aromaticity prediction.
The dataset was constructed in `Pushing the Boundaries of Molecular Representation for Drug
Discovery with the Graph Attention Mechanism
.
Discovery with the Graph Attention Mechanism
<https://www.ncbi.nlm.nih.gov/pubmed/31408336>`__ and is accompanied by the task of predicting
the number of aromatic atoms in molecules.
...
...
Prev
1
2
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment