Graph Descriptors

Graph descriptor functions for converting graphs into feature vectors.

This module provides various functions that convert graphs into numerical representations suitable for kernel methods. Each descriptor is callable with an iterable of graphs and returns either a dense numpy.ndarray or sparse scipy.sparse.csr_array of shape (n_graphs, n_features). They implement the GraphDescriptor interface.

Graphs may either be passed as nx.Graph or rdkit.Chem.Mol objects.

Descriptors may, for example, be implemented as follows:

from typing import Iterable
import networkx as nx
import numpy as np

def my_descriptor(graphs: Iterable[nx.Graph]) -> np.ndarray:
    hists = [nx.degree_histogram(graph) for graph in graphs]
    hists = [
        np.concatenate([hist, np.zeros(128 - len(hist))], axis=0)
        for hist in hists
    ]
    hists = np.stack(hists, axis=0)
    return hists / hists.sum(axis=1, keepdims=True) # shape: (n_graphs, n_features)
Generic graph descriptors
Molecule descriptors

polygraph.utils.descriptors.GraphDescriptor

Bases: Protocol, Generic[GraphType]

Interface for graph descriptors.

A graph descriptor is a callable that takes an iterable of graphs and returns a numpy array or a sparse matrix. Graphs must be of the type specified by the GraphType generic parameter. In practice, this may either be nx.Graph or rdkit.Chem.Mol.

__call__(graphs)

Compute features of graphs.

Parameters:
  • graphs (Iterable[GraphType]) –

    Iterable of networkx graphs or rdkit molecules

Returns:
  • Union[ndarray, csr_array]

    Features of graphs. Dense numpy array or sparse matrix of shape (n_graphs, n_features).