Graph Kernels
Kernel functions for comparing graphs via descriptor functions.
This module implements various kernel functions that measure similarities between graphs. Each kernel is initialized with a descriptor function that converts graphs into vector representations. The kernel then computes similarities between these feature vectors.
Descriptor functions accept iterables of graphs (networkx graphs or rdkit molecules) and return descriptions as dense or sparse arrays.
Specifically, they should either return a dense numpy.ndarray or sparse scipy.sparse.csr_array of shape (n_graphs, n_features).
Available kernels
RBFKernel: Standard Gaussian/RBF kernel using \(\ell^2\) distanceLaplaceKernel: Exponential kernel using \(\ell^1\) distanceGaussianTV: Non-positive definite Gaussian kernel using \(\ell^1\) distanceAdaptiveRBFKernel: RBF kernel with data-dependent bandwidth adaptationLinearKernel: Simple dot product kernel
Example
import numpy as np
from polygraph.utils.kernels import RBFKernel
import networkx as nx
from typing import Iterable
def my_descriptor(graphs: Iterable[nx.Graph]) -> np.ndarray:
hists = [nx.degree_histogram(graph) for graph in graphs]
hists = [
np.concatenate([hist, np.zeros(128 - len(hist))], axis=0)
for hist in hists
]
hists = np.stack(hists, axis=0)
return hists / hists.sum(axis=1, keepdims=True) # shape: (n_graphs, n_features)
ref_graphs = [nx.erdos_renyi_graph(10, 0.3) for _ in range(100)]
gen_graphs = [nx.erdos_renyi_graph(10, 0.3) for _ in range(100)]
# Create kernel with multiple bandwidths
kernel = RBFKernel(my_descriptor, bw=np.array([0.1, 1.0, 10.0]))
ref_desc = kernel.featurize(ref_graphs) # shape: (n_ref_graphs, n_features)
gen_desc = kernel.featurize(gen_graphs) # shape: (n_gen_graphs, n_features)
# Compare reference and generated graph descriptors
ref_vs_ref, ref_vs_gen, gen_vs_gen = kernel(ref_desc, gen_desc)
polygraph.utils.kernels.DescriptorKernel
Bases: ABC, Generic[GraphType]
Abstract base class for kernel functions that operate on graph descriptors.
This class defines the interface for kernels that compute similarity between graphs based on their descriptors. It handles both the featurization of graphs and the computation of kernel matrices.
| Parameters: |
|
|---|
__call__(ref, gen)
Computes the full kernel comparison between reference and generated graphs.
| Parameters: |
|
|---|
| Returns: |
|
|---|
adapt(blocks)
May adapt kernel parameters based on the computed kernel matrices.
This method may, e.g., scale the bandwidth of a Gaussian kernel based on the typical distance between reference and generated graphs.
| Parameters: |
|
|---|
| Returns: |
|
|---|
featurize(graphs)
Converts graphs into descriptor representations.
| Parameters: |
|
|---|
| Returns: |
|
|---|
pre_gram(ref, gen)
Computes all kernel matrices between reference and generated samples.
| Parameters: |
|
|---|
| Returns: |
|
|---|