Core Concepts
AutoFragment is built upon a robust graph-based representation of molecular systems and a set of chemistry-aware utilities. This foundation ensures that fragmentation algorithms operate on physically meaningful structures.
Molecular Graph
At the heart of the library is the MolecularGraph class, which wraps a NetworkX graph to represent atoms as nodes and bonds as edges.
from autofragment.core.graph import MolecularGraph
from autofragment.core.types import Atom
# Methane example
atoms = [
Atom("C", [0.0, 0.0, 0.0]),
Atom("H", [0.6, 0.6, 0.6]),
# ... more hydrogens
]
# Create graph and infer bonds automatically
mg = MolecularGraph.from_molecules([atoms], infer_bonds=True)
print(f"Atoms: {mg.n_atoms}, Bonds: {mg.n_bonds}")
Features
Bond Inference: Automatically detects bonds based on inter-atomic distances and covalent radii.
Ring Detection: Identifies rings (cycles) crucial for aromaticity and fragmentation rules.
Bridge Detection: Finds “bridges” or cut-edges, which are often good candidates for fragmentation points.
Subgraph Extraction: Efficiently extract specific fragments or regions as independent graphs.
# Check for rings
rings = mg.find_rings()
# Check if a bond is a bridge (removable without disconnecting the graph component)
is_bridge = mg.is_bridge(atom1_idx, atom2_idx)
Chemistry Utilities
AutoFragment includes a built-in chemistry engine (autofragment.core.chemistry) to handle fundamental chemical properties.
Key Capabilities
Periodic Table Data: Masses, valence electrons, electronegativity, and covalent radii.
Bond Order Inference: Infers Single, Double, Triple, and Aromatic (1.5) bond orders from geometry.
Aromaticity Detection: Uses Huckel’s rule heuristics and bond order analysis to detect aromatic systems (e.g., Benzene).
Formal Charge: Estimates formal charges on atoms based on connectivity and valence.
from autofragment.core import chemistry
order = chemistry.infer_bond_order("C", "C", distance=1.40)
# Returns 1.5 for aromatic C-C bond
Data Structures
ChemicalSystem
ChemicalSystem is the canonical representation of a full system: all atoms,
optional bonds, metadata, and lattice information. Public APIs accept a
ChemicalSystem for system-level operations.
Molecule
Molecule is a lightweight helper for isolated fragments (e.g., single waters)
and geometry utilities. Conversions between ChemicalSystem and molecule lists
are explicit via system_to_molecules and molecules_to_system.
Fragment
A Fragment represents a subset of a molecular system. It carries:
Symbols & Geometry: The atomic make-up.
Child Fragments: Optional
fragmentsfield for hierarchical nesting. A leaf fragment (is_leaf == True) holds atoms directly; a non-leaf fragment contains childFragmentobjects.Metadata:
molecular_charge,molecular_multiplicity, and methods forlayer(QM/MM) assignment.Graph Awareness: Can map back to the original
MolecularGraph.
The n_atoms property recurses through children, so it works correctly at any level of the hierarchy.
FragmentTree
The FragmentTree is the primary output container. It holds:
Fragments: The list of resulting fragments (flat or hierarchical).
Interfragment Bonds: Explicit records of bonds that were cut, allowing for detailed analysis or restoration (e.g., capping).
Provenance: Metadata about the source file and algorithm used.
For hierarchical trees (produced by tiered partitioning), FragmentTree provides:
n_primary: Number of top-level fragments._is_hierarchical: Whether any fragment has children.n_fragments: Total count across all hierarchy levels.
Fragmentation Scheme
FragmentationScheme holds the algorithm configuration used to generate a
FragmentTree.
Fragmentation Rules
The autofragment.rules module provides a Rules Engine that determines which bonds can be broken during fragmentation. Rules encode chemical knowledge to ensure fragments remain chemically meaningful.
Rule Actions
Rules return one of four actions (most to least restrictive):
MUST_NOT_BREAK: Bond must never be broken (aromatic rings, double bonds)
PREFER_KEEP: Prefer keeping, but can break if necessary (peptide bonds)
ALLOW: No preference (default)
PREFER_BREAK: Good fragmentation point (alpha-beta carbon bonds)
Built-in Rules
Common:
AromaticRingRule,DoubleBondRule,MetalCoordinationRule,FunctionalGroupRuleBiological (configurable):
PeptideBondRule,DisulfideBondRule,ProlineRingRule,HydrogenBondRuleMaterials Science:
SiloxaneBridgeRule,MOFLinkerRule,MetalNodeRule,PerovskiteOctahedralRule
See the Rules Documentation for complete details on creating and using fragmentation rules.