Fragmentation Rules
The Rules Engine is the core component that controls which bonds can or cannot be broken during molecular fragmentation. Rules encode chemical knowledge to ensure fragments remain chemically meaningful.
Overview
The rules system provides:
Flexible rule definitions via the
FragmentationRuleabstract base classPriority-based evaluation where critical rules are evaluated first
Conflict resolution where the most restrictive action wins
Pre-built rules for common chemistry, biology, and materials science
Quick Start
from autofragment.rules import (
RuleEngine,
RuleAction,
AromaticRingRule,
DoubleBondRule,
PeptideBondRule,
)
# Create an engine with rules
engine = RuleEngine([
AromaticRingRule(), # Never break aromatic rings
DoubleBondRule(), # Never break double/triple bonds
PeptideBondRule(), # Prefer keeping peptide bonds
])
# Evaluate a specific bond
action = engine.evaluate_bond((0, 1), chemical_system)
if action == RuleAction.MUST_NOT_BREAK:
print("This bond cannot be broken")
# Get all breakable bonds in a system
breakable = engine.get_breakable_bonds(chemical_system)
Rule Actions
Rules return one of four actions, ordered from most to least restrictive:
Action |
Description |
Use Case |
|---|---|---|
|
Bond must never be broken |
Aromatic rings, double bonds |
|
Prefer keeping, can break if necessary |
Peptide bonds, metal coordination |
|
No preference (neutral) |
Default when no rules apply |
|
Good fragmentation point |
Alpha-beta carbon bonds |
When multiple rules apply to the same bond, the most restrictive action wins.
Priority System
Rules have integer priorities (higher = evaluated first):
Constant |
Value |
Typical Use |
|---|---|---|
|
1000 |
Aromatic rings, proline rings |
|
800 |
Metal coordination, peptide bonds |
|
500 |
General rules (default) |
|
200 |
Hydrogen bonds |
# Override default priority
rule = AromaticRingRule(priority=999)
print(rule.priority) # 999
Built-in Rules
Common Chemical Rules
These rules apply universally to all molecular systems.
AromaticRingRule
Never break bonds within aromatic ring systems (benzene, naphthalene, pyridine, etc.).
rule = AromaticRingRule()
# Priority: CRITICAL
# Action: MUST_NOT_BREAK
DoubleBondRule
Never break double or triple bonds. Configurable minimum bond order.
rule = DoubleBondRule(min_order=1.5) # Catches aromatic bonds too
# Priority: CRITICAL
# Action: MUST_NOT_BREAK
MetalCoordinationRule
Preserve metal-ligand coordination bonds. Supports custom metal sets.
rule = MetalCoordinationRule(
metals={"Fe", "Cu", "Zn"},
rule_action=RuleAction.MUST_NOT_BREAK
)
# Priority: HIGH
FunctionalGroupRule
Keep common functional groups intact (carboxyl, nitro, phosphate, etc.).
rule = FunctionalGroupRule()
# Default groups: carboxyl, nitro, sulfonate, phosphate, amino
Biological Rules
Designed for proteins and nucleic acids. These rules are configurable to support different fragmentation strategies.
PeptideBondRule
Handle peptide bonds between amino acid residues.
# Traditional FMO: keep peptide bonds
rule = PeptideBondRule(rule_action=RuleAction.PREFER_KEEP)
# Residue-based fragmentation: break at peptide bonds
rule = PeptideBondRule(rule_action=RuleAction.PREFER_BREAK)
DisulfideBondRule
Handle disulfide bridges (Cys-Cys S-S bonds).
# Preserve structure
rule = DisulfideBondRule(rule_action=RuleAction.MUST_NOT_BREAK)
# Allow breaking for domain separation
rule = DisulfideBondRule(rule_action=RuleAction.ALLOW)
AlphaBetaCarbonRule
Mark C-alpha to C-beta bonds as preferred break points (sidechain separation).
rule = AlphaBetaCarbonRule()
# Default action: PREFER_BREAK
ProlineRingRule
Never break the proline pyrrolidine ring (5-membered ring with N).
rule = ProlineRingRule()
# Always: MUST_NOT_BREAK
HydrogenBondRule
Handle hydrogen bonds. Defaults to ALLOW because H-bonds often need to be broken.
# For water clusters: allow breaking (default)
rule = HydrogenBondRule(rule_action=RuleAction.ALLOW)
# For secondary structure: prefer keeping
rule = HydrogenBondRule(rule_action=RuleAction.PREFER_KEEP)
Materials Science Rules
For solid-state and extended materials systems.
SiloxaneBridgeRule
Handle Si-O-Si siloxane bridges in silica and zeolites.
rule = SiloxaneBridgeRule(rule_action=RuleAction.PREFER_BREAK)
MOFLinkerRule
Preserve organic linker molecules in Metal-Organic Frameworks.
rule = MOFLinkerRule()
# Protects C-C, C-N, C-O bonds within linkers
# Does NOT protect metal-linker bonds
MetalNodeRule
Preserve metal node clusters (Zn4O, Cu paddlewheels, etc.).
rule = MetalNodeRule(cluster_distance=4.0) # Angstroms
# Groups metals within 4Å as a cluster
PerovskiteOctahedralRule
Preserve MO₆ octahedra in perovskite structures.
rule = PerovskiteOctahedralRule(
b_site_metals={"Ti", "Zr", "Nb"},
allow_corner_breaking=True
)
ZeoliteAcidSiteRule
Protect Brønsted acid sites (Al-O(H)-Si) in zeolites.
rule = ZeoliteAcidSiteRule()
# Action: MUST_NOT_BREAK
SilanolRule
Preserve surface silanol groups (Si-OH).
rule = SilanolRule()
# Action: MUST_NOT_BREAK
MetalCarboxylateRule
Identify and allow breaking of metal-carboxylate (M-O) bonds to separate nodes from organic linkers.
rule = MetalCarboxylateRule(rule_action=RuleAction.PREFER_BREAK)
PolymerBackboneRule
Heuristic for identifying polymer backbone bonds.
rule = PolymerBackboneRule()
Creating Custom Rules
Extend FragmentationRule to create custom rules:
from autofragment.rules import FragmentationRule, RuleAction
class MyCustomRule(FragmentationRule):
"""Custom rule for my specific system."""
name = "my_custom_rule" # Required: unique identifier
def __init__(self, priority=None):
super().__init__(priority=priority or self.PRIORITY_MEDIUM)
def applies_to(self, bond, system):
"""Return True if this rule applies to the bond."""
graph = system.to_graph()
atom1 = graph.get_atom(bond[0])
atom2 = graph.get_atom(bond[1])
# Your logic here
return atom1["element"] == "X" and atom2["element"] == "Y"
def action(self):
"""Return the action when this rule applies."""
return RuleAction.MUST_NOT_BREAK
RuleEngine API
The RuleEngine manages rules and evaluates bonds:
engine = RuleEngine()
# Add/remove rules
engine.add_rule(AromaticRingRule())
engine.remove_rule("aromatic_ring") # By name
engine.clear_rules()
# Query rules
rules = engine.get_rules() # Sorted by priority
print(len(engine)) # Number of rules
# Evaluate bonds
action = engine.evaluate_bond((0, 1), system)
all_actions = engine.get_bond_actions(system)
# Find fragmentable bonds
breakable = engine.get_breakable_bonds(system) # Not MUST_NOT_BREAK
preferred = engine.get_preferred_breaks(system) # PREFER_BREAK only
Best Practices
Start with critical rules: Always include
AromaticRingRuleandDoubleBondRuleBe explicit about biological rules: Choose appropriate actions for your fragmentation strategy
Layer rules by priority: Use CRITICAL for absolute constraints, HIGH for strong preferences
Test with simple systems: Verify rule behavior on small molecules before applying to large systems
Log rule decisions: Check which rules are triggering for debugging
Example: Protein Fragmentation
from autofragment.rules import (
RuleEngine,
AromaticRingRule,
DoubleBondRule,
PeptideBondRule,
DisulfideBondRule,
ProlineRingRule,
AlphaBetaCarbonRule,
)
# FMO-style: fragment at alpha carbons, keep peptide bonds
fmo_engine = RuleEngine([
AromaticRingRule(), # Protect aromatic sidechains
DoubleBondRule(), # Protect double bonds
ProlineRingRule(), # Protect proline rings
DisulfideBondRule(), # Keep disulfides
PeptideBondRule(rule_action=RuleAction.PREFER_KEEP),
AlphaBetaCarbonRule(), # Break at C-alpha
])
# Get fragmentation points
breakable = fmo_engine.get_breakable_bonds(protein_system)
preferred = fmo_engine.get_preferred_breaks(protein_system)
Example: MOF Fragmentation
from autofragment.rules import (
RuleEngine,
AromaticRingRule,
DoubleBondRule,
MetalCoordinationRule,
MOFLinkerRule,
MetalNodeRule,
)
mof_engine = RuleEngine([
AromaticRingRule(), # Protect aromatic linkers
DoubleBondRule(), # Protect carboxylate C=O
MOFLinkerRule(), # Keep organic linkers intact
MetalNodeRule(), # Keep metal clusters intact
# Metal-linker bonds will be the break points
])