Custom Rules Authoring Guide
This guide explains how to create custom fragmentation rules to control where bonds can be broken during molecular fragmentation.
Rule System Overview
Rules encode chemical knowledge about fragmentation:
What to protect: Aromatic rings, functional groups
Where to break: Weak bonds, natural boundaries
Priority handling: Conflict resolution between rules
Rule Actions
Rules return one of four actions:
Action |
Description |
|---|---|
|
Bond must never be broken |
|
Prefer keeping, can break if necessary |
|
No preference (neutral) |
|
Good fragmentation point |
When multiple rules apply, the most restrictive action wins.
Base Classes
autofragment provides two easy ways to create rules:
BondRule: A simple rule based on the element symbols of the atoms in a bond.FragmentationRule: An abstract base class for creating rules with custom logic.
Creating Bond Rules
Use BondRule to quickly protect or prefer specific atom-atom bonds:
from autofragment.rules import BondRule, RuleAction
# Allow breaking single C-C bonds
cc_break = BondRule(
name="single_cc_breakable",
atom1_elem="C",
atom2_elem="C",
action=RuleAction.PREFER_BREAK,
priority=5
)
# Protect C-N bonds
cn_keep = BondRule(
name="protect_cn",
atom1_elem="C",
atom2_elem="N",
action=RuleAction.PREFER_KEEP,
priority=8
)
Creating Custom Rule Classes
For more complex logic, extend the FragmentationRule base class:
from autofragment.rules import FragmentationRule, RuleAction
class DistanceRule(FragmentationRule):
"""Break bonds longer than a threshold."""
name = "distance_rule"
def __init__(self, threshold=2.0, priority=None):
super().__init__(priority=priority or self.PRIORITY_MEDIUM)
self.threshold = threshold
def applies_to(self, bond, system):
# bond is a tuple of (atom1_idx, atom2_idx)
dist = system.get_distance(bond[0], bond[1])
return dist > self.threshold
def action(self):
return RuleAction.PREFER_BREAK
Advanced: SMARTS Patterns
While SMARTS matching is not built into the core (to avoid dependency overhead), you can easily wrap RDKit or other libraries:
try:
from rdkit import Chem
except ImportError:
pass
class SMARTSRule(FragmentationRule):
def __init__(self, pattern_smarts, **kwargs):
super().__init__(**kwargs)
self.pattern = Chem.MolFromSmarts(pattern_smarts)
def applies_to(self, bond, system):
# Implement mapping between autofragment and RDKit
return False
Building a RuleSet
Combine multiple rules into a RuleSet for organization:
from autofragment.rules import RuleSet, AromaticRingRule
my_rules = RuleSet(name="custom_polymer")
my_rules.add(AromaticRingRule())
my_rules.add(cc_break)
my_rules.add(DistanceRule(threshold=1.8))
# Or create from list
my_rules = RuleSet.from_rules([
cc_break,
cn_keep,
DistanceRule()
])
Using Custom Rules
Pass the rules to a RuleEngine to evaluate them:
from autofragment.rules import RuleEngine
engine = RuleEngine(my_rules.rules)
# Check a specific bond
action = engine.evaluate_bond((0, 1), system)
Priority System
Higher priority rules are evaluated first. If multiple rules apply to the same bond, the most restrictive action wins.
Constant |
Value |
Typical Use |
|---|---|---|
|
1000 |
Aromatic rings, multiple bonds |
|
800 |
Metal coordination |
|
500 |
Default user rules |
|
200 |
Weak preferences |
# Override default priority
rule = DistanceRule(priority=999)
Example: Polymer Rule Set
from autofragment.rules import (
RuleSet, BondRule, RuleAction, AromaticRingRule
)
polymer_rules = RuleSet(name="polymer")
# Protect aromatic rings (built-in)
polymer_rules.add(AromaticRingRule())
# Prefer breaking backbone C-C
polymer_rules.add(BondRule(
name="backbone_cc",
atom1_elem="C",
atom2_elem="C",
action=RuleAction.PREFER_BREAK,
priority=500
))
Debugging Rules
Check which rules affect specific bonds:
engine = RuleEngine(my_rules.rules)
for bond in system.bonds:
action = engine.evaluate_bond(bond, system)
triggered = engine.get_triggered_rules(bond, system)
print(f"Bond {bond}: {action}, rules: {[r.name for r in triggered]}")
Best Practices
Start with critical rules: Always include aromatic and double bond protection
Use appropriate priorities: Critical > High > Medium > Low
Test on small molecules: Verify behavior before applying to large systems
Log rule decisions: Debug unexpected fragmentation
Combine built-in and custom: Extend existing rules rather than replacing