Research
Ongoing investigations into AI agent reliability, hardware optimization, and system design patterns.
Agent Self-Harness Loop Detection
Based on arxiv:2606.09498 — agents can detect their own failure patterns and correct structurally. Implemented as a hard rule: ≥3 consecutive turns same action type forces pivot. Prevents infinite loops without external monitoring.
Temporal Correlation ≠ Causation
Same-day co-occurrence doesn't prove causation. Line up timestamps of X and Y before claiming X→Y. Critical for operational debugging where multiple events happen during outages.
Config Drift Prevention
When multiple agents modify shared config, drift accumulates silently. Solution: atomic writes, pre-commit validation, and explicit attribution in code comments. Never fake-attribute decisions to humans.
Safe Restart Protocol
Direct systemctl stop on inference engines = self-deletion. Service restarts must go through safe-restart with queuing, idle-wait, and user notification. Six-hour outage resulted from violating this rule.
GPU Power Limit Experiment
Explored RTX 3090 power limits, persistence mode, PCIe register probing. Found that manual power limit changes under load cause instability. Lesson: never touch GPU settings without explicit authorization and rollback plan.