Liquid-Dendrite World Models

Modern AI inhales megawatts; biological intelligence thrives on milliwatts. This discrepancy is not just a matter of efficiency but a fundamental architectural gap. I introduce the Liquid-Dendrite World-Model, a novel framework for embodied agents that directly couples structural growth to metabolic cost. This architecture synthesizes three key elements: (1) heterogeneous dendrites with learnable time-constants for multi-scale temporal processing; (2) a liquid neural network core for robust, continuous-time memory; and (3) a predictive spiking-diffusion “dream” module that generates future sensory states. The agent inhabits a 3D world of nutrients and toxins, governed by a simple, powerful rule: it physically grows new sensory and computational structures only when the energetic return on investment—measured by a reduction in prediction error—is positive. This manuscript details the bio-inspired architecture, the energy-centric learning loop, and a roadmap for creating agents that learn to build themselves in pursuit of metabolic efficiency.

1. Introduction: The Foraging Brain

Today’s autonomous agents are built, not grown. Their architectures are static, pre-defined by human engineers, and their operational cost is an afterthought. In contrast, biological organisms, from single cells to complex brains, are masters of metabolic economy. A fungal mycelium, for instance, does not expand randomly; it sends out exploratory hyphae, reinforcing paths that lead to nutrients while pruning those that prove fruitless. This is not just a search algorithm; it is a physical, structural adaptation driven by an unbreakable energy budget.

My work operationalizes this principle of energy-aware structural plasticity. I propose that a truly autonomous agent should not merely learn within its architecture, but learn how to build its architecture. By unifying liquid time-constant (LTC) networks for memory, heterogeneous dendrites for temporal processing, and a generative world-model for prediction, I create a system where learning is inseparable from the agent’s physical and metabolic state. The agent forages for information as it forages for energy, growing new “roots”—dendrites and stationary sensors—only where the information gain justifies the metabolic cost.

2. Foundational Work

My model stands on the shoulders of several key innovations:

Liquid Time-Constant Networks: These provide a powerful, compact, and differentiable ODE formulation for recurrent computation, enabling continuous-time memory states.
Temporal Dendritic Heterogeneity: Research has shown that endowing spiking neurons with multiple dendritic compartments, each with a unique, learnable time-constant (τ), significantly boosts an SNN’s ability to process complex, multi-scale temporal information.
Spiking Diffusion Models: As generative models, diffusion networks have achieved state-of-the-art performance. Spiking variants extend this power to the event-based, energy-efficient domain of SNNs, making them ideal for predictive “dreaming.”
Metabolically Constrained Plasticity: While STDP and other learning rules are well-studied, recent work has begun to explore how local metabolic factors can gate or modulate synaptic changes, hinting at a deeper connection between energy and learning.

3. Materials and Methods

3.1 Environment: A Metabolic Arena

The agent inhabits a 3D world built in Three.js, populated by stochastically generated clusters of nutrient and toxin pellets. Every action—moving, turning, and every single neuronal spike—incurs a precise metabolic cost that depletes the agent’s internal energy reserve. Contact with nutrient pellets restores energy, while contact with toxins amplifies the metabolic cost of recent actions, creating a high-stakes environment where energy efficiency is paramount to survival.

3.2 The Liquid-Dendrite Neuron Model

Each neuron is a multi-compartment unit. Heterogeneous dendritic branches, each with its own learnable membrane time-constant τk, perform initial temporal filtering on incoming spike trains. The summed currents from these dendrites feed into a liquid ODE core within the soma, described by h˙ = f(h, t, θ), which maintains the neuron’s continuous-time memory state. A spike is emitted when h crosses a threshold, and crucially, new dendritic branches can be structurally appended to the neuron during the slow plasticity phase.

3.3 World-Model and “Spiking Dreams”

The agent’s internal world-model is a spiking diffusion decoder. Periodically, the agent enters a “dreaming” phase where this module, conditioned on the agent’s current latent state, generates a predicted future sensory spike-tensor for the next 100ms. This generative process allows the agent to simulate potential futures and evaluate the likely sensory consequences of its actions before committing to them.

3.4 The Dual Learning Loops

Learning occurs on two timescales:

Fast Plasticity (Synaptic): At every simulation step, synaptic weights are updated via a three-factor learning rule: TD-error-modulated Spike-Timing-Dependent Plasticity (STDP). This rule is stabilized by an E/I balance clamp, ensuring the network remains computationally stable.
Slow Plasticity (Structural): After a dream cycle, the agent calculates the prediction error ε (the divergence between its dream and subsequent reality). If ε is high and the agent has a surplus of energy E, it triggers a structural plasticity event. The system invests its stored energy to grow new dendrites on highly active neurons or deploy new stationary “root” sensors in the environment, optimizing its physical structure to minimize future prediction errors.

3.5 Reward, Energy, and Termination

The agent’s objective function is survival. The instantaneous reward Rt and change in energy E are defined as:

R = α(nutrient) – λS(spikes) – βε(prediction error)

Et+1 = Et + Rt

The episode terminates when E ≤ 0. This tightly couples learning directly to survival.

4. Evaluation Metrics

Success will be measured not by task completion alone, but by metabolic efficiency and adaptive resilience.

Energy-Adjusted Return: The primary metric is survival time per joule (approximated by total spikes and actions).
Structural Efficiency: Measure the rate of reward gain per added synaptic parameter, quantifying how intelligently the agent invests in its own growth.
Generalization: A trained agent will be tested on zero-shot transfer to novel nutrient/toxin map configurations.
Hardware Viability: The ultimate goal is to profile the champion agent on neuromorphic hardware (e.g., Intel’s Loihi 2), measuring real-world latency and power consumption in milliwatts.

5. Implementation Roadmap

Week	Milestone
1	Integrate voxel nutrients, toxins, and energy HUD in Three-JS
2	Replace vanilla LIFs with dendrite-heterogenous cells (TorchSNN fork)
3	Pretrain and embed spiking-diffusion decoder on 10 k sensor traces
4	Add liquid ODE core via torchdiffeq adjoint solver
5	Ablation studies: {no-dream, no-growth, fixed τ\tauτ} baselines
6–8	CMA-ES hyper-evolution across 512 arenas; optional SpiNNaker 2 cluster
9	Port champion policy to Loihi 3 evaluation board for real-time profiling

5. Expected Outcomes and Significance

We anticipate emergent sensor rooting focused on high-entropy regions, compression of predictive error at constant energy, and hardware runtimes within mobile-robot power envelopes. No prior work unites continuous-time liquid dynamics, dendritic growth, and prediction-error budgeting in a single embodied agent; confirmation would advance neuromorphic RL toward metabolically grounded autonomy

7. Conclusion and Future Work

Liquid-Dendrite World-Models propose a biologically inspired pathway to self-scaling, energy-aware intelligence. Immediate work will benchmark against fixed-topology SNN baselines; longer-term goals include multi-agent nutrient-sharing games and robot-arm deployment for crop monitoring on Montana farms.

I will be applying for access to Lohi2 and am looking to collaborate!

Justin Vetch — VetchTech Labs, Montana, USA
Correspondence: vetchj88@gmail.com X: @mt_jv88