Open Access

Zwanzig-Mori projection operators and EEG dynamics: deriving a simple equation of motion

PMC Biophysics20092:6

DOI: 10.1186/1757-5036-2-6

Received: 22 March 2009

Accepted: 13 July 2009

Published: 13 July 2009


We present a macroscopic theory of electroencephalogram (EEG) dynamics based on the laws of motion that govern atomic and molecular motion. The theory is an application of Zwanzig-Mori projection operators. The result is a simple equation of motion that has the form of a generalized Langevin equation (GLE), which requires knowledge only of macroscopic properties. The macroscopic properties can be extracted from experimental data by one of two possible variational principles. These variational principles are our principal contribution to the formalism. Potential applications are discussed, including applications to the theory of critical phenomena in the brain, Granger causality and Kalman filters.

PACS code: 87.19.lj

1. Introduction

The electrical activity of the brain has intrigued scientists since the invention of the electroencephalogram (EEG) [1, 2]. Scalp and intracranial EEG's are now in widespread clinical use in the diagnosis and management of epilepsy and other neurological disorders. These applications rely on empirical correlations between certain EEG patterns with specific neurological disorders. Intense interest exists in trying to understand EEG dynamics at a deeper level, so as to extract ever more information about brain health and function. These efforts fall into two classes: those which are largely empirical, based on traditional correlations between EEG patterns and clinical observations, and those which are theory-based, where one has in mind a certain model of brain dynamics and then one tries to interpret EEG patterns in terms of the theoretical model. In the empirical class are recent efforts to correlate high frequency oscillations with epileptogenic tissue [3]. In the theory-based class, the most celebrated approach is the cable theory of Hodgkin and Huxley [4]. This theory can be scaled up using compartmental models to describe networks of thousands or even millions of neurons using high power computers. In the hands of a master, much insight can come from such simulations [5]. However, these methods are computationally intensive. They are not easily scaled to truly macroscopic levels and they are not amenable to the clinical diagnostic situation where one wants to know, for specific EEG samples from specific individuals, whether a certain brain pathology is present.

Mesoscopic and macroscopic level theories of EEG dynamics have also been proposed, each based on a plausible basic postulate or mathematical model of the electrical activity of the brain [2, 68]. The methods of nonlinear dynamics (chaos theory) also fall into this class and have been applied to seizure prediction [9, 10].

It would be desirable to base a macroscopic theory of EEG dynamics on fundamental physical principles, for instance, on the laws of motion that govern atomic and molecular motion. In this paper, we discuss such a theory. The approach we take was developed by Zwanzig [11] and Mori [12] to explain thermodynamic irreversibility (why entropy always increases). The result is a simple equation of motion that has the form of a generalized Langevin equation (GLE), which requires knowledge only of macroscopic properties. The macroscopic properties can be extracted from experimental data by one of two variational principles. Potential applications are discussed, including applications to epilepsy research, the theory of critical phenomena in brains, Granger causality and Kalman filters.

2. Theory

EEG dynamics results from the motion of microscopic charged particles in the brain. The charged particles interact with other charged particles as well as with uncharged particles. The total number of particles is astronomical, on the order of 1023. A macroscopic EEG electrode either on the scalp or inserted into the brain detects voltages set up by the local distribution of the charged particles. The changes in these voltages reflect the local flux of charged particles. The forces exerted by the particles on each other are highly nonlinear. How can we possibly hope to derive an equation of motion for the macroscopic EEG voltages? The most elegant solution is to apply Zwanzig-Mori projection operators [11, 12]. We follow the discussion of Zwanzig [13]; see also Berne and Pecora [14] for more detailed discussions.

Let the voltage measured by electrode n at time t be represented as x(n, t) where n = 1 to N. If the reference ground is taken to be infinitely far away, then x(n, t) is given by:

where N0 is the total number of charged particles, Q i is the charge of particle i, is its position at time t, and is the position of the n th electrode. If the reference ground is chosen differently, then x(n, t) will be given by some linear combination of the terms on the right hand side of Eq (1). The choice of reference ground does not affect what follows. Next, let x(n, t) represent the n th component of an N-dimensional vector X(t). Hence, the voltage readouts from an N-channel EEG are the N components of the vector X(t). For convenience, we will take X(t) to represent the EEG voltage after the time average of the raw data has been subtracted out: X(t) = X raw (t) - 〈(X raw (t))〉.

In what follows, we will refer to the x(n, t)'s as the explicit degrees of freedom, while the 's and all other degrees of freedom are the implicit or "bath" degrees of freedom. At this point, readers who wish to skip some of the more mathematical discussion may jump to Eq. (17).

To continue, if the microscopic particles of the system all obey the laws of physics, then the dynamics of X(t) can be written:
Here L is the Liouville operator. The classical and quantum Liouville operators, respectively, are defined as:

Here and are the coordinate and momentum of the i th particle, H is the total Hamiltonian, ħ is Planck's constant, and the square brackets denote the quantum commutator. In what follows, one may choose the dynamics to be determined by either the classical or quantum Liouville operator.

The Hamiltonian H in Eq (3) contains kinetic and potential energy terms for all the microscopic particles in the system. The potential energy terms describe how every microscopic particle interacts with every other microscopic particle. These terms are in general highly nonlinear. We shall not need to know the details of these interactions. The Zwanzig-Mori approach only requires that the microscopic dynamics obeys the form of Eq (2). The realm of validity for what follows thus includes all of classical and quantum mechanics.

Equations (2)–(3) are formally correct but they are not useful for describing macroscopic systems in practice because we cannot possibly specify L for all N0 ≈ 1023 microscopic particles. If we could formally "average" or integrate Eq (2) over all the microscopic degrees of freedom and (that is, over all the implicit degrees of freedom) without integrating out the explicit degrees of freedom contained in X(t), then we can transform Eq (2) into a useful equation of motion for X(t). Here is where projection operators come in. Let Γ represent all the microscopic particle coordinates and momenta (i.e., it represents a point in phase space), and let . Define an inner product

where ρ(Γ) gives a microscopic distribution of Γ, not necessarily at equilibrium, and where the asterisk denotes a complex conjugate. Here A and B are explicit or implicit degrees of freedom which for the time being we will allow to be complex (i.e., they may have both real and imaginary components). We will later show how to recover convenient expressions for purely real observables, such as the EEG voltages and their time derivatives. Note that Eq (4) is essentially a phase space average of the product of A times B*.

Now imagine that we identify the vector A as the explicit degree of freedom, and we wish to project or integrate out all other degrees of freedom from Eq (2). The projection operator with respect to A is:
This operator integrates out all degrees of freedom of the system except those that affect A. Applying this projection operator on Eq (2) will result in an equation of motion for A only. We will not reproduce here the mathematical manipulations described in detail in Refs [1214]. One needs the operator identity:
We will take t0 to represent the initial time within an arbitrary time window. After some work, for times such that tt0, one arrives at the generalized Langevin equation (GLE):
Here Ω has the interpretation of a frequency, F(t) is a "random force" due to the dynamics of the implicit degrees of freedom, and Γ(τ) is referred to as a memory function that defines the effect of prior values of A on its current time rate of change. Equation (7) is the desired equation of motion for the explicit degrees of freedom. An important formal result is that

which states that the random force acting on the macroscopic observable A is not correlated with A. This result is rigorously true and it underlies the foundation of the Onsager regression hypothesis [14]. It will also suggest one of the variational principles by which we extract model parameters from experiment, as we will discuss shortly.

Other forms of the GLE may be derived from Eq (7). For convenience, let us rewrite Eq (7) as
where the open circle denotes a convolution and the terms in Ω and Γ(τ) have both been absorbed into the convolution:
The macroscopic observable A(t) thus far is allowed to have both real and imaginary parts. Now consider a purely real vector A(t) of the form:
We choose this form because the underlying dynamics is governed by the laws of physics, for which the coordinate and momentum (here represented as the velocity) are independent degrees of freedom. The coordinate X(t) is constrained such that

In practice, recall that X(t) represents the N channels of EEG voltages. Hence, V(t) is the first time derivative of X(t), which in practice can be constructed from experimental time series data by taking finite differences, e.g., V(t) ≈ [X(t + δt) - X(t - δt)]/(2δt), where δt is the EEG sampling time.

Taking the formal time derivative of Eq (14) and inserting Eq (12), we find
Applying the constraint of Eq (15), we have
where , K = W21, G = W22 and F R (t) = F2(t). This equation may be written in more expanded form as

Equation (17b) has the form of an equation of motion for a time-delayed damped harmonic oscillator with force constant matrix K, friction constant matrix G and random force F R (t). The purely real matrix K has convolution components K(m, n, τ) giving the influence of x(n, t - τ) on x(m, t) with time delay τ. That is, a fluctuation in the n th macroscopic coordinate of magnitude x(n) at a certain point in time, results in a force on the n th macroscopic coordinate of magnitude -K(m, n, τ)x n at a time τ later. Similarly, G(τ) is a purely real time-delayed friction matrix with components G(m, n, τ) giving the frictional force of on the m th macroscopic coordinate with time delay τ.

Recall that X(t) represents the EEG voltages. Hence Eq (17) tells us that we can think of EEG dynamics as being driven by three kinds of forces: one due to interactions between neurons at the sites of the electrodes (the K-term), another due to frictional forces (the G-term), and a third due to random environmental noise (the F R (t)-term). The influence of the enormous number of implicit degrees of freedom are hidden in microscopic expressions for the K(τ), G(τ) and F R (t) terms as expressed through Eqs (8)–(10). In certain highly idealized situations, one may be able to estimate these functions quantitatively. In the general case, however, Eqs (8)–(10) are computationally intractable. Nonetheless, the functions K(τ), G(τ) and F R (t) also represent macroscopic properties of the system, and it should be possible to extract them from experiment. What we need is a way to relate these functions to experimental data, preferably by a variational principle.

There are two obvious approaches to a variational principle by which K(τ), G(τ) and F R (t) may be extracted from experiment: (I) minimization of the amplitude of the random force F R (t) and (II) minimization of correlation of the random force with the macroscopic coordinate.

2.1. Variational principle I: minimization of random force amplitude

Equation (17) represents a way of dividing up the total ''force'' acting on the macroscopic observables into three components: (1) that due to explicit interactions between the macroscopic observables, represented by the term in K, (2) that due to frictional dissipation, represented by the term in G and (3) that due to environmental fluctuations, represented by the term in F R (t). One may plausibly argue that one should ascribe as much of the total force as possible to the explicit interactions between the macroscopic observables, and as little as possible to environmental fluctuations. In this case, one may define an error functional which adds up all the square amplitudes of F R (t) over the entire time interval under observation, t = [t a , t b ], with the goal of minimizing it:

To minimize this error functional, one may take each value of K(m, n, τ) and G(m, n, τ), for each m, n and τ, as an independent parameter, and vary them so as to make E I (t a , t b ) as small as possible. To do this, first insert Eq (17) into Eq (18), to express E I (t a , t b ) in terms of X(t), , , K(τ), and G(τ). Next, insert experimental values for X(t), and into E I (t a , t b ). One may then minimize E I (t a , t b ) with respect to K(τ) and G(τ) by standard algorithms [15]. Because E I (t a , t b ) is quadratic in K(τ) and G(τ), one is guaranteed a unique solution. This procedure represents our first variational principle.

For future reference, we list the derivatives of E I (t a , t b ) with respect to the K(m, n, τ)'s and G(m, n, τ)'s. These equations are useful in global minimization algorithms [15]:
Alternatively, after some experience, if one finds that K(m, n, τ) and G(m, n, τ) tend to decay exponentially, possibly with some oscillations, then one may try to save some computational effort by parameterizing K(m, n, τ) and G(m, n, τ), for example, using

The minimization of E I (t a , t b ) would then be with respect to A K (m, n), b K (m, n), ω K (m, n), etc. The drawback here is that the minimization algorithm would require an iterative algorithm suitable for nonlinear fits, and a unique solution is not guaranteed [15].

2.2. Variational principle II: minimization of random force correlation

To formulate our second variational principle, consider Eq (11), which states that the random force should not be correlated with any macroscopic observable when averaged over the phase space available to the microscopic degrees of freedom, as defined in Eq (4). One may hypothesize that this phase space average may be replaced by an average over time. This hypothesis is known as the ergodic hypothesis. Note that in Eq (4), we do not assume the phase space average is necessarily an equilibrium average. If for some possibly non-equilibrium distribution function, the ergodic hypothesis holds true, then we may take the assumption that the bath random force is not time-correlated with macroscopic observables as the basis for our second variational principle. There are subtleties in this approach which must be treated with care.

It is generally not possible to be certain whether an experimental system is ergodic or not. For instance, the explicit degrees of freedom may happen to be trapped in a portion of phase space that is separated from other regions of phase space by very high free energy barriers. Even if over time the explicit degrees of freedom sample all of the phase space available within this bounded region, we can never know that there are not other regions that would have been equally sampled, were it not for the impenetrable energy barriers separating the regions. In this case, we can still define the random force term in Eq (17) to be such that it is not time-correlated with the macroscopic observables. If the random force were correlated with a macroscopic observable, then of course it would not really be random.

In what follows, we will assume that the bath degrees of freedom are ergodic. We will be careful, however, not to invoke the ergodic hypothesis for correlations between two explicit degrees of freedom, because we wish to allow for the possibility that the explicit degrees of freedom are non-ergodic and far from equilibrium.

To proceed, let us define a time averaged correlation function with the time average being over a time period [t a , t b ], with t a t 0 t b :
Let C(n, m, t) represent the (n, m) matrix element of a matrix C(t). Evaluate Eq (17b) at time t + t0, multiply both sides by X(t0), and integrate over t0 over the range [t a , t b ]. We will assume that K(τ), G(τ) and F R (t) do not depend on the time t0, i.e., we assume these functions are stationary. The result is:
which may be written in more expanded form as

Here C RX (t) is the time correlation between the random force and the macroscopic observables, when averaged over time. If we take the bath random force to be ergodic, then the time averaged correlation C RX (t) from Eq (25) is equal to the phase space averaged correlation (F(t), X(t0)) from Eq (11), and therefore C RX (t) should be equal to zero. Minimizing the square amplitude of every element of the matrix C RX (t) over the time interval [t a , t b ] represents our second variational principle.

The careful reader will note that Eq (24) is not quite the same as the typical equation of motion for the time correlation function, as defined in standard texts [13, 14]. In the standard derivations, an equation of motion that is identical in form as Eq (24) is derived but the time correlation functions involve an average over phase space, not over time. One then invokes the ergodic hypothesis to equate these phase space averaged correlation functions with time averaged correlation functions. However, we wish to avoid making the ergodic hypothesis for the explicit degrees of freedom. In our derivation, we invoke the ergodic hypothesis only for the bath degrees of freedom.

To continue, define an error functional that consists of the sum of the square of every element of the matrix C RX (t), summed over all the elements of the matrix and over every instant in time in the time interval [t a , t b ]:

Here Tr signifies a matrix trace and T a matrix transpose. To minimize the error functional of Eq (26), one may take each value of K(m, n, τ) and G(m, n, τ), for each m, n and τ, as an independent parameter, and vary them so as to make E II (t a , t b ) as small as possible. To do this, first insert Eq (24) into Eq (26), to express E II (t a , t b ) in terms of C(t), , , K(τ), and G(τ). Next, insert experimental values for C(t), , into E II (t a , t b ). One may then minimize E II (t a , t b ) with respect to K(τ) and G(τ) by standard algorithms [15]. Because E II (t a , t b ) is quadratic in K(τ) and G(τ), one is guaranteed a unique solution. This procedure represents our second variational principle.

For future reference, we also list the derivatives of E II (t a , t b )with respect to the K(m, n, τ)'s and G(m, n, τ)'s. These equations are useful in global minimization algorithms [15]:
Alternatively, after some experience, if one finds that K(m, n, τ) and G(m, n, τ) tend to decay exponentially, possibly with some oscillations, then one may try to save some computational effort by parameterizing K(m, n, τ) and G(m, n, τ), for example, using

The minimization of E II (t a , t b ) would then be with respect to A K (m, n), b K (m, n), ω K (m, n), etc. The drawback here is that the minimization algorithm would require an iterative algorithm suitable for nonlinear fits, and a unique solution is not guaranteed [15].

3. Discussion

The mathematical or physical model that one chooses to represent EEG dynamics influences how one interprets experimental observations. The model itself acts as a filter which biases one's interpretations. It is desirable therefore that these models be based on fundamental physical principles. The model we present here is suitable for interpreting electrophysiological data acquired from extracellular recordings. It is a macroscopic level model which complements more microscopic Hodgkin-Huxley type models. The assumptions that we make are fourfold. First, we assume that the underlying dynamics obeys the laws of either classical or quantum physics. This assumption is quite safe for biological systems.

Second, we assume that the total system under consideration is isolated and free from external influences, which allows us to write the Liouville operator in Eq (2) as being free of an explicit time dependence, i.e., the overall system is stationary. If there is an external driving force, such as environmental stimuli driving a learning experience, then the Liouville operator will have an explicit time dependence and the operator identity of Eq (6) is no longer valid. On the other hand, one could always subsume the external degrees of freedom into the Liouville operator, to make the external degrees of freedom part of the total system under consideration. It does not matter how many degrees of freedom are subsumed into the Liouville operator in this way, since we integrate them out anyway. One could in principle claim to include the entire universe in one's Liouville operator, in which case, neglecting relativistic and other cosmological effects, one recovers the time independence of the Liouville operator and the derivation of Eq (17) proceeds as above.

Thirdly, we have made the ergodic assumption for the bath degrees of freedom, which assumes that the bath degrees of freedom can quickly (on time scales much faster than the time scales of the explicit degrees of freedom) explore all of the phase space available to the bath. Importantly, we have avoided making the ergodic hypothesis for the explicit degrees of freedom, and thus our principal theoretical result, Eq (17), is capable of describing systems far from equilibrium.

The fourth assumption is comprised of either variational principle I or II, either of which allows us to extract the model parameters from experimental data. Variational principle I tries to explain as much of the dynamics as possible using the macroscopic observables, ascribing as little as possible to the influence of random unseen forces. Variational principle II does not assume that the random forces are small in amplitude, but rather that they are not correlated with the dynamics of the macroscopic observables.

We feel that variational principle II is better justified, as there is no reason to suppose that random environmental noise should generally be small. Variational principle II requires us to make either the ergodic hypothesis for the bath random forces, or one must take the definition of a random force to mean that it is not correlated with macroscopic observables, either within a phase space average or in a time average. If experimentally we find that the random force is in fact correlated with a macroscopic observable, then we would know that there is an unidentified degree of freedom in the experimental system that is important to the dynamics of the identified macroscopic observables. Such a scenario can arise if there are neurons very far from the experimental electrodes that strongly drive the observed neurons, but that are too far from any electrode to be directly observed. In this case, there very well may be strong correlations between the macroscopic observables and the presumptive random force term. Thus, difficulty in minimizing the error functional E II (t a , t b ) may sometimes signify that we have not identified all the key macroscopic observables. We may need to insert a few more electrodes in other parts of the brain.

In this case, if it is not possible to insert more electrodes into the brain or if we do not know where to insert them, what we can do is to take shorter time intervals of observation, given by [t a , t b ], and to extract the K's and G's for each time interval individually, a different set of K's and G's for each time interval. If the time intervals are short enough, then it will be possible to make E II (t a , t b ) as small as one likes, simply because there will be fewer data points to be fitted with a given number of free fitting parameters. The K's and G's will then vary across time intervals. These variations reflect the influence of the unseen variables.

3.1. Nonlinear interactions

It may seem curious that Eq (7) is linear in the macroscopic variables X(t) and even though we have made no assumptions about linearity in the microscopic variables, the 's and 's. The reason is a subtle one but important to understand. In Eq (2), every component of the vector X(t), x(n, t), is itself a vector in Hilbert space which can be expanded in terms of an infinite basis set of eigenfunctions π j of the Liouville operator L:
Higher order functions, such as x(n, t)2, x(n, t)3, etc, have their own expansion coefficients in terms of these eigenfunctions, and the expansion coefficients will not necessarily have a simple relationship to those for x(n, t). What this means is that in Hilbert space, higher order functions of the macroscopic variable x(n, t) are considered additional dimensions in the abstract space. If they are important for describing the dynamics of X(t), then one will not be able to satisfy one's variational principle and E II (t a , t b ) will be in some sense too large. Difficulty in satisfying the variational principle thus can signal either that there are important unseen variables at play or that there are nonlinear dependences on the macroscopic variables. In the latter case, one remedy would be to define a vector Z(t) with components zN+k(n, t) = x(n, t) k . One can include higher order terms up to whatever order k one likes, in order to capture some of the nonlinear dependences. To generalize even further, one could begin including other functions as well, for example, sigmoidal functions for modeling nonlinear neuronal responses. Regardless of what one chooses, the vector Z(t) will evolve according to the Liouville equation:
The rest of the development follows as before, and one finds a corresponding equation of motion for Z(t):

One can still appeal to either variational principle I or II to extract the K's and G's from experiment data.

In general, there is no simple prescription for determining which macroscopic observables are important nor the best nonlinear forms for the interactions between the macroscopic observables. The choices that one makes will depend on the specifics of one's experimental setup (where one has inserted electrodes) and on one's intuition about how best to model explicit interactions. Nonetheless, the form of Eq (17) or Eq (33) constrains the mathematical model to one that is consistent with the laws of physics, and the variational principles I and II guarantee the best fit of one's model to experimental data.

As an example, consider the situation where there are two intracranial electrodes, and where the neurons near one electrode interact with neurons near the other through both a linear term and a sigmoidal term. There may also be a frictional force acting at each site. A reasonable set of equations of motion may then look like this:

One may then appeal to either variational principle to obtain the K's and G's, and even λ if one wishes. The Zwanzig-Mori approach thus allows for a very flexible approach to the mathematical modeling of macroscopic observables. One does one's best to set up an equation of motion, with a certain set of fitting parameters, and then one appeals to the variational principle of one's choice to extract these fitting parameters from experiment. The variational principle guarantees the best description of the experimental data, given one's choice of a mathematical model.

3.2. Piece-wise linear approximation

Alternatively, one can ignore all macroscopic observables that are not linear functions of the x(n, t)'s and simply take sequential time intervals [t a , t b ] to be short enough that E I (t a , t b ) or E II (t a , t b ) is satisfactorily small. How short this time interval has to be is however short is necessary to obtain a value of E I (t a , t b ) or E II (t a , t b ) that is below one's desired threshold. In this case, one is assuming that the dynamics is linear within each time interval [t a , t b ], but not necessarily linear across time intervals. This approach is the piece-wise linear approximation. It is worthwhile considering when it is valid.

If the underlying microscopic dynamics obeys classical dynamics (i.e., Newtonian physics), then there is no loss of generality in making the piece-wise linear approximation, at least as far as the validity Eq (17) is concerned. The reason is that the force exerted on a classical particle is proportional to the slope of a potential energy function at a single point on that surface, with that point being given by the instantaneous coordinate of the particle X(t). For a short enough period of time, one can perform a Taylor expansion of the potential energy surface about the instantaneous coordinate of the particle, expand to just the quadratic order term, and ignore the higher order terms. The force, since it is proportional to the slope of the potential energy surface, is then always piece-wise linear in X(t).

In contrast, the instantaneous force on a quantum wavepacket depends on the shape of the potential energy surface over the instantaneous spatial extent of the entire wavepacket. If this wavepacket extends over a substantial patch of the potential energy surface, then an accurate Taylor expansion of the potential energy surface may have to include terms higher than just the quadratic order term. Thus quantum dynamics may not always be accurately described by a piece-wise linear assumption.

Fortunately, biological systems are typically far from the quantum regime, and in this case, we can always simply take the time intervals [t a , t b ] shorter and shorter until we satisfy our variational criterion. Piece-wise linear analysis (also known as instantaneous normal modes) has been surprisingly successful in describing even highly nonlinear dynamics, including that of liquids [1620].

Once we have obtained the K's and G's, we can interpret interactions between groups of neurons with these functions. Which neuronal groups are linked by either the K's or G's? What are the time scales for the time delays? In particular, the eigenstates and eigenvalues of the K-matrix would be of high interest, as these eigenstates represent spatiotemporal patterns created by the interaction between the explicit degrees of freedom, i.e., these eigenstates may represent ''memory traces''.

To explore this idea, first ignore the convolutions in Eq (17) and consider the eigenstates of K where the eigenvalues are purely real and positive. These eigenstates are oscillatory states which reside in free energy "wells". In terms of the EEG, one will see "standing wave" oscillations distributed over the spatial distribution of the respective eigenstates. These are stable states that can be used to store information. In contrast, eigenstates of K where the eigenvalues are purely real and negative are unstable states which map onto free energy "barriers". In terms of the EEG, these states are evanescent states which do not recur, or at least not in a periodic way. These unstable states are not useful for storing information because they are transitory and one cannot reliably design a trajectory to return to such states.

Because K is purely real but not necessarily symmetric, it can also have pairs of complex-valued eigenstates which have eigenvalues that are also complex. In each pair, the eigenvalues and eigenstates are complex conjugates of each other. The EEG dynamics represented by these pairs is that of a "traveling wave" that travels between the real and imaginary parts of the eigenstates. For instance, if one eigenstate is with complex eigenvalue λ1 = λ R + I , then the other eigenstate is with eigenvalue λ2 = λ R - I . In terms of the EEG dynamics, one will see activations of EEG voltages that morph continuously between the spatial distribution represented by and that represented by . If λ R > 0, then these states are stable and can also be used to store information.

The presence of the convolution between K(τ) and X(t) in Eq (17) opens up the possibility of super-eigenstates that are not only spatially distributed, but also extended in time. To see how these arise, define a super-vector X(t) by concatenating (M+1) consecutive N-dimensional X-vectors:
The number M is determined by the number of time steps that contribute to the convolutions involving the K and G matrices. A super-vector F R (t) can be defined in an analogous way, as well as N(M + 1) × N(M + 1) dimensional super-matrices K and G. Equation (17) can then be written without convolution operators as:

The eigenstates of K now span not only space but also time over an interval given by τ M = Mδt. We suggest that those eigenstates of K that are stable represent spatiotemporal memory traces.

3.3. Criticality, neural and thermodynamic

Of increasing interest in neurodynamics is the idea that the brain may poise itself near a critical point [2123]. Near this neural critical point, neuronal elements may exhibit spatiotemporal patterns of correlated activation that span many length scales and time scales, from the length scales of just a few neurons to that encompassing macroscopic brain regions, and from time scales of individual action potentials (millisecs) to the much longer time scales of a completed thought (seconds to minutes, and possibly to even longer times). For neural systems, criticality can be defined in terms of the connectivity of the system [24, 25]. It has been shown that a neural system at critical connectivity exhibits maximal information storage capacity, and optimal information transmission and processing [21, 22, 24, 2628]. Because the survival of an animal depends on how well its brain processes, stores and retrieves information, we have hypothesized that biological neural systems must maintain themselves at or near critical connectivity [29]. If this hypothesis is true, then failure to remain at or near critical connectivity may represent neurological disease, for instance, epilepsy [30]. If this hypothesis is false, it cannot be too far wrong, and it would be important to characterize how far a neural system can be from critical connectivity before it fails to be useful as an information processing system.

It would be of great interest to have a reliable measure of connectivity in the intact brain, as measured by clinically available electrodes either on the scalp of a human subject, placed on the surface of the brain, or inserted into the brain. For practical reasons, all such systems monitor only a tiny subset of all neural activity in the intact brain. An important consequence is the subsampling problem, where monitoring of only a portion of the total system is likely to misrepresent the true state of the system, including the possibility of mistaking a critical system for a subcritical or supercritical one [31]. It appears that one needs to monitor on the order of 25% of the total system in order to have some hope of a reliable measurement of system connectivity, at least using current measures of connectivity (Priesemann V; Personal communication, 2009).

Can the projection operator approach afford a more reliable measure of criticality, and can it connect the neural concept of criticality to the more traditional thermodynamic concept [32]? We do not know as yet, but the conceptual framework of the projection operator approach is suggestive of an approach to this problem. The reasoning is as follows.

The vector of macroscopic observables, X(t), can be regarded as a generalized coordinate of an abstract neural ''super-particle'' that exists in a very high-dimensional space. The coordinate reflects the collective movement of a macroscopic number of microscopic charged ions and molecules in a very complicated, nonlinear way, as given by Eq (1). Notwithstanding the complexity of the microscopic dynamics, it is a characteristic of critical phenomena to span all length and time scales [32], and thus one may ask, if we define a heat capacity for the macroscopic neural super-particle, will it behave in the same way as the heat capacity of the microscopic system if either system is at or near its critical point? The answer is yes, it should, and therefore the critical point of the microscopic system should be identical to that of the macroscopic system. The heat capacity of the neural system should diverge at the critical point in the same way, with the same critical exponent.

To explore this idea, note that changes in the internal energy U(t) of the oscillator are driven by the force -KX(t), and so incremental changes in this energy are given by
If we take the kinetic energy to be and the temperature to be related to the kinetic energy through
where T(t) is the instantaneous temperature, N is the total number of EEG channels and k B is Boltzmann's constant [33], we can then define a dimensionless internal heat capacity as

Using Eq (40), we may then investigate divergences of the heat capacity near phase transitions including the critical point. Other thermodynamic quantities of interest may be similarly defined [34]. A caution is that more sophisticated definitions of temperature than Eq (39) may be needed for systems that are not at thermal equilibrium.

One can also begin to think of EEG dynamics in terms of established particle dynamics concepts. What does the free energy surface U(t) look like on which brain dynamics moves? Are there deep energy wells that trap neural super-particle trajectories? Are there frequent hops between energy wells? What proportion of time is spent inside a well vs on an energy barrier? How high are the energy barriers to transitions between free energy wells? Can one hope to bring into play such landmarks of physical chemistry as transition state theory and other reaction rate theories? What happens in disease, for example, in epilepsy? Do the energy wells become shallower? Is there less friction? We have previously taken small steps in investigating these questions [35], but a great deal more is possible. Our prior experience shows that the calculations required to utilize variational principle II are feasible.

3.4. Causality

According to Newtonian physics, the force exerted by one body on another acts instantaneously, with no time delay. Projection operator theory shows us that if the force exerted by one macroscopic body on another is mediated through many microscopic degrees of freedom, the response of the second body may be delayed in time. This time delay is represented by the convolution in Eq (17). The macroscopic time delay is intuitively obvious from our daily experience, and does not surprise us. The time delay is suggestive of but not proof of causality. Can causality be demonstrated using the Zwanzig-Mori equation of Eq (17)?

In this regard, it is instructive to consider Granger causality [36], which is finding increasing application in neuroscience and many other areas. In Granger causality, one assumes that the value of X(t + δt) at time t + δt depends on all prior values at earlier times. One then asks if X(t + δt) also depends on prior values of another vector Y(t). To answer this question, one can construct two time series:
One solves for G in Eq (41) by minimizing an error functional over some time interval [t a , t b ]:
Similarly one solves for H1 and H2 by minimizing the error functional

If E XY (t a , t b ) is significantly less than E X (t a , t b ) by some statistical measure, then one says that Y(t) Granger causes X(t).

The Zwanzig-Mori formulation can be applied to demonstrate causality in an analogous way. The Granger error terms R X (t) and R XY (t) correspond to random force fluctuations in the Zwanzig-Mori equation. The error functionals of Granger causality in Eqs (43) and (44) are equivalent to the Zwanzig-Mori error functional E I (t a , t b ), corresponding to our variational principle I. That the Zwanzig-Mori equation contains both a coordinate and a velocity term is not a major difference, as within the Granger formulation, one could simply take the velocity as another degree of freedom, in effect, another coordinate.

However, within the Zwanzig-Mori formalism, one also has the option of appealing to variational principle II. This variational principle may be more useful for noisy systems. For noisy systems, it is not justifiable to assume that the random force contribution to dynamics should be as small as possible. Random environmental forces may in fact be dominant. Nonetheless, one can still define the random force to have no time correlation with the macroscopic observables, for reasons discussed above. Thus, we feel that variational principle II has advantages over variational principle I.

3.5. Prediction and control: relation to Kalman filters

Kalman filters are also beginning to find application in engineering applications of neural prediction and control [37]. The idea here is to understand how a system responds to externally applied perturbations so that one can apply the perturbations in a rationally planned way so as to achieve a desired response.

Zwanzig-Mori theory also allows one to probe system response to external perturbations. Let the system observables be described by a vector X(t) and let the external perturbations be described by a time-dependent vector Y(t). On the experimental system, one may apply a variety of representative test perturbations Y(t). One then constructs a vector Z(t) containing both vectors X(t) and Y(t):
Going through all the steps as before, we find the Zwanzig-Mori equation for Z(t)

Appealing to either variational principle I or II, we can extract all the K's or G's in Eq (46) as a result of the test perturbations. There will be off-diagonal elements of the K's and G's that describe the effect of a given test perturbation Y(t) on the future dynamics of X(t). Knowing these elements allows one to predict the future response of the system to any variation of the test perturbations.

3.6. Related approaches

Hänggi and coworkers have also taken advantage of the Zwanzig-Mori formulation of the GLE to construct a hierarchy of statistical measures of system memory [38]. These measures have been applied to a patient with photosensitive epilepsy with striking results [39]. The Zwanzig-Mori approach has also been applied to construct a renormalized kinetic theory of dense fluids [40].

In terms of other theoretical approaches for deriving macroscopic equations of motion based on microscopic dynamics, a moment expansion method is also possible. In this approach, one defines moments of the macroscopic observables, e.g., where m and n are positive integers and where the angular brackets signify a phase space average over a distribution function, which need not be an equilibrium distribution function. Taking time derivatives and applying microscopic laws within the angular brackets results in coupled differential equations linking the dynamics of the different order moments of X(t). In general, the lower order moments will depend on higher order moments, and one needs a way of closing the moment expansion [e.g., see Refs [41, 42]].

Yet another method is to expand the distribution function in terms of a "basis set", and then derive equations of motion for the expansion coefficients. It is possible to allow the basis functions themselves to change in time by appealing to the Dirac-Frenkel variational principle [43], which resembles our variational principle II. Indeed, the Dirac-Frenkel variational principle was the motivation for variational principle II. The variationally optimized "mobile basis set" approach has been applied to quantum dynamics [44, 45] and can also be applied to statistical mechanics. A suggestion for workers who wish to try this mobile basis set approach is that one should expand the square root of the distribution function in a basis set, not the distribution itself, or else one would not be able to maintain normalization simultaneously with energy conservation.

3.7. Summary

The Zwanzig-Mori formalism is a simple and flexible mathematical framework for interpreting macroscopic dynamics even in the presence of significant environmental noise. Its range of applicability is very broad, including the dynamics of all natural systems governed by classical or quantum physics. It is based on sound physical principles, and it allows one to extract model parameters from experimental data using one of two variational principles. These variational principles are our principal contribution to the formalism.



DH was supported by NIH 1KL2RR025012-01 and by the NIH Loan Repayment Program. We thank Drs. John Straub, Tom Keyes and Viola Priesemann for helpful comments.

Authors’ Affiliations

Department of Neurology,, University of Wisconsin,


  1. Buzsaki G: Rhythms of the Brain. 2006, New York: Oxford University PressView ArticleGoogle Scholar
  2. Nunez PL: Neocortical Dynamics and Human EEG Rhythms. 1995, New York: Oxford University PressGoogle Scholar
  3. Engel J, Bragin A, Staba R, Mody I: Epilepsia. 2009, 50: 598-604. 10.1111/j.1528-1167.2008.01917.x.View ArticleGoogle Scholar
  4. Hodgkin AL, Huxley AF: J Physiol. 1952, 117: 500-544.View ArticleGoogle Scholar
  5. Traub RD, Miles R: Neuronal Networks of the Hippocampus. 1991, New York: Cambridge University PressView ArticleGoogle Scholar
  6. Freeman WJ: Neurodynamics: An exploration in mesoscopic brain dynamics. 2000, London: Springer VerlagView ArticleGoogle Scholar
  7. Ingber L, Nunez PL: Phys Rev E. 1995, 51: 5074-5083. 10.1103/PhysRevE.51.5074.View ArticleADSGoogle Scholar
  8. Wilson HR, Cowan JD: Biophys J. 1972, 12: 1-24. 10.1016/S0006-3495(72)86068-5.View ArticleADSGoogle Scholar
  9. Hughes JR: Epilepsy Behav. 2008, 12: 128-135. 10.1016/j.yebeh.2007.08.004.View ArticleGoogle Scholar
  10. Mormann F, Andrzejak RG, Elger CE, Lehnertz K: Brain. 2007, 130: 314-333. 10.1093/brain/awl241.View ArticleGoogle Scholar
  11. Zwanzig R: J Chem Phys. 1960, 33: 1338-1341. 10.1063/1.1731409.View ArticleADSMathSciNetGoogle Scholar
  12. Mori H: Prog Theor Phys. 1965, 33: 423-455. 10.1143/PTP.33.423.View ArticleADSGoogle Scholar
  13. Zwanzig R: Nonequilibrium statistical mechanics. 2001, New York: Oxford University PressGoogle Scholar
  14. Berne BJ, Pecora R: Dynamic Light Scattering. 1976, New York: WileyGoogle Scholar
  15. Press WH, Flannery BP, Teukolsky SA, Vetterling WT: Numerical recipes: The art of scientific computing. 1989, New York: Cambridge University PressGoogle Scholar
  16. Keyes T: J Phys Chem A. 1997, 101: 2921-2930. 10.1021/jp963706h.View ArticleGoogle Scholar
  17. La Nave E, Scala A, Starr FW, Sciortino F, Stanley HE: Phys Rev Lett. 2000, 84: 4605-4608. 10.1103/PhysRevLett.84.4605.View ArticleADSGoogle Scholar
  18. Adams JE, Stratt RM: J Chem Phys. 1990, 93: 1332-10.1063/1.459145.View ArticleADSGoogle Scholar
  19. Keyes T: Phys Rev E. 2000, 62: 7905-7908. 10.1103/PhysRevE.62.7905.View ArticleADSGoogle Scholar
  20. Buchner M, Ladanyi BM, Stratt RM: J Chem Phys. 1992, 97: 8522-10.1063/1.463370.View ArticleADSGoogle Scholar
  21. Chialvo DR: 2006, arXiv:q-bio/0610041v1
  22. Plenz D, Thiagarajan TC: Trends Neurosci. 2007, 30: 101-110. 10.1016/j.tins.2007.01.005.View ArticleGoogle Scholar
  23. Beggs JM: Philos Transact A Math Phys Eng Sci. 2008, 366: 329-343. 10.1098/rsta.2007.2092.View ArticleMathSciNetGoogle Scholar
  24. Beggs JM, Plenz D: J Neurosci. 2003, 23: 11167-11177.Google Scholar
  25. Beggs JM, Plenz D: J Neurosci. 2004, 24: 5216-5229. 10.1523/JNEUROSCI.0540-04.2004.View ArticleGoogle Scholar
  26. Kinouchi O, Copelli M: Nat Phys. 2006, 2: 348-352. 10.1038/nphys289.View ArticleGoogle Scholar
  27. Bertschinger N, Natschlager T: Neural Comput. 2004, 16: 1413-1436. 10.1162/089976604323057443.View ArticleGoogle Scholar
  28. Haldeman C, Beggs JM: Phys Rev Lett. 2005, 94: 058101-10.1103/PhysRevLett.94.058101.View ArticleADSGoogle Scholar
  29. Hsu D, Tang A, Hsu M, Beggs JM: Phys Rev E. 2007, 76: 041909-10.1103/PhysRevE.76.041909.View ArticleADSMathSciNetGoogle Scholar
  30. Hsu D, Chen W, Hsu M, Beggs JM: Epilepsy Behav. 2008, 13: 511-522. 10.1016/j.yebeh.2008.05.007.View ArticleGoogle Scholar
  31. Priesemann V, Munk MH, Wibral M: BMC Neurosci. 2009, 10: 40-10.1186/1471-2202-10-40.View ArticleGoogle Scholar
  32. Stanley HE: Introduction to Phase Transitions and Critical Phenomena. 1971, New York: Oxford University PressGoogle Scholar
  33. McQuarrie DA: Statistical Mechanics. 2000, Sausalito, CA: University Science BooksGoogle Scholar
  34. Allen MP, Tildesley DJ: Computer Simulation of Liquids. 1987, Oxford: Clarendon PressGoogle Scholar
  35. Hsu M, Hsu D: Neurocomputing. 2005, 65–66: 469-474. 10.1016/j.neucom.2004.11.003.View ArticleGoogle Scholar
  36. Granger CWJ: Econometrica. 1969, 37: 424-438. 10.2307/1912791.View ArticleGoogle Scholar
  37. Schiff SJ, Sauer T: J Neural Eng. 2008, 5: 1-8. 10.1088/1741-2560/5/1/001.View ArticleGoogle Scholar
  38. Mokshin AV, Yulmetyev RM, Hanggi P: Phys Rev Lett. 2005, 95: 200601-10.1103/PhysRevLett.95.200601.View ArticleADSGoogle Scholar
  39. Yulmetyev RM, Khusaenova EV, Yulmetyeva DG, Hanggi P, Shimojo S, Watanabe K, Bhattacharya J: Math Biosci Eng. 2009, 6: 189-206. 10.3934/mbe.2009.6.189.View ArticleMathSciNetGoogle Scholar
  40. Yip S: Ann Rev Phys Chem. 1979, 30: 547-577. 10.1146/annurev.pc.30.100179.002555.View ArticleADSGoogle Scholar
  41. Golden KI, Kalman G: Physical Review A. 1982, 26: 631-10.1103/PhysRevA.26.631.View ArticleADSGoogle Scholar
  42. Ma J, Hsu D, Straub JE: J Chem Phys. 1993, 99: 4024-4035. 10.1063/1.466098. []View ArticleADSGoogle Scholar
  43. Frenkel J: Wave Mechanics: Advanced General Theory. 1934, Oxford: Clarendon PressGoogle Scholar
  44. Hsu D, Coker DF: J Chem Phys. 1992, 96: 4266-4271. 10.1063/1.462820.View ArticleADSGoogle Scholar
  45. Kay KG: J Chem Phys. 1989, 91: 170-179. 10.1063/1.457631.View ArticleADSGoogle Scholar


© Hsu and Hsu 2009

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.