On the Value of Information and Mean Squared Error for Noisy Gaussian Models

The relationship between the age of information (AoI) and the mean squared error (MSE) in optimisation problems has been widely investigated for various Gaussian Markov models. Recently, we put the AoI in an information-theoretic context and proposed a mutual information-based value of information (VoI) framework to characterise how valuable the status updates are. In this letter we consider general noisy Gaussian models and investigate the connection between the VoI and MSE. We show that the VoI can be presented by a logarithmic function of the MSE. We also study the rate of change of the two with respect to the signal-to-noise ratio (SNR) and the AoI. The results hold for both discrete-time and continuous-time stationary Gaussian processes. This letter illuminates a useful relationship between information theory and signal processing theory in the presence of the AoI and additive noise, and provides a possible way to infer the information value by estimating the MSE for practical applications.


I. INTRODUCTION
T HE timeliness of data is of great importance for next-generation wireless communication systems. Innovative real-time applications largely rely on time-critical information for timely monitoring and control. A metric named age of information (AoI) has been introduced to characterise the data freshness at the receiver [1]. It is defined as the time difference between the current time and the generation time of the latest received data sample. The AoI and its applications have been widely investigated in various queueing models and wireless networks [2]- [5]. However, the AoI may not be useful in all applications since it is a process-independent metric that grows linearly with unit slope. This means that the AoI is unable to capture non-linearities caused by correlation properties of the underlying physical process.
For the reasons mentioned above, other measures have been proposed in the literature. One method is to use monotonic non-linear AoI functions to measure the freshness or staleness of data. Exponential, logarithmic and step functions have been commonly used as the AoI-dependent metric [6], [7]. The other method is to employ information theory or signal processing theory to characterise the importance of fresh information. The estimation error [8]- [13], mutual information [14] Manuscript and conditional entropy [15] have been utilised as non-linear functions to characterise the information value. The relationship between the AoI and the mean squared error (MSE) for remote estimation has been investigated. The authors in [9]- [11] showed that the MSE could be written as a function of the AoI for a Wiener process, an Ornstein-Uhlenbeck (OU) process and a linear time-invariant (LTI) system, respectively. Despite their novelty and contributions, existing works mostly focus on Markov models in which the system performance only relates to the most recent received data sample and samples are assumed to be observable at the receiver. Since the status updates can be negatively affected by noise or other features in the transmission system, it may not be reasonable to assume that they are directly visible. In [16], the authors explored the performance of the MSE in the presence of the noise but only considered the latest single observation. However, in this case, all the past observations hold valuable information about the underlying random process. Therefore, we require a more appropriate metric to evaluate the information value for latent variable models.
In our previous work [17], [18], a mutual informationbased value of information (VoI) framework was proposed to characterise how valuable the status updates are for latent variable models. We explored this framework in the context of a specific noisy Gaussian process (Ornstein-Uhlenbeck process) and studied its relationship to the AoI.
In this work, we focus on more general noisy Gaussian models to investigate the VoI and its connection to MSE. In [19], an important relationship between MSE and the input-output mutual information was established in the absence of transmission delay. In that work, data samples are treated individually and the data value of each sample does not change with time. In this work, we extend this result to status update systems in the presence of transmission delay. We show that the VoI equals a logarithmic function of MSE, a result that holds for both discrete-time and continuous-time Gaussian processes. We also explore the connection between the rate of change of the VoI and MSE under high and low correlation conditions. This letter illuminates interesting relationships between the timeliness of information and remote estimation in an information-theoretic context, which provides a possible way to infer the information value by utilising the estimation error.

II. VOI AND MSE FORMULATION
We consider the single transmitter-receiver case in which the sensor device is deployed to observe a random process {X t }. The sensor needs to continuously sample data to get 1558-2558 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information. the latest status update of a physical phenomenon and transmit data samples in a timely manner to the receiver. The system model is shown in Fig. 1. We denote X ti as the status update of the underlying random process and t i as its generation time at the transmitter. Samples are observed through an additive noise channel. We denote Y t i as the corresponding observation and t i as its receiving time at the receiver. 1 The estimator uses the observations to provide the optimal estimationX t of the current status of the underlying process. Due to the transmission delay and the noise in the transmission channel, we have t i > t i and Y t i = X ti . For a given time instant t, we denote n as the index of the most recent received data sample (i.e., t > t n ), and n is also the total number of data samples observed at the receiver. In this case, the AoI is given as t − t n .
For given time instants t, t i and t i (1 ≤ i ≤ n), the value of information at time t is defined as [17] v Here, X t is the current status of the random process at the transmitter, m represents the memory of the system, and {Y t n , · · · , Y t n−m+1 } is the collection of the most recent m of n observations that are captured at the receiver. The concept of the value of information is defined as the mutual information between the current status of the underlying hidden process and a sequence of past noisy measurements. This definition is interpreted as the reduction in the uncertainty of the current hidden status based on some noisy observations. The VoI characterises how valuable the status updates are from the receiver's perspective.
For remote estimation of the underlying hidden process, the mean squared error at time t is defined as whereX t is the estimator of the current status X t . The minimum mean squared error (MMSE) is achieved when The MSE measures the error of an estimator of the current unobserved status of the underlying process given that we have some past noisy observations.

A. Noisy Gaussian Model
The underlying hidden process {X t } is considered to be a stationary, Gaussian process. We assume that samples {X ti } are Gaussian variables with zero mean and constant variance σ 2 x . Due to the property of stationarity, we denote ρ τ (0 < ρ τ < 1) as the auto-correlation coefficient of two variables X ti and X tj where τ = |t j − t i | (Theorem 6.8.2, [20]), i.e., If X ti and X tj are less correlated, ρ τ approaches 0; otherwise, it approaches 1. Let vector X = [X tn , · · · , X tn−m+1 ] T be the sequence of status updates of the underlying random process and let K X be its auto-correlation matrix, i.e., We assume that this random process is observed through an additive Gaussian noise channel. We denote N t i as the noise sample induced by the receiver at t i . Then, the corresponding noisy measurement is defined as The samples {N t i } are assumed to be independent and identically distributed (i.i.d.) Gaussian variables with zero mean and constant variance σ 2 n . Let γ be which is used to represent the signal-to-noise ratio (SNR) over the Gaussian channel. Similarly, let vectors Y = [Y t n , · · · , Y t n−m+1 ] T and associated N = [N t n , · · · , N t n−m+1 ] T be the noisy observations and noise samples, respectively. Let Σ Y be the auto-covariance matrix of Y . Thus, observations and covariance matrix can be collectively represented by where I is the identity matrix.

B. Value of Information
Let the m-dimensional vector u be given by

Lemma 1: The VoI for the noisy Gaussian model is given as
Proof: See the appendix. Here, γ is the SNR in the transmission environment. The vector u represents the correlation between the current status and all the m most recent observations. The first element ρ t−tn relates to the time difference t − t n , which is the AoI. When the AoI is very small or the underlying random process is highly correlated, ρ t−tn is close to 1.
Lemma 1 gives the VoI expression and links it to the AoI for general noisy Gaussian models. The VoI relates the correlation of the underlying random process, the noise over the transmission channel, the number of observations, and the AoI. Similar results hold for specific Gaussian hidden Markov models; details can be found in our previous work [17], [18]. From the operational perspective, the VoI can be interpreted as how many bits of information are contained in a latent state by observing a sequence of visible noisy measurements.

C. Mean Squared Error
For remote estimation, the estimator is the conditional expectation given some noisy measurements, i.e.,X t = E[X t |Y = y], which is given by (Theorem 11.1, [21]) Based on (11), we can state the following result.

Lemma 2: The MSE for the noisy Gaussian model is given by
Proof: The MSE in Lemma 2 shows a similar behaviour to the VoI in Lemma 1. It also relates the SNR γ, the auto-correlation matrix K X , and the vector u. For the single observation case, it is easy to find that large SNR or highly correlated data source can lead to small MSE. Moreover, we have the following result regarding the connection between the MSE and the AoI in this case.
Remark 1: If the correlation coefficient ρ τ is non-increasing when τ increases, the MSE is a non-decreasing function of the AoI.
Proof: The derivative of the MSE with respect to vector u is given as For the single observation case, vector u is ρ t−tn , which is non-increasing with the AoI Δ = t − t n and we have Thus, the derivative of the MSE with respect to the AoI is given as Remark 1 shows that the MSE can be presented by a non-decreasing function of the AoI for noisy Gaussian models. This remark holds for specific Gaussian processes (e.g., Wiener process [9] and OU process [10]). It also holds for exponentially correlated data source in LTI systems [11].

IV. RELATIONSHIP BETWEEN VOI AND MSE
Based on Lemmas 1 and 2, we can state the following main result.
Proposition 1: For a fixed time instant t, the relationship between the VoI and the MSE for noisy Gaussian models satisfies This proposition links the VoI to a logarithmic function of the MSE. In practice, if we think about a communication system and try to determine how valuable the status update are, it may not be easy to estimate the mutual information but it is easy to estimate the MSE. Therefore, by estimating the MSE and using this link, it is possible to infer what the value of information is.
Proposition 1 illustrates the general relationship between the VoI and the MSE with multiple observations. In the following part, we consider the single observation case to explore this result further. The VoI and MSE can be calculated by replacing the m-dimensional vector u and matrix K X with the single variable ρ t−tn and 1, respectively. Then, we have the following corollaries.
Corollary 1: If the latest sample X tn and the current status X t of the underlying random process are highly correlated (i.e., ρ t−tn → 1), the relationship between the VoI and MSE can be written as If the latest sample X tn and the current status X t of the underlying random process are uncorrelated (i.e., ρ t−tn → 0), we have In this corollary, (18) links the derivative of the VoI with respect to the SNR to the MSE for highly correlated data samples. A high correlation condition means that either the AoI is very small (i.e., t − t n → 0) or the underlying data source does not change quickly with time (e.g., the random process {X t } is a constant). This relationship covers the result in [19], which links the derivative of the input-output mutual information to the MSE in the absence of transmission delay. For less correlated data samples, (19) shows that the MSE is independent of the SNR. Moreover, a low correlation condition yields zero rate of change of the VoI with increasing SNR. . (20) This corollary shows that the rate of change of the VoI is not equivalent to the MSE when increasing the AoI. Based on Remark 1, it is easy to see that the VoI is a non-increasing function of the AoI. When the estimation error is large, the rate of change of the VoI is 0, and it does not relate to the AoI.

V. CONCLUSION
In this letter, we studied the connection between the VoI and the MSE for general noisy Gaussian models. The proposed mutual information-based VoI framework represents the reduction in the uncertainty of the current status given some past noisy observations. This letter showed that the VoI can be presented as a logarithmic function of the MSE, which gives a possible method to evaluate the information value by estimating the MSE for practical applications. Moreover, the rate of change of the VoI and MSE with respect to the SNR and the AoI were investigated in high and low correlation regimes, which further illustrated connections among the AoI, VoI and MSE. Future work can be focused on exploring the effect of different autocorrelation functions on the VoI and its relationship to the MSE.

APPENDIX
The VoI definition given in (1) follows from the relation I(X t ; Y ) = h(X t ) + h(Y ) − h(X t , Y ). Since (X t , Y T ) is multivariate Gaussian and stationary distributed, the VoI can be further written as (Example 8.5.1, [22]) where Σ Xt,Y is the covariance matrix of (X t , Y T ) T . The covariance between X t and Y t i is given as Therefore, Σ Xt,Y can be presented by Σ Y , i.e., Substituting (23) into (21), the VoI is given as where (a) is obtained by the block matrix determinant and (b) is obtained by the determinant lemma.