On Parameter Estimation by Aggregated Poisson Observations

We consider the problem of parameter estimation by observations of inhomogeneous Poisson processes. The intensity function of the process is supposed to be a smooth function with respect to the unknown parameter. We propose a Chi-square statistic on the base of aggregated observations and we define a Minimum Chi-square Estimator with the help of this statistics. We show that this estimator is consistent and asymptotically normal. We discuss possible generalizations of the obtained results.


Introduction
Statistical inference for inhomogeneous Poisson processes provides interesting problems of mathematical statistics.It consists in inferring the unknown characteristics of the observed statistical model which is present in the intensity function of the inhomogeneous Poisson process.We can mention among others: Kutoyants [11,12,13], Karr [10], Kutoyants and Spokoiny [14], Aubry and Dabye [1], Ba and Dabye [2].The most important aspects of inference are the theory of estimation and hypothesis testing.The method of Minumum Chi-square Estimation occupies an important place in this field.There are many applications in biology (Greenwood and Nikulin [6]), astrophysics (Mighell [15]), nuclear physics (Harris and Kandji [7]), as well as the references therein.The list is not exhaustive.
The Chi-square goodness of fit test was introduced in the work of Karl Pearson in the early twentieth century.Originally, this is a simple hypothesis testing problem.The study was made thanks to the statistics built by him that bears his name.As a result, many of the contributions have been made later.Over the years, given the changing contexts, this method was applied to many different statistical models and has led to many other interesting tests.In the work by Greenwood and Nikulin [6], one can find a detailed summary of the history of the development of this method done by many researchers in this field.
The controversy between Pearson and Fisher regarding the distribution to be compared is one of the great stories in the history of statistics.
Nowadays we know that the results from this remarkable and fantastic discovery are only true for the case of simple null hypotheses.In 1924, Fisher proved that in the case of composite hypotheses, where we need to make estimates, the limit distribution of the Pearson's statistic is no longer valid.Moreover, this limit distribution depends on the method used to estimate the parameters.In particular, if the parameters are estimated by the Karl Pearson's statistic minimization method or by any other asymptotically equivalent method, then the limit distribution may still be Chi-square, but the number of degrees of freedom is reduced by the dimension of the estimated parameter.Chernoff and Lehmann proved in [4] that using the maximum likelihood method derived from raw data, the limit distribution of the Pearson's statistic changes dramatically.This distribution depends on the unknown parameter and therefore is unusable for the test.Once again, the problem of determining a free statistic (Asymptotic Distribution Free) arises.
We know that the method of the Maximum Likelihood Estimation is widely used and its sovereignty has been questioned by Berkson [3], who argues that the Minimum Chi-square, and not the Maximum Likelihood, is the basic principle of the statistical inference.In any application, there is a wide class of estimators that Berkson would classify as Minimum Chi-square estimators, whereas in the literature the definition of such a class is not unambiguous.Moreover, according to Berkson, the Pearson's Chi-square statistic formulation requires data from a discrete distribution, or data from a continuous distribution grouped into intervals (or more generally subsets) of the sample space.There is no doubt that the minimization of Pearson's Chisquare statistics almost invariably requires tedious and complex procedures, but current authors believe that the preference for maximum likelihood is perhaps more due to prejudice than to real computational problems.
Without going so far as to advocate the universal application of the estimation based on the minimization of Chi-square statistics, they feel that there are situations where it may be preferable to use other methods and, if necessary, to use modern powerful calculation techniques.
Shortly after, Greenwood and Nikulin [6] contributed, under some regularity conditions and for any continuous probability law, a limit distribution that does not depend on an unknown parameter of the statistic constructed from the Maximum Likelihood Estimator (MLE).Also worth mentioning is Moore's significant contribution in [16], which summarized all these results.He showed, in particular, that the modifications made to Pearson's statistics are derived from Wald's approach [17].
Many researchers preferred to focus their work on the tests, rather than on the estimate of the Minimum Chi-square.The authors also discovered that there are almost no works on Poisson processes in continuous time.
In this paper, after introducing our statistical model of continuous time Poisson processes, we introduce a Chi-square type statistic and construct an estimator of the parameter ϑ with the help of this statistic.Subsequently, we study the asymptotic properties (the consistency and the asymptotic normality) of this estimator.We are advancing the perspectives, and propose possible generalizations of the obtained results.

Statistical model and the Minimum Chisquare Estimator
Suppose that we observe n samples of an inhomogeneous Poisson process The mean and intensity functions are denoted as Λ(ϑ, •) and λ(ϑ, •) respectively, i.e.
Our goal is to construct an estimator of ϑ by aggregated data.We introduce a partition A of the interval [0, T ] in the following way: we choose numbers a 0 , . . ., a L such that and we obtain the intervals

Note that these intervals verify the following relations
Here and in the sequel, in the notation "Λ(ϑ, A l )", Λ is the intensity measure.Let us denote as well then X j = X l j , l = 1, . . ., L , j = 1, . . ., n, are independent Poisson random vectors which verify Our goal is to introduce the Chi-square type statistic and to construct an estimator of the parameter ϑ with the help of this statistic.
Let us put We have to estimate ϑ by the observations X n = X j , j = 1, . . ., n .We denote ϑ 0 the true value of the parameter.The derivative of any function f (ϑ, t) (ϑ ∈ Θ, t ∈ [0, T ]) with respect to ϑ will be always denoted ḟ (ϑ, t).
Let us introduce the following functions Define the Minimum Chi-square Estimator (MCE) ϑ n which minimizes the function T n (ϑ) as a solution of the equation If this equation has more than one solution then any of them can be taken as the MCE.In the sequel we will write We are interested in asymptotic properties of the MCE for large samples (n → ∞).We need the following conditions.

Consistency
Theorem 1 Suppose that the conditions C 1 and C 2 hold.Then the MCE ϑ n is consistent.More precisely, Proof.Using the properties of the norm for any γ > 0 we can write Here and in the sequel a means (a 1 , . . ., a L ) , where . is the Euclidian norm in R L .Further, Hence, we obtaine where we used the Tchebychev inequality and the fact that The theorem is proved.
Remark 1 For the proof of consistency there is no condition on the regularity of the model.This means that the obtained result is valid for the discontinuous intensity functions λ(.).

Asymptotic normality
In this section we prove the weak convergence to a normal law of the MCE ϑ n .By " ⇒ " we denote the convergence in distribution.The following theorem is valid.
Theorem 2 If the conditions C are fulfilled, then the MCE ϑ n is asymptotically normal, that is Proof.Let us recall that the function T n (ϑ) is defined by and let us remind that the MCE ϑ n is a solution of the equation The function T n (ϑ) is differentiable with respect to ϑ.The MCE ϑ n minimizes the function T n (ϑ) and is consistent for ϑ 0 in the open set Θ.So the MCE is in the open set Θ with a probability converging to 1, and with the same probability the derivative The calculation of the derivative T n (ϑ) with respect to ϑ gives us Therefore, the equation ( 2) becomes and, according to the weak law of large numbers, As the MCE is consistent and the functions Λ (ϑ, A l ) , Λ (ϑ, A l ) are continues, the equation ( 3) can be rewritten as where o( 1) is convergence in probability.
Let us introduce the random variable Adding and subtracting Λ(ϑ 0 , A l ) to ζ n (ϑ n , A l ) and using the Taylor expansion, we obtain where θn is an intermediate value between ϑ 0 and ϑ n .Then Consider the following random variable We can write Further, and hence Remark 2 According to the theorem, the limit variance of the MCE is the same as the limit variance of the MLE.This is an important property of the MCE.
The detailed analysis of the proof shows that we have the convergence of moments too and, in particular, 4 On asymptotic efficiency Let us fix a partition A and denote P ϑ , ϑ ∈ Θ the family of measures induced in the space R nL by the observations X n = (X 1 , . . ., X n ).The likelihood ratio formula is We suppose that the conditions C are fulfilled and put ϕ n = n −1/2 .Then the normalized likelihood ratio Here and ε n → 0 (convergence in probability).Therefore the family of measures P for all ϑ 0 ∈ Θ.Note that the MLE θn , defined by the relation L θn , X n = sup ϑ∈Θ L (ϑ, X n ) is one of the solutions of the equation This estimator is consistent, asymptotically normal √ n( θn − ϑ 0 ) =⇒ N 0, I (ϑ 0 ) −1 and asymptotically efficient (see [11] and [9]).Thanks to the convergence (6), we can write as n → ∞ and δ → 0. Therefore, the MCE ϑ n is asymptotically efficient.The MCE in this case is asymptotically normal with the limit covariance matrix D(ϑ 0 ) −1 .

Let us consider the following problem. Suppose we have the possibility to choose the partition
The following question naturally arises: how to choose it in optimal way?We know that the limit variance of the MCE is D(ϑ 0 , A) −1 , where Therefore, in order to minimize D(ϑ 0 , A) −1 we have to maximize D(ϑ 0 , A).In other words, we have to find A * = (A l , l = 1, . . ., L) such that for any choice of the partition A = (A l , l = 1, . . ., L).Of course, the optimal choice A * = A * (ϑ 0 ) depends on the unknown parameter ϑ 0 .The partition A * (ϑ 0 ) is obtained as a result of the corresponding problem of optimization.A similar problem for the MLE was considered by Kutoyants and Spokoiny [14] (see also Kutoyants [13], section 4.3).
As in these works, the solution carried is out in two steps.First, we construct a preliminary consistent estimator θN of the parameter ϑ 0 by one part of observations X N = (X 1 , . . ., X N ), where N < n.To do it we can take any A satisfying the conditions C.Then, using this estimator, we can take the partition A * θN and prove the asymptotic "optimality" of this construction.

Another interesting problem can be the choice of the partition in such a
way that the contributions of the observations from different intervals to the limit variance of MCE are equal.This means that A is such that d(ϑ 0 ) = Λ(ϑ 0 , A 1 ) 2 Λ(ϑ 0 , A 1 ) = Λ(ϑ 0 , A l ) 2 Λ(ϑ 0 , A l ) , l = 2, . . ., L.
Once more, the "optimal" A depends on the unknown parameter ϑ 0 .The solution of this problem can be carried out in two steps, too.First, we construct, as above, a consistent estimator, and then we use it to define the sets.
4. The obtained results allow constructing the corresponding goodnessof-fit test with basic parametric hypothesis.The test can be asymptotically distribution free, but it is consistent against a special class of alternatives.As in classical case, the alternatives which give the same values of Λ (ϑ 0 , A l ) cannot be detected by the test.