Epidemic life and death: WHO estimates anyway?

The World Health Organization (WHO) recently estimated that around 4.7 million Indians died of Covid-related causes in the years 2020 and 2021. While these headline estimates, which are several orders of magnitude higher than India’s official estimates, have made headlines, the true value of a statistical estimation exercise lies in its methodology.

Broadly, the WHO estimates the expected mortality (monthly) for 2020 and 2021 based on historical data, comparing it with the actual all-cause mortality (ACM) for 2020-2021 and assigning the difference to COVID-19. Interprets it as a higher mortality rate. This is fine in theory, except for the fact that for India, WHO does not have monthly ACM data for 2020-2021. To address this “data gap”, WHO “estimates” the monthly ACM for India.

Thus, the first thing to note is that the WHO estimate of excess deaths is based on the estimates themselves and not the observed figures. In addition, WHO uses different models to estimate the ACM for different countries. In the absence of evidence to the contrary, which the WHO does not provide, it provides a cross-country comparison of the ACM and, in detail, renders the additional deaths completely uninformed.

Another point to note is that WHO uses data from 17 states to estimate the national ACM for India. However, the source of this data is shrouded in mystery. WHO does not disclose which states are part of the sample. A press release from the Government of India states that this data is unverified. WHO recognizes that the data cannot be treated as “officially produced by India” figures; It also acknowledges that in addition to official data, it also “uses data provided by journalists receiving death registration information through Right to Information requests.” From a statistical point of view, this is problematic because, as WHO recognizes, different sources process and record mortality data in different ways. This means that segregating data from different sources can lead to systematic biases in the sample.

Thus, it is important to note that the 4.7 million number is an estimate that is itself based on estimates that are based on unverified data.

The third problematic aspect of WHO’s estimation exercise is the statistical method it uses to estimate the ACM for India from state-level data. The methodology used by it is based on Karlinsky (2022), but is much more complex. This is because Karlinsky uses one province (Córdoba) to estimate the national ACM for Argentina, while the WHO uses 17 Indian states to estimate the national ACM for India. This is compounded by the fact that the number of states for which data is available varies by month. Not only does WHO disclose which states are present in its sample, it does not state which monthly observations are made.

Thus, in a nutshell, the WHO estimates based on estimates were drawn from unverified data obtained using a methodology that makes an assumption for India whose validity is questionable, at least.

Going deeper into the methodology reveals even more troubling issues of implementation. WHO acknowledges that the accuracy of its statistical method depends on two important assumptions. First, that the distribution of the epidemic over time is uniform for the states in its sample and for the country as a whole. Second, the share of states in the total deaths in the country has remained historical and stable during the pandemic.

While Karlinsky documented this in his study, WHO provides no evidence that these assumptions hold for the data used for India. Unlike Karlinsky, the WHO does not even present a rudimentary graph showing where the states in the sample went through the peaks and troughs of the pandemic as well as the country as a whole. It is critical. Given India’s geographic spread and given that the spatial and temporal distribution of epidemics varies quite widely across Indian states, using state-level data that does not reflect the national spread of the epidemic, the ACM will increase the estimates. could.

Even more surprising is that the WHO claimed to have validated this model in a simulation – for which it gives no details – but did not do it empirically, even though it is easier to validate it empirically. and be easier. One, it could have applied this model to countries for which both sub-national and national data are available and verified if the model-implied national ACM matches the actual observed ACM. Second, WHO could apply the model to pre-pandemic sub-national and national data from India and verify whether the model-implied ACM matched the pre-pandemic ACM for India.

This study does not constitute an isolated case of statistical misadventure. It is part of a series of sordid events, including WHO’s inaction on covaccine, slow investigation into the origins of Covid-19, Indian government’s persistent under-reporting of data, an endless stream of ‘Indian-origin’ ‘scientists’ on television procession is involved. Whipping their ‘scientism’ with wrong predictions, and finally, hordes of ‘independent’ journalists scouring crematoriums and cemeteries with cameras and drones, ruthlessly trampling on the misery of scattered families in search of those elusive bodies which may prove to be a massive undercount. dead in India.

These are the personal views of the authors.

V. Ananth Nageswaran and Diva Jain are, respectively, Chief Economic Adviser to the Government of India and Director at Arjav.

subscribe to mint newspaper

, Enter a valid email

, Thank you for subscribing to our newsletter!