Statistical sources and African post-colonial economic history: Notes from the (digital) archives

ABSTRACT While interest in African economic history has grown rapidly in recent years, the continent’s post-colonial past remains understudied. This is at least in part because of the decline and fragmentation in the publication of economic statistics after decolonization, which has limited the type and breadth of quantitative analysis that can be undertaken. Nonetheless, this note argues that there are comparatively untapped post-colonial data sources that could enrich the study of the continent's economic history. The note surveys some of these sources and data repositories and provides advice, based on the author’s own experiences, on how to utilize them.


Introduction
African economic history is in ascendance. After a hiatus from the 1980s to the early 2000s, the discipline is undergoing a renaissance marked by a rise in publications and its growing integration into global economic history debates (Austin and Broadberry 2014;Fourie and Gardner 2014;Fourie 2019). Yet this African economic history resurgence suffers from a missing middle. While the colonial period is comparatively well documented by economic historians, the post-colonial era commonly features as an outcome of past conditions rather than a period of study in its own right. Many of the rich comparative studies of state development and living standards across African colonies produced in recent years end their data series in the early 1960s when statistical collection by the colonial metropoles ceased.
This note discusses some of the reasons for this lacuna in the African economic history literature, focusing on the changing nature of statistical publications after independence. It provides guidance, based on the author's own experience, which may help researchers interested in the post-colonial era to locate comparatively underutilized economic statistics covering the 1960s to the present.

'The compression of history'
African economic history came of age in the 1960s and 1970s as decolonization spurred scholars to engage with the continent's pre-colonial and colonial material past and chart how African economies were modernizing and transforming (see Hopkins 2019, for an overview). By the late 1980s, however, the discipline lost influence, partly due to a decline in economic history writing generally, spurred by the epistemological and methodological divergence between the economics and history disciplines. But African economic history was particularly hard hit, as the economic crises affecting much of the African continent in this period reduced international interest in the region's economic past. As Africa's economic fortunes rebounded in the 2000s, African economic history writing has flourished once more, marked by a growing research output, new academic networks dedicated to the sub-discipline and greater integration into global economic history debates.
One of the catalysts of this revived interest in African economic history was the bold theses and methods for studying long-term development pioneered by Acemoglu, Johnson, and Robinson (see discussion in Hopkins 2009). Drawing on new institutional economics, their work used regression analysis and instrumental variables to demonstrate causal relationships between historical events or institutional structures and contemporary economic outcomes. Their now-famous 'reversal of fortune' thesis, which argued that more densely populated regions were less likely to be settled under colonialism and were instead subjected to 'extractive institutions' that hindered long-term development, served as inspiration for a plethora of similar studies seeking to understand the historical roots of underdevelopment in Africa specifically (Acemoglu, Johnson, and Robinson 2002). Prominent examples include Nathan Nunn's (2008) research which demonstrated a negative effect of the intensity of the slave trade on a country's current level of income, Nunn and Wantchekon's (2011) paper linking slave trade intensity with contemporary levels of mistrust, and Michalopoulos and Papaioannou's (2013) research linking pre-colonial state centralization to contemporary levels of development. These research methods has been applied prolifically to African topics. In a review of economic history 'persistence' papers that identify causal relationships between an event in the past and a contemporary outcome, Giovanni Federico found that 23 out of the 69 persistence papers published since 2000 were on African history, the biggest regional cluster (Federico 2017). Most examined the effect of events more than a century before the contemporary outcome it was shown to influence. Fourie and Obikili (2019) refer to this as small T, large N studies, which furnish only a few observations across time, but across a large set of countries or regions.
This methodological approach has been criticized for its 'compression of history', as the methods used shed little light on the reasons for institutional persistence or how said institutions evolved between the two periods from which dependent and explanatory variables are drawn (Austin 2008;Jerven 2015). Moreover, closer data scrutiny and efforts to replicate some of the earlier of these studies have thrown doubt on the data quality and statistical robustness of many papers of this genre. Jerven (2013aJerven ( , 2013b has criticized the quality of the commonly used dependent variable in these studies (GDP per capita), showing that country rankings within Africa are unreliable and unstable. This point was also demonstrated by Frankema and van Waijenburg (2011) in a replication exercise which showed that results from one prominent study changed when they varied the outcome year for which GDP per capita was taken. Cogneau and Dupraz (2014) similarly conducted a replication exercise that demonstrated how a given set of results were driven by systematic biases in the measurement of population density based on satellite luminosity data. In another case endogenous measurement error and omitted variable bias were found to undermine the reliability of explanatory variables, in this case mission location (Jedwab, Meier zu Selhausen, and Moradi 2019). Similarly, Cogneau and Dupraz (2015) have illustrated the problems of using Murdock's ethnographic atlas as a measure of pre-colonial institutional variation. Others have pointed to common statistical weaknesses in persistence studies, such as spatial autocorrelation of residuals (Kelly 2019). Austin (2008) and Lamoreaux (2015) have called for more detailed case study work that fills the gap between historical event and contemporary outcome and explains processes of change over time. Such approaches would strengthen our understanding of causal paths and institutional persistence or lack thereof, and guard against spurious results.
Other strands of recent economic history research have indeed focused on changing living standards and demographics in Africa, tracing change over time for a smaller set of countries. Frankema and van Waijenburg (2012) reconstructed real wages across a set of British colonies in Africa over the colonial period to examine changes in living standards over time. Several studies have looked in depth at the fiscal evolution of the colonial state (Gardner 2012;Frankema and van Waijenburg 2014;Cogneau, Dupraz, and Mesplé-Somps 2018). But partly because of the move towards greater quantification in economic history and thus greater demands for data, this type of research has focused heavily on the colonial period, drawing primarily on administrative data collected by the colonial state (Hopkins 2019;Fourie 2016). The British and French empires collected broadly the same types of statistics across each of their colonies, archived in central repositories, which has made the digitization of these sources relatively cost effective, and enables comparison across time and space.
Many of these studies therefore end their analysis in the 1960s, when data collection fell to individual national statistical agencies. Consequently, the African economic history upsurge has been heavily skewed towards studies of the colonial era (roughly 1890s to 1960), while the pre-and post-colonial periods remain comparatively understudied. Of the 48 working papers 1 published by the Africa Economic History Network since its foundation in 2012, for instance, nine include coverage of the pre-colonial period, 39 cover the colonial period, and 13 include some coverage of the post-colonial period. Of these 13, however, eight of the papers with post-colonial coverage are birds-eye papers that present new long-run time series data covering a century or more and lack an explicit focus on economic change in the post-colonial era ( Coverage in Economic History of Developing Regions is similar. Since 2010, when the journal expanded its focus to developing regions, it has published 57 articles on African topics (excluding historiographical or theory pieces). Of these, 38 are on South Africa and focus primarily on the colonial or Apartheid eras or financial history topics. Of the 19 articles on the rest of the continent, four take a long-term perspective that spans the post-colonial, colonial, and in some cases pre-colonial period, while only two explore events specific to the post-colonial era.
To be fair, a limited focus on the latter twentieth century is not unique to African economic history. Contemporary history has until recently occupied a precarious position in most history departments. Prior to the 1960s, history explicitly focused on periods outside living memory, on the grounds that contemporary history cannot be studied dispassionately (Brivati 1996). General economic history journals, with a global scope, tend to contain a comparatively small share of articles on the post-1960s era, although the periods covered are considerably broader than in the African sub-field, and much less focused on the twentieth century overall. But in comparison with general African history writing, the omission of the post-colonial period is more marked in the economic history field. In the Journal of African History, by comparison, over a third of research articles published in the past two years pertained to the post-colonial period. 8 Furthermore, the study of post-colonial African economic history seems particularly salient given that so much recent African economic history writing is motivated by a desire to understand why African countries lag behind other regions of the world economically today (Akyeampong et al. 2014). While a growing literature has sought to identify the historical origins of underdevelopment, far fewer engage in detail with events of the post-colonial period, when African governments ought to have had the greatest possibility of altering these legacies, and the number of independent governments ought to have spurred the greatest plethora of development paths.
Consequently, many important recent economic ruptures that have influenced development paths, from the terms of trade shocks of the 1970s, economic crises of the 1980s, structural adjustment reforms of the 1990s, to deindustrialization and later recovery in the 2000s, have received comparatively little attention from economic historians. These topics are studied by economists and political scientists, some with a strong understanding of the continent's economic history no doubt, but who are not aiming to explain these processes of economic change using the economic historian's frameworks and toolkits, or to explicitly place them in historical context. Many of these economic studies compare sub-Saharan Africa's unfavourable development performance to other parts of the world, but do less to show whether and when these performance gaps emerged. Understanding whether, why, and when politically independent countries broke or attempted to break with economic legacies of the colonial era, and how external factors conditioned country paths, is crucial to this knowledge project and requires us to take the post-colonial economic policy and contexts seriously.

Underutilized statistical sources
A major reason for this lacuna is that the sources of economic statistics available to researchers changed in form and function after independence. In the 1960s and 1970s statistical collection fell to dozens of separate national statistical offices, with new interests, new lines of accountability, and different levels of technical capability. With the onset of the fiscal crises of the 1970s and 1980s, these statistical bureaus, like most government agencies, faced serious funding constraints. Consequently, post-colonial African economic data is harder to collect, patchier and of more varied quality (see Jerven 2013b). Given the importance placed on quantification and 'big data' to contemporary economic history writing (Fourie 2016), this has naturally pushed researchers towards topics and periods where the quantitative evidence is richest.
However, while recognizing these considerable constraints, the data deficit may not be quite as tragic as sometimes portrayed (see Devarajan 2013). By looking at a broader range of post-colonial sources and investing time in understanding how statistics collection changed in a given country after independence, progress can be made. Below I discuss several sources of economic data for post-colonial African countries, organized by source or repository, which offer possibilities for the construction of time series data of common economic variables. This list builds on the author's own research experience and is not exhaustive. It seeks merely to provide some possible starting points for researchers interested in the quantitative analysis of Sub-Saharan Africa's. 9 Government statistical publications are a first point of call. While no longer published in colonial blue books, the break in government statistical collection in Africa after independence is not always as sharp as it is made out to be. In the first decade of independence, many countries continued to collect much of the same data as during the colonial era and often increased the production of national accounts and data on industrial output and prices, collated in new statistical series.
Beyond national archives and libraries on the African continent, many libraries in the former colonial metropolises hold a considerable collection of these post-colonial statistical bulletins, abstracts, and reports. In London for instance, the British Library, Senate House Library, LSE, and SOAS shelve a large collection of statistical publications from Anglophone Africa, although the coverage declines in the 1970s. The library aggregation search tool, the Library Hub, provides an easy way of searching across the catalogues of most of the research libraries in the United Kingdom (https://discover.libraryhub.jisc.ac. uk/). The British Library has put together useful guides to the content of its African government publication collections, which can help to furnish the user with publication titles, although these sometimes do change over time (https://www.bl.uk/collection-guides/ african-government-publications). World Bank reports (discussed below) can also be a good guide to country publications as they often review and discus the data production in a given sector.
A few countriesnotably Kenyahave digitized their annual statistical abstracts dating back to the late 1950s and make them available from their statistical bureau websites. This publication summarizes statistics from across sectors and will often provide references to the underlying surveys or reports on which it builds. For more recent decades, many countries publish at least some budget documents and audit reports on the respective ministry of finance or audit office websites, and often a smattering of publications on the statistical bureau websites. These websites are worth browsing carefully. The headers and folders used on the main agency page are often poor guides to their actual content, and a review of all main subfolders sometimes throws up more material than expected. However, these country-specific sources can be time-consuming to collect, particularly for studies that rely on large country samples, as they require familiarity with each country's statistical output and nomenclature. Usefully, many of the basic statistical series produced by these national bureaus were republished in contemporary IMF and World Bank reports. In the past decade these two institutions have digitized a large swathe of their document archives. Consequently, troves of statistical data are buried in hundreds of thousands of PDFs, now available for download from their respective websites. These repositories contain far greater coverage and more detail than the series available from the World Development Indicators and other curated databases. Neither institution advertises these databases with great fanfare. Their websites are designed to showcase current projects and initiatives, and search algorithms will lead researchers first and foremost to country pages, recent reports, or showcased projects, rather than the document archives.
For economic history research, the World Bank 'documents and reports' archive is particularly useful, and contains material dating back to the late 1950s (http://documents. worldbank.org/curated/en/home). This repository contains intermediate, or semi-processed goods; rather than final research papers, it contains the country-level descriptive reports that form part of the World Bank's general monitoring or provide the basis for lending decisions and programme design. Some familiarity with the World Bank production of reports and studies is helpful for navigating this trove of material. The World Bank's repository for instance, allows searches by type of publication. By limiting the search to 'publications or research' or 'economic and sector work'former World Bank speak for background studies and country-level research productsit is possible to filter out the myriad project reports and legal documents that otherwise clog the search results and then manually review all search results for relevant material. Many of these reports, particularly the World Bank's flagship research products such as public expenditure reviews and country economic memoranda, contain statistical appendices that compile data from the national accounts and government accounts. They often collect broadly the same sets of statistics in each new study. Depending on the focus of the study, many also offer sector-specific statistics, be it on agriculture, education or infrastructure.
The IMF's archive catalogue search function is less advanced and less user friendly than that of the World Bank (https://archivescatalog.imf.org/search/simple). For purposes of harvesting economic databe it on government revenue and expenditure, GDP, inflation, or bankingan effective approach is to collect issues of a given type of IMF publication for a given country by searching for the series title or key words in the title search bar, such as 'Article IV', 'Recent Economic Developments', or 'Statistical Appendix'. For economic data, the IMF's 'Recent Economic Developments' reports are particularly useful. They were produced regularly on a country by country basis between the 1960s and 1980s, using a roughly standardized template. IMF staff reports on the Article IV consultations and monitoring programmes, where IMF staff report on general developments and economic risks and assess programmatic performance, are likewise valuable and often the only source of consistent basic economic data for the more recent period. Compilation of statistics is also often available in sporadic releases of country statistical appendices, often appended to 'selected issues' papers (thus commonly titled 'selected issues and statistical appendix'). Many of these publications follow a standard format and collect much the same information year in and year out. The IMF reports also have the added advantage that they provide a running commentary on economic developments in the given country and a contemporary assessment of the reliability of the available statistics. One practical difficulty is that a given report often comes with separately archived supplements, appendices, and notifications carrying similar document titles and filed as separate entries, thus the user may need to sift through multiple records to locate the main file.
Another useful institutional repository is provided by the International Labour Organization (ILO), which makes much of its archive of reports and studies available online through the portal labordoc (https://labordoc.ilo.org), including country studies on labour laws, labour relations, manpower, labour statistics, and more, dating back to the 1960s. Many are ad hoc studies, rather than standardized and regular reports, and therefore the most useful starting point may simply be to sift through everything published on a given country. Within these reports, data from recent surveys are often republished in tabular form in appendices.
A further valuable source of statistics are household and enterprise surveys. The International Household Survey Network, an umbrella organization for international organizations and donors that support survey programmes in developing countries, provides a large catalogue of surveys from across the world, organized by country and year (http:// catalog.ihsn.org/index.php/catalog). Some records simply document past surveys for which microdata is not publicly available, in other cases this site provides links to the data supplier, which usually allow researchers access to the microdata after registering with the relevant authority or organization.
Household surveys from the pre-1980s period are few and of mixed quality, but starting in the late 1980s the pace of survey production has accelerated, presumably due to falling costs of computing power that made them cheaper to analyse and administer. Many household budget surveys, demographic and health surveys, and censuses are today available to academic researchers in microdata form, which vastly increases the use to which the data can be put and have been used extensively as cross-sectional measures of contemporary variation in living standards. While these sources are primarily designed to measure demographic trends and living standards, they also offer an important alternative source of information on access to public services, infrastructure, and employment. Because they are based on the testimonies of randomly sampled households rather than records of government departments, they provide a means of cross-checking the veracity of some administrative sources, be it on public sector employment, wages, or access to services and infrastructure.
Within the household survey landscape, housing and population censuses deserve special mention. The Minnesota Population Center's Integrated Public Use Microdata Series (IPUMS) project has constructed a central repository for census microdata from around the world. Usually a 10% sample of the census is available for download. Over 70 African censuses have been made available to date from 27 countries. Most are from the 1990s and 2000s, but in a few cases the censuses date back to the 1970s (Benin, Cameroon, Lesotho, and Liberia) or 1960s (Kenya and Togo). Many contain variables on educational attainment and place of birth, in addition to basic demographics. This allows spatial studies at far more granular level than most alternative sources.
Depending on the research area, other UN agencies and institutes offer some historical sources and statistical repositories, for instance on agriculture (FAO) and industry and production (UNIDO). However, the breadth of the online archive and usability of the search engines vary significantly by agency. In addition, some bilateral donors provide archives or studies and reports commissioned by them. USAID's Development Experience Clearinghouse (https://dec.usaid.gov/dec/content/search.aspx), for instance, contains reports dating back to the 1960s.
Many of these sources have been used extensively by scholars in disciplines other than economic history. Surveys are commonly designed with policy research questions in mind, but the growing availability of household microdata from Africa has also been put to unconventional uses, with the DHS, for instance, spurring a literature on the political economy of ethnicity (Franck and Rainer 2012), and class (Shimles and Ncube 2015). Historical data on road expenditure in Kenya was used to study relationships between democracy and ethnic patronage (Burgess et al. 2015).
Some research has also made use of these sources with the explicit aim of challenging or enriching debates in economic history. With a focus on Kenya, Tanzania, and Uganda, for instance, Simson (2019) showed that many features of a post-colonial state that have been invoked to explain why the fiscal crises of the 1980s proved so enduring, such as the excessive size of the civil service and government budget, look different when analysed across time. Drawing primarily on World Bank and IMF reports, this work showed that public spending declined soon after the crises of the 1970s set in. With the exception of debt service payments, domestic recurrent spending and later staff numbers came under considerable pressure. This lends greater support to theories that emphasize external rather than internal drivers of the crises. Albers and Suesse (2015), in a paper on tax intensity in Africa over the course of a century, found that the legacies of colonial taxation structures in Africa faded after independence and did not prove enduring. Bossuroy and Cogneau (2013) used household budget surveys from five African countries to provide the first measures of net intergenerational occupational mobility in the post-colonial, showing important country variations in farm to non-farm mobility that may have colonial roots. While not strictly an economic history paper, Alesina et al. (2019) have used the IPUMS International census data to map educational mobility across the continent and shed light on the drivers of the rate of educational mobility, including historical ones.
Although these sources of post-colonial economic statistics may require a greater time investment to collect and interpret than some of the colonial-era sources, they offer opportunities to enrich key debates in African economic history about institutional persistence or its corollary. They can shed light on how state structures and policy altered after independence and the causes and consequences of the economic crises and structural adjustment reforms of the 1980s and 1990s. By using economic history methods and lenses to explain changes in the more recent past, they can also make African economic history more relevant to African policymakers and commentators today.

Disclosure statement
No potential conflict of interest was reported by the author.

Notes on contributor
Dr Rebecca Simson is the David Richards Junior Research Fellow in Economic History at Wadham College, University of Oxford.