UvA-DARE (Digital Academic Repository) A re-appraisal of the migration-development nexus: Testing the robustness of the migration transition hypothesis

The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


Introduction
Globalization has facilitated physical mobility and as a result enabled international migration to increase from 92 million in 1960 to 244 million in 2017. 1 The traditional view that the root cause of these rising migration flows has been a lack of economic development in origin countries has resurfaced in the past few years within the policy debates of both sending and receiving countries. 2Would-be migrants, the argument goes, decide to move primarily in search of higher wages and income abroad.In this framework, exogenous non-economic factors such as natural disasters and conflicts at origin are secondary.
The direct relation between income differentials and emigration originates from the neoclassical theory of migration. 3This theory posits that a higher domestic reservation wage reduces the relative expected returns on emigration, as opposed to staying at home.This implies that, the larger the income and wage differentials between countries, the higher the migration pull factors are.Consequently, emigration is predicted to decrease as income gaps between origin and destination countries close.
An important policy implication of this theory is that high-income countries can decrease immigration through policies that help low-income countries raise their average incomes and development levels.When income differentials decline as a result, so will the migration flows from low-income to higherincome countries.This will also relieve strained borders and stem the brain drain that negatively affects developing countries (Caselli, 2019).Accordingly, since the 1990s, policy makers, academics and development NGOs have advocated a triad of policies aimed at fostering development in emigration countries through aid, trade liberalization, and temporary and return migration (De Haas, 2007).However, although these models are intuitively appealing, they do not adequately explain observed patterns of migration.Empirical evidence shows that migration determinants do not depend only on economic factors such as income and wages, but also on migrant networks abroad, foreign immigration policies, and demographic transitions (Clemens, 2014).The migration transition hypothesis developed in Zelinsky's (1971) seminal paper, on the other hand, accounts for both these economic and non-economic migration determinants.This creates a richer interrelation between migration and economic income levels.In particular, this hypothesis predicts a nonlinear inverted-U relationship between development and migration.Emigration first rises as development increases in a given origin country, until a so-called migration transition turning point is reached, after which emigration starts declining.As explained in Clemens (2014), this phenomenon can be explained by factors such as, among others, rising inequality, gradually relieving credit constraints, and structural labor market changes leading to worker dislocation, which might all accompany the economic development process.
There is an extensive literature on the determinants of migration that has tested Zelinsky's hypothesis.Using cross-section data many studies find the inverted U-shaped relationship between levels of GDP per capita and the share of emigrants, even after controlling for other determinants of migration (Djajic et al., 2016;Dao et al., 2018;Idu, 2019).However, testing for a migration hump using cross-section data leaves important considerations unaccounted for, such as reverse causality and the migration transition's longitudinal dimension, as the transition takes place over an extended time period in a given origin country.Other studies have tested for a hump shape using panel data (Mayda, 2010;Bertoli and Huertas-Moraga, 2013).However, these papers use a limited number of country-time points, which restricts the empirical strength of their results.Other papers test the inverted-U relationship using solely migration flows to OECD destinations (Lull, 2016;Benček and Schneiderheinze, 2019).These studies, however, exclude the possibility that migrants from low-income countries can also migrate to other low-or mediumincome countries.Since the average share of migration from all origins to non-OECD destinations is 50% over the 1960-2017 period, 4 we include such migration flows in order to incorporate all migration corridors in the analysis.
The aim of this paper is to test for the inverted U-shape between emigration and development using a large panel database.We employ a comprehensive global panel data set with 180 origin and destination countries on a 50-year timeframe .5 This allows us to empirically test for bilateral migration dynamics not only across countries but also across time with a relatively large number of observations. Bcause of its large longitudinal dimension, it is well suited for testing the migration transition hypothesis' central prediction, which is a long-run phenomenon per origin country (De Haas, 2010).Our empirical specification is based on the random utility-maximization (RUM) model, which provides the microfoundations for a migration version of the gravity model.6 We employ a gravity-migration specification with a large number of fixed effects, which control for several observed and unobserved origin-, destination-, time-and country-pair-specific characteristics deemed to influence migration.We introduce both a linear and a squared GDP per capita at origin term (our proxy for development levels) to test for the non-linear inverted-U shape.This term is instrumented using its period-to-period lag in order to tackle reverse causality.The data set presented in this paper further contributes to Llull (2016), who employs a similar panel data set including bilateral migration flows for the 1960-2000 time period but does not test for the inverted Ushaped relationship between development and emigration.
To our knowledge, this is the first paper on the migration transition hypothesis that tests the RUM model on a global panel data set that extends over a period of 50 years and includes bidirectional flows for 180 origin and destination countries.This comprehensive database accounts for all potential migration flows, and not merely flows to OECD destinations.As stated above, about half of all international migration, on average, was to non-OECD destinations.Merely including OECD destinations would therefore leave out a large portion of all migration flows.Furthermore, we reduce the bias due to the presence of zeros in the dependent variable using a Poisson Pseudo-Maximum-Likelihood estimator with High-Dimensional Fixed Effects (PPML-HDFE), and not by simply omitting them or resorting to data aggregations.Lastly, we conduct additional alternative tests of an inverted U-shaped relationship, while previous studies have generally merely run quadratic model estimations and hence ran into the risk of incorrectly finding an extremum.
Our results confirm the existence of an inverted U-shaped relationship between development and emigration within a cross-country (panel) setting.This result is robust to the inclusion of additional control variables and the estimation of the empirical model on alternative time subsamples.It is also robust to the inclusion of an interaction term between geographical distance and income at origin, and several additional tests for the existence of the inverted-U relationship.
However, we cannot conclude that our findings yield evidence of a causal link between development at origin and emigration flows.The reason is that multilateral resistance to migration (i.e., that the attractiveness of a given country depends on the latent attractiveness of other potential destinations) is not fully accounted for.The only viable way to adequately correct for this is to also include origin-time fixed effects next to other (origin, destination-time, time, and country-pair) fixed effects that we do include.However, like all other papers in the existing literature on this topic, our econometric model does not allow for the inclusion of origin-time fixed effects as these would be perfectly collinear with our origin-timevarying variable of interest: GDP per capita at origin.With this endogeneity issue remaining unsolved we cannot claim that our results establish a causal relationship.
We perform several robustness analyses to test whether an initial increase in economic development indeed leads to higher emigration.To this end, we test, for a subsample of countries that have actually transitioned from the low-income to the middle-income category, whether their emigration has increased with development, by applying both a linear and a quadratic version of our regression model.From this and several other robustness tests, we do not find that the inverted-U relation between development and emigration based on panel data also implies such a relation for an individual low-income country over time.Accordingly, drawing the conclusion that the inverted-U relationship between economic development and emigration is causal seems unfounded.
Several authors (e.g.De Haas, 2019, Clemens andPostel, 2018) have concluded from the migration transition hypothesis that as low-income countries develop, their emigration will tend to increase before declining after the turning point and that development aid is therefore not a proper instrument to reduce emigration from low-income countries.Our findings do not imply this conclusion.On the contrary, for a subsample of countries that transitioned from low to middle-income (excluding China and India), we find that, as low-income countries develop economically, their emigration actually declined.This obviously has important policy implications for development cooperation: it suggests that development programs can actually reduce emigration from low-income countries if they are successful at promoting local economic development.
The remainder of this paper is structured as follows.In section 2, we review the theories that might give grounds to the existence of a migration-development inverted U-shaped 'life cycle' in any given country, as well as the current empirical evidence for them.Section 3 describes the data we use and provides a descriptive analysis.Section 4 outlines our empirical methodology and section 5 presents our results, also including several robustness analyses.Section 6 concludes.

The migration-development 'life cycle'
This section presents a literature review on the migration transition theories as well as the existing empirical evidence of the inverted U-shaped relationship between development and migration.

Theory
The migration transition hypothesis (Zelinsky, 1971;Gould, 1979) sustains that economic, demographic, and socio-political forces, which co-occur with development, might also influence migration decisions.Under certain assumptions, such factors can jointly explain an inverted U-shaped relation between migration and development levels.
Following De Haas (2010), these factors affecting migration decisions can be grouped into migration capabilities and migration aspirations.
On the one hand, migration capabilities (MC) can be expected to monotonically increase with development indicators such as income and education, as well as with the creation of migrant networks abroad.First, income growth implies that potential migrants are better able to finance migration (Vanderkamp, 1971;Faini and Venturini, 2010).This effect can be compounded by the impact of remittances from migrant communities abroad.Second, improvements in education and human capital raise the number of feasible migration destinations by increasing the number of visa classes (which are usually skilled-employment work visas) that migrants can obtain (Flahaux and De Haas, 2016;Ortega and Peri, 2013).Third, would-be migrants' relationships with previous migrants already abroad may improve their ability to integrate in a given destination country, thereby further increasing migration capabilities (Massey, 1988).Yet, when the migrant population abroad grows, the positive network externalities generated by it may eventually disappear, due to the formation of a localized culture, gradually eroding the link between the established foreign network and potential domestic migrants (Epstein, 2008).Overall, with development, the rise in disposable income, human capital levels and migrant communities abroad leads to an increase in capabilities to emigrate.These MC can be expected to start growing more and more rapidly at first, because of the compounding impact of migrant networks and remittances, and later decelerating due to the formation of a localized culture with decreasing links with potential migrants in origin countries.This initial acceleration and later deceleration of migration capabilities with development is shown as the S-shaped MC curve in Figure 1.
On the other hand, migration aspirations (MA) are more likely to have an inverted U-shape.Migration aspirations are a function of several factors, all of which are likely to first rise and later decrease with a country's economic development (Clemens, 2014).These factors include: (i) Population growth initially increases with development due to declining mortality rates, and at some point starts decreasing with further development due to declining fertility rates.The initial increased population growth generates labor market pressures at home and thus increases demand for emigration, while at some point reduced population growth reduces emigration aspirations (Zelinsky, 1971).(ii) Opportunity costs of migration for capital owners initially decrease with development and stop falling once the relative prices of production factors have adjusted to the economy's opening to international trade (Samuelson, 1948 7 ; Martin and Taylor, 1996).
(iii) Rising domestic inequality with development which can, for some subset of the population, increase the gap between expected and actual income, leading to an initial rise in migration aspirations (Stark, 2006). 8Once the subset of the population with the highest gap between expected and actual income has migrated, this gap is on average reduced in the total population, causing a fall in aggregate migration aspirations. 9 Figure 1 illustrates the hump-shaped line for migration aspirations and the S-shaped curve for migration capabilities.At development levels Dlow and Dmedium, we assume that one's aspiration to migrate is the same, at MA1 = MA2.Yet migration capabilities at Dlow are much lower than at the higher development stage Dmedium.For an equal aspiration to migrate, this difference in capabilities is expected to be the reason why poorer individuals tend to migrate less.Conversely, possessing both a strong willingness to migrate and sufficient capabilities to act upon it, medium earners are most likely to emigrate.On the other hand, since high-income individuals possess the required ability but lack the willingness to migrate, their propensity to do so will be lower.
Figure 1 The migration transition hypothesis at the individual level At the country level and over time, we therefore expect emigration to first rise as domestic development rises, until a certain 'turning point' at which migration aspirations and capabilities are both relatively high.From this point onwards, capabilities grow just marginally with development, while migration aspirations fall, gradually pulling aggregate emigration rates downwards.Migration transition theories relative income for the poorest, and thus raises income expectations.Since migrating abroad may be a way to achieve this new level of expected income due to inter-country income differences, this can foster migration aspirations at the lower end of the income distribution. 9In reality, this phenomenon generally does not generate a clear inverted U-shaped relationship between development and migration aspirations.Inequality in a country can rise and fall more than once as development increases.Nevertheless, inequality has a clear impact on the gains from migration attained by workers at different points in the income distribution and in time: as inequality rises, migration aspirations are thought to increase in tandem, and vice-versa (Borjas, 1987).therefore collectively predict that emigration has an inverted U-shaped 'life cycle' that is a function of the stage of development in the source country (Hatton and Williamson, 2011).

Empirical evidence
The inverted U-shaped relationship between migration and development has recently been observed in cross-sectional nonparametric regressions (Clemens, 2014;Dao, Docquier, Parsons and Peri, 2018).The turning point is graphically found to lie at a gross domestic product (GDP) per capita level varying from $ 4,000 to around $10,000 (in 2019 US dollars, adjusted for purchasing power parity (PPP)).Countries with medium levels of development are associated with the highest emigration rates, while both underdeveloped and highly developed countries exhibit comparatively low rates of emigration.Clemens (2014) and Dao et al. (2018) report that both the (initially) positive and the (later) negative relationships between emigration and GDP per capita levels were statistically significant.Clemens (2014) found that this cross-sectional, hump-shaped association holds for every decade since 1960 and becomes more pronounced with time.The turning point in GDP per capita remains at the same level, whereas the corresponding emigration rate increases over time.De Haas (2010) showed that the same cross-sectional inverted U-shaped relationship holds when using the human development index (HDI) instead of GDP per capita values.
It is not sufficient to merely observe that migration traces an inverted U-shaped pattern with development for a given year across countries.There are a number of studies that test for the existence of the migration hump using parametric regressions in such a cross-sectional setup, such as Djajic et al. (2016), Dao et al. (2018) and Idu (2019).However, this leaves at least three important considerations unaccounted for: First, the migration transition hypothesis' central prediction is that this relationship ought to hold on average over time in any given country, and not merely in a given year across countries.That is, it is expected to hold in the longitudinal rather than in the cross-sectional dimension (Hatton and Williamson, 2011).
Second, development can be expected to affect migration flows, but rising migration also affects development levels, for instance through the remittances it generates.This can lead to reverse causality problems, which cannot be adequately tackled in a cross-sectional set-up.
Third, it can be expected that migration decisions strongly depend on observed or unobserved idiosyncratic characteristics of origin and destination countries.Examples of these factors are migration policies or individual preferences for migration, or drivers affecting pairs of countries, such as geographical distance or linguistic proximity.It is important to consider and correct for all costs and benefits related to every possible migration channel available to a would-be migrant.
One paper that employs a similar methodology to ours is the paper by Llull (2016).His paper exploits a relatively new database of bilateral migrant stocks and finds heterogeneous effects of income gains on migration prospects depending on distance.Like our paper, he uses a gravity-migration specification which is tested using panel data.Moreover, Llull (2016) employs a similar bilateral data set although the data we present in this paper is more temporally extended.
Despite the similarities, this paper differs from Llull (20106) in three important ways.First, Llull (2016) does not test for the existence of a hump-shaped relationship between emigration and development.Second, he uses migrant stocks instead of migration flows as the dependent variable, which is not in line with the specification's micro-foundation (Beine et al., 2016). Third, Lull (2016) does not use the PPML-HDFE technique and instead employs the Ordinary Least Squares (OLS) technique.OLS is known not to perform well when the proportion of zeros in the dependent variable is high, which is the case here.It also yields relatively high biases in the presence of heteroscedasticity (Silva andTenreyro, 2006, 2011).
A second, and as far as we know the only other, paper that is similar to ours is Benček and Schneiderheinze (2019), who more recently tested systematically for the existence of the migration hump.They find a negative relationship between income and emigration that is independent from the origin country's initial income level.Similar to this paper, they investigate the existence of the hump shape not only in cross-section but also over time.
Our methodology and data differ from Benček and Schneiderheinze (2019) in three ways.First, we explore all bilateral migration flows, whereas Benček and Schneiderheinze (2019) only focus on unilateral emigration flows to OECD countries.Second, we employ an estimation method owing to which we are able to limit the estimation bias due to the large number of zeros in our migration flow variable without having to exclude these observations.We do not make such sample selections as it might generate bias due to the exclusion of many potential destination countries.Third, we include a complete set of origin-and destination-time fixed effects, which reduces, although not fully eliminates, the potential endogeneity issues.

Data
For the empirical analysis, we compiled an extensive panel data set comprising bilateral migration flows of 180 origin and destination countries for each decade from 1970 to 2020 (using 2019 as a proxy for 2020).
The dependent variable is the bilateral migration flow in each of the five decades from 1970 to 2020.Each of the explanatory variables we include in our model are varying in the origin-country and time dimensions only.We also employ fixed effects that vary in the destination, time and country-pair dimensions.All are averaged over decades, from t -10 to t -1.The dependent variable under study in our analysis is the decadal bilateral migration flow for the 1970-2020 time period.10Following Beine and Parsons (2015), migration flows are computed as the decade-to-decade difference in stocks, where, if M ijt represents the stock of migrants from country i living in destination j at time t, the migration flow in period t is defined To measure this variable, we merge two migrant stock databases produced by the World Bank and the UN.For the years 1960-2000, we use the World Bank's Global Bilateral Migration database compiled by Özden et al. (2011).This is based on raw data from the Global Migration Database of the United Nations Department of Economic and Social Affairs of the Population Division (UN DESA, 2008).It contains migrant stock data by country of origin compiled from a collection of 3,500 censuses spanning 230 migrant destinations, for every decade from 1960 to 2000.11For the years 2010 and 2020 we combine this World Bank database with the Trends in International Migrant Stocks data from UN DESA (2019), which contains data for the following reference years: 1990, 1995, 2000, 2005, 2010, 2015 and 2019.This methodology is akin to Özden et al. (2011).The year 2019 is used as a proxy for the year 2020.
In these databases, migrants are defined as foreign-born individuals who have moved to a different country. 12As explained in Özden et al. (2011) and UN DESA ( 2019), this has advantages over defining them by their citizenship.The latter definition does not provide a consistent measure of international migrant stocks because of differing citizenship laws across nations, and because people in some countries can acquire citizenship after having been a migrant for a number of years.This definition better captures the concept of migration as a "movement of a person or a group of persons, either across an international border, or within a State" (International Organization for Migration, 2011).
Both databases are based on the same underlying migration data and share many of the same processing methods.In both cases, the UN's Population Division census data is used to compile the database.The same country list is employed for both databases, although the UN DESA data contains six more countries than the World Bank's.In our merged data set, we only count those countries included in both databases.The original data suffers from a substantial amount of missing observations because many countries do not release national census data every 10 years.These may be prohibitively expensive in terms of labor intensity, can be abandoned because of exogenous factors, such as civil unrest or conflict, or are never released for political reasons.The authors chose to minimize the number of gaps in the data through interpolations.For the 'in-between' years (1970, 1980, 1990for the 1960-2000World Bank data and 2000and 2010 for the UN DESA data), they do so by assuming a linear trend before and after missing data points.Where data are lacking for the beginning or end decades, they use growth rates in migration, taken from the UN Total Migrant Stock database (2006), to estimate bilateral migrant stocks.It is important to note, however, that since both databases use interpolations and predictions to fill in for missing values, our compiled bilateral database will also include a number of predicted values. 13As a result, our estimation results are partially based on using predicted values as independent variables, which leads to increased uncertainty on the results.
The UN DESA (2019) database differs from the World Bank database (Özden et al., 2011) in two ways.Firstly, the UN DESA (2019) also adds data on refugees if available.Secondly, UN DESA (2019) used nationally representative surveys to complement the international migrant stock estimates based on population censuses and registers used in both databases.
We follow Rojas-Romagosa and Bollen (2018) by appending the data sets using the most recent UN international migrant stock data for the year 2019.Employing decadal data enables us to closely map our data to the population census rounds, which are done every decade.As in Beine and Parsons (2015), we set negative flow values to zero.
To our knowledge, this is the most extensive panel data set used so far in the literature to test for the existence of the migration hump.First, the large time dimension (50 years) has not been used to test the migration transition hypothesis before and it is well-suited to capture migration's long-run dynamics.Second, the large set of origins and destinations (180 countries, see Appendix Table A. 1) enables us to test the model on every possible migration direction, and not just South-North flows.Appendix Table A.2 contains the definitions and sources for all variables used in this paper.

Descriptive statistics
Table A. 4 in the Appendix shows the summary statistics for our dependent variable of interest (migration flows), migration rates (migrant stocks over population), our explanatory variable of interest Figure 2 Bilateral emigration rates over time for countries in each quartile of the income distribution to countries in the other quartiles, in the 1960-2020 timeframe Note: Income groups were made by partitioning our PPP-adjusted GDP per capita country-time points into four equally sized (n = 285) quartiles.Emigration rates are computed as the ratio of the total number (stock) of migrants from a given income quartile country group residing in the destination income quartile country group to the total population in the origin income quartile country group.The low, lower-middle, upper-middle-and high-income quartiles respectively correspond to countries in the $392-$2207, $2207-$5708, $5708-$14943 and $14943-$279498 GDP per capita ranges (in PPP-adjusted constant 2011 US dollars).
(GDP per capita) and all other explanatory variables used in this study.Notably, with their highly positive skewness, the migration flow and rate distributions are heavily skewed towards the left.This reflects the large number of migration directions with small or zero flows of migrants. 14The share of migration to OECD countries is equal to around 50% on average. 15  The evolution of bilateral migration over time from each income quartile of the distribution of GDP per capita country-time points in our data set to all other quartiles is depicted in Figure 2. A similar graph using the World bank's classification of countries into low-lower middle-upper middle-and highincome groups presenting bilateral emigration rates over time for these income groups to all other income groups can be found in Figure A.1. in the appendix.As shown by both figures, migration rates are generally highest for lower-and upper-middle-income countries than for low-and high-income countries for each time period shown, in line with the cross-sectional migration hump.

The canonical RUM model
The Random Utility-Maximization (RUM) model has recently been used in the migration literature, see Beine et al. (2016).This approach allows us to rigorously micro-found a migration version of the gravity model that is more commonly employed in the trade literature since Tinbergen's (1962) seminal contribution.The RUM expression of the location-decision problem faced by a would-be migrant (which translates into a simple utility-maximization problem) includes country-pair-specific utility components which call for the inclusion of bilateral (gravity) variables into the empirical model.Let us consider the location-decision problem faced by an individual h that considers migrating from a given country i to country j at time t.RUM models describe the utility derived from this move as: where w ijt denotes a deterministic component of utility and c ijt represents the cost of migrating from i to j at time t.These can both be modelled as a function of observable variables, which should capture anything increasing or reducing the attractiveness of a particular destination and should include location-or countrypair-specific elements (Bertoli and Huertas-Moraga, 2013).
Conversely, since θ hijt is an individual-specific stochastic term, it cannot be observed.As has been repeatedly done in the migration literature, we assume that θ hijt follows an independent and identically distributed extreme value type 1 distribution à la McFadden and Zarembka (1974).Applied to equation (1), the expected share of individuals residing in i who move to j at time t, E(p ijt ), can then be written as: where D is the set of all countries the individual can choose from, l represents any country in this choice set, and p ijt ∈ [0,1] is the actual share of share of individuals residing in i who move to j at time t .By definition, the expected scale of the migration flow from country i to country j at time t is E(m ijt ) = E(p ijt )s it , where s it represents the size of the population residing in country i at time t.We can thus re-write expression (2) above to express it as follows: E�m ijt � = e w ijt -c ijt ∑ e w ilt -c ilt l∈D s it .
(3) RUM models usually assume that the deterministic component of utility does not change with the origin country i.This allows us to re-write equation (3) as: where Φ ijt = e -c ijt , y jt = e w jt , and Ω it = ∑ Φ ilt y lt l∈D .In this expression, migration depends on the accessibility Φ ijt of destination j, its attractiveness y jt , the capacity the origin country i has to send out migrants, proxied by its total population, s it , and is inversely related to the utility derived by migrating to other destinations l ∈ D or staying in the home country, Ω it .Expression (4) is similar to other canonical gravity specifications, such as that used in the context of trade in Baier et al. (2019).

Main migration-gravity econometric specification
As is commonly done in the literature, we use GDP per capita levels (at PPP) as our measure of development levels at origin.To compute it, we use expenditure-side national GDP, which is most suitable for comparing living standards over time and across countries (Feenstra et al., 2015), divided by total population size.We include both a linear and squared origin country GDP per capita variable in order to test for the hypothesized nonlinearity in the impact of development at origin on subsequent emigration flows.These are our two variables of interest.Some econometric studies, however, claim that using merely a squared term in order to test for an (inverted) U-shaped relationship might lead to false conclusions (Lind and Mehlum, 2010;Haans et al., 2016).Therefore, before we conclude that there truly is a U-shaped relationship, we consider the three-step procedure of Lind and Mehlum (2010) and test our model fit when including a cubic term to the empirical specification, as suggested in Haans et al. (2016).
To conform with the theory behind the RUM model (equation 4), we also control for population size.Within the RUM framework, population size measures the capacity that a given origin country has to send out migrants.Naturally, when a country has a larger population, it also has potentially higher migration flows in absolute numbers.
Following Rojas-Romagosa and Bollen (2018), we include country-pair fixed effects (FE) in our estimation.This is needed in order to account for all observable or unobservable bilateral time-invariant migration cost components, such as cultural or geographical distance, or any other time-invariant factor that might affect one's choice of destination j.
Taking logs of the RUM expression (4) above yields the following econometric specification: ln(m ijt ) = β 1 ln(GDPpc it-10 ) + β 2 [ln (GDPpc i,t-10 )] 2 + β 4 ln (s it ) + I ij + I jt + I i + ε ijt (5) where m ijt represents migration flows from country i to country j at time t; GDPpc it-10 is the 10-year lag of GDP per capita at origin; s it is the population size at origin at time t; I ij , I jt and I i are respectively pair, destination-time and origin FE; ϵ ijt is the error term.
Without taking logs as in (5) the empirical specification would run the risk of suffering from biased estimates due to the large number of zeros in our data set.Given the logarithmic form of our dependent variable, all pairwise observations with zero migration in the data would normally get dropped, as in loglinearized models estimated using OLS (e.g.Ortega and Peri, 2013;Llull, 2016).In order to avoid this, we estimate specification (5) using a Poisson pseudo-maximum-likelihood with high-dimensional fixed-effects (PPML-HDFE) estimator.As shown in Silva andTenreyro (2006, 2011), PPML estimations perform well even when the proportion of zeros in the dependent variable is high.This justifies this approach given our data set.When compared to log-linearized gravity models, PPML estimations also yield relatively small biases in the presence of heteroscedasticity.
To estimate the above model, we employ the estimator by Correia et al. (2019).This estimator allows for a large set of different high-dimensional fixed effects structures.Exponentiating expression (5), our PPML migration specification can be expressed as follows: We use robust heteroscedasticity and autocorrelation consistent (HAC) standard errors and we cluster these around countries of origin.This is because our standard errors may be heteroscedastic and are probably correlated over time within origin countries' observations.

Dealing with endogeneity
A serious issue in the literature concerns the potential endogeneity.In particular, the possible reverse causality between development at origin and migration flows.The RUM expression (3) above does not make any specific assumptions about the direction of causality of the relationship between the prospective net utility of moving, w ijt -c ijt , and expected migration flows E�m ijt �.The former can impact the latter, but the reverse may also plausibly hold.Development at origin might affect one's migration aspirations and capabilities, and thus overall migration flows, through the channels mentioned in section 2. However, migration outflows can also affect development levels at origin.This could either happen directly (through remittances, modifications in consumption patterns, changes in asset accumulation at home, and brain drain) or indirectly (for instance, through changes in the prices of local production factors and goods, or thanks to migrants encouraging investments into their areas of origin). 16ne way in which the literature (imperfectly) accounts for endogeneity is by assuming that current migration outflows may only affect present and future development levels, while past levels of income per capita can affect future levels of emigration (Mayda, 2010;Ortega and Peri, 2013;Idu, 2019).That is, migration flows in year t, m ijt , can only impact GDP per capita at t, t + 1, t + 2, …, while income in previous periods t -1, t -2, … may impact contemporaneous and future migration flows.Following the literature, we therefore relate current migration flows to lagged values of GDP per capita in our estimations.This reverse causality problem is likely to be less present in our case, as we use 10-year lags in GDP per capita. 17nother potential concern is the so-called multilateral resistance to migration (MRM).This is defined in Bertoli and Fernández-Huertas Moraga (2013) as the confounding influence that all potential alternative destinations l ∈ D might have on one's choice to migrate to country j.This is encapsulated in the term Ω it in equation ( 4).Ignoring this 'third country effect' has been shown to lead to omitted variable bias (Bertoli and Fernández-Huertas Moraga, 2013).
Existing strategies used in the literature to control for MRM do not work in this case.For example, Ortega and Peri (2013) control for heterogeneous preferences for migration across countries, which induce MRM by employing origin-time fixed effects.These are nonetheless perfectly collinear with any vector of time-varying origin variables w it and therefore do not allow for the inclusion of development at origin, our variable of interest, into the model.A more general and less restrictive approach is the common correlated effects (CCE) estimator.This allows for consistent estimations in the case of spatially and serially correlated error structures.This estimator was proposed by Pesaran (2006) and employed in Bertoli and Fernandez-Huertas Moraga (2013).However, with only six time periods, our data set does not have a sufficiently large longitudinal dimension for the CCE estimator to be used here.Following Mayda's (2010) approach and the arguments put forth in Beine et al. (2016), we (partially) control for MRM by introducing origin and destination-time fixed-effects.These absorb time-invariant and time-varying unobserved country-specific effects, respectively.They also serve as a proxy for MRM induced by time-invariant aspects of heterogeneous preferences for migration at origin or by the temporally fluctuating attractiveness of alternative destinations (Beine and Parsons, 2015).Origin FE are not collinear with GDP per capita at origin, which varies temporally, and can thus be included in the estimation model.This is analogous to the standard Anderson-Van Wincoop trade-gravity specification (2003), which incorporates importer and exporter fixed effects to account for multilateral resistance to trade.
Adequately accounting for MRM would require including origin-time fixed effects in our model along with destination-time and country-pair-varying fixed effects, as is done in state-of-the-art trade-gravity specifications, such as Baier et al. (2019).However, this would cause collinearity issues with respect to our variable of interest, which varies in both country and time dimensions.For this reason, we cannot fully account for MRM and thereby eliminate the endogeneity bias from our estimation.Accordingly, our findings regarding to the migration-development nexus cannot be argued to represent a causal relationship.

Main results
Table 1 shows the results from our main specification.The significant coefficients on the linear and squared GDP per capita terms have a positive and a negative sign, respectively, see column (2).These results provide empirical evidence of an inverted U-shaped relationship between GDP per capita at origin and emigration flows.
Moreover, it confirms the existence of the hump not only in the case of South-North flows, which had largely been the focus of past research on the topic, but for all combinations of origin and destination countries.By focusing on South-North flows, usually by leaving out non-OECD destinations from their analysis, previous studies have excluded about half of total international migration over the 1970-2020 period.By including such flows, we can therefore provide a more accurate test of the migration transition hypothesis, which is expected to hold for every origin globally.
The results from our model estimation on alternative time subsamples suggest that the migration hump holds both before and after 2000.Table 1 (columns (3) and ( 4)) shows the results for both the 1970-2000 period and the 2000-2020 period.As can be seen in the table, the coefficients on the linear and squared GDP per capita term are again significant and have a positive and negative sign, respectively.While the size of the two coefficients is lower for the latter timeframe, this decline is not significant.The finding of a migration hump remains robust when estimating the model separately for each decade within the 1970-2020 timeframe.This is illustrated in Figure 3, which shows the results of our nonparametric cross-country regressions of emigrant stocks on GDP per capita at origin (PPP-adjusted), for each of the five decades within the 1970-2020 period.Emigration rates are computed as the ratio of the total number (stock) of migrants from a given country residing in a foreign country to the total population in the origin country.These regressions depict an inverted U-shaped relationship between development at origin and emigration for each of these decades.Our results confirm those found in Clemens (2014) and Dao et al. (2018).

Figure 3
Non-parametric regression of the migration-development nexus in cross-section for each year in the 1970-2020 time period Note: The dark red lines depict Second-Order Gaussian continuous kernel non-parametric regressions.Countries with emigration rates that are higher than 1 per year are omitted.The Cayman Islands and Kuwait are omitted from the regressions as well.

Robustness analyses
In order to test the robustness of our results, we perform two sets of robustness analyses.First, in Section 5.2.1, we test the robustness of the hump shape as a whole by conducting the analysis with several alternative specifications, which all support the finding of the hump shape.Second, in Section 5.2.2, we specifically test the robustness of the finding that emigration initially rises when a low-income country begins to develop, corresponding to the upward sloping 'left hand side' of the migration hump at the lower end of the income distribution.There, we do not find support for the initial increase of emigration with development.

Robustness analyses of the hump shape
To test the robustness of the hump shape we use several alternative specifications.These include: (i) the addition of several origin-time control variables (their definitions and sources can be found in the Appendix Table A. 2, along with GDP per capita and population at origin), in order to prevent omitted variable bias in the origin country-time dimension; (ii) the inclusion of an interaction term between geographical distance and income at origin, and (iii) additional tests of the existence of an inverted U-shape between GDP per capita at origin and emigration flows.

(a) Controlling for demographic and other origin-time variables
In the first set of robustness checks, we augment our base model with several socio-demographic control variables.These serve to enrich our model by capturing more of the variation in the origin-time dimension and effectively reduce potential omitted variable bias issues.
First, demographic factors at origin can significantly influence migration patterns through their impact on the domestic labor market structure.On a global scale, inter-country differentials in demographic structures might affect the directionality of migration flows, whereby countries with a large inactive population demand more labor from abroad in order to support the economy, while residents of countries with a relatively large labor force are more willing to emigrate.Also, higher population densities can make one more willing to emigrate, as it limits the amount of available resources per person.In this light, we introduce the age dependency ratio and population density at origin (both defined in Appendix Table A. 2) as controls.We expect a positive sign on the coefficient on population density: an increase entails higher pressures on a country's resources, potentially leading to higher rates of emigration.The coefficient for age dependency could be both positive (e.g. a higher elderly dependency could lead to more emigration among pensioners, while a higher youth dependency could lead to more pressure for parents to look for better income opportunities abroad) or negative (e.g. higher elderly dependency may require more immigrants in elderly care), since several mechanisms are at play here.
Moreover, political instability or poor governance may catalyze emigration, sometimes by forcing it.The landscape of politically driven emigration can range from people fleeing a war or a genocide to those seeking better living conditions, in the form of secured property rights or the freedom of expression.In order to capture the influence of these factors on emigration, we introduce the Polity IV index at origin, along with the number of months the origin country has been in any sort of conflict (genocides, politicides, and ethnic and revolutionary wars).We expect a negative sign for the former, as one's willingness to migrate in a relatively democratic country is expected to be low.The coefficient on our conflict variable is expected to be positive.
Populations can be displaced by natural disasters as well, which might destroy means of living in the origin country and thereby force people to flee appalling conditions at home to seek higher material wealth abroad.We account for these in an alternative specification through the number of natural disasters that occurred in a country during the time period considered.The coefficient on this variable is expected to be positive, as the rise in natural disaster occurrences in a given time period should lead to more outward migration.
In order to prevent potential collinearity issues, only control variables that have an absolute correlation of 0.4 or lower with GDP per capita are included in the estimation. 18Further, since natural disaster occurrences are highly correlated with the natural logarithm of population at origin (correlation > 0.4), we do not simultaneously include them in the estimation.The same goes for the Polity IV index at origin and the age dependency ratio.Therefore, we first incorporate each control variable to the main model separately in order to test their significance with no influence from other potential factors.We then include all controls at the same time, excluding some variables to avoid collinearity. 19

(b) Estimating alternative time subsamples
The on average positive global growth rates in GDP per capita between 1970-2020 led to a rightward shift of the world per capita income distribution.An increasing number of countries now lie in the middle-to high-income per capita group.This can have an impact on the existence of the migration transition.If the migration turning point lies at relatively low GDP per capita levels, then the hump will be more pronounced for earlier periods, assuming that the turning point remains constant over time.Otherwise, if the turning point does move to the right over time, this effect does not occur.
In order to test whether the hump shape became less pronounced, we subdivide our country-time sample into two distinct timeframes, taking advantage of the panel structure of our data set.The two timeframes chosen were 1970-2000 and 2000-2020 (the year 2000 cutoff was chosen arbitrarily).We then estimate model ( 6) on these two subsamples.This will also enable us to have a better idea of where the actual migration transition point lies, and thus which income levels actually drive the migration transition.
(c) Controlling for interactions between geographical distance and income at origin Furthermore, the impact of a change in income at home on emigration might be different depending on the distance to potential migration destinations chosen by a would-be migrant.For instance, the effect of a positive income shock on one's decision to move might be more pronounced if the destination considered is closer to home.This can be due to the fact that migrants considering a faraway destination might focus more on long-run income prospects than fluctuating income shocks in their migration decision.Moving farther away implies less flexibility to move back and forth to one's home country to benefit from wage fluctuations.Following Lull (2016), the interaction between geographic distance and income at origin, � , is included into model ( 6), where, for any x � being the sample mean of variable x, we define x � ≡ x -x �.This yields the following estimation model: Additional tests for the existence of an inverted U-shape Lastly, we conduct further statistical tests of the inverted U-shaped relationship between development at origin and emigration.As argued in Lind and Mehlum (2010), merely adding a quadratic term to an otherwise linear specification can be too weak a criterion to test for such a nonlinear relationship if the latter is either convex or monotone.In this case, one might be led to a type I error where the null hypothesis of linearity is wrongly rejected because an extreme point is found and thus an inverted U-shape.
To account for this potential issue, we follow Lind and Mehlum's (2010) three-step procedure. 20  First, we verify that β 2 in specification ( 6) is significantly negative.Second, we check whether the slopes at both ends of the data range, to the right and to the left of the optimum, are significantly different from zero, and positive and negative, respectively.Third, the turning point should lie well within the data range.
With regard to the first step, we use the results from the estimation of the empirical specification (6).The second and third step are done using the Sasabuchi test (Sasabuchi, 1980;Lind and Melhum, 2010).This test checks the robustness of an inverted U-shaped relationship by testing whether the slopes to the left and the right of the turning point are significantly positive and negative, respectively.We also choose to test the fit of our model when adding a cubic term, thus allowing for the curve to take an S-shape rather than a U-shape.

Results of robustness checks
The results of the first robustness checks are that the main result remains unchanged when augmenting the main model with a set of socio-demographic controls.As shown in Table 2, in terms of significance and sign, our main result regarding the two GDP per capita coefficients remains unchanged when the age dependency ratio, the Polity IV index, the number of natural disaster occurrences, conflict duration and population density are individually added to the main model.Networks and population density, which are the only variables with significant coefficients, both have the expected positive sign.This suggests that these variables, either through an increase of a country's diaspora population or through a negative impact on resource availability, might foster emigration.

Table 2
Results from the estimation of the base model, augmented with selected origin-time control variables (specification is sometimes changed to avoid multicollinearity) 20 See also Haans et al. (2016).Including all of these control variables along with the main model, changing the specification to avoid multicollinearity issues 21 shows that the main results do not change.As Table 3 shows, both the linear and squared GDP per capita terms remain highly significantly positive and negative, respectively.Moreover, population density and our network variable are both significant across most specifications.This evidences the role of demographic pressure and the impact of the size of the diaspora in affecting one's propensity to migrate.Population density keeps its expected positive sign, while the coefficient of the network variable on occasion however unexpectedly turns negative.The results from the estimation of model ( 7) are shown in Table 4.While both the linear and the squared GDP per capita terms are highly significant and have the expected positive and negative signs, respectively, the added interaction term is weakly significant.In accordance with Lull (2016), we therefore find some evidence suggesting that income shocks might have a heterogeneous impact on emigration depending on distance to destination.
The results of the Sasabuchi test for the (inverted) U-shape can be found in the Appendix Table A. 5.The slopes at both ends of the data ranges are significant, and of the expected signs: positive at the lower bound and negative at the upper bound.The overall test for the presence of an inverted U-shape between GDP per capita and migration flows also enables us to reject the null hypothesis that emigration evolves linearly with GDP per capita, and thus further confirms the existence of the inverted-U shape.Moreover, the extremum point, at ln(GDPpcit) = 7.85788, lies well within the Ln GDP per capita range, which goes from 6.126 to 12.541 (see Appendix Table A. 4 for summary statistics).Finally, adding a cubic term to the empirical model does not improve model fit, as Table A. 7 in the Appendix depicts.The linear, squared and cubed GDP per capita terms are insignificant.Given these results and the ones above, the migrationdevelopment nexus is thus more likely to follow an inverted-U shape than an S-shape.

Table 4
Results from the estimation of model ( 7 Pseudo R-squared 0.904

Robustness analyses of the initial increase of migration with development
All the findings of our robustness tests in sections 5.1 and 5.2.1 seem to suggest that there is strong empirical support for the migration transition hypothesis' prediction of a migration hump: an inverted-U relationship between development levels and emigration.This finding is consistent with our Figures 2 and 3 which show that middle-income countries tend to have higher emigration rates than either low-income or high-income countries.Recently, several authors (e.g.De Haas (2019), Clemens and Postel (2018)) have concluded from this finding of such an inverted-U relation that this implies that, as low-income countries develop, their emigration will tend to increase first before declining only after some threshold level of income.
If this conclusion holds for individual countries, then this could have serious implications for development programs.In particular, it would imply that development cooperation, to the extent that it contributes to economic development, contributes to increased emigration from low-income to highincome countries.As the authors mentioned above have pointed out, development cooperation in that case is not a proper instrument to reduce emigration from low-income countries.
However, even if the migration hump finding is as robust as it seems to be, can we actually conclude that all individual countries will follow this inverted U-pattern as they grow richer?In other words, will emigration for an individual low-income country indeed rise as it starts developing economically, and fall after some threshold middle-income level?The answer is that this does not necessarily follow from the finding of an inverted-U relation between development and emigration based on cross-country or panel data.Benček and Schneiderheinze (2019) are therefore critical of any causal interpretation of the migration hump.
One reason why the cross-sectional evidence for the hump shape does not necessarily demonstrate an individual country's transition path is that, while middle-income countries experience higher emigration than low-income (and high-income) countries, this is not necessarily due to their income differences.It may also be due to fundamental heterogeneity between the different country income groups that simultaneously affect both economic development and migration (Lucas, 2019).If such omitted variables are driving the inverted U-relationship, then the migration hump is misinterpreted as being a result of economic development.
This point is not solely relevant for evidence of an inverted-U relationship based on cross-section data, but also for evidence of a migration hump based on panel data, as we are using in this paper.The reason is as follows.By using panel data, we exploit both the variation over time and the variation across countries.The variation over time for each country across the income distribution is however limited in the sense that even though we use a large 50-year timeframe from 1970 to 2020, there is no country that has covered the whole income distribution over this period developing from a low-income to a high-income country.Despite substantial economic growth for many countries within this period, countries have still moved within a limited range of the income distribution.This implies that, even though we are using panel data and exploiting some income variation over time for each country, we are still to a large extent relying on the cross-section variation in the data for our finding of an inverted-U relation between emigration and economic development.That means that this finding of the inverted-U relation is to an important extent still driven by the fact that middle-income countries experience higher emigration than low-or high-income countries.So again, the conclusion that income levels are driving the inverted-U relation between development and migration will not necessarily hold if there is systematic heterogeneity across countries in these income groups.This is particularly the case if there is heterogeneity with respect to factors that affect both development and migration, and if these factors are not properly controlled for.
One reason why full control for all relevant factors affecting both income and emigration is complicated in all panel data studies on emigration and development is the following.Even though in our above panel data analysis we have applied a very extensive set of control variables, including several origintime control variables and destination-time, country-pair, and origin fixed effects, there might still be some origin-time factors that affect both emigration and development and hence require additional controls.While such factors in principle could be controlled for by using origin-time fixed effects, such fixed effects are however perfectly collinear with any origin-time varying variables and hence cannot be simultaneously included in the specification with development at origin, which is our variable of interest.As indicated, this issue is relevant for all panel data studies on emigration and development.Therefore, additional robustness checks are required in order to test whether low-income countries as they develop indeed initially experience an increase in emigration due to economic growth.This will be done in the next section.
In order to avoid this issue of inappropriately using the higher emigration levels of middle-income countries compared to low-income countries, while not being able to fully control for fundamental differences between the two groups of countries that may drive the result of an initial increase in emigration with development, we perform several robustness tests in this subsection that all relate to the upwardsloping part of the migration hump in order to test whether as low-income countries grow, their emigration will tend to initially increase.
The first test we perform is for the subsample of 46 countries that actually transitioned from lowincome to middle-income status.We test whether emigration from these countries increased with development, by applying our base regression model on this sub-group only.In this case, the included middle-income countries are the same as the included low-income countries (only at a later point in time) and hence there is no heterogeneity between the two income groups when using this subsample.This subsample consists of 46 countries that have all developed from the low-income to the middle-income category in the period 1970-2020 according to the World Bank income classification. 22If emigration initially increases with economic development until low-income countries reach some middle-income threshold level, then we would expect a positive and significant coefficient on our linear GDP per capita variable.We perform the regression both with and without a squared term for our GDP per capita variable.The results from our estimation for this subsample of countries that have transitioned from low-income to middleincome are shown in columns (1) and (2) of Table 5.The table shows that in neither case, we get a significantly positive coefficient for our GDP per capita variable and hence we cannot conclude that for this group of countries economic development has resulted in an increase of emigration.We have also performed the regression for several subsets of this group of 46 countries and the results are all similar in the sense that they show no evidence of an increase of emigration with development for these countries.
Next, we perform the same test for the similar sample of countries that have transitioned from the low-income to the middle-income category, but now excluding China and India.These two countries are outliers in terms of population and country size, which may have important implications for emigration, and they have also experienced relatively high economic growth.The results for this subsample are presented in columns (3) and (4) of Table 5.The results in column (4) for the regression including the quadratic term for our GDP per capita variable show no significance for the coefficients on either the linear or squared GDP per capita variable.However, the results in column (3) for the regression including only the linear GDP per capita variable show a very significant and negative coefficient on our GDP per capita variable.
Limiting our analysis to this sample of 44 countries that have actually developed from being a lowincome country to becoming a middle-income country, the finding is thus that emigration has not increased but rather declined with economic development.By focusing solely on the countries that actually made the transition from low-income to middle-income status, we avoid the issue of inappropriately using the higher emigration levels of middle-income countries compared to low-income countries, while not being able to fully control for fundamental differences between the two groups of countries.For this relevant subsample, it is clear that, when low-income countries develop economically, their emigration declines.This obviously has important policy implications as it refutes the recent belief that development programs contribute to rising emigration when promoting economic development.
In addition to this subsample of countries that each developed from low-income to middle-income status, we also test whether there is an increase in emigration with development for the subsample of all African countries.This is also an interesting subsample because these countries have grown in the covered 50-year period from being mostly low-income to being mostly lower-middle income, with less than half of the countries still being low-income countries in 2020 and a few countries transitioning to the upper-middle income category.Mean GDP per capita for African countries increased substantially from US$ 1,738 to US$ 4,798 during this period. 23he results for our subsample of African origin countries confirm that GDP per capita growth does not give rise to emigration from African countries.The results are presented in columns ( 5) and ( 6) of Table 5.The results show that, despite substantial increase in GDP per capita among African countries, there is no sign of a significant positive relation between GDP per capita and emigration, as some authors in the migration literature suggested.Instead, the relationship is negative though not significant.As shown in the table, population at origin does show a significant and positive coefficient.This indicates that population growth may have been driving higher emigration levels for African countries.The next robustness check is to test whether, for the lower part of the income distribution, there is an upward-sloping 'left hand side' of the migration hump, in other words, whether there is a significantly positive relation between development and emigration up to a certain point.We first check this for our base model applied to the full sample of countries for which the result was presented in column (2) in Table 1.The corresponding extreme point for this base model result lies at a per capita GDP of US$ 2,586.We therefore now test our migration base model (both only with a linear and also with a quadratic term for GDP per capita) applied on all observations in our data set up to this extreme point for GDP per capita.The results are presented in column ( 1) and (2) of Table 6 and do not show a significantly positive coefficient for our linear GDP per capita term that we would expect if an increase in income would lead to more emigration in this lower part of the income distribution until the extreme point of the hump.The coefficient on squared GDP per capita is also insignificant.
We perform a similar test for the highest extreme point of GDP per capita that we found across all other specifications used in sections 4.2 and 4.3.1 and that is the one applied for the time subsample 2000-2020, for which the results were shown in column (4) of Table 1.The extreme point corresponding to this result lies at a GDP per capita of US$ 5,693.We again test our base model of emigration, again both with only a linear and also with a quadratic term for GDP per capita, applied on all observations below this turning point of the hump of US$ 5,693.The results are presented in columns (3) and (4) of Table 6 below and show that also using this extreme point, the coefficients on our GDP per capita variable are insignificant, indicating there is no significant relation between income per capita and emigration at this part of the income distribution.We also tested the extreme points for all other specifications used in section 4.3.1 and since these all lie to the left of the above extreme point of US$ 5,693, the results from the estimation of our base emigration model applied to the observations to the left of these respective extreme points are all similar in the sense that they do not show a positive and significant relation between our GDP per capita variable and emigration.Next, we test our base model on all observations in the first and second quartile of the income distribution.Again, if emigration initially increases with development, we would expect to find a significant and positive coefficient on our GDP per capita variable.The results can be found in columns ( 1) and (2) of Table 7 and show no significance for either the linear or the squared GDP per capita variable.
Finally, we apply the PPML base model on all observations with a maximum GDP per capita of US$ 9,999, which happens to be the mean of GDP per capita across all upper middle-income countries.Columns (3) and (4) of Table 7 show the results and indicate no significance for either the GDP per capita variable or for the hump-shape of the relation between development and emigration.The conclusion from these robustness checks is twofold.On the one hand, the finding of a humpshaped relationship between emigration and development levels is highly robust in panel data settings, using data for 180 countries and a 50-year timeframe.On the other hand, it is not correct to conclude that, in any given country, emigration initially increases with economic development before it starts to fall.In particular, the 'left hand side' of the migration hump does not withstand any of the robustness tests that we performed.On the contrary, when we focus on low-income countries that actually transitioned to middle-income status, we find evidence that emigration actually declined with economic development.This suggests that the inverted U-shaped relationship of economic development and migration cannot be interpreted as a causal relationship.

Concluding remarks
This paper has rigorously tested the migration transition hypothesis according to which emigration follows an inverted U-shaped relationship with economic development.The migration transition hypothesis suggests that emigration first increases, as countries move from low to middle-income levels of development, and subsequently decreases again as countries grow richer.As predicted by several migration transition theories, such a non-linear pattern could emerge from various factors at play, including financial constraints that diminish over time, migrant networks abroad that increase with migration, or a demographic transition.
In order to test this hypothesis, we applied a migration version of the gravity model, micro-founded by the Random Utility-Maximization (RUM) model, on a global panel data set comprising 180 origin and destination countries and a 50-year timeframe .This is the most extensive panel data set used so far in the literature to test for the existence of the migration hump.We used GDP per capita at origin as a proxy for development levels and include a linear and a squared term to account for the nonlinearities predicted by migration transition theories.We used the recent PPML estimator and, following the literature, controlled for the influence of alternative destinations on one's decision to migrate (so-called multilateral resistance to migration).We did so by incorporating several origin-time control variables and various fixed effects structures controlling for unobserved origin-, destination-time, time and country-paircharacteristics potentially affecting migration flows.
Based on this panel data analysis, we find strong empirical support for an inverted-U relationship between emigration and development levels.Our results are robust to (a) the addition of several origin-time control variables, (b) the use of different time and country subsamples (with and without non-OECD countries), (c) the inclusion of an interaction term between geographical distance and income at origin and (d) several additional tests of the existence of an inverted-U shaped relation between GDP per capita at origin and emigration flow.
However, the finding of an inverted U-shaped relation between economic development and emigration is mainly driven by cross-country heterogeneity in factors other than income and therefore the migration hump cannot be interpreted as a causal relation.In several additional robustness analyses we found that, for a given low-income country, an increase in economic development does not lead to higher emigration.On the contrary, for a subsample of 44 countries that actually transitioned from low-income to middle-income status (excluding China and India), we even found evidence that emigration rather declined with economic development.Drawing the conclusion that the inverted-U relationship is causal therefore seems unfounded.This new finding, supported by various robustness checks, has important policy implications.In contrast with what other authors (e.g.De Haas, 2019 andClemens andPostel, 2018) have concluded based on cross-sectional findings, we can no longer conclude that, as low-income countries develop, their emigration will tend to increase before declining after a certain middle-income turning point.While we do find empirical evidence of an inverted U-relation between economic development and emigration using the full sample of 180 countries over 50 years, it seems that this finding is driven by the underlying crosssectional pattern of middle-income countries having higher emigration rates than either low-or high-income countries.These differences in emigration rates are likely caused by fundamental differences between countries in different income categories that make a causal inference of the inverted-U relation invalid.
Moreover, akin to other papers in the existing literature on this topic, we are not able to fully control for the potential endogeneity arising from the reversed causality between migration and GDP, nor from the multilateral resistance to migration, i.e. the unobserved impact of the attractiveness of alternative destinations on one's willingness to emigrate.Due to these issues any causal interpretation of the migration hump is unfounded.Although in our analysis we do not eliminate the bias due to endogeneity, we are able to reduce it by including a decade-to-decade lag of GDP per capita at origin as an instrument in order to tackle reverse causality and by using country-, destination-time and country-pair-varying fixed effects in order to partially account for multilateral resistance to migration.
We circumvent the remaining endogeneity problem due to fundamental differences between countries in different income categories that we cannot fully control for, by estimating the model solely for those countries that actually transitioned from low-income to middle-income status.In this case, the included middle-income countries are the same as the included low-income countries (only at a later point in time) and hence there is no heterogeneity between the two income groups when using this subsample.Interestingly, the results for this subsample (which excludes China and India) show importantly that emigration actually declines as low-income countries develop economically.This obviously has important policy implications: it suggests that development programs can in fact promote economic development in low-income countries without encouraging emigration.

Table A. 2
Overview of the main variables used in the analyses, its definitions and sources Variable Definition Source

GDP per capita
The ratio of Purchasing Power Parity (PPP)-adjusted total Gross Domestic Product (GDP) in constant 2011 US dollars, to the total population count.

Age dependency ratio
The ratio of the number of people younger than 15 or older than 64 (dependents) to the working-age population (ages 15-64).
World Development Indicators, World Bank.

Population density
Midyear population divided by land area in square kilometers.
World Development Indicators, World Bank.

Population (total)
The mid-year estimate of all residents, regardless of legal status or citizenship.

World
Development Indicators, World Bank.

Polity IV index
This index (Marshall, Gurr, & Jaggers, 2018) considers a nation as strongly democratic if citizens have the ability to express their preferences about policies and leaders through institutions and procedures, executive power is institutionally constrained, and civil liberties are guaranteed.'Strong' autocracies, on the other hand, are characterized by the presence of sharp restrictions on, or suppression of, competitive political participation.This index ranges from -10 (strongly autocratic) to +10 (strongly democratic).
Center for Systemic Peace (CSP) ) with the interaction term distance and GDP per

Figure A. 1
Figure A. 1 Mean bilateral emigration rates over time for countries in each income group (as defined by the World Bank) to countries in all other income groups, in the 1960-2020 timeframe

Table 1
Results from base model, full sample and time subsamples

Table 3
Results from the estimation of the base model with different combinations of control variables 21Variables with more than 0.4 (absolute) correlation are not included together in the same specification.A correlation matrix can be found in TableA.3 in the Appendix.

Table 5
Results from base model, countries that transitioned from LIC to MIC and African countries

Table 6
Results from the estimation of the base model, up to various extreme points found

Table 7
Results from the estimation of the base model, up to various income thresholds

Table A . 3
Correlation matrix of selected origin-time variables, including log GDP per capita Base model estimated on a migration data set excluding small island states Robust clustered standard errors in parentheses.** p<0.01, ** p<0.05, * p<0.1 Note: Small island states are defined as islands with a population of less than 3mln.

Table A . 7
Results from the estimation of the base model with a cubic term