We used a range of sources of administrative and spatial data. The core dataset that contains primary care workforce information is provided by NHS Digital, and we obtained data as of 30 September 2016, published on 29 March 2017 . At the practice level, information is available on geography (Clinical Commissioning Group (CCG) and NHS region), patient list size by age groups and also numbers and full-time equivalent (FTE) for GPs, by country and region of qualification. At the individual level, the same geographical information is available, as well as staff type (e.g. GP, nurse, administrator), role (e.g. GP partner, junior doctor), country and region of qualification for GPs only, age, sex and FTE. Individual records are not linked to practices to protect anonymity .
Additional information at the practice level included the average deprivation of patients, the overall morbidity as measured by the QOF and the average payment made to primary care per patient. NHS payments to general practice for the financial year 2015/2016 and for the whole of England  were used to calculate average pay per patient minus prescribing and dispensing fee payments.
Deprivation was quantified with the 2015 release of the Index of Multiple Deprivation (IMD), a complete aggregate measure which is widely used to quantify deprivation and affluence . The measure quantifies relative deprivation across the following seven domains: income, employment, education and skills, health and disability, crime, barriers to housing and services, and living environment. Deprivation scores are calculated and assigned to very low UK geographical units (Lower Super Output Areas), and the overall IMD is calculated as a weighted mean across the seven domains, with income and employment deprivation given the largest weight (22.5% each), followed by health and education deprivation (13.5% each), and with the other three domains given equal weights (9.3%). To calculate the average deprivation levels of the practice population, rather than the practice location, we made use of a dataset linking practice populations to low geographies , allowing us to calculate a weighted average of deprivation for each practice.
To quantify overall morbidity at the practice level, we used 2015/2016 data from a national primary care pay-for-performance programme, the QOF . The programme has underpinned high quality of recording in primary care , and under its umbrella, recording, management and treatment of a large number of clinical domains was financially and reputationally incentivised. In 2015/2016, there were 21 incentivised domains: Atrial Fibrillation, Asthma, Cancer, Cardiovascular Disease Primary Prevention, Coronary Heart Disease, Chronic Kidney Disease (for those aged 18 or older), Chronic Obstructive Pulmonary Disease, Dementia, Depression (18 or older), Diabetes both types (17 or older), Epilepsy (18 or older), Heart Failure, Hypertension, Learning Disability, Severe Mental Illness, Obesity (18 or older), Osteoporosis (50 or older), Peripheral Artery Disease, Palliative Care, Rheumatoid Arthritis (16 or older), Stroke. For each of the practices participating in the QOF, covering more than 99% of all registered patients , we calculated the total sum of all condition registers, a cumulative QOF register.
Finally, 2016 spatial coordinates for NHS organisational units were obtained from the Office for National Statistics (ONS) Open Geography portal . We focused on two organisational levels, the lower CCGs with 209 units, and the higher NHS regions with 14 units.
For all aspects of data manipulation and analysis we used Stata v14.1. Whenever medians are reported, we also report the 25th and 75th centiles. Spatial maps were plotted using the spmap command . An alpha level of 5% was used throughout.
We quantified the characteristics of GPs for the whole of England and for each of the 14 NHS regions in 2016. For each region and overall, we estimate and report the following individual aggregates: number, percentage of males, median age and median FTE. All individual-level aggregates are reported overall by GP country of qualification. We also report practice-level aggregates on median number and FTE (overall only) per 10,000 patients, per 1000 patients aged 75 or older and per 10,000 counts on the cumulative QOF register. Finally, and also at the practice level but overall and by GP country of qualification, we present the median residence location deprivation of the average practice patient and median of the average pay per patient (minus prescription and dispensing costs). For deprivation, we first calculated the weighted deprivation mean within each practice, and next we estimated its weighted median (weighted by the practice list size for overall estimates or by the product of the list size and the percentage of GPs qualified from each region for qualification regions). The process was similar for pay, the only difference being the first step where we used the average pay per patient in the practice. This weighting approach allowed us to estimate patient deprivation and pay medians by GP country of qualification (on numbers, which were available as country of qualification aggregates at the practice level, when FTE was not).
In a second approach, we quantified the characteristics of practices (median FTE per 10,000 patients, 1000 patients aged 75 or older and 10,000 counts on the cumulative QOF register; also average pay and patient residence location deprivation) at different levels of presence of overseas qualified GPs: 0%; above 0% and up to 20%; above 20% and up to 40%; and above 40%. To evaluate if pay differed for various strata of EEA and elsewhere qualification more robustly, we performed multiple linear regressions at the practice level, associating average pay to percentage of EEA, percentage of elsewhere and percentage of EEA or elsewhere qualification, adjusted for the percentage of patients aged 75 or older and the cumulative QOF register (i.e. adjusting for proxies of health need).
Spatial graphs at the CCG level with additional information on NHS regions were plotted for various variables of interest for GPs, overall and by region of qualification: number and percentage aged 55 or older (FTE weighted), cumulative FTE, mean age, FTE per 10,000 patients and FTE per 10,000 counts on the cumulative QOF register. The primary aim of these graphs was to identify areas more dependent on overseas qualified GPs.