# Metapopulation ecology links antibiotic resistance, consumption, and patient transfers in a network of hospital wards

1. CIRI, Centre International de Recherche en Infectiologie, Université de Lyon, Inserm U1111, Ecole Normale Supérieure de Lyon, Université Lyon 1, CNRS, UMR5308, France
2. Département de Pharmacie, Hospices Civils de Lyon, France
3. P?le de Santé Publique, Département d'Information Médicale, Hospices Civils de Lyon, France
4. Institut des Agents Infectieux, Hospices Civils de Lyon, France
5. Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, University of Lyon, France
6. Service des Maladies Infectieuses et Tropicales, Hospices Civils de Lyon, France
Research Article

## Abstract

Antimicrobial resistance (AMR) is a global threat. A better understanding of how antibiotic use and between-ward patient transfers (or connectivity) impact population-level AMR in hospital networks can help optimize antibiotic stewardship and infection control strategies. Here, we used a metapopulation framework to explain variations in the incidence of infections caused by seven major bacterial species and their drug-resistant variants in a network of 357 hospital wards. We found that ward-level antibiotic consumption volume had a stronger influence on the incidence of the more resistant pathogens, while connectivity had the most influence on hospital-endemic species and carbapenem-resistant pathogens. Piperacillin-tazobactam consumption was the strongest predictor of the cumulative incidence of infections resistant to empirical sepsis therapy. Our data provide evidence that both antibiotic use and connectivity measurably influence hospital AMR. Finally, we provide a ranking of key antibiotics by their estimated population-level impact on AMR that might help inform antimicrobial stewardship strategies.

## Introduction

Antimicrobial resistance (AMR) of pathogenic bacteria progresses worldwide, imposing a considerable burden of morbidity, mortality and healthcare costs (Laxminarayan et al., 2013; Cassini et al., 2019). AMR is increasingly recognized to emerge in various settings including agriculture (Johnson et al., 2016) or polluted environments (Lübbert et al., 2017; Venter et al., 2017). However, hospitals continue to be important hotspots for AMR in clinically-relevant pathogens (Chatterjee et al., 2018; David et al., 2019) due to the confluence of strong antibiotic selection pressure, fragile patients, and highly resistant pathogens that can disseminate between wards and facilities through patient transfers (Safdar and Maki, 2002; Snitkin et al., 2012). These conditions result in hospitals becoming reservoirs of resistant bacteria (Clarivet et al., 2016; Cohen, 1992; Pogue et al., 2013), which can later enter the community (Huttner et al., 2013).

The primary hospital-based strategies against AMR are antimicrobial stewardship, which aims to lower the antibiotic pressure, and infection control whose goal is to reduce the transmission of pathogens (Manning et al., 2018). The need for such strategies is widely accepted but because they are often implemented together, the relative importance of each is unclear (Chatterjee et al., 2018; Goff et al., 2017). The details of how to best implement these strategies are hotly debated (Lemmen and Lewalter, 2018), especially regarding which antibiotics should be restricted first as a part of antibiotic stewardship strategies (Chastain et al., 2018) or the risk-benefit balance of screening-based patient isolation procedures to reduce transmission (Karda?-S?oma et al., 2017; Robotham et al., 2016). Designing effective antibiotic stewardship strategies has been hindered by the paucity of evidence concerning which antibiotics exert the strongest selection pressure. Current rankings of antibiotics for de-escalation and sparing strategies rely on expert consensus with partial agreement (Weiss et al., 2015), themselves based on conflicting evidence (Acar, 1997; Huttner et al., 2016). This is further complicated by the fact that the association between antibiotic use and resistance is not uniform across pathogen species (Bell et al., 2014) or classes of antibiotics (Niehus et al., 2020). Moreover, antibiotic consumption can also have long-term effects on the carriage of resistant bacteria within patients (Niehus et al., 2020).

Understanding the drivers of AMR at the hospital level requires consideration not only of the selection pressure from antibiotics on individual patients, but also of the transmission and dissemination of drug-resistant pathogens in the patient population (Lipsitch, 2001; Lipsitch and Samore, 2002). However, we lack a quantitative understanding of the respective impacts of selection and transmission on the incidence of AMR infections in hospitals. Observational studies of AMR usually report on the proportion of resistant variants in a limited set of species, which conceals the overall AMR burden and can make interpretation difficult when resistant variants apparently increase in proportion while decreasing in incidence (Burton, 2009). Moreover, linking antibiotic use and AMR prevalence at the population level is difficult due to the confounding effects of bacterial transmission and the complexity of the ecological processes underlying AMR (review in Schechner et al., 2013). Thus, studies of AMR in hospitals could benefit from ecological frameworks able to simultaneously model the impact of antibiotic use and patient transfers on the incidence of infections with the most relevant pathogens. Metapopulation ecology is such a framework. It was introduced by Levins, 1969 to explain the persistence of agricultural pests across a set of habitat patches, such as crop fields, and refined by Hanski to account for the characteristics of patches, such as their size, and the connectivity between them (Hanski, 1998; Hanski, 1994). The metapopulation concept, in which populations of organisms are spread across inter-connected patches with varying characteristics, is useful to describe bacterial pathogens in a hospital network containing wards with different sizes and levels of antibiotic pressure, connected by the transfers of infected patients. Models using the metapopulation framework, beyond their frequent use in wildlife and conservation biology (Dolrenry et al., 2014; Heard et al., 2015; MacPherson and Bright, 2011), have recently provided theoretical grounds for pathogen persistence in the healthcare setting (Spagnolo et al., 2018). So far, however, metapopulation models of hospital AMR have been applied to simulated rather than empirical data (Spagnolo et al., 2018; Vilches et al., 2019).

Here, we used a metapopulation framework to model and isolate the population-level effects of antibiotic use and inter-ward connectivity on the incidence of infections with major pathogen species and their drug-resistant variants within a 357-ward hospital network, using detailed data collected?over the course of one year. Our objectives were: (1), to determine the respective impacts of antibiotic use and connectivity on the incidence of infections with resistant pathogens at the population level; and (2), to compare the impacts of the use of specific antibiotics on the ward-level incidence of AMR infections, after controlling for the effect of connectivity. Based on the association patterns between the incidence of 17 pathogen variants and the use of 11 antibiotic classes, our findings highlight both common patterns and species-specific behaviors of pathogens and provide a ranking of key antibiotics by their estimated population-level impact on AMR.

## Results

### Distribution of bacterial pathogens and antibiotic use in a hospital network

We analyzed pathogen isolation incidence in clinical samples, antibiotic use, and patient transfers in 357 hospital wards from the region of Lyon, France. Data for all three measures were collected during the same period from October 2016 to September 2017. The hospital network contained a total of 4,685 beds. The median ward size was 12 beds (interquartile range, 5 to 20).

Ward-level data were aggregated from 13,915 infection episodes, defined as ward admissions with?≥1 clinical sample positive for E. coli or one of the so-called ESKAPE pathogens (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa and Enterobacter cloacae complex), collectively termed ESKAPE2 (Table 1). Pathogens were grouped into species-resistance pattern combinations, namely 3rd-generation cephalosporin (3GC)-resistant E. coli, E. cloacae complex and K. pneumoniae, carbapenem-resistant E. coli, E. cloacae complex, K. pneumoniae, P. aeruginosa and A. baumannii, vancomycin-resistant E. faecium and methicillin-resistant S. aureus (MRSA). Pathogen variants not falling into these resistance groups were collectively referred to as the less-resistant variants (Table 1). The median yearly incidence of infection episodes per ward was 24 (interquartile range, 7 to 55).

Table 1

Infection episodes most frequently involved the less-resistant variants, especially E. coli, which were also found in the largest number of wards (Table 1 and Figure 1—figure supplement 1). Resistant variants were consistently less frequent than their less-resistant counterparts in all species (Figure 1—figure supplement 1). In enterobacteria (E. coli, K. pneumoniae and E. cloacae), carbapenem-resistant variants were consistently less frequent than 3GC-resistant variants. Infections with vancomycin-resistant E. faecium and carbapenem-resistant A. baumannii were exceptional, with seven and twelve episodes respectively (Table 1).

To estimate the degree of concentration of each variant in the network, we calculated concentration indices defined as the probability that two random occurrences of the same variant originated from the same ward, analogous to the asymptotic Simpson index (see Materials?and?methods). The concentration index varies from 0% for a uniformly random distribution (each occurrence is in a different ward) to 100% for a maximally concentrated distribution (all occurrences are in the same ward). The concentration of infection episodes was weak (<5%) for all variants, indicating a global lack of clustering (Table 1). Concentration increased with resistance (~2-fold increase from the least to the most resistant variant) in E. coli, K. pneumoniae, P. aeruginosa and A. baumannii, suggesting an adaptation of resistant variants to more specific wards within the hospital network compared to their less-resistant, less-concentrated counterparts. This pattern was not found in E. cloacae complex, E. faecium or S. aureus, in which the concentration index remained comparable across resistance categories.

Over the same period (October 2016 to September 2017), antibiotics were prescribed in 86.3% of wards (Table 2), with a total consumption of 125.7 defined daily doses per year per bed (ddd/y/b). Antibiotics usually suspected to select for AMR in the selected variants were grouped into 11 classes (Table 2). Antibiotics with comparatively rare use (e.g., rifampicin) were excluded. The distribution of antibiotic use in the network was analyzed using the concentration index described above, here representing the probability that two random drug doses were delivered in the same ward. Antibiotic use was diffuse, with concentration indices?<?4%, ranging from 0.8% for cefotaxime-ceftriaxone and fluoroquinolones to 3.6% for oxacillin.

Table 2

### Antibiotic use and connectivity predict the incidence of drug-resistant infections

We used multivariable generalized linear models (GLMs) within the metapopulation framework to disentangle the influences of antibiotic pressure, connectivity, ward size, and ward type on the incidence of infections with the selected pathogens and their resistant variants.

Connectivity quantifies the incoming flux of each pathogen variant in a downstream ward via the transfer of infected patients from upstream wards. Practically, we estimated connectivity for each variant and downstream ward as the sum of the direct transfers from each upstream ward multiplied by the variant’s prevalence in that ward (see Materials?and?methods). Mean connectivity ranged from 168.2 estimated introductions per year for less-resistant E. coli to 0.09 for vancomycin-resistant E. faecium. Connectivity was always higher for the less-resistant variants compared to the resistant variants, consistent with the higher prevalence of the former (Supplementary file 1 Table 1a).

Wards were characterized by their size (no. of beds) and type, representing patient fragility. Ward type was coded as one of the following categorical variables: intensive care and blood cancer units, progressive care units, and other wards. ‘Other wards’ was considered the reference category in all models. In several pathogen variants, namely carbapenem-resistant E. coli, carbapenem-resistant A. baumannii, and vancomycin-resistant E. faecium, the small sample size in one or several ward categories prevented the inclusion of ward type as a model covariate.

We also considered that the distribution of infections across wards was a source of bias that required a specific adjustment procedure (see Materials?and?methods). The local prevalence of specific infections (e.g., respiratory tract infections) in a ward influences both the antibiotic use and the observed incidence of infections with a given pathogen, which might confound the relationship between antibiotic use and incidence (Figure 1—figure supplement 2). However, the distribution of infections would be difficult to represent as an adjustment covariate with a sufficiently small number of categories. We used a proxy method to circumvent this issue. We assumed that the distribution of infections directly influences the frequency and specimen types (e.g., respiratory vs. urinary tract specimens) of microbiological samples in each ward. Under this assumption, we replaced the unrepresentable distribution of infections with a proxy variable summarizing the distribution of microbiological samples. This proxy variable, which we refer to as the incidence control, was defined as the ward-level incidence of a pathogen variant predicted by patterns of microbiological sampling alone. As expected, the incidence control correlated with both antibiotic use and the incidence of infections in all prevalent variants (Figure 1—figure supplements 3 and 4). Of note, the incidence of each variant also correlated with both antibiotic use (Figure 1—figure supplement 5) and connectivity (Figure 1—figure supplement 6) in bivariate analyses, except for the very rare A. baumannii and E. faecium resistant variants. Hereafter, all models included the incidence control covariate to adjust for the confounding effect of the distribution of infections.

The incidence of each pathogen variant was modeled in a separate multivariable quasi-Poisson GLM (Figure 1). In these GLMs, global antibiotic use was associated with infection incidence in seven pathogen variants independent of connectivity, ward size, and ward type (Figure 1), including five resistant variants (3GC-resistant E. coli, 3GC-resistant K. pneumoniae, carbapenem-resistant K. pneumoniae, 3GC-resistant E. cloacae, and carbapenem-resistant E. cloacae) and two less-resistant variants (P. aeruginosa and E. faecium). The largest effect size was found in carbapenem-resistant K. pneumoniae, in which every doubling of antibiotic use predicted a 47% increase in incidence (95% confidence interval, 19% to 90%).

Figure 1 with 6 supplements see all

Connectivity predicted a higher incidence in all variants of P. aeruginosa and E. faecium, with a stronger effect size in the resistant variants (11.5 and 43.7%, respectively) compared to their less-resistant counterparts (6.1 and 15.3%). A significant association with connectivity was also found for the less-resistant E. coli, although with a much smaller effect size (2.6%).

Ward characteristics only weakly predicted infection incidence compared to antibiotic use and connectivity. Ward type, or patient fragility, predicted incidence in several variants, although with large uncertainty margins. Interestingly, associations of incidence with intensive care and blood cancer units were negative (in E. coli, 3GC-resistant E. coli, 3GC-resistant E. cloacae, and MRSA) while associations with progressive care units were positive (in 3GC-resistant K. pneumoniae and P. aeruginosa). Ward size did not predict incidence in any variant.

### Do associations between antibiotic use and resistance represent AMR selection?

The metapopulation models illustrated in Figure 1 identified positive associations between total antibiotic use in hospital wards and increased incidences of infections with the more resistant variants of several species. Yet, a correlation with AMR does not necessarily establish a selective role of antibiotics. For instance, a high incidence of resistant infections in a ward can increase antibiotic use through prolonged or combined therapies (Schechner et al., 2013). Conversely, the prescription of antibiotics always inactive against a variant is unlikely to be motivated by this variant’s incidence and such antibiotics are more likely to provide a direct benefit to the resistant variant. Based on this rationale, we propose stringent criteria to identify whether an association between the use of an antibiotic and the incidence of a variant possibly reflects AMR selection (hereafter, possibly selective associations; see Materials?and?methods). Under the hypothesis that antibiotic use is either a consequence of AMR or spuriously correlated with AMR, the strength of an association between the use of an antibiotic and the incidence of a variant should not depend on whether the association fulfills the criteria for possible selection.

To test this hypothesis, we identified possibly selective associations in our data and we examined whether they were equally likely to be positive compared to other associations. We constructed quasi-Poisson multivariable regression models where the total antibiotic use was replaced with the use of specific antibiotics, along with the incidence control and connectivity covariates (Figure 2a and b). Ward type and size, which were weak predictors of incidence in Figure 1 models, were excluded to avoid introducing additional noise. The 17 variant-specific models each included all antibiotic groups as predictors, including antibiotic classes not expected to exert any direct selection pressure on the variant, such as aminoglycosides on E. coli. The models yielded 187 coefficients of which 19 (10.2%) represented possibly selective associations (see Materials?and?methods). The mean percent change in incidence for every doubling of consumption volume in possibly selective associations was 6.5%, compared to 0.9% in other associations (95% CI of the difference of the means, 0.6 to 10.6%). Three of the four strongest possibly selective associations involved cefotaxime-ceftriaxone, which selected for 3GC-resistant E. coli, K. pneumoniae and P. aeruginosa; the other involved carbapenems selecting for carbapenem-resistant P. aeruginosa. Overall, the larger magnitude of the coefficients of possibly selective associations suggests that the local, ward-level selection of drug-resistant variants by antibiotics is measurably pervasive throughout our hospital network.

Figure 2

### Quantifying the drivers of resistance to first-line sepsis therapy

From a clinical standpoint, the most immediate consequence of AMR is the failure to control sepsis with empirical antibiotics, mainly carbapenems and the non-antipseudomonal 3GCs cefotaxime and ceftriaxone. Because such failure can equally result from acquired or intrinsic resistance, the incidence of intrinsically resistant pathogens such as E. faecium is of equal clinical importance as that of pathogen variants with acquired resistance mechanisms. To examine the impact of antibiotics on both intrinsic and acquired resistance, we modeled the cumulative incidence of infections with 3GC- and/or carbapenem-resistant variants of the selected pathogens (see Materials?and?methods).

In these models, antibiotic use was the strongest predictor of the incidence of both carbapenem-resistant (6.5% increase in incidence for every doubling of consumption volume, 95% CI, 2.5 to 11.0%) and 3GC-resistant infections (5.1% increase, 95% CI, 2.7 to 7.5%; Figure 3a). Connectivity better predicted the incidence of carbapenem-resistant infections (4.8%, 95% CI, ?0.4 to 10.7%) compared with 3GC-resistant infections (1.0%, 95% CI, ?2.3 to 4.5%), in line with the comparatively stronger association of connectivity with the incidence of individual carbapenem-resistant variants (Figure 1). Ward size had no measurable effect in either model. Carbapenem-resistant infections were not associated with ward type, while 3GC-resistant infections were positively associated with progressive care units and negatively associated with intensive care and blood cancer units (Figure 3a). These findings provide an unambiguous link between population-level antibiotic use and global resistance to empirical sepsis therapy that was robust to confounding by connectivity, microbiological sampling patterns, and other ward characteristics.

Figure 3

To identify the antibiotics whose use was most strongly associated with global carbapenem and cefotaxime-ceftriaxone resistance, we examined the effect of replacing the total antibiotic use in our models with?individual antibiotic classes, similar to Figure 2 models. Besides the consumption of cefotaxime-ceftriaxone and carbapenems, the consumption of piperacillin-tazobactam was a strong predictor of the incidence of both 3GC- or carbapenem-resistant infections (Table 3). To visualize the independent associations of incidence with the use of cefotaxime-ceftriaxone, carbapenems and piperacillin-tazobactam, we plotted the average ward-level infection incidence predicted by variations in the consumption volumes in models including the connectivity and incidence control covariates (Figure 3b). The incidence of 3GC-resistant infections was predicted by the consumption of cefotaxime-ceftriaxone (4.8% increase, 95% CI, 1.5 to 8.2%) but not carbapenems (1.4% increase, 95% CI, ?1.7 to 4.6%). In the same vein, the use of carbapenems, but not 3GCs, predicted the incidence of carbapenem-resistant infections, although with a wide uncertainty margin (5.1% increase, 95% CI, ?0.6 to 11.1%). Strikingly, the use of piperacillin-tazobactam predicted the incidence of both 3GC-resistant (5.9% increase, 95% CI, 2.9 to 9.0%) and carbapenem-resistant infections (10.3% increase, 95% CI, 5.0 to 16.0%). In both models, the amplitude of the piperacillin-tazobactam coefficient outweighed those of all other antibiotics (Table 3). Overall, these results indicate a specific association of cefotaxime-ceftriaxone and, to a slightly lesser extent, carbapenem use with resistance to the same antibiotic group, but not other groups, and identify a major role of piperacillin-tazobactam consumption in predicting the incidence of both 3GC- and carbapenem-resistant infections. To propose a unified ranking of the population-level impact of antibiotics on 3GC and carbapenem resistance, a final model was constructed by pooling all 3GC- and carbapenem-resistant variants together (Table 3). In this model, piperacillin-tazobactam and cefotaxime-ceftriaxone had the largest positive coefficients while 1GC/2GC and coamoxiclav had the largest negative coefficients (Table 3).

Table 3

## Discussion

Understanding the respective impacts of antibiotic use and connectivity on the incidence of drug-resistant infections is essential for optimizing interventions against AMR. Pathogens whose incidence is strongly predicted by antibiotic use might be most effectively targeted by antibiotic restrictions. On the other hand, infection control interventions might be most relevant against pathogens whose incidence is predicted by connectivity. By applying a metapopulation framework to explain variations in infection incidences across a large network of hospital wards, we found that both antibiotic use and connectivity independently contribute to ward-level AMR in several pathogen species. Our study also provides the first quantitative ranking of the predicted impact of several key antibiotics on the global burden of drug-resistant infections in a hospital network.

While previous theoretical work based on modeling and simulation has predicted how patient transfers contribute to AMR prevalence through pathogen dissemination, they have typically considered only a single resistant variant or used a generic simulated pathogen (Donker et al., 2017; Donker et al., 2010; Vilches et al., 2019). Our study provides empirical evidence to support these predictions in general, while identifying substantial variation in the response to connectivity between different species and variants (Figure 1). In our closed network model of the hospital, the influence of connectivity on the incidence of a pathogen variant is expected to be higher if the variant is endemic to the hospital, its prevalence varies across wards, and changes from one variant category to another are rare. If a variant is frequently introduced from outside of the network, the contribution of within-network movements, hence of connectivity, to the prediction of the local incidence will be diminished. The measured influence of connectivity should also be reduced if a variant frequently undergoes transitions between resistance categories (e.g., by evolving or losing resistance) because local emergence will affect incidence more strongly than inter-ward introduction.

Consistent with this theoretical interpretation of connectivity, we found that its influence was strongest in the hospital-endemic pathogens P. aeruginosa and E. faecium (Blanc et al., 2007; Wurster et al., 2016; Zhou et al., 2020). Intriguingly, MRSA incidence was not predicted by connectivity. This is at odds with the classical perception of MRSA as typically nosocomial and, by extension, hospital-endemic. However, available evidence does not strongly support the qualification of MRSA as a hospital-endemic pathogen in our setting. In France, the proportions of MRSA among S. aureus are comparable in community and hospital settings, at about 10% (ECDC, 2019; ONERBA France, 2018; Santé Publique France, 2019). MRSA population structures in hospitals and the community are comparably dominated by the so-called ST8 Lyon clone which is equally found in in- and outpatients (Dauwalder et al., 2008). Finally, MRSA infections were especially diffuse in our network, with a concentration index even lower than that of the less-resistant S. aureus infections (Table 1). Collectively, this does not support the conclusion that MRSA concentrates in French hospitals compared to the community, which might explain why connectivity did not predict MRSA incidence in our study.

The incidence of global carbapenem-resistant infections was strongly predicted by connectivity (Figure 3), with an effect size comparable to that of antibiotic use, while connectivity did not predict global 3GC-resistant infections. This suggests that reducing connectivity with infection control interventions could be more effective at preventing the spread of carbapenem-resistant pathogens between wards compared to 3GC-resistant pathogens. Contrasting with hospital-endemic pathogens, connectivity was a comparatively weaker predictor for community-associated variants (3GC-resistant E. coli and K. pneumoniae) that enter from outside the hospital network, and in variants whose resistance is selected locally because they can frequently shift between susceptible and resistant categories. Indeed, the weak association of connectivity with the incidence of resistant E. cloacae complex variants, for instance, might be explained by the plasticity of their resistance profile, facilitating their local selection. While resistance in E. coli and K. pneumoniae typically requires gene acquisition (Manges et al., 2019; Wyres and Holt, 2016), E. cloacae complex can resist cephalosporins and carbapenems through increased AmpC beta-lactamase and decreased porin expression (Pavez et al., 2016; Babouee Flury et al., 2016; Lee et al., 2017). Such resistance emerges through adaptation and de novo mutations that are rapidly selected from the local reservoir of susceptible progenitors under antibiotic selection pressure (Hawken et al., 2018; Moradigaravand et al., 2016).

Ecological studies have repeatedly identified associations between the use of antibiotics and AMR prevalence, but such associations do not necessarily reflect AMR selection (Schechner et al., 2013). Specific associations at the level of antibiotic and variant pairs are not all equally likely to result from selection by antibiotics or from increased antibiotic use in response to AMR. Based on medical and biological reasoning, we identified associations more likely to represent possible selection and showed that the average strength of these associations outweighed the others (Figure 2). Noteworthy, several associations with a sizeable strength could not be formally classified as possibly selective but might reflect co-selection. The association of cefotaxime-ceftriaxone with carbapenem-resistant variants in K. pneumoniae, but not in E. coli, possibly reflected selection for the highly frequent 3GC-resistance in carbapenemase-producing K. pneumoniae. This contrasts with E. coli Oxa48 producers that frequently remain susceptible to 3GCs but resistant to piperacillin-tazobactam (Huang et al., 2014), consistent with the strong association between the use of piperacillin-tazobactam and the incidence of carbapenem-resistant E. coli infections (Figure 2a). These findings suggest that the associations between antibiotic use and resistance preferentially reflected AMR selection by antibiotics rather than adaptation of antibiotic use in response to AMR. However, this observation does not eliminate temporal ambiguity between antibiotic use and resistance in our models. Further research using, for instance, multi-state models or time-series analyses, may better clarify this ambiguity.

Because intrinsic and acquired resistances to an antibiotic equally lead to treatment failure, we modeled the pooled incidences of infections with 3GC- or carbapenem-resistant variants of the ESKAPE2 pathogens, including those with intrinsic resistance. This approach allowed us to rank antibiotics by their potential association strength with global AMR. The use of piperacillin-tazobactam and cefotaxime-ceftriaxone predicted 3GC resistance and the use of piperacillin-tazobactam predicted carbapenem resistance (Figure 3). The positive association of the use of piperacillin-tazobactam with both 3GC- and carbapenem-resistant infections deserves further attention. Based on its in vitro efficacy against extended-spectrum beta-lactamase (ESBL)-producing enterobacteria, piperacillin-tazobactam has been repeatedly considered as an alternative drug of choice in carbapenem-sparing strategies (Harris et al., 2015; Peterson, 2008). The strategy of replacing carbapenems with piperacillin-tazobactam assumes: (1), that piperacillin-tazobactam is clinically as effective as carbapenems on piperacillin-tazobactam-susceptible pathogens; and (2), that AMR selection under piperacillin-tazobactam pressure is weaker than under carbapenem pressure, as reflected by a recent consensus-based ranking of beta-lactams for de-escalation therapy (Weiss et al., 2015). Yet, in several recent reports including a multicenter randomized clinical trial, piperacillin-tazobactam treatment of sepsis with ESBL-producing enterobacteria was associated with poorer outcomes compared to carbapenem treatment (Harris et al., 2018; Ofer-Friedman et al., 2015; Tamma et al., 2015). From an epidemiological standpoint, studies of the respective associations of piperacillin-tazobactam and carbapenem use with AMR yielded conflicting results. At the population level, the incidence of carbapenem-resistant enterobacteria was negatively associated with the use of piperacillin-tazobactam in a 5 year, single-hospital trend analysis study (McLaughlin et al., 2013). At the patient level, however, the exposure to piperacillin-tazobactam was associated with the acquisition of carbapenem-resistant P. aeruginosa in a meta-analysis (Raman et al., 2018) and carbapenem-resistant Gram-negative bacilli in a single-hospital prospective cohort study (Marchenay et al., 2015). If confirmed in other settings, our finding that piperacillin-tazobactam use correlates with both 3GC- and carbapenem-resistant infections might call for a reevaluation of the rationale of recommending piperacillin-tazobactam over other drugs for ecological reasons. As a note of caution, our observation that piperacillin-tazobactam use predicted a higher incidence of 3GC- and carbapenem-resistant variants does not imply selection for acquired resistance through any specific mechanism such as carbapenemase production. We also note that the link between piperacillin-tazobactam use and global resistance resulted from the accumulation of small, positive associations with most 3GC- and carbapenem-resistant variants, including those with intrinsic resistance (Figure 2). Because of this, links between piperacillin-tazobactam use and 3GC- or carbapenem-resistant infections might go undetected in studies focusing on individual pathogen variants.

Our study has several limitations. First, we did not consider the role of healthcare workers in pathogen transmission, nor the role of direct patient admissions from the community. Second, although our 3-level categorical coding of ward types captures variations in patient fragility within our study system, such a coarse-grained classification could potentially leave residual confounding. Third, the small sample sizes of resistant variants of E. faecium and A. baumannii limited our ability to draw robust inferences regarding these variants and further studies are required to confirm these results (Arias and Murray, 2012; Hsu et al., 2017). Finally, our findings reflect the AMR ecology of a Western European area, with generally lower prevalences of carbapenemase-producing pathogens, vancomycin-resistant E. faecium, and MRSA than in other regions of the world.

To conclude, the modeling of the incidence of infections with seven major bacterial species and their drug-resistant variants in hospital wards using a metapopulation framework indicates that both antibiotic use and inter-ward connectivity may predict the burden of AMR in a variant-specific fashion. This supports the need to tailor strategies against AMR to the targeted pathogen. Along with novel hospital-level insights into the drivers of AMR, our work illustrates the application of the methodological framework of metapopulation ecology to the problem of hospital AMR. Applying this framework to other healthcare settings could help inform the local and regional antibiotic stewardship and infection control strategies.

## Materials and methods

### Data collection and compilation

Request a detailed protocol

We obtained data on infection incidence from the information system of the Institut des Agents Infectieux, the clinical microbiology laboratory of the Hospices Civils de Lyon, a group of university hospitals serving the Greater Lyon urban area (~1.4 million inhabitants) of France. For each ward from October 1st, 2016 to September 30th, 2017, we extracted the number of clinical samples, after exclusion of screening samples, positive for at least one of the ESKAPE2 species (as determined using Vitek MS MALDI-ToF identification, bioMérieux), falling into one of the resistance variant categories defined in Table 1. Resistance was based on available results for susceptibility to, where applicable, ceftriaxone, cefotaxime, ceftazidime, cefepime, imipenem, meropenem, oxacillin, and vancomycin. Samples were deduplicated per patient, ward, and pathogen variant, so that only the first sample positive for the same variant in the same ward was considered for each patient. Patients positive for multiple variants and/or sampled from multiple wards were considered as multiple, distinct infection episodes. Antibiotic use in defined daily doses (ddd) of all systemic antibacterial drugs (ATC classification term J01), as well as of specific (groups of) molecules defined in Table 2, were extracted from the pharmacy department information system. For each pair of wards, the number of patient transfers was extracted from the hospital information system along with, for each ward, the number of beds, the type of medical activity and the number of patient admissions. Because of the aggregated nature of the data, informed consent was not sought, in accordance with French regulations. Our main response variable was the number of patients per ward infected with each pathogen variant, expressed as incidence over 1y.

### Controlling for microbiological sampling patterns

Request a detailed protocol

The estimation of a causal effect between antibiotic use and the incidence of infections with a given pathogen variant is biased by variations in the distribution of infections across wards, because this distribution influences both antibiotic prescriptions and microbiological sampling efforts. A causal network representation of this situation is shown in Figure 1—figure supplement 2. The distribution of infections is difficult to determine and to represent as a model covariate, preventing its direct inclusion in our models. To circumvent this issue, the unrepresentable distribution of infections was replaced with a proxy variable that we called the incidence control, designed to capture variations of microbiological sampling frequencies and of the sampled anatomic sites (e.g. urinary vs. respiratory tract) across wards. Of note, ward size and type were also assumed to correlate with the infection distribution and to contribute to the proxy adjustment. However, the inclusion of ward size and type had a negligible impact on model fit compared to the incidence control value, suggesting that this latter variable captures most of the signal.

The incidence control was defined as the expected incidence of a pathogen variant explained by microbiological sampling alone, under the assumption that the pathogen incidence is conditionally independent of the ward given the sampled anatomic site. Anatomic sites were assigned to seven site groups, namely, skin and soft tissues, respiratory tract, urinary tract, digestive tract, vascular access devices, sterile sites (such as cerebrospinal fluid and peripheral blood cultures) and other sites.

The incidence control was computed as follows for each pathogen variant in each ward. In this section, wards are indexed by $i=1,…, 357$, groups of anatomical sites are indexed by $j=1, …,7$, and pathogen variants are indexed by $k=1,…,17$. For conciseness, we introduce the special case $k=0$ to denote a sample not positive for any of the considered pathogen variants. Contrary to the incidence calculation described in the previous section, all samples were considered without deduplication to account for how repeated sampling increases the likelihood of pathogen detection.

First, for each pathogen variant $k$ and group of anatomic sites $j$, we computed the average probability of variant detection aggregated for all wards in the network as

$P\left(Variant=k|Site=j\right)=\frac{\sum _{i=1}^{357}N\left(Variant=k,Site=j,Ward=i\right)}{\sum _{i=1}^{357}N\left(Ward=i,Site=j\right)}$

where $NVariant=k, Site=j, Ward=i$ is the number of samples in ward $i$ taken from site $j$ positive for the variant $k$ (or, if $k=0$, negative for all variants) and the denominator denotes all samples taken irrespective of their result. For simplicity, we assumed independence between the pathogen variants isolated from a same sample and between samples collected from a same patient. Of note, most patients (n = 22,646 / 26,064; 86.9%) were sampled only once.

To account for repeated sampling, we considered the probability that $M$ samples from a site $j$ in a patient remained negative for pathogen variant $k$,

$P\left(Variant\ne k|M,Site=j\right)=\text{?}{\left[1?P\left(Variant=k|Site=j\right)\right]}^{M}.$

Sampling from multiple sites was represented using the vector notation $M={M1,…,M7}$ to denote the respective numbers of samples taken from site groups 1 to 7. For instance, $M={1,0,2,0,0,0,0}$ denotes 1 skin or soft tissue sample and 2 urinary samples. Using this notation, the probability that all samples from all sites remained negative is,

$P\left(Variant\ne k|\mathbit{M}\right)=\prod _{j=1}^{7}P\left(Variant\ne k|{M}_{j},Site=j\right),$

and the probability that at least one sample was positive for variant $k$ is,

$P\left(Variant=k|\mathbit{M}\right)=1-P\left(Variant\ne k|\mathbit{M}\right).$

This relationship was used to calculate the expected incidence of variant $k$, that?is the number of patients with at least one sample positive for $k$, by considering in each ward $i$ the number $N(i,M)$ of patients with the same number of samples from each site $M$ and the probability of being tested positive given $M$. The expected incidence of variant $k$ in ward $i$ was then defined as,

$N\left(Variant=k|Ward=i\right)=\sum _{\mathbit{M}\in \mathbf{\Omega }}N\left(i,\mathbit{M}\right)×P\left(Variant=k|\mathbit{M}\right)$

where $Ω$ denotes the set of possible sample combinations.

Clearly, variations in?the incidence control value between wards depend only on the number and sites of microbiological samples taken, thus reflecting the incidence and types of bacterial infections at ward-level independent of between-ward variations of pathogen community structure. Under our assumption that the incidence control is a valid proxy to the unrepresentable distribution of infections in each ward, the incidence control should correlate both with antibiotic use and the incidence of infections. Bivariate analyses confirmed that the incidence control correlated with the observed cumulative incidence of all bacteria (R2=0.96, 95% CI, 0.95 to 0.96, Figure 1—figure supplement 3) and, to a lesser extent, with the total antibiotic use (R2=0.34, 95% CI, 0.25 to 0.40, Figure 1—figure supplement 4). These correlations remained substantial for most pathogen variants and specific antibiotics. The incidence control was added as an adjustment covariate in all models predicting infection incidence. The adjusted models, thus, predicted the incidence of infections in excess of what would be expected based on variations in sampling intensity alone.

### Connectivity and other ward characteristics

Request a detailed protocol

In ecology, habitat quality refers to the resources and conditions that allow individuals and populations to persist in a location (Hall et al., 1997), such as food or cover from predators (Johnson, 2005). We described the habitat quality of hospital wards for bacterial pathogens using explanatory variables adapted from Hanski’s metapopulation models (Hanski, 1998; Hanski, 1994), namely patch size and connectivity, along with additional variables capturing patient fragility and antibiotic selection pressure. We considered each ward within the hospital system as a distinct habitat patch, i. We used the number of beds both as a measure of patch size (capturing the number of patients available for colonization) and as a proxy for contact opportunities between patients within the same ward.

The connectivity estimate was implemented as a proxy to the unobservable number of introductions of each variant in each ward during the study period (Donker et al., 2017). To estimate this quantity, we measured directional, partial connectivity Sj,i from ward j to ward i as the number of patients transferred directly from j to i, times the inferred probability that each patient tested positive for the variant. Partial connectivity, thus, was the expected number of positive patients transferred from j to i. Finally, connectivity for ward i was the sum of all directional connectivities, Si =?∑jSj,i. This estimation procedure relies on simplifying assumptions, namely that all positive cases are detected and remain positive upon transfer to the downstream ward; and that the probability and destination ward of a patient transfer does not depend on infection.

Along with size and connectivity, wards were characterized by ward type based on patient fragility, and antibiotic consumption. Ward type was coded as a categorical variable with the following three levels: general wards, intermediate (progressive) care units, and intensive care and blood cancer units. Antibiotic use was normalized by dividing by?the number of beds in each ward and expressed in ddd/bed/y.

### Statistical analysis

Request a detailed protocol

The statistical unit was the individual ward (n?=?357) in all analyses. We used the asymptotic Simpson index (Simpson, 1949), also known as the Hunter-Gaston index (Hunter and Gaston, 1988), to determine the probability that two random isolates of a given variant were isolated in the same ward or that two random doses of a given antibiotic were delivered in the same ward. The index is defined as [∑ni (ni-1)]/[N (N-1)] where ni is the number of infection episodes (or antibiotic doses) detected (or delivered) in ward i and Ni =?∑ni is the total number of infection episodes (or antibiotic doses). In ecology, the Simpson index is typically used to measure biodiversity by estimating the probability that two individuals from a sample belong to the same species (Simpson, 1949). Here, we use this index to examine the distribution of sampling locations relative to the taxa, measuring the probability that two infection episodes occur, or two antibiotic doses are consumed, in the same ward. Hence, we used the term ‘concentration index’ to avoid confusion with a diversity measure. The iNext R package was used to determine bootstrap-based 95% confidence intervals of the concentration index (Hsieh et al., 2016).

Models of infection incidence were constructed using multivariable quasi-Poisson regression of the form

$\mathrm{E}\left(Incidence\text{?}|\mathbit{x}\right)=\mathrm{exp}?\left(Intercept+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}+\dots \right),\text{?}\mathrm{V}\mathrm{a}\mathrm{r}\left(Incidence\right)=\theta \text{?}\mathrm{E}\left(Incidence\text{?}|\mathbit{x}\right)$

where $E(?)$?denotes expectation, $x$ is the vector of predictors, such as antibiotic consumptions, the $β$’s are model coefficients, $Var(?)$ denotes variance and $θ$ is the overdispersion parameter of the quasi-Poisson model. The overdispersion parameter is used to relax the Poisson assumption of equality of the expectation and variance of the response variable. We used the quasi-Poisson distribution because we found evidence of both under- and overdispersion, as evidenced by fitted quasi-Poisson dispersion parameters ranging from 0.24 (strong underdispersion, found for carbapenem-resistant A. baumannii in Figure 1 model) to 2.5 (moderate overdispersion, found for less-resistant E. faecium) (Supplementary file 1 Table 1b). We favored the quasi-Poisson model over the alternative, negative binomial model because the latter gives greater weight to smaller sites, whereas the former gives greater weight to the larger sites (Ver Hoef and Boveng, 2007). In our application, greater weight should be given to wards with greater incidence rather than to those with less.

We constructed a model for each variant. The response variable was the deduplicated patient counts. All non-categorical explanatory variables including the incidence control, ward size, connectivity and antibiotic use were log2-transformed before further analyses. To avoid negative infinity values from this transformation, all zeroes were first converted to half the minimum non-zero value. This transformation was associated with better model fit (using the model structure of Figure 1), in terms of Akaike information criterion (with Poisson distribution), compared with: (1) replacing zeroes with the minimum non-zero value before taking logs; (2) adding one to zero values before taking logs; or (3) avoiding log transformation. The log2 transformed data were used for all subsequent analyses.

To ease interpretation of model coefficients, we converted the raw coefficients $β^$ from all models to percent changes in infection incidence, equal to $100×(eβ^ -1)$. All analyses used R software version 3.6.0.

### Possibly selective associations between antibiotic use and resistance

Request a detailed protocol

We examined criteria to identify a priori possibly selective associations between antibiotic use and resistance. The criteria were based on medical and biological considerations, namely, that antibiotics inactive against a variant are unlikely to be prescribed in response to this variant’s prevalence; and that antibiotics are most likely to select for a variant when resistance provides a specific advantage, hence, when the variant is not resistant to more potent antibiotics (e.g., CTX/CRO is more likely to select for PA than for CRPA in which carbapenem resistance provides no additional benefit under CTX/CRO pressure). This rationale led to the following criteria: (1) the variant is always resistant to the antibiotics of interest; (2) the variant is not resistant to antibiotics more potent (in terms of spectrum or efficacy) than the antibiotics of interest; and (3) the antibiotics of interest can be plausibly used against the variant in empirical therapy. A total of 19 associations fulfilled the criteria for possibly selective associations: 3GCREC with CTX/CRO; CREC with IPM/MEM; KP with AMX; 3GCRKP with CTX/CRO; CRKP with IPM/MEM; EB with AMC; C3GREB with CTX/CRO and TZP; CREB with IPM/MEM; PA with CTX/CRO; CRPA with IPM/MEM; SA with AMX; MRSA with C1G/C2G, OXA and AMC; AB with CTX/CRO; CRAB with IPM/MEM; EF with CTX/CRO; and VREF with VAN/TEC. Of note, the associations not fulfilling the criteria for possible selection can be interpreted as negative controls in our models, and their coefficients are expected to be distributed around zero (null distribution) in the absence of residual confounding. In line with this interpretation, the coefficients of most of the negative control associations followed a near-zero-centered distribution (Figure 2c), suggesting that residual confounding was negligible in the adjusted models.

### Pooled analysis of CTX/CRO- and IPM/MEM-resistant variants

Request a detailed protocol

To model the cumulative incidences of 3GC- and carbapenem-resistant infections, pathogen variants were pooled into resistance categories. When resistance to CTX/CRO or IPM/MEM was not determined by design (such as 3GC resistance in 3GCREC) or by intrinsic resistance (such as 3GC resistance in E. faecium), variants were classified as resistant when the proportion of resistance in our setting was above 80%. This less-stringent resistance criterion, compared to the criterion used to determine possibly selective associations, was chosen to avoid the exclusion of variants that are mostly resistant to an antibiotic group, which would bias pooled analyses. Applying the 80% threshold for the proportion of resistance led to classifying CRKP and CREB as 3GC-resistant (91% and 93% 3GC resistance, respectively) but not CREC (61% 3GC resistance); and EF and VREF as carbapenem-resistant (84 and 100% carbapenem resistance, respectively, inferred from ampicillin resistance [Weinstein, 2001]). Overall, the 3GCR category included 3GCREC, 3GCRKP, CRKP, 3GCREB, CREB, PA, CRPA, AB, CRAB, EF, VREF, and MRSA; and the CR category included CREC, CRKP, CREB, CRPA, CRAB, EF, VREF, and MRSA.

### Data and software code availability

Request a detailed protocol

All data and software code that support the findings of this study are available at: Shapiro, 2020, https://github.com/rasigadelab/metapop (copy archived at https://github.com/elifesciences-publications/metapop).

## References

1. 1
2. 2
3. 3
4. 4
5. 5
6. 6
7. 7
8. 8
9. 9
10. 10
11. 11
12. 12
13. 13
14. 14
15. 15
16. 16
17. 17
European Centre for Disease Prevention and Control Surveillance of Antimicrobial Resistance in Europe (2019)
?Annual report of the European Antimicrobial Resistance Surveillance Network (EARS-Net).
18. 18
19. 19
The habitat concept and a Plea for standard terminology
(1997)
Wildlife Society Bulletin 25:173–182.
20. 20
A practical model of metapopulation dynamics (1994)
The Journal of Animal Ecology 63:151–162.
https://doi.org/10.2307/5591
21. 21
22. 22
23. 23
24. 24
25. 25
26. 26
27. 27
28. 28
29. 29
30. 30
31. 31
De-constructing de-escalation (2016)
Clinical Microbiology and Infection 22:958–959.
https://doi.org/10.1016/j.cmi.2016.09.024
32. 32
Habitat quality: a brief review for wildlife biologists
(2005)
Transactions of the Western Section of the Wildlife Society 41:31–41.
33. 33
34. 34
35. 35
36. 36
37. 37
38. 38
39. 39
40. 40
41. 41
42. 42
43. 43
44. 44
45. 45
46. 46
47. 47
48. 48
49. 49
50. 50
51. 51
52. 52
53. 53
54. 54
55. 55
56. 56
57. 57
58. 58
59. 59
metapop, version bf3f075 (2020)
GitHub.
60. 60
61. 61
62. 62
63. 63
64. 64
65. 65
66. 66
67. 67
68. 68
69. 69
70. 70
71. 71

## Decision letter

1. Ben S Cooper
Reviewing Editor; Mahidol University, Thailand
2. Eduardo Franco
3. Tjibbe Donker
Reviewer
4. Rene Niehus
Reviewer; Harvard University, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

A metapopulation is a group geographically separated sub-populations of a species where there are interactions between the sub-populations. The concept has been usefully applied to understand the impact of patient movements on the spread of drug-resistant bacteria in healthcare settings on a number of previous occasions, but this paper represents the first application of the concept to a large data-set including multiple bacterial species and detailed data on antibiotic usage, another key driver of the spread of resistance. The analysis provides important new insights about how patterns of patient movement within healthcare settings and antibiotic exposures both contribute to the spread of antimicrobial resistance.

Decision letter after peer review:

Thank you for submitting your article "Metapopulation ecology links antibiotic resistance, consumption and patient transfers in a network of hospital wards" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Eduardo Franco as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Tjibbe Donker (Reviewer #1); Rene Niehus (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional analysis may be required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (http://www.asadoresteatre.com/articles/57162). First, because of disruptions to normal working life for many, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is 'in revision at eLife'. Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

In this work, the authors study the very important question of how population level antibiotic resistance in clinically important bacterial taxa is driven by ward-level antibiotic use and patient transfer between wards. For this the authors use clinical specimen data, antibiotic use data and ward-connectivity data aggregated across one year. They perform a between ward comparison by applying ecological metapopulation methods to correct for confounding. They find, in agreement with previous evidence, that nosocomial and carbapenem resistance pathogen incidence is positively associated with connectivity, and that antibiotics have varying effects depending on the pathogen variant, with Piperacillin-tazobactam appearing to select most strongly for pathogens resistant to empirical treatment. More broadly, this would mean that different pathogen variants require either antibiotic stewardship or patient isolation interventions.

Essential revisions:

Currently the description of the methods is incomplete and makes it difficult to assess the validity of the results. Specifics are described below.

1) Is this incidence control value (Incidence(Variant|Ward) ) calculated for each of the 357 wards, or for each of the 3 ward types? This should be made clearer in the text, and also by introducing indexing into the mathematical expressions.

2) The authors build a method to account for

i) variants being associated with diff anatomical sites P(Variant|Location)

ii) different wards differing in which anatomical sites are typically sampled, and in how many samples are typically taken per patient P( No Variant | N, Ward )

From the definition of P( No Variant | N, Ward ) it is not clear whether the authors also account for the fact that samples from one patient are not independent (in terms of the sampled anatomical site) and a definition of N(I,Ward) is missing. Is this the number of samples specific from a patient with index i, being disaggregated into location and ward, or is it the total number of samples in the ward disaggregated into the different anatomical sites? The difference is important: if N(I,ward) is the total number of ward samples disaggregated into anatomical sites, then the expression P(No Variant|N, Ward) assumes that samples taken from a patient i are independent.

3) In Data collection and compilation the authors say that clinical samples are deduplicated, resulting in one sample per patient, but in the section after it seems that multiple samples per patient are used for to compute incidence control. It is correct that for the incidence control value samples where used without deduplication, but for the regression model fit were deduplicated counts used? This should be made clearer.

4) Using a priori expected selective antibiotic-variant pairs the authors show that the model manages to pick up more of those pairs than others. This is expected: even when substantial confounding increases all individual antibiotic-variant associations, a direct causal effect in a abxA-variantX pair is expected to make it easier to pick up this association. It is not clear from the Materials and methods whether a multivariate model was used here. Please provide the regression equation in the Materials and methods to allow the reader to follow what was done. If it is univariate, it is to be expected that looking into different individual antibiotics will introduce a lot of new confounding from association with other antibiotic treatments. What would be useful here is a multivariate analysis of individual antibiotics (asking about the effect of each, given the effect of all the other treatment), and importantly including a few additional predictors (treatments) that one would expect to NOT be correlated with AMR (negative control). A zero-centred distribution of the negative control β parameters would then indicate lack of confounding.

5) A Poisson model was used. Typically Negative Binomial models will fit these type of data better as there's usually some overdispersion. The effect will be wider confidence intervals. It would be helpful to compare fits of Poisson and Neg Bin and go with the best fitting model.

6) The repeated dichotomising between "significant" and "non-significant findings" in the paper is at odds with current statistical thinking (see, for example, the American Statistical Association consensus statement on p-values https://www.amstat.org/asa/files/pdfs/P-ValueStatement.pdf). We strongly suggest the authors drop this terminology and instead of reporting results as significant or not, report point estimates and confidence intervals for everything.

7) The exclusion of rarely used antibiotics from the analysis makes sense, but it wasn't clear why narrow spectrum antibiotics or those used in combination were also excluded. We think including these in the analysis would improve the paper.

8) There was a consensus that the conclusions were too strong e.g. "Our data establish that.…". As these are observational data, with lots of unmeasured confounding likely, the conclusions should better reflect the uncertainty.

9) "Hospital-based antibiotic stewardship is a cornerstone strategy to combat AMR, based on the assumption that AMR evolves in hospitals. Yet, there is surprisingly limited evidence to support this assumption47-49"

It's not completely clear what this statement means (there are lots of examples where antibiotic use in hospitalised patients clearly selects for resistant variants e.g. fusidic acid resistance in staphylococci), but if it's referring to population wide selection mediated by antibiotic use there are several additional papers that could be cited ( eg several papers by José-María López Lozano and Timothy Lawes, such as their Nature Micriobiology 2019 paper or more recently Tom Crellen's 2019 paper on Klebsiella in eLife).

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your article "Metapopulation ecology links antibiotic resistance, consumption, and patient transfers in a network of hospital wards" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Tjibbe Donker (Reviewer #1); Ben S Cooper (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

The manuscript is much improved both methodologically and in terms of readability, and key points raised in the first round of reviews have been addressed. There are, however some remaining concerns which are important to address.

Essential revisions:

1) In many places the paper refers to multivariate (or bivariate) models. Though this terminology is widely misused, multivariate (still) means models with multiple dependent variables. Models with multiple independent variables but a single dependent variable are *multivariable* models. The models used here seem to be multivariable and not multivariate.

2) In most places the authors have rightly been cautious about reaching causal conclusions based on the associations found in the observational data. However there are still a small number of places where results are presented using causal language (i.e. language that many readers are likely to interpret as implying a direct causal relationship) that is not fully justified e.g "with every doubling of antibiotic use increasing incidence by 49%", "connectivity had a larger effect on the incidence". The manuscript would be improved by rephrasing results to report these as associations, and the discussion can then consider the plausibility of a causal interpretation (as has already been done in most cases).

3) "This nearly zero-centered distribution of coefficients in associations not expected to represent direct selection suggests that residual confounding in the models was negligible….": We would suggest revising this sentence to make it easier to understand.

4) "If the resistant infections in a ward mostly result from the admission of already colonized patients, antibiotic restrictions may have a limited impact on AMR compared to infection control measures that prevent the further dissemination of the pathogens" This makes sense if "limited impact on AMR" refers to colonisation with AMR (though it's not clear that it does given that the paper is concerned with clinical infection rather than carriage data), but antibiotics can also act to select for resistance within hosts increasing (or reducing) the chance of clinical infection with resistant organisms (see, for example, Niehus et al., 2020 and references therein). We strongly encourage the authors to rephrase to acknowledge that antibiotics can also have a big impact on within host dynamics and that antibiotic restrictions (or other changes) can also have an important impact even if most infections occur in already colonised patients.

5) "the influence of connectivity on the incidence of a pathogen variant is expected to be higher if the variant is endemic to the hospital". This seems intuitive but if the level of endemicity is so high that all patients are colonised (as we would expect for E coli) then surely we would expect connectivity to have no effect on the incidence of infection with the pathogen variant. So is this statement generally true?

6) "Consistent with this theoretical interpretation of connectivity, we found that its influence was strongest in the typical nosocomial pathogens P. aeruginosa and E. faecium." Though isn't it also the case that there was little evidence of an association between connectivity and MRSA infection incidence, but MRSA (or, at least, many lineages of MRSA) also represents a "typical nosocomial pathogen". It seems as important to highlight results that don't fit with this theoretical interpretation as those that do.

7) "Further research based on time-series analyses is required.." This seems too restrictive. Several analytical frameworks not commonly referred to as "time-series analysis" (e.g. various flavours of multistate models) would also be appropriate. As written this could be mis-interpreted as saying that ARIMA models are the only way to go here.

8) Subsection “Sampling bias control” "prevalence" or incidence?

9) Subsection “Sampling bias control” third paragraph "from a same sample"

10) Fourth paragraph , shouldn't "M6" be "M7"?

11) Final paragraph in subsection “Sampling bias control”. The logic of this is not clear. This seems to be saying that the incidence control value (A) is correlated with sampling effort (B), and that A is also correlated with antibiotic usage (C), and this will lead to spurious correlation between incidence and antibiotic use in unadjusted models. This may be correct (if we interpret spurious correlation to mean correlation between two variables in the absence of a direct causal relationship), but it is not clear from the text why it should be so. Can this be rephrased to make it clearer. A DAG may also help here (see, for example, Judea Pearl's The Book of Why for an introduction to DAGs in representing causal relationships) as unless there is a backdoor path, there won't be "spurious correlation".

12) Is an R2 of 0.34 really "strong correlation"?

13) Subsection “Connectivity and other ward characteristics”. The coding of fragility is confusing. The text says that an ordinal scale was used implying that it is treated as a categorical variable but with a natural ordering (but with no values associated with the distances between levels). However, it is then reported that numerical values were assigned to the levels. If it is an ordinal scale, the coding is arbitrary, but, as written, it sounds as though a numeric scale might actually have been used here. This needs to be clarified, and if a numeric scale has been used (i.e. the assumption has been made that patients in intensive care units have precisely twice the "fragility" as those in intermediate care units ) this either needs to be justified or (in the absence of clear rationale) the analysis revised, e.g. treating ward type as a categorical variable and not making any strong assumption in how patients differ in their vulnerability to infection.

14) "…exponentiated coefficients can be interpreted as the percentage increase of the incidence for every doubling of the predictor, when all other predictors are held at their reference value". Taken literally, this sentence is not accurate, even in the special case where the independent variable is log2 of the predictor. The sentence after this one is accurate and has all the information needed, so deleting the above sentence would be fine. Note that it is more usual to report results of a Poisson regression as incidence rate ratios (exponentiating the regression coefficients) which have a simple interpretation and will be familiar to many readers, and it's not clear why the conversion reported in the second sentence is preferred here.

15) Subsection “Pooled analysis of CTX/CRO- and IPM/MEM-resistant variants” The sentence beginning "The rationale.…" is very hard to make sense of, and we strongly suggest it is rewritten to make the meaning clearer.

16) The authors removed the ESKAPE2 acronym from the Abstract. It could also be removed from the conclusion in a similar way.

https://doi.org/10.7554/eLife.54795.sa1

## Author response

Essential revisions:

Currently the description of the methods is incomplete and makes it difficult to assess the validity of the results. Specifics are described below.

1) Is this incidence control value (Incidence(Variant|Ward) ) calculated for each of the 357 wards, or for each of the 3 ward types? This should be made clearer in the text, and also by introducing indexing into the mathematical expressions.

We agree that the section describing the incidence control value lacked clarity. We completely rewrote this section, using indexing of expressions where applicable. The calculation method was also slightly modified to take into account comment Q2 of the reviewers.

2) The authors build a method to account for

i) variants being associated with diff anatomical sites P(Variant|Location)

ii) different wards differing in which anatomical sites are typically sampled, and in how many samples are typically taken per patient P( No Variant | N, Ward )

From the definition of P( No Variant | N, Ward ) it is not clear whether the authors also account for the fact that samples from one patient are not independent (in terms of the sampled anatomical site) and a definition of N(I,Ward) is missing. Is this the number of samples specific from a patient with index i, being disaggregated into location and ward, or is it the total number of samples in the ward disaggregated into the different anatomical sites? The difference is important: if N(I,ward) is the total number of ward samples disaggregated into anatomical sites, then the expression P(No Variant|N, Ward) assumes that samples taken from a patient i are independent.

We thank the reviewers for pointing out the lack of clarity regarding our simplifying assumptions. In the original version of our analysis, we assumed independence of the samples from a same patient in terms of both the probability of detection of a pathogen variant and the probability of being sampled from one anatomical site if already sampled from another site. Although the first independence assumption remained necessary to obtain tractable expressions, the second assumption was unnecessary and could be relaxed by considering the combinations of samples taken from a same patient.

We improved our analysis of the incidence control value by considering the probability that a patient is detected positive for a given variant, considering all samples taken in combination, rather than considering anatomical sites separately. Consequently, the Materials and methods section describing the incidence control value has been completely rewritten and all results of models involving the incidence control value have been updated. The simplifying assumptions are explicit in text. Noteworthy, this change of the computation of the incidence control value had a negligible impact on the results of our models and did not impact any of our findings. This suggests that the additional simplifying assumption in the original version of the manuscript was acceptable; we believe, however, that relaxing this assumption in the revised version strengthens the rigor of our approach.

3) In Data collection and compilation the authors say that clinical samples are deduplicated, resulting in one sample per patient, but in the section after it seems that multiple samples per patient are used for to compute incidence control. It is correct that for the incidence control value samples where used without deduplication, but for the regression model fit were deduplicated counts used? This should be made clearer.

Again, we thank the reviewers for this opportunity to clarify our methods. The deduplication of the clinical samples considered each unique combination of pathogen variant, patient and ward, so that a same patient can be counted multiple times if sampled in several wards or positive with several variants – but a same patient in a same ward with a same pathogen is counted only once. This deduplication was not applied when computing the incidence control value because we feel it is necessary to account for repeated sampling of the same patient, which increases the probability of pathogen detection. We clarified the text to describe the deduplication scheme used for observed incidence, and to explain why deduplication was not used to compute the incidence control value.

4) Using a priori expected selective antibiotic-variant pairs the authors show that the model manages to pick up more of those pairs than others. This is expected: even when substantial confounding increases all individual antibiotic-variant associations, a direct causal effect in a abxA-variantX pair is expected to make it easier to pick up this association. It is not clear from the Materials and methods whether a multivariate model was used here. Please provide the regression equation in the Materials and methods to allow the reader to follow what was done. If it is univariate, it is to be expected that looking into different individual antibiotics will introduce a lot of new confounding from association with other antibiotic treatments. What would be useful here is a multivariate analysis of individual antibiotics (asking about the effect of each, given the effect of all the other treatment), and importantly including a few additional predictors (treatments) that one would expect to NOT be correlated with AMR (negative control). A zero-centred distribution of the negative control β parameters would then indicate lack of confounding.

We apologize for not making clear that all models were multivariate. We rewrote the Materials and methods section to clarify the model used (a quasi-Poisson model, see our response to comment Q5) and to provide a regression equation with the meaning of the overdispersion parameter. We also modified the legend of Figure 2 to clarify the multivariate nature of the analysis.

We fully agree with the reviewers on the interpretation of a zero-centered distribution of negative control coefficients as indicating negligible residual confounding. This interpretation was implicit in the construction of Figure 2, which clearly shows that the coefficients of associations not considered as “selective”, interpreted as negative controls, have a zero-centered distribution, contrary to the coefficients of possibly selective associations. We added a sentence in the Results section line 236-237 to clarify this interpretation.

5) A Poisson model was used. Typically Negative Binomial models will fit these type of data better as there's usually some overdispersion. The effect will be wider confidence intervals. It would be helpful to compare fits of Poisson and Neg Bin and go with the best fitting model.

Overdispersion was absent from most models and at most moderate (<3-fold) in some of them, which is why we reported Poisson regression results in the original version of the manuscript. For the sake of rigor, however, we examined under- and overdispersion in the models of Figure 1 by comparing the log-likelihoods of Poisson vs. negative binomial models and by examining the overdispersion parameter optimized under a quasi-Poisson model. This analysis is available in script F04_S1_overdispersion.R in the github repository. We include these results in Supplementary file 1 (Table 1B) and mention them the Materials and methods but we refrained from including them in details in the main manuscript. This analysis indicates: (1) that negative binomial models improved likelihood in few cases; and (2) that underdispersion (theta < 1) was present in several models and that its maximal magnitude was stronger than that of overdispersion (minimal theta 0.24 vs maximal theta 2.5).

To avoid any risk of underestimating the width of confidence intervals, we took dispersion into account in all models of the revised manuscript. Based on the findings above, we used quasi-Poisson rather than negative binomial models because: (1) underdispersion, which was prevalent, is better fit by quasi-Poisson models than by negative binomial models which are overdispersed by design; and (2) more importantly, negative binomial models weight towards the smaller sites (wards) whereas a quasi-Poisson weights toward the larger sites. As we consider the effects of wards with higher incidence to be more relevant to our scientific question than those of wards with lower incidence, it is more appropriate in this case to weight towards ward with greater incidence and to use quasi-Poisson models.

We modified the Materials and methods section accordingly and all results of regression models, including all figures, and Table 3. The changes did not modify any of our conclusions.

6) The repeated dichotomising between "significant" and "non-significant findings" in the paper is at odds with current statistical thinking (see, for example, the American Statistical Association consensus statement on p-values https://www.amstat.org/asa/files/pdfs/P-ValueStatement.pdf). We strongly suggest the authors drop this terminology and instead of reporting results as significant or not, report point estimates and confidence intervals for everything.

We agree with the reviewers on this point and, on a more personal note, we appreciate being encouraged (for the second occasion this year) to abandon P-values whose increasingly formal and rigid interpretation, especially in the medical literature, has diverged from their original intended use as an informal index. We have replaced all references to “significance” and P-values with confidence intervals in text. For readability, we conserved a Mann-Whitney test significance result in Figure 2C.

7) The exclusion of rarely used antibiotics from the analysis makes sense, but it wasn't clear why narrow spectrum antibiotics or those used in combination were also excluded. We think including these in the analysis would improve the paper.

Thank you for this suggestion. Some antibiotics had been excluded to lower the risk of multicollinearity in multivariate models in preliminary analyses with a smaller dataset. However, including them in analyses of the final, larger dataset did not hurt the models, so we are happy to include them. We have revised our analyses to include narrow spectrum antibiotic and those used in combination (specifically amoxicillin and aminoglycosides). We have updated the Materials and methods and Results accordingly, as well as Figure 2 and Table 3. This addition does not modify our conclusions.

8) There was a consensus that the conclusions were too strong e.g. "Our data establish that.…". As these are observational data, with lots of unmeasured confounding likely, the conclusions should better reflect the uncertainty.

We agree with this point. We have edited our manuscript throughout (particularly the Abstract and Discussion) to better reflect the uncertainty of our conclusions.

9) "Hospital-based antibiotic stewardship is a cornerstone strategy to combat AMR, based on the assumption that AMR evolves in hospitals. Yet, there is surprisingly limited evidence to support this assumption47-49"

It's not completely clear what this statement means (there are lots of examples where antibiotic use in hospitalised patients clearly selects for resistant variants e.g. fusidic acid resistance in staphylococci), but if it's referring to population wide selection mediated by antibiotic use there are several additional papers that could be cited ( eg several papers by José-María López Lozano and Timothy Lawes, such as their Nature Micriobiology 2019 paper or more recently Tom Crellen's 2019 paper on Klebsiella in eLife).

We thank the reviewer for pointing out this sentence, which was both unclear and not necessary. We have removed this sentence.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Essential revisions:

1) In many places the paper refers to multivariate (or bivariate) models. Though this terminology is widely misused, multivariate (still) means models with multiple dependent variables. Models with multiple independent variables but a single dependent variable are *multivariable* models. The models used here seem to be multivariable and not multivariate.

The reviewers are correct. We have changed “multivariate” to “multivariable” throughout the manuscript and the Supplementary File.

2) In most places the authors have rightly been cautious about reaching causal conclusions based on the associations found in the observational data. However there are still a small number of places where results are presented using causal language (i.e. language that many readers are likely to interpret as implying a direct causal relationship) that is not fully justified e.g "with every doubling of antibiotic use increasing incidence by 49%", "connectivity had a larger effect on the incidence". The manuscript would be improved by rephrasing results to report these as associations, and the discussion can then consider the plausibility of a causal interpretation (as has already been done in most cases).

We agree with the reviewers. We have rephrased our results to eliminate this causal language:

“The largest effect size was found in carbapenem-resistant K. pneumoniae, in which every doubling of antibiotic use predicted a 47% increase in incidence (95% confidence interval, 19 to 90%).”

“Connectivity better predicted the incidence of carbapenem-resistant infections (4.8%, 95% CI, -0.4 to 10.7%) compared with 3GC-resistant infections (1.0%, 95% CI, -2.3 to 4.5%), in line with the comparatively stronger association of connectivity with the incidence of individual carbapenem-resistant variants (Figure 1).”

“As a note of caution, our observation that piperacillin-tazobactam use predicted a higher incidence of 3GC- and carbapenem-resistant variants does not imply selection for acquired resistance through any specific mechanism such as carbapenemase production.”

“We also note that the link between piperacillin-tazobactam use and global resistance resulted from the accumulation of small, positive associations with most 3GC- and carbapenem-resistant variants, including those with intrinsic resistance (Figure 2).”

3) "This nearly zero-centered distribution of coefficients in associations not expected to represent direct selection suggests that residual confounding in the models was negligible.": We would suggest revising this sentence to make it easier to understand.

This sentence required more space for clarity. To avoid interrupting the flow of the Results section with a detailed explanation, we moved this sentence in the Materials and methods section and rephrased it as follows,

“Of note, the associations not fulfilling the criteria for possible selection can be interpreted as negative controls in our models, and their coefficients are expected to be distributed around zero (null distribution) in the absence of residual confounding. In line with this interpretation, the coefficients of most of the negative control associations followed a near-zero-centered distribution (Figure 2C), suggesting that residual confounding was negligible in the adjusted models.”

4) "If the resistant infections in a ward mostly result from the admission of already colonized patients, antibiotic restrictions may have a limited impact on AMR compared to infection control measures that prevent the further dissemination of the pathogens" This makes sense if "limited impact on AMR" refers to colonisation with AMR (though it's not clear that it does given that the paper is concerned with clinical infection rather than carriage data), but antibiotics can also act to select for resistance within hosts increasing (or reducing) the chance of clinical infection with resistant organisms (see, for example, Niehus et al., 2020 and references therein). We strongly encourage the authors to rephrase to acknowledge that antibiotics can also have a big impact on within host dynamics and that antibiotic restrictions (or other changes) can also have an important impact even if most infections occur in already colonised patients.

We agree with this point and we should have avoided this ambiguous phrasing. We have revised this section of the Discussion:

“Understanding the respective impacts of antibiotic use and connectivity on the incidence of drug-resistant infections is essential for optimizing interventions against AMR. Pathogens whose incidence is strongly predicted by antibiotic use might be most effectively targeted by antibiotic restrictions. On the other hand, infection control interventions might be most relevant against pathogens whose incidence is predicted by connectivity. By applying a metapopulation framework to explain variations of infection incidences across a large network of hospital wards, we found that both antibiotic use and connectivity independently contribute to ward-level AMR in several pathogen species. Our study also provides the first quantitative ranking of the predicted impact of several key antibiotics on the global burden of drug-resistant infections in a hospital network.”

We thank the reviewers for pointing us to the Niehus et al. study which is highly relevant to our study setting. We modified the Introduction section to include this reference and better contextualize our approach.

“This is further complicated by the fact that the association between antibiotic use and resistance is not uniform across pathogen species (Bell et al., 2014) or classes of antibiotics (Niehus et al., 2020). Moreover, antibiotic consumption can also have long-term effects on the carriage of resistant bacteria within patients (Niehus et al., 2020).”

5) "the influence of connectivity on the incidence of a pathogen variant is expected to be higher if the variant is endemic to the hospital". This seems intuitive but if the level of endemicity is so high that all patients are colonised (as we would expect for E coli) then surely we would expect connectivity to have no effect on the incidence of infection with the pathogen variant. So is this statement generally true?

The reviewers are correct: if endemicity (interpreted as the concentration of the pathogen in the hospital relative to the community) is so high that all patients become colonized when entering the hospital, connectivity will not predict incidence anymore. Reasoning further, connectivity will not predict incidence whenever the proportion of colonized patients is constant across wards. Hence, some variability is required for connectivity to influence incidence. We have edited our statement to reflect this need for variation or heterogeneity across hospital wards:

“In our closed network model of the hospital, the influence of connectivity on the incidence of a pathogen variant is expected to be higher if the variant is endemic to the hospital, its prevalence varies across wards, and changes from one variant category to another are rare.”

6) "Consistent with this theoretical interpretation of connectivity, we found that its influence was strongest in the typical nosocomial pathogens P. aeruginosa and E. faecium." Though isn't it also the case that there was little evidence of an association between connectivity and MRSA infection incidence, but MRSA (or, at least, many lineages of MRSA) also represents a "typical nosocomial pathogen". It seems as important to highlight results that don't fit with this theoretical interpretation as those that do.

We agree with the reviewers. We wrongly used the term “nosocomial” in place of the more accurate term “hospital-endemic”, which better reflects our ecological reasoning. We do not believe that the case of MRSA is against our theoretical interpretation because although MRSA is considered a nosocomial pathogen, it is less clearly a hospital-endemic pathogen in our setting. We clarified this point and added references:

“Consistent with this theoretical interpretation of connectivity, we found that its influence was strongest in the hospital-endemic pathogens P. aeruginosa and E. faecium (Blanc et al., 2007; Wurster et al., 2016; Zhou et al., 2020). Intriguingly, MRSA incidence was not predicted by connectivity. This is at odds with the classical perception of MRSA as typically nosocomial and, by extension, hospital-endemic. However, available evidence does not strongly support the qualification of MRSA as a hospital-endemic pathogen in our setting. In France, the proportions of MRSA among S. aureus are comparable in community and hospital settings, at about 10% (ECDC, 2019; ONERBA France, 2018; Santé Publique France, 2019). MRSA population structures in hospitals and the community are comparably dominated by the so-called ST8 Lyon clone which is equally found in in- and outpatients (Dauwalder et al., 2008). Finally, MRSA infections were especially diffuse in our network, with a concentration index even lower than that of the less-resistant S. aureus infections (Table 1). Collectively, this does not support the conclusion that MRSA concentrates in French hospitals compared to the community, which might explain why connectivity did not predict MRSA incidence in our study.”

7) "Further research based on time-series analyses is required.." This seems too restrictive. Several analytical frameworks not commonly referred to as "time-series analysis" (e.g. various flavours of multistate models) would also be appropriate. As written this could be mis-interpreted as saying that ARIMA models are the only way to go here.

We agree with the reviewers. We have revised this sentence:

“Further research using, for instance, multi-state models or time-series analyses, may better clarify this ambiguity.”

8) Subsection “Sampling bias control” "prevalence" or incidence?

We have changed this to “incidence.”

9) Subsection “Sampling bias control” third paragraph "from a same sample"

We have edited this:

“For simplicity, we assumed independence between the pathogen variants isolated from a same sample and between samples collected from a same patient.”

10) Fourth paragraph , shouldn't "M6" be "M7"?

Thank you for pointing out this mistake. We corrected the text

11) Final paragraph in subsection “Sampling bias control”. The logic of this is not clear. This seems to be saying that the incidence control value (A) is correlated with sampling effort (B), and that A is also correlated with antibiotic usage (C), and this will lead to spurious correlation between incidence and antibiotic use in unadjusted models. This may be correct (if we interpret spurious correlation to mean correlation between two variables in the absence of a direct causal relationship), but it is not clear from the text why it should be so. Can this be rephrased to make it clearer. A DAG may also help here (see, for example, Judea Pearl's The Book of Why for an introduction to DAGs in representing causal relationships) as unless there is a backdoor path, there won't be "spurious correlation".

We agree that our rationale for the use of the incidence control value as an adjustment covariate should be clarified because it is central to our approach. We used DAGs to reason about the latent and observed variables during the initial model preparation steps but we refrained from presenting them in the original manuscript because we felt that the causal graph framework would not appeal to a clinical readership. However, we fully agree that the link between “sampling bias” and antibiotic use was not explicit and, as a result, was difficult to understand. Thus, we included a DAG as supplementary data to clarify our approach.

Indeed, the biasing (backdoor) path between the exposure (antibiotic use) and the outcome (incidence) is not directly related to “sampling bias” but to variations of the incidence and distribution of infections across wards. In a more formal causal reasoning, the “sampling bias” is a proxy variable to the distribution of infections rather than a variable directly on the biasing path.

To clarify this point, we added a DAG representation in Figure 1—figure supplement 2 and we added or rephrased the text in several places. We clarified early in text that the incidence control is a proxy to the distribution of infections, using the terminology of epidemiology:

“We also considered that the distribution of infections across wards was a source of bias that required a specific adjustment procedure (see Materials and methods). The local prevalence of specific infections (e.g., respiratory tract infections) in a ward influences both the antibiotic use and the observed incidence of infections with a given pathogen, which might confound the relationship between antibiotic use and incidence (Figure 1—figure supplement 2). However, the distribution of infections would be difficult to represent as an adjustment covariate with a sufficiently small number of categories. We used a proxy method to circumvent this issue. We assumed that the distribution of infections directly influences the frequency and specimen types (e.g., respiratory vs. urinary tract specimens) of microbiological samples in each ward. Under this assumption, we replaced the unrepresentable distribution of infections with a proxy variable summarizing the distribution of microbiological samples. This proxy variable, which we refer to as the incidence control, was defined as the ward-level incidence of a pathogen variant predicted by patterns of microbiological sampling alone.”

In line with this more accurate explanation of our approach, we replaced all references to “sampling bias correction” with “inclusion of the incidence control covariate” to clarify that microbiological sampling was not the primary source of bias. The relevant Materials and methods section was renamed “Controlling for microbiological sampling patterns.” We detailed our causal assumptions and we amended the last paragraph of this section:

“Under our assumption that the incidence control is a valid proxy to the unrepresentable distribution of infections in each ward, the incidence control should correlate both with antibiotic use and the incidence of infections. Bivariate analyses confirmed that the incidence control correlated with the observed cumulative incidence of all bacteria (R2 = 0.96, 95% CI, 0.95 to 0.96, Figure 1—figure supplement 3) and, to a lesser extent, with the total antibiotic use (R2 = 0.34, 95% CI, 0.25 to 0.40, Figure 1—figure supplement 4).”

12) Is an R2 of 0.34 really "strong correlation"?

Thank you for pointing this. We have edited this sentence as follows:

“Bivariate analyses confirmed that the incidence control correlated with the observed cumulative incidence of all bacteria (R2 = 0.96, 95% CI, 0.95 to 0.96, Figure 1—figure supplement 3) and, to a lesser extent, with the total antibiotic use (R2 = 0.34, 95% CI, 0.25 to 0.40, Figure 1—figure supplement 4).”

13) Subsection “Connectivity and other ward characteristics”. The coding of fragility is confusing. The text says that an ordinal scale was used implying that it is treated as a categorical variable but with a natural ordering (but with no values associated with the distances between levels). However, it is then reported that numerical values were assigned to the levels. If it is an ordinal scale, the coding is arbitrary, but, as written, it sounds as though a numeric scale might actually have been used here. This needs to be clarified, and if a numeric scale has been used (i.e. the assumption has been made that patients in intensive care units have precisely twice the "fragility" as those in intermediate care units ) this either needs to be justified or (in the absence of clear rationale) the analysis revised, e.g. treating ward type as a categorical variable and not making any strong assumption in how patients differ in their vulnerability to infection.

The reviewers are correct. Although our intention was to treat patient fragility as an ordinal variable, a numeric scale was used in previous analyses. We do not assume that patients in intensive care have precisely twice the fragility as those in intermediate care. As the coefficients of ordinal variables can be difficult to interpret, we have revised our analysis treating ward type as a categorical variable as suggested. All models that include ward type as an explanatory variable have been rerun and the corresponding Figures 1 and 3A have been modified accordingly. We have also revised the Results and Materials and methods as necessary. Due to small sample size in some ward type strata, however, the coefficients for ward type could not be determined for some pathogen variants. We explain this point early in text:

“In several pathogen variants, namely carbapenem-resistant E. coli, carbapenem-resistant A. baumannii, and vancomycin-resistant E. faecium, the small sample size in one or several ward categories prevented the inclusion of ward type as a model covariate.”

Importantly, these changes do not affect our conclusions.

14) "…exponentiated coefficients can be interpreted as the percentage increase of the incidence for every doubling of the predictor, when all other predictors are held at their reference value". Taken literally, this sentence is not accurate, even in the special case where the independent variable is log2 of the predictor. The sentence after this one is accurate and has all the information needed, so deleting the above sentence would be fine. Note that it is more usual to report results of a Poisson regression as incidence rate ratios (exponentiating the regression coefficients) which have a simple interpretation and will be familiar to many readers, and it's not clear why the conversion reported in the second sentence is preferred here.

The reviewers are correct. We have deleted this sentence as suggested. Of note, we chose to convert the coefficients to percent changes in infection incidence because we felt this would be easier to interpret for readers, especially those who work in clinical settings.

15) Subsection “Pooled analysis of CTX/CRO- and IPM/MEM-resistant variants” The sentence beginning "The rationale.…" is very hard to make sense of, and we strongly suggest it is rewritten to make the meaning clearer.

We have edited this sentence for clarity:

“When resistance to CTX/CRO or IPM/MEM was not determined by design (such as 3GC resistance in 3GCREC) or by intrinsic resistance (such as 3GC resistance in E. faecium), variants were classified as resistant when the proportion of resistance in our setting was above 80%. This less-stringent resistance criterion, compared to the criterion used to determine possibly selective associations, was chosen to avoid the exclusion of variants that are mostly resistant to an antibiotic group, which would bias pooled analyses.”

16) The authors removed the ESKAPE2 acronym from the Abstract. It could also be removed from the conclusion in a similar way.

“To conclude, the modeling of the incidence of infections with 7 major bacterial species and their drug-resistant variants in hospital wards using a metapopulation framework indicates that both antibiotic use and inter-ward connectivity may predict the burden of AMR in a variant-specific fashion.”

https://doi.org/10.7554/eLife.54795.sa2

## Article and author information

### Author details

1. #### Julie Teresa Shapiro

CIRI, Centre International de Recherche en Infectiologie, Université de Lyon, Inserm U1111, Ecole Normale Supérieure de Lyon, Université Lyon 1, CNRS, UMR5308, Lyon, France
##### Contribution
Conceptualization, Software, Formal analysis, Visualization, Methodology, Writing - original draft, Writing - review and editing
##### Competing interests
No competing interests declared
2. #### Gilles Leboucher

Département de Pharmacie, Hospices Civils de Lyon, Lyon, France
##### Contribution
Data curation, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
3. #### Anne-Florence Myard-Dury

P?le de Santé Publique, Département d'Information Médicale, Hospices Civils de Lyon, Lyon, France
##### Contribution
Data curation, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
4. #### Pascale Girardo

Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Data curation, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
5. #### Anatole Luzzati

Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Data curation, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
6. #### Mélissa Mary

Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Data curation, Formal analysis, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
7. #### Jean-Fran?ois Sauzon

Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Data curation, Formal analysis, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
8. #### Bénédicte Lafay

Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, University of Lyon, Lyon, France
##### Contribution
Validation, Writing - review and editing
##### Competing interests
No competing interests declared
9. #### Olivier Dauwalder

Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
10. #### Frédéric Laurent

1. CIRI, Centre International de Recherche en Infectiologie, Université de Lyon, Inserm U1111, Ecole Normale Supérieure de Lyon, Université Lyon 1, CNRS, UMR5308, Lyon, France
2. Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Writing - review and editing
##### Competing interests
No competing interests declared
11. #### Gerard Lina

1. CIRI, Centre International de Recherche en Infectiologie, Université de Lyon, Inserm U1111, Ecole Normale Supérieure de Lyon, Université Lyon 1, CNRS, UMR5308, Lyon, France
2. Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Conceptualization, Writing - review and editing
##### Competing interests
No competing interests declared
12. #### Christian Chidiac

Service des Maladies Infectieuses et Tropicales, Hospices Civils de Lyon, Lyon, France
##### Contribution
Writing - review and editing
##### Competing interests
No competing interests declared
13. #### Sandrine Couray-Targe

P?le de Santé Publique, Département d'Information Médicale, Hospices Civils de Lyon, Lyon, France
##### Contribution
Data curation, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared
14. #### Fran?ois Vandenesch

1. CIRI, Centre International de Recherche en Infectiologie, Université de Lyon, Inserm U1111, Ecole Normale Supérieure de Lyon, Université Lyon 1, CNRS, UMR5308, Lyon, France
2. Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Writing - review and editing
##### Competing interests
No competing interests declared
15. #### Jean-Pierre Flandrois

Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, University of Lyon, Lyon, France
##### Contribution
Conceptualization, Investigation, Writing - review and editing
##### Competing interests
No competing interests declared

1. CIRI, Centre International de Recherche en Infectiologie, Université de Lyon, Inserm U1111, Ecole Normale Supérieure de Lyon, Université Lyon 1, CNRS, UMR5308, Lyon, France
2. Institut des Agents Infectieux, Hospices Civils de Lyon, Lyon, France
##### Contribution
Conceptualization, Software, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing - original draft, Project administration, Writing - review and editing
##### Competing interests
No competing interests declared

### Funding

#### French Laboratory of Excellence project ECOFECT (ANR-11-LABX-0048)

• Julie Teresa Shapiro
• Gilles Leboucher
• Fran?ois Vandenesch

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

### Acknowledgements

The authors thank Arnaud Friggeri, Alain Lepape, and?Florent Wallet for fruitful discussion.?Julie Teresa Shapiro, Gérard Lina, Fran?ois Vandenesch, and Jean-Philippe Rasigade are part of the French Laboratory of Excellence project ECOFECT (ANR-11-LABX-0048).

### Senior Editor

1. Eduardo Franco, McGill University, Canada

### Reviewing Editor

1. Ben S Cooper, Mahidol University, Thailand

### Reviewers

1. Tjibbe Donker
2. Rene Niehus, Harvard University, United States

### Publication history

2. Accepted: October 12, 2020
3. Accepted Manuscript published: October 27, 2020 (version 1)
4. Version of Record published: November 26, 2020 (version 2)

? 2020, Shapiro et al.

## Metrics

• 657
Page views
• 99
• 0
Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

A two-part list of links to download the article, or parts of the article, in various formats.

### Open citations (links to open the citations from this article in various online reference manager services)

1. Ecology
2. Physics of Living Systems

# Efficient mate finding in planktonic copepods swimming in turbulence

Fran?ois-Ga?l Michalec et al.
Research Article

Zooplankton live in dynamic environments where turbulence may challenge their limited swimming abilities. How this interferes with fundamental behavioral processes remains elusive. We reconstruct simultaneously the trajectories of flow tracers and calanoid copepods and we quantify their ability to find mates when ambient flow imposes physical constrains on their motion and impairs their olfactory orientation. We show that copepods achieve high encounter rates in turbulence due to the contribution of advection and vigorous swimming. Males further convert encounters within the perception radius to contacts and then to mating via directed motion toward nearby organisms within the short time frame of the encounter. Inertial effects do not result in preferential concentration, reducing the geometric collision kernel to the clearance rate, which we model accurately by superposing turbulent velocity and organism motion. This behavioral and physical coupling mechanism may account for the ability of copepods to reproduce in turbulent environments.

1. Ecology
2. Evolutionary Biology