# Belief updating in bipolar disorder predicts time of recurrence

1. Psychiatry Unit, Department of Medicine and Surgery, Università di Parma, Italy
2. Department of Experimental Psychology, University of Oxford, United Kingdom
3. Affective Brain Lab, Department of Experimental Psychology, University College London, United Kingdom
Research Article

## Abstract

Bipolar disorder is a chronic relapsing condition in which mood episodes are interspersed with periods of wellbeing (euthymia). Shorter periods of euthymia are associated with poorer functioning, so it is crucial to identify predictors of relapse to facilitate treatment. Here, we test the hypothesis that specific valence-dependent learning patterns emerge prior to the clinical manifestation of a relapse, predicting its timing. The ability to update beliefs in response to positive and negative information was quantified in bipolar patients during euthymia, who were then monitored for 5 years. We found that reduced tendency to update beliefs in response to positive relative to negative information predicted earlier relapse. Less updating in response to positive information may generate pessimistic beliefs, which in turn can lead to more severe prodromal symptoms (e.g. sleep disturbance, irritability etc.). The results suggest that measuring valence-dependent belief updating could facilitate risk prediction in bipolar disorder.

## Introduction

Bipolar disorder is estimated to affect approximately 3% of the general population (American Psychiatric Association, 2013). The disorder is characterized by successive periods of elation (mania) and depression interspersed with periods of euthymia, an asymptomatic phase in which patients are in clinical remission (Grande et al., 2016). Shorter periods of euthymia between relapses are associated with poorer functioning, increased odds of suicidality, disability, unemployment, and hospitalization (Peters et al., 2016). A major research goal is thus to identify predictors of upcoming relapse in order to facilitate timely treatment (Harrison et al., 2016). Yet, predicting relapse using existing clinical diagnostic tools or demographic information has proven largely ineffective in bipolar disorder (Bukh et al., 2016; de Dios et al., 2012; Vieta et al., 2018).

Here, we test the hypothesis that prior to the clinical manifestation of a relapse patients exhibit a specific learning pattern from self-relevant information, which makes them more vulnerable to relapse. In particular, it has been shown that healthy individuals tend to update their beliefs to a greater extent in response to unexpected positive information (e.g. learning that the likelihood of receiving a job offer is greater than expected) than negative information (e.g. learning it is lower than expected;?Kappes et al., 2020; Kuzmanovic et al., 2016; Ma et al., 2016; Moutsiana et al., 2015; Sharot et al., 2011). Such asymmetry in valence-dependent belief updating leads to optimistic beliefs (Sharot et al., 2011). This is because a person who takes in positive information to a greater extent than negative information from their experiences in the world, uses such information to update their beliefs, which eventually leads to more positive expectations.

In contrast, individuals with depression do not show a positivity bias in learning (Garrett et al., 2014; Korn et al., 2014). These patients update their beliefs less in response to positive information (Korn et al., 2014) and more in response to negative information (Garrett et al., 2014), leading to pessimistic views that may exacerbate symptoms.

However, it is unknown whether a lack of positivity bias in learning precedes symptoms of affective disorders. If it does, then measuring learning biases could facilitate early diagnosis. Indeed, there is theoretical basis to believe that reduced learning from positive information and/or enhanced learning from negative information will occur before a relapse. In particular, a reduction in learning from positive information (or enhancement in response to negative information) will cause individuals to take in less desirable information from the world around them (or more undesirable information), which over time will tip beliefs toward a pessimistic direction. Once a patient generates a negative set of beliefs, deleterious symptoms may appear such as negative mood, sleep disturbance, suicidal thoughts etc.

Testing this possibility on Bipolar patients is especially compelling for three reasons. First, bipolar patients tend to relapse frequently, with a mean distance between episodes of 18 months on average (Radua et al., 2017). If we were to examine how patients update their beliefs during euthymia to predict how soon they would relapse, within only a few years most patients would have likely relapsed, which would allow us to test the reliability of those predictions. Second, whilst traditional diagnostic tools that measure symptom severity has proven useful in predicting relapse in unipolar depression (Dinga et al., 2018), these are largely ineffective in bipolar disorder (Bukh et al., 2016; de Dios et al., 2012). Third, by examining bipolar patients we could test whether valence-dependent belief updating predicts episodes of both polarities (that is mania and depression) differently or in the same manner. On one hand, the one-dimensional model suggests that mania and depression lie on opposite ends of the same mood spectrum. Hence manic or depressive symptoms arise only in the absence of the other (Bonsall et al., 2015; Bonsall et al., 2012; Chang and Chou, 2018; Daugherty et al., 2009; Eldar et al., 2018; Eldar et al., 2016; Eldar and Niv, 2015; Mason et al., 2017; Nana, 2009; Steinacher and Wright, 2013). We may thus expect that positivity bias in belief updating would predict manic episodes and vice versa for depressive episodes. Other models, however, conceptualize mania and depression as independent dimensions (Bystritsky et al., 2012; Cochran et al., 2018; Goldbeter, 2013; Goldbeter, 2011; Hadaeghi et al., 2015; Huber et al., 2000; Lopez, 2008). According to these models, perturbations can trigger episodes of either polarity. The degree to which patients update beliefs in response to positive and negative information may reflect such perturbations (or perhaps even cause them) predicting relapse of both polarities similarly.

To test whether a valence-dependent bias in belief updating, or the lack thereof, precedes relapse of bipolar episodes, we tested patients with bipolar disorder on the belief update task (Sharot et al., 2011) while they were in the euthymic stage. The patients were then followed up for 5 years and their time of relapse was recorded.

## Results

Forty-five patients diagnosed with bipolar disorder (see Supplementary file 1 for patients’ demographics and characteristics) performed the belief update task while in the euthymic phase (Figure 1). Nine patients dropped out of the study before 5 years elapsed. The rest - 36 patients - were monitored for symptoms approximately every 2 months for the following 5 years. The task allows quantification of belief change in response to information that is better or worse than expected. The task and analysis employed here have been used numerous times before (Chowdhury et al., 2014; Garrett et al., 2018; Garrett et al., 2014; Garrett and Sharot, 2016; Garrett and Sharot, 2014; Kappes et al., 2018; Korn et al., 2014; Kuzmanovic et al., 2016; Kuzmanovic et al., 2014; Moutsiana et al., 2013; Moutsiana et al., 2015; Sharot et al., 2012b; Sharot et al., 2012a; Sharot et al., 2011). Participants were presented with 40 adverse life events (e.g. robbery, card fraud) and asked to estimate how likely the event was to happen to them in the future (this is referred to as the first estimate). They were then presented with the base rate of the event in a demographically similar population (this is referred to as information). For each trial, an estimation error term was calculated as the difference between the probability presented (information) and participants’ first estimate on that trial. In a second session, participants were asked again to provide estimates of their likelihood of encountering the same events (this is referred to as second estimate). Trials were divided into those in which participants received good news (i.e. the probability presented of encountering an aversive event was lower than the subject’s first estimate of their own probability; see Figure 1a) or bad news (i.e. the probability presented was higher than the subject’s first estimate of their own probability; see Figure 1b). While information can be better or worse than expected, all stimuli are negative life events. Thus, comparison is never between positive and negative stimuli, but between information that is better (thus subjectively positive) or worse (thus subjectively negative) than expected. Different methods of dividing trials in this task has shown to yield similar results (Garrett and Sharot, 2014), thus we used the original task design here.

Figure 1

Participants change in beliefs for each trial is referred to as the update and is calculated for good news trials as the difference in participants’ first estimate and second estimates (i.e. pre- and post-information), whereas for bad news trials as the difference between the participants’ second estimate and first estimate. Thus, in both cases, positive numbers indicate a move toward the information.

For each subject, we then separately calculated the average update for good news and bad news trials. The difference between the two types of trials is the ‘Update Bias’, an explicit measure of valence-dependent belief updating. Positive update bias scores indicate greater updating in response to good news relative to bad news (this is an optimistic update bias) whilst negative scores indicate the reverse (a pessimistic update bias).

At the end of the task, participants were asked to provide the actual probability previously presented for each event. Memory errors were calculated as the absolute difference between the probability previously presented and the participants’ recollection of that statistic. Participants also provided a number of ratings for each event on a likert scale according to: past experience, familiarity, negativity, vividness and emotional arousal (we call these ‘subjective ratings’, see Materials?and?methods).

The question we address here is whether the magnitude of the bias in belief updating (Update Bias) is associated with the patients’ prognosis. Patients were followed up for 5 years after they completed the update bias task and we logged their episode recurrence during this period. This enabled us to calculate the duration of time spent in euthymia for each participant from when they completed the task in a euthymic phase until their next episode (‘future time in euthymia’). We then examined if the future time in euthymia was predicted by their update bias score.

### Belief update bias is associated with future duration of euthymia

We found an association between the update bias and future time in euthymia with a larger positive update bias related to longer time in euthymia. In particular, we implemented the linear regression with future time in euthymia as the dependent measure and update bias (our main measure of interest) as the independent measure. We also added (1) prior beliefs (mean first estimate) and the difference between good news trials and bad news trials on the following measures to control for possible confounds: (2) estimation errors (3) memory error (4) number of trials (5-6) reaction times at the first and second estimate and (7-11) all subjective ratings (that is ratings on familiarity, prior experience, vividness, emotional arousal and negativity) (see Materials?and?methods). The analysis revealed a significant effect of belief update bias (Beta?=?0.589, p=0.004, bootstrapping on 10,000 samples, 95% CI?=?0.122–1.129) with a larger bias predictive of greater future time in euthymia (Figure 2, Supplementary file 2a). To test whether this effect may have been attributed to multicollinearity within the model, we calculated the Variance Inflation Factor (VIF) of the update bias. This revealed a low VIF (=1.93), which indicates that the variable does not display collinearity with the other variables in the model. Lower first estimation (optimistic priors) was also associated with longer future time in euthymia (Beta?=??2.74, p=0.003) along with the difference in the number of good and bad news trials (Beta?=?2.02, p=0.003).

Figure 2

To further test the robustness of the effect we used the ‘model averaging’ approach (Freckleton, 2011). This approach involves first running every single combination of models given the independent variables. For example, running a model only with two of the variables, only three, only four and so on. Each time with a different combination of independent variables. Our original model included 13 variables, thus this involved running 4095 nested models. Then the betas of each variable are averaged across all models, weighting them on the model’s BIC (Bayesian Information Criterion) (Freckleton, 2011). This exercise revealed a significant effect of update bias in explaining time in euthymia (weighted estimates?=?0.47, 95% CI?=?0.34, 0.60). Moreover, the eight best fitting models (out of 4095), according to the lowest BIC score, all included the update bias.

To test whether the update bias predicted future time in euthymia differently for manic and depressive episodes, we added to the linear regression above the polarity of the next episode and its interaction with the update bias as predictors. Neither the next polarity (Beta?=?0.242; p=0.233) nor the interaction were significant predictors (Beta?=??0.72, p=0.803), while the update bias remained a significant predictor (Beta?=?0.490; p=0.052). This suggests that the update bias does not predict future time in euthymia differently based on the polarity of the future episode.

Finally, we examined if the update bias predicts time in euthymia while only controlling for variables for which significant differences exists between good news trials and bad news trials, as those differences may introduce noise that can obscure the effects of update bias if not controlled for. To identify these variables, we tested with paired sample t-test if any of the variables in Model 1 showed significant differences between good and bad news trials. This revealed significant effects for each subjective rating: past experience, familiarity, vividness, arousal, vividness and negativity (vividness t(34) = 3.96, p<0.001; familiarity t(34) = 3.38, p=0.002; prior experience t(34) = 3.304, p=0.002; emotional arousal t(34) = 3.971, p<0.001; negativity t(34) = 2.414, p=0.021). No other differences were found (Supplementary file 1a). We thus ran a linear regression with future time in euthymia as the dependent measure and update bias (our main measure of interest) as the independent measure controlling only for the subjective measures above. We found that Update Bias was significantly associated with time in euthymia (Beta?=?0.351; p=0.047), with no multicollinearity (VIF?=?1.098). In this model, ratings of emotional arousal (Beta?=?0.775, p=0.016) and negativity (Beta?=??0.576, p=0.026) were also significant predictors. Note, however, that these latter effects do not remain significant in the original model once controlling for other task variables whilst the update bias does.

#### Future time in euthymia is associated with greater updating in response to good news and less updating in response to bad news

Thus far, we have shown that patients with a greater, more optimistic, update bias experienced euthymia for longer, whilst those with a smaller bias relapsed faster. Since the update bias is a difference score (the difference between updating beliefs in response to good and bad news), the association could be due to a variation in belief updating in response to good news, bad news or both. To that end, we ran two separate linear regressions predicting future time in euthymia. These were exactly as described above except that in one (Model 1a), update in response to good news was an independent variable in place of update bias. In the other (Model 1b), update in response to bad news was the independent variable in place of update bias. We found that longer future time in euthymia was associated with greater updating from good news (Model 1a: good news Beta?=?0.570, p=0.002, VIF?=?1.673, Figure 3, Supplementary file 2b). Updating from bad news (Model 1b) showed a non-significant association in the opposite direction (Beta?=??0.276, p=0.219, VIF?=?2.038, Figure 4, Supplementary file 2c). When including both update for good news and bad news in the same linear regression, update for good news was again significantly associated with future time in euthymia (Beta?=?0.543, p=0.004, VIF?=?1.721) whilst update from bad news was not (Beta?=??0.175; p=0.354, VIF?=?2.097).

Figure 3
Figure 4

#### Update bias is associated with future time in euthymia beyond all other clinical indicators

To examine if traditional clinical indicators also predicted future time in euthymia, we ran the same linear regression again (including all task controls as described above), but this time we included clinical and demographic variables. Specifically: age, education, gender, bipolar type (I or II), being on mood stabilizers (lithium, anticonvulsants), antipsychotics, antidepressants, history of psychotic symptoms, duration of illness and Beck Depression Inventory scores (Altamura et al., 2015; L?hteenvuo et al., 2018; Radua et al., 2017; Tundo et al., 2018).

Update Bias was again strongly related to future time spent in euthymia (Beta?=?0.741, p=0.001). The VIF score again indicated no multicollinearity between update bias and the other variables (VIF?=?2.87). Indeed, among all the independent variables included in Model 2, Update Bias was correlated only with the difference in memory errors between good and bad news (Supplementary file 3). Taking away that variable from the model did not change the results in Model 2 (Beta?=?0.680, p=0.001) or Model 1 (Beta?=?0.544, p=0.002). The only other variable that was associated with future time in euthymia was whether the patients were taking antidepressants such that taking antidepressants was associated with a shorter time in euthymia (Beta?=??0.429; p=0.040). No other clinical predictors were significant (Supplementary file 2d).

The update bias has been shown to be related to trait optimism (Sharot et al., 2011). We thus examined if trait optimism (measured via the standard LOT-R questionnaire, Scheier et al., 1994) was also related to time in euthymia. To that end, we added the LOT-R to Model 2 and found that while the update bias was still related to time in euthymia (Beta?=?0.740; p=0.001) LOT-R was not (Beta?=?0.208, p=0.148). This suggests that trait optimism is not a predictor of relapse while update bias is.

As the update bias is a predictor of future time in euthymia, one may wonder whether it is also related to number of past episodes. To test this, we examined the correlation between the two and found no association (r?=?-.107; p=0.53). Moreover, adding this variable into Model 2 did not change the results (Beta relating update bias to future time in euthymia?=?0.717, p=0.005).

The above results indicate that the relationship between update bias and future time in euthymia cannot be accounted for by demographics, clinical indicators of illness course or personality traits. Moreover, as we always control for task variables, the effect cannot be explained by attention or memory of the information received (as these are controlled for by adding memory in the linear regression), or by how familiar or experienced participants were with the stimuli presented (as ratings of these were included in the linear regression), nor by how negative they found the stimuli to be (again, these ratings were included).

#### Leave-one-out validation

Next, we tested the predictive validity of the above model using leave one out analysis in which the abovementioned linear regression was ran on all the data save for one participant which was held out from the analysis. We then used the regression betas to predict relapse of the left-out participant. This process was repeated so that each participants’ time to relapse was estimated from model parameters generated without using that participant to fit the data. Participants actual time to relapse (data) and their predicted time to relapse (estimation) were then correlated and also compared using a paired sample t-test. This analysis indicates whether the update bias is a good predictor of future time in euthymia.

We found a strong correlation between the predicted future time in euthymia from our model (estimate) and the actual future time spent in euthymia (data) (r?=?0.522, p=0.001). The means of the two sets of values were not significantly different from one another (t(34) = ?0.178, p=0.859). Running the Leave one out cross-validation analysis using Model 2 but taking out the update bias variable showed that the predicted values and actual values were not correlated (r?=?0.129; p=0.459) and model’s fit (SSE, sum of squared errors) was better for the model including the Update Bias (SSE?=?61.80) than the one that did not (SSE?=?97.10). This suggests that the update bias is a crucial variable in a model predicting future time in euthymia.

As Model 2 includes all clinical variables, we wanted to examine whether predictive validity would remain high in a model in which only the task variables were included (that is, the update bias and all task related controls) and was blind to demographics and clinical variables. This leave one out analysis again showed a significant correlation between predicted future time in euthymia (estimate) and actual future time in euthymia (data) (r?=?0.454, p=0.006) with no difference between the means of the two sets of values (t(34) = 0.127; p=0.898).

Overall, our results suggest that a greater update bias predicts a longer time in euthymia. This held true even when controlling for clinical variables.

## Discussion

Our findings show that the extent to which bipolar patients updated their beliefs in response to positive information, relative to negative information, was predictive of when they would relapse. In particular, euthymic patients who updated their beliefs to a greater extent in response to positive information were more likely to remain in euthymia for a longer duration. In other words, biased processing of information in a manner that supports positive outlook was linked to a more favorable course of bipolar illness.

These results are important from a clinical perspective due to the difficulty in predicting the course of bipolar disorder (Vieta et al., 2018). Maintenance treatment in bipolar disorder relies on mood stabilizers and psychological programs aimed at increasing treatment compliance, educating the patients and identifying prodromal sub-syndromic symptoms, which provide clues for an upcoming episode (Oud et al., 2016; Scott et al., 2007). Despite their efficacy, these approaches are highly dependent on the physician’s ability to identify and promptly treat symptoms as they arise, to reduce the likelihood of full-blown episodes. Being able to predict a relapse could inform patients and clinicians to step up vigilance to recognize prodromal symptoms and intervene where necessary. The way patients update beliefs could be introduced in the future as a risk prediction tool for bipolar disorder.

Importantly, our results demonstrate that the absence of an optimistic update bias precedes the clinical manifestation of relapse. This finding is important for understanding the relationship between valence-dependent learning and mood. While a relationship between mood and valence-dependent learning has been previously demonstrated in depressed patients (Garrett et al., 2014; Korn et al., 2014), it has been unclear whether a change in mood likely trigger changes in learning or whether changes in learning likely trigger changes in mood. While our data is correlational, the temporal order of events in our study support the latter possibility. That is, we speculate that a reduction in learning in response to positive information relative to negative information over time, lead to a less optimistic perspective, which eventually provides a fruitful ground for clinical affective symptoms to manifest.

It is interesting that a reduction of a positivity bias in belief updating was predictive of both depressive and manic episodes. The results are consistent with multistable models that suggest that environmental or inner perturbation such as stressors could trigger either episode onset (Cochran et al., 2018; Cochran et al., 2017). The update bias may be an indicator of the existence of such stressors and their associated prodromal symptoms. While mood symptoms of full-blown episode are different for depression and mania, both polarities are preceded by many of the same subsyndromal a-specific symptoms. These symptoms, known as prodromes, represent the main therapeutic target when treating recurrences (Vieta et al., 2018) and include sleep disturbances, irritability, and mood instability (Andrade-González et al., 2020). It is thus possible that a pessimistic view triggers these common subsyndromal symptoms.

We emphasize that our task carefully controls for an individuals perception of how likely they are to experience the aversive events by including their first estimate of these likelihoods in the model. We also control for how often participants had experienced these events in the past and how familiar they are with the events from others experiencing them. Thus, it is unlikely that the relationship between the update bias and future time in euthymia can be explained by some measure of ‘disadvantage setting’ of the patients. Indeed, the association between the update bias and time in euthymia was not explained by differences in memory, attention, experience with the negative life events used as stimuli, the perceived negativity of those stimuli, familiarity with the stimuli, or patients’ initial predictions of how likely they were to experience the events in the future, as all those were controlled for in our analysis. The association remained strong also when clinical measures, demographics, and other questionnaires were controlled for, such as number of past episodes and trait optimism. Interestingly, no clinical measures were able to predict future time in euthymia, except for antidepressant medication (Hooshmand et al., 2019). Other medication, such as mood stabilizers, did not predict time in euthymia. However, because most of the participants (83.3%) were on mood stabilizers, it is impossible to reliably determine if indeed such medication is associated with time in euthymia and whether taking these medications affects the relationship between task performance and time in euthymia. We also note that the sample size, although in line with the a-priori power analysis, is relatively small for assessing those models with a large number of variables and future studies with a larger sample will be helpful to verify the robustness of the current results.

The fact that general measures of cognitive ability such as memory errors and overall update did not predict time in euthymia suggests that the effect is valence-specific and cannot be explained by a general cognitive impairment, as this would have been better captured by overall update rather than the difference in update between good news and bad news trials. However, it is possible that other valence-dependent measures of learning, such as ones that can be quantified in a reinforcement learning task (Lefebvre et al., 2017), may be good predictors of future time of euthymia. This remains to be tested in future studies.

In sum, we found that greater belief updating in response to positive information was predictive of how long euthymic bipolar patients will remain in euthymia. This finding could be informative for developing tools for patient assessment. Future studies should examine whether the same measure is predictive of the reoccurrence of other clinical disorders, most notably depression. Moreover, it would be informative to test at-risk patients to examine if the update bias can be used as a tool for early diagnosis of affective disorders.

## Materials and methods

### Participants

Patients were recruited at the Psychiatric University Clinic of Parma, a public community-based mental health service, from the 31st of January 2013 to the 26th of November 2014. A psychiatrist (PO) evaluated patients using a suitably structured socio-demographic interview to collect anamnestic and data concerning therapy and illness course.

Sample size was based on a previous study (Korn et al., 2014), which found a negative correlation between the Update Bias and BDI scores of r?=??0.50 (p<0.001). The required sample size based on this was n?=?23 for an alpha of 0.05% and 80% power, beta?=?0.80 (Faul et al., 2007). As we expected up to 50% dropout over 5 years, we recruited 45 bipolar patients (19 female, 42.2%) aged between 20 and 80 (mean?=?45.87; s.d.?=?13.09). Fourteen of the patients had performed the belief update task once before for another study approximately 4 months prior, but using different stimuli. No difference emerged between those that performed the task in the past and those that did not.

The study design is observational and prospective. The patients were followed-up with psychiatric visits at the hospital approximately every 2 months (min?=?0.25 months, max?=?3 months) for 5 years from enrolment (until December 2019) by the same psychiatrist (PO) who is trained in the administration of the SCID-5. The follow-up visits were part of the clinical practice in an outpatients’ service specialized in bipolar disorder. The occurrence of new episodes, registered at the time of the visit, was discussed and confirmed by the clinicians according to the DSM-5 criteria. Error could occur if a patient happened to relapse and recovered between the visits (i.e. within the 2-month gap) and the clinician had failed to record these events. This, however, is unlikely as the average duration of a mood episode is 4 months (Tondo et al., 2017). Patients for whom there was no contact for a period longer than 6 months were dropped from the study (n?=?9) and their data were not included in the survival analysis. This resulted in a final sample of 36 subjects. All participants gave informed consent prior to testing. The local ethics committee (Comitato Etico per Parma) approved the study protocol in accordance with the Helsinki Declaration.

Participants were invited to participate in the study if they: (1) Fulfilled the diagnostic criteria for Bipolar Disorder I or II at the Structured Clinical Interview for DSM IV TR Disorders (First et al., 2002) in full remission. (2) Did not satisfy the criteria for Rapid Cycling (i.e. more than four episodes/year) (American Psychiatric Association, 2013); (5) Did not have any other major axis I diagnosis (i.e. Schizophrenia spectrum disorder, Obsessive Compulsive Disorder, Panic Disorder, Generalized Anxiety Disorder, Post Traumatic Stress Disorder, Anorexia Nervosa, Binge Eating Disorder). Specific Phobias and Personality Disorders were not considered because they are outside the study’s aim; (6) Did not reveal substance abuse/dependence or addictive disorder in the previous three months; (7) Did not show cognitive impairment, defined as score of less than 25 on the Mini Mental State Examination (Folstein et al., 1975).

Behavioral task was adapted from our past study (Sharot et al., 2011). This task has been used numerous times in the literature (Chowdhury et al., 2014; Garrett et al., 2018; Garrett et al., 2014; Garrett and Sharot, 2016; Garrett and Sharot, 2014; Kappes et al., 2018; Korn et al., 2014; Kuzmanovic et al., 2016; Kuzmanovic et al., 2014; Ma et al., 2016; Moutsiana et al., 2015; Moutsiana et al., 2013; Sharot et al., 2012b; Sharot et al., 2012a; Sharot et al., 2011).

#### Stimuli

Request a detailed protocol

Forty short descriptions of negative life events (for example: domestic burglary, card fraud) were presented in a random order. The original English stimuli (Sharot et al., 2011) were translated into Italian by a native Italian speaker with English as a second language and approved by the author according to a back-translation model. Very rare or very common events were not included; all event probabilities lay between 10% and 70%. To ensure that the range of possible overestimation was equal to the range of possible underestimation, participants were told that the range of probabilities lay between 3% and 77% and were only permitted to enter estimates within this range. Participants were randomly assigned one of three possible lists of stimuli (each list comprised a different set of 40 stimuli).

Request a detailed protocol

All subjects completed a practice session of three trials before beginning the main experiment. On each trial, one of the 40 adverse life events were presented for 4 s. Participants were asked to estimate how likely the event was to happen to them in the future. Participants had up to 6 s to respond. If participants had already experienced an event, they were instructed to estimate the likelihood of that event happening to them again in the future. If the participant failed to respond, that trial was excluded from all subsequent analyses (mean trials with missing response?=?4.73, s.d.?=?4.03). Following presentation of a fixation cross, participants were presented with the probability of an event occurring in a demographically similar population for 2 s followed by a fixation cross. In a second session, immediately after the first, participants were asked again to provide estimates of their likelihood of encountering the same events so that we could assess how they updated their estimate in response to the information presented. Probabilities of the events occurring were not provided in this second session (Figure 1). After the task participants completed the Beck Depression Inventory (BDI-II) (Beck et al., 1996) and the Revised Life Orientation Test (LOT-R) to evaluate patients’ trait optimism (Scheier et al., 1994).

#### Memory control

Request a detailed protocol

To control for memory effects, participants were asked at the end of the experiment to provide the actual probability previously presented for each event. Memory errors were calculated as the absolute difference between the probability previously presented and the participants’ recollection of that statistic:

$Memory\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Error=|Probability\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Presented?Recollection\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}of\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Probability\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Presented|$

#### Other controls

Request a detailed protocol

Following previous studies papers adopting the same task in clinically depressed participants (Garrett et al., 2014; Korn et al., 2014) after completing the task, participants also rated all stimuli on familiarity [for the question ‘‘Regardless if this event has happened to you before, how familiar do you feel it is to you from TV, friends, movies, and so on?’’ the responses ranged from 1 (not at all familiar) to 6 (very familiar)], prior experience [for the question ‘‘Has this event happened to you before?’’ the responses ranged from 1 (never) to 6 (very often)], vividness [for the question ‘How vividly could you imagine this event?’ (1, not at all vivid, to 6, very vividly)], emotional arousal [for the question ‘When you imagine this event, how emotionally arousing do you find the image in your mind?’ (1, not at all arousing, to 6, very arousing)] and negativity [for the question ‘‘How negative would this event be for you?’’ the responses ranged from 1 (not negative at all) to 6 (very negative)].

One subject interrupted the experiment during the last part when memory and other control variables were being registered resulting in missing values for these parameters.

#### Statistical analysis

Request a detailed protocol

Trials were partitioned into either ‘good news’ or ‘bad news’ according to participants’ first estimates. A trial was defined as good news when the probability presented was lower than the first estimate of their own probability (Figure 1a). Similarly, when the probability presented was higher, the trial was classified as bad news (Figure 1b). Trials for which first estimates were equal to the information provided were excluded from subsequent analyses, as these could not be categorized into either condition.

Update was calculated for each trial such that positive updates indicated a change towards the probability presented and negative updates a change away from the probability presented:

$Update\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\left(GoodNews\right)=First\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Estimate\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}?\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Second\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Estimate$
$Update\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\left(Bad\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}News\right)=Second\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Estimate\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}?\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}First\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}Estimate$

We then computed the average update for good and bad news separately for each participant, as done previously (Chowdhury et al., 2014; Garrett et al., 2018; Garrett et al., 2014; Garrett and Sharot, 2014; Kappes et al., 2018; Korn et al., 2014; Kuzmanovic et al., 2016; Kuzmanovic et al., 2014; Moutsiana et al., 2015; Moutsiana et al., 2013; Sharot et al., 2012a; Sharot et al., 2012b; Sharot et al., 2011). A participant’s update bias was then computed as the signed difference between these two scores:

$\mathrm{U}\mathrm{p}\mathrm{d}\mathrm{a}\mathrm{t}\mathrm{e}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{B}\mathrm{i}\mathrm{a}\mathrm{s}=\mathrm{A}\mathrm{v}\mathrm{e}\mathrm{r}\mathrm{a}\mathrm{g}\mathrm{e}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{U}\mathrm{p}\mathrm{d}\mathrm{a}\mathrm{t}\mathrm{e}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\left(Good\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}News\right)?\mathrm{A}\mathrm{v}\mathrm{e}\mathrm{r}\mathrm{a}\mathrm{g}\mathrm{e}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathrm{U}\mathrm{p}\mathrm{d}\mathrm{a}\mathrm{t}\mathrm{e}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\left(Bad\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}News\right)$

A score of 0 indicates no bias in updating in either direction; positive scores indicate an optimistic bias in updating such that participants change their beliefs to a greater degree for good news relative to bad news. Negative scores indicate a pessimistic bias in updating such that participants change their beliefs to a greater degree for bad news relative to good news.

This bias can be seen both when trials are classified according to the participants’ estimate of base rate or self-risk (Garrett and Sharot, 2014; Kuzmanovic et al., 2014). Similarly, this bias is not affected by the estimate being either the likelihood of the event happening or not happening to them (Garrett et al., 2014; Garrett and Sharot, 2014; Sharot et al., 2011). Hence, here we use the classic approach in which we elicited the estimation of an event happening to them.

We also computed the differences between good and bad trials on all the other task related variables, that were included as covariates in all the following models.

$\mathrm{\Delta }\left(\mathrm{X}\right)=\mathrm{A}\mathrm{v}\mathrm{e}\mathrm{r}\mathrm{a}\mathrm{g}\mathrm{e}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathrm{X}}_{\mathrm{g}\mathrm{o}\mathrm{o}\mathrm{d}\phantom{\rule{thinmathspace}{0ex}}\mathrm{n}\mathrm{e}\mathrm{w}\mathrm{s}}?\mathrm{A}\mathrm{v}\mathrm{e}\mathrm{r}\mathrm{a}\mathrm{g}\mathrm{e}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathrm{X}}_{\mathrm{b}\mathrm{a}\mathrm{d}\phantom{\rule{thinmathspace}{0ex}}\mathrm{n}\mathrm{e}\mathrm{w}\mathrm{s}}$

This deltas resulted in a set of new variables named Δ estimation error, Δ memory errors, Δ number of trials, Δ reaction times at the first and second estimate, Δ familiarity, Δ prior experience, Δ vividness, Δ emotional arousal and Δ negativity.

### Future time in euthymia

Request a detailed protocol

For each subject, we calculated the time in months they spent in euthymia following task completion until the onset of the next episode up to 5 years. For those who did not relapse during the follow-up, time in euthymia was set to 60 months (n?=?6).

Bipolar disorder is a recurrent disorder with a mean distance between episodes of approximately 1.5 years with half of the patients relapsing in the following year (Radua et al., 2017). As future time in euthymia was not normally distributed (Skewness?=?1.362; Kolmogorov-Smirnov normality test?=?0.301, p<0.001), we log-transformed this variable resulting in a normal distribution of the values (Skewness?=??0.338; Kolmogorov-Smirnov normality test?=?0.118; p>0.05).

### Update bias and future time in euthymia

We conducted a linear regression with future time in euthymia as the dependent variable and update bias as our predictor of interest. To control for possible confounds, we also added as independent variables the mean first estimate and the difference between good news trials and bad news trials on the following measures: estimation errors, memory error, number of trials, reaction times at the first and second estimate and the subjective ratings on familiarity, prior experience, vividness, emotional arousal and negativity. We also ran this regression with bootstrapping in SPSS (10,000 runs with replacement).

#### Model 1

Request a detailed protocol

Log (Future time in euthymia)=b0 + b1* update bias + b2* mean first estimate + b3* Δ estimation error + b4* Δ memory errors + b5* Δ number of trials + b6* RT first estimate + b7* RT second estimate + b8* Δ familiarity +b9* prior experience +b10* Δ vividness + b11*Δ emotional arousal + b12* Δ negativity + error.

Δ (delta) refers to the difference in each parameter between good and bad news trials.

The same linear regression was run two more times substituting the update bias with the update in response to good news and the update in response to bad news. To visualize the effect of each of these (see Figures 2, 3 and 4), we generated scatterplots of the residuals of the dependent variable (logarithmic transformation of the future time in euthymia in months to account for skewness) and of the independent variable of interest (update bias, update from good news, update from bad news respectively), this accounts for the other control variables.

Because it is possible that the update bias predicts future time in euthymia differently for manic and depressive episodes, we added to the linear regression above the polarity of the next episode and its interaction with the update bias as predictors. Six subjects did not relapse during the follow-up period, hence this multiple linear regression was run on 30 subjects only. For this analysis, next episode polarity was coded as ?1 for depressive episodes and one for manic episodes.

To test the strength of this model, we ran model averaging (Freckleton, 2011). Model averaging tests whether the effect of the update bias was contingent on entering a specific set of variables in the model (Fletcher, 2018). This approach involves running every single combination of models given the independent variables. For example, a multiple linear regression with three predictors (a, b, c) will results in seven different models: three with one predictor each (a, b, and c), three with the possible combination of two predictors (a + b, a + c, and b + c) and one full model with the three predictors (a + b + c). The betas of each predictor are then averaged across all models, weighting them on the model’s BIC (Bayesian Information Criterion). Moreover, the nested models can be ordered according their BIC score, where a lower BIC corresponds to a better fit. This allows to evaluate which independent variables are most relevant for the full model.

Because difference between good news trials and bad news trials in task-related variables and subjective ratings might introduce noise when evaluating the association between the update bias and the time in euthymia we controlled for those in a multiple linear regression. To identify these variables, we tested if any of the variables included in Model 1 showed significant differences between good and bad news trials using a paired sample t-test (e.g. compared memory error scores for good news versus memory error scores for bad news, vividness ratings for good news versus bad news, and so on for all variables in Model 1). We then ran a linear regression with future time in euthymia as the dependent measure and update bias (our main measure of interest) as the independent measure controlling only for the measures that returned a significant results from the above set of tests.

To control for all demographic and clinical variables, we then ran a second linear regression with the following controls: age, education, gender, bipolar type (I or II), in-range plasmatic dosage of mood stabilizer (lithium, valproate, lamotrigine and carbamazepine), antipsychotics or antidepressant prescription, history of psychotic symptoms, duration of illness, and the BDI-II scores.

#### Model 2

Request a detailed protocol

Log (Future time in euthymia)=b0 + b1* update bias + b2* mean first estimate + b3* Δ estimation error + b4* Δ memory errors + b5* Δ number of trials + b6* RT first estimate + b7* RT second estimate + b8* Δ familiarity + b9* prior experience + b10* Δ vividness + b11* Δ emotional arousal + b12* Δ negativity + b13* age + b14* years of education + b15* gender + b16* bipolar type + b17* depressive symptoms at BDI-II + b18* mood stabilizers + b19* antidepressants + b20* antipsychotics + b21* lithium + b22* history of psychotic symptoms + b23* duration of illness + error.

Δ (delta) refers to the difference in each parameter between good and bad news trials; gender coded as 0?=?male, 1?=?female; mood stabilizers, antidepressants, antipsychotics, lithium and history of psychotic symptoms coded as 0?=?no, 1?=?yes.

To test for multicollinearity, using the SPSS collinearity diagnostics we calculated the Variance Inflation Factor (VIF) for our variable of interest. The VIF quantifies the amount of multicollinearity in a set of regression variables. It is equal to the ratio of the overall model variance to the variance of a model that includes only that single independent variable. A high VIF indicates that the independent variable is highly collinear with the other variables in the model. A low VIF (less than 4) indicated it is not.

### Leave-one-out validation

Request a detailed protocol

Next, we tested the robustness of the Model 2 using leave one out cross validation. Model 2 was rerun on all data except for one subject. The regression betas are then used to predict the relapse of that left out participant. This process was repeated for each participant so that the leave-one-out reiteration resulted in two values for time in euthymia for each subject: the actual time in euthymia (data) and the predicted value (estimate). We then ran a Pearson correlation between the two values as well as compared them using a t-test.

Finally, we performed the same leave one out analysis as above but this time without including the Update Bias. A non-significant correlation between the actual and predicted values of this new model without the update bias would suggest a key-role of our variable of interest in predicting time in euthymia.

## References

1. 1
2. 2
Diagnostic and Statistical Manual of Mental Disorders (DSM-5)
(2013)
Washington, DC, United States: American Psychiatric Association.
3. 3
4. 4
5. 5
6. 6
7. 7
8. 8
9. 9
10. 10
11. 11
12. 12
13. 13
Mathematical models of bipolar disorder (2009)
Communications in Nonlinear Science and Numerical Simulation 14:2897–2908.
https://doi.org/10.1016/j.cnsns.2008.10.027
14. 14
15. 15
16. 16
Mood as representation of momentum (2016)
Trends in Cognitive Sciences 20:15–24.
https://doi.org/10.1016/j.tics.2015.07.010
17. 17
18. 18
19. 19
20. 20
Structured Clinical Interview for DSM-IV-TR Axis I Disorders, Patient Edition (SCID-I/P), New York: Biometrics Research
(2002)
Psychiatric Institute.
21. 21
22. 22
23. 23
24. 24
25. 25
26. 26
27. 27
28. 28
A model for the dynamics of bipolar disorders (2011)
Progress in Biophysics and Molecular Biology 105:119–127.
https://doi.org/10.1016/j.pbiomolbio.2010.11.007
29. 29
30. 30
31. 31
32. 32
Innovative approaches to bipolar disorder and its treatment (2016)
Annals of the New York Academy of Sciences 1366:76–89.
https://doi.org/10.1111/nyas.13048
33. 33
34. 34
35. 35
36. 36
37. 37
38. 38
39. 39
40. 40
41. 41
42. 42
Markov Models for Longitudinal Course of Youth Bipolar Disorder
(2008)
University of Pittsburgh.
43. 43
44. 44
45. 45
46. 46
47. 47
Bifurcation analysis of parametrically excited bipolar disorder model (2009)
Communications in Nonlinear Science and Numerical Simulation 14:351–360.
https://doi.org/10.1016/j.cnsns.2007.10.002
48. 48
49. 49
50. 50
51. 51
52. 52
53. 53
54. 54
55. 55
56. 56
57. 57
58. 58
59. 59
Bipolar disorders (2018)
Nature Reviews Disease Primers 4:18008.
https://doi.org/10.1038/nrdp.2018.8

## Decision letter

1. Shelly B Flagel
Reviewing Editor; University of Michigan, United States
2. Michael J Frank
Senior Editor; Brown University, United States
3. Amy Louise Cochran
Reviewer; University of Wisconsin - Madison, United States
4. Catherine Harmer
Reviewer; University of Oxford, United Kingdom

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This research highlights a predictor of mood states in individuals with bipolar disorder, above and beyond clinical predictors. Namely, a bias in belief updating when an initial prediction overstates prevalence (i.e. "good news" situation) relative to when an initial prediction understates an outcome (i.e. "bad news" situation) is predictive of future time spent euthymic in individuals with bipolar disorder. These findings could lead to novel therapeutic interventions for those with bipolar disorder.

Decision letter after peer review:

Thank you for submitting your article "Belief Updating in Bipolar Disorder Predicts Time of Recurrence" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Shelly Flagel as a Reviewing Editor and Michael Frank as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Amy Louise Cochran (Reviewer #1); Catherine Harmer (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

The editors have judged that your manuscript is of interest, but as described below noted that additional analyses and significant revisions are required before it is published. Thus, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (http://www.asadoresteatre.com/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

Ossala et al. seek to predict future time spent euthymic in individuals with bipolar disorder (BP) based on how they update their beliefs. To predict time spent euthymic, they emphasize one variable, in particular, the bias in average updates when their initial prediction overstates actual prevalence (which they call a "good news" situation) vs. when their initial prediction understates actual outcomes (which they call a "bad news" situation). The principal finding is that the bias in average updates is predictive of future time spent euthymic after controlling for a number of possible confounds including subjective biases, task-related variables, and clinical variables. The study design is thoughtful and interesting, and the principal finding appears to be exceedingly strong. If biases in updating is a robust predictor of future outcomes, above and beyond clinical predictors, then this variable should have utility in helping set expectations for individuals with BP as well as helping get at underlying cognitive deficits in BP.

Essential revisions:

Considering the small sample size (n = 36 outcomes) and the number of covariates (e.g., n = 24 in Model 3), a major concern is the possibility of an overinflation of p values for various reasons given below. Based on this concern, the authors need to address the following:

1) Did the authors perform a power analysis before study start? The small sample size should be raised as a limitation in the Discussion.

2) There may be multicollinearity among independent variables in the regression models that might lead to an overinflation of p values. Broadly, please address the issue of multicollinearity. Specifically, can you report univariate analyses relating update bias to time spent euthymic? And can you report (in the supplement) the correlation between independent variables in the regression model?

3) While cross-validation was used to show the prediction models would generalize to a larger population, it was not used to determine if the principal finding (i.e. relationship between update bias and time spent euthymic) would generalize. Can you use cross-validation to evaluate the generalizability of this finding, e.g., by comparing cross-validated prediction errors to a model that does not include update bias?

4) Based on the Materials and methods section, it appears that you are running a standard linear regression, but you mention that you construct general linear models (GLM). If the former, please simply state that. If the latter, can you provide more details in what exact analysis was performed? For example, using repeated measures could lead to lower p values, but would need to be described (though inappropriate for the given outcome which is only measured once per individual).

5) Please discuss the small sample size as a limitation in the Discussion. Assuming a standard linear regression was performed, there should be 5-10 outcomes per independent variable, but the presented analysis only had 2-3.

6) Given that the task has been analyzed elsewhere, please compare/contrast your analyses to analyses from prior papers? Among task-related variables (e.g., difference in number of good vs. bad trials) that were controlled for in the regression models, how many were used in prior papers?

7) Related to the last comment, in prior work using the task and in models of associative learning more broadly, it is common to analyze bias in learning rates (roughly the ratio between the update and the error) rather than the update bias. Can the authors comment on why they focused on bias in overall updates rather than bias in learning rates?

8) One might have expected that optimistic updates predict the transition only to manic and not depressives states. The authors should discuss this and other alternative hypotheses and whether their reported findings fit into any existing and coherent model of BP disorder.

9) The authors show that the belief update task can predict the time of the dysthymic transition. But they only have tested the belief update task. What if any task predicts the time of the dysthymic? On the other side, optimistic biases can be assessed in several ways. On one side of the spectrum there are standard personality measures, such as the LOT-R, on the other side of the spectrum there are reinforcement learning tasks, where the optimistic bias is frequently reported. Does any measure of optimism predict the time of the dysthymic transition? Or is it just the belief update task? In the Supplementary material it is mentioned that trait optimism (LOT-R) was assessed. Does this scale also predict the time spent in euthymia? Do the results hold if it is included in the GLMs?

10) Another important check that is lacking is whether the results hold if the number of previous dysthymic episodes is entered as a control. In such pathological states, it seems that crises become more and more frequent with time. A possible interpretation of the results (that cannot be ruled out presently) is that a higher number of previous dysthymic episodes makes a patient concomitantly less optimistic and more likely to have a shorter euthymic phase. A related question is whether the belief task is a better predictor compared just the number of previous dysthymic episodes (note that even if correlated this factor is not completely taken into account by including the patient age in the regression).

11) Medication effects were controlled for statistically by including 0 or 1 for different classes of medication. However, this is unlikely to be a perfect control as few participants are medication free (>80% on a mood stabiliser). This should be discussed in more detail.

12) The inclusion of the different clinical and demographic factors that might influence the association is appreciated. One might wonder if this could just be a general cognitive effect e.g. the more “impaired” cognition (or difference to normal) associated with a worst illness profile. The authors go some way towards assessing this by looking for example at memory for the new information presented in the task. However, this justifies a bit more attention i.e. did the authors include relapse history in the model? Was memory (as oppose to belief updating) a predictor by itself?

13) What is the likely mechanism underpinning the failure to update beliefs on the basis of positive information? Is it that despite knowing the population average, the participant believes they have more specific information to refute that a positive update applies to them e.g. burglaries happen to 20% of people. but for some its 30% because they live in an area where burglaries are more common. So, could it be that indirectly the task measures the disadvantaged setting of individuals more prone to relapse? Or is it in the mechanism by which this learning takes place?

https://doi.org/10.7554/eLife.58891.sa1

## Author response

Essential revisions:

Considering the small sample size (n = 36 outcomes) and the number of covariates (e.g., n = 24 in Model 3), a major concern is the possibility of an overinflation of p values for various reasons given below. Based on this concern, the authors need to address the following:

1) Did the authors perform a power analysis before study start? The small sample size should be raised as a limitation in the Discussion.

Sample size was based on a previous study (Korn et al., 2014), which found a negative correlation between the Update Bias and BDI scores of r = -0.50 (p < 0.001). The required sample size based on this was n=23 for an alpha of 0.05 and 80% power, beta=0.80 (Faul et al., 2007). We started with an N of 45 and following drop out over the five years ended with an N of 36. We now include this information in the Materials and methods section. We have also added a Discussion of the small sample size.

2) There may be multicollinearity among independent variables in the regression models that might lead to an overinflation of p values. Broadly, please address the issue of multicollinearity. Specifically, can you report univariate analyses relating update bias to time spent euthymic? And can you report (in the supplement) the correlation between independent variables in the regression model?

To test for multicollinearity, we calculated the Variance Inflation Factor (VIF) for each independent variable. The VIF quantifies the amount of multicollinearity in a set of regression variables. It is equal to the ratio of the overall model variance to the variance of a model that includes only that single independent variable. A high VIF indicates that the independent variable is highly collinear with the other variables in the model. A low VIF (less than 4) indicates that it is not. The VIF of our measures of interest – the update bias – was 1.93 for model 1 and 2.87 for model 2, which is low and not considered problematic. We now report this in the revised manuscript.

Indeed, the only independent variable the update bias was significantly correlated with was difference in memory errors. Taking away that variable from the model did not change the results (Update bias Beta = 0.544, p = 0.002 for model 1; Beta = 0.680, p = 0.001 for model 2). We now report the correlation between all independent variables in the regression models as requested (see Supplementary Material).

We also tested whether the effect of the update bias was contingent on entering a specific set of variables in the model by running “model averaging” (Freckleton, 2011). This approach involves first running every single combination of models given the independent variables. For example, running a model only with 2 of the variables, only 3, only 4 and so on. Each time with a different combination of independent variables. With N = 13 (Model 1) this involved running 4095 nested models. Then the betas of each variable are averaged across all models, weighting them on the model’s BIC (Bayesian Information Criterion). This revealed a significant effect of update bias in explaining “time in euthymia” (weighted estimates = 0.47, 95% CI = 0.34-0.60). The eight best fitting models (out of 4095), according to the lowest BIC score, all included the update bias. Running the same procedure for Model 2 will involve running 8,388,607 models and we lack the computational power needed for that exercise. However, given the significant effects of Model 1, the goal of Model 2 is simply to ensure that the results of Model 1, which includes less variables, are not explained away by additional clinical and demographic variables.

While multicollinearity is not driving the effects, our additional analysis reveals that there are differences between the two types of trials (good news trials and bad news trials) that introduce noise and can obscure the effects of update bias if not controlled for. In particular, testing if any of the task variables showed significant differences between desirable and undesirable trials revealed significant effects in all subjective ratings: past experience, familiarity, vividness, arousal, vividness and negativity (p-values < 0.02) but not in any of the other variables (that is no difference for memory errors, estimation errors, number of trials or RTs). Simply correlating update bias with time in euthymia without accounting for these significant differences reveals a trend (rho = 0.297 p = 0.078), and once the differences are accounted for by running a new model controlling only for these subjective measures the relationship between Update Bias and “time in euthymia” is clearer (Beta = 0.351; p = 0.047). The VIF of the update bias in this model is low (VIF = 1.098) as were the VIFs of all other variables (all VIFs < 4). We now report all this in the revised manuscript.

3) While cross-validation was used to show the prediction models would generalize to a larger population, it was not used to determine if the principal finding (i.e. relationship between update bias and time spent euthymic) would generalize. Can you use cross-validation to evaluate the generalizability of this finding, e.g., by comparing cross-validated prediction errors to a model that does not include update bias?

Thank you for suggesting this confirmatory analysis. Running the LOO-cv analysis using the model that does not include update bias showed that the predicted values and actual values were not correlated (r = 0.129; p = 0.459). Compare this to the same model which does include update bias, for which we find both a strong correlation between predicted values and actual values (r = 0.445, p = 0.007) and no significant difference between these two values (t(34) = 0.123; p = 0.901, paired sample t-test). The model’s fit (SSE, sum of squared errors) was better for the model including the Update Bias (SSE = 61.80) than the one that did not (SSE = 97.10). We have now added this analysis to the revised manuscript.

4) Based on the Materials and methods section, it appears that you are running a standard linear regression, but you mention that you construct general linear models (GLM). If the former, please simply state that. If the latter, can you provide more details in what exact analysis was performed? For example, using repeated measures could lead to lower p values, but would need to be described (though inappropriate for the given outcome which is only measured once per individual).

Thank you for noting this. We ran linear regressions not general linear models. This typo has now been corrected.

5) Please discuss the small sample size as a limitation in the Discussion. Assuming a standard linear regression was performed, there should be 5-10 outcomes per independent variable, but the presented analysis only had 2-3.

Following the reviewer’s suggestion, we have now added a discussion of the limitation of the small sample size. In addition, we now perform two additional analyses that help alleviate the concern of number of outcomes per independent variable. First, we rerun the regression analysis using bootstrapping which enables us to construct confidence intervals with a small sample. Performing bootstrapping on 10000 samples again showed that the relationship between the Update Bias was significantly related to time in Euthymia (Beta = 0.589, CI = 0.122-1.129). Second, we run model averaging (Fletcher, 2018). In particular, running 4095 nested models, most of which include a smaller number of independent variables. This again revealed a significant effect of update bias in predicting “time in euthymia” (Estimates weighted by log model evidence = 0.465, 95% CI = 0.34-0.60). Third, we now report the results of a new model with only 6 variables, which is 6 outcomes per independent variable. We now add these additional analyses to the revised manuscript.

6) Given that the task has been analyzed elsewhere, please compare/contrast your analyses to analyses from prior papers? Among task-related variables (e.g., difference in number of good vs. bad trials) that were controlled for in the regression models, how many were used in prior papers?

Our approach was to include all the control variables that were included in the previous papers adopting the same task in clinically depressed participants (Korn et al., 2014; Garrett et al., 2014). We now mention this in the Materials and methods. To our knowledge, seven past papers (Chowdury et al., 2013; Korn et al., 2014, 2016; Garrett et al., 2014, 2018; Kuzmanovic et al., 2017; Ma et al., 2016) include difference in number of good vs. bad trials as controls, and nine papers include RT (Sharot et al., 2011; Sharot et al., 2012b, Chowdury et al., 2013; Korn et al., 2014; Garrett et al., 2014; Moutsiana et al., 2013; Kuzmanovic et al., 2017, 2018; Ma et al., 2016). All papers from the Affective Brain Lab (N = 10) include memory and subjective ratings (i.e. familiarity, prior experience, vividness, emotional arousal and negativity) as controls, as do some papers not from the lab (e.g., Ma et al., 2016; Oganiana et al., 2018; Schonfelder et al., 2017). Thus, the controls included in the paper are in line with past studies.

7) Related to the last comment, in prior work using the task and in models of associative learning more broadly, it is common to analyze bias in learning rates (roughly the ratio between the update and the error) rather than the update bias. Can the authors comment on why they focused on bias in overall updates rather than bias in learning rates?

We focused on the update itself rather than the ratio between update and error for the following reasons. First, in the 20 papers published thus far with the update task 100% used update as a measure of interest (Sharot et al., 2011, 2012a, 2012b; Chowdury et al., 2013; Korn et al., 2014; Moutsiana et al., 2013, 2015; Garrett et al., 2014, 2017, 2018; Kappes et al., 2018; Oganian et al., 2018; Kuzmanovic et al., 2014, 2016, 2017, 2018, 2019; Ma et al., 2016; Schonfelder et al., 2017) and only 40% use the ratio between update and error as a measure of interest (Sharot et al., 2011; Kuzmanovic et al., 2017, 2018; Kappes et al., 2018; Ma et al.,l 2016). Moreover, in the three papers that use this task to examine affective disorders all three used update as a measure of interest (Garrett et al., 2014; Korn et al., 2014; Schonfelder et al., 2017) and only one used the ratio between update and error as a measure of interest. Thus, to be consistent with past research we focused on the update measure. Second, the reason that past studies, in particular those looking at patients (Garrett et al., 2014; Korn et al., 2014, 2016; Kuzmanovic et al., 2019), focus on update rather than the ratio between update and error is that such populations usually show a general impairment in the latter but not the former. This makes the ratio a noisy measure to use in such populations. Indeed, the average ratio between update and error for good news in our clinical sample was 0.63, below that seen in healthy populations (0.72 in the original study by Sharot et al., 2011; and ranging between 0.67 in Kappes et al., 2018 to 0.77 in Ma et al., 2016).

8) One might have expected that optimistic updates predict the transition only to manic and not depressives states. The authors should discuss this and other alternative hypotheses and whether their reported findings fit into any existing and coherent model of BP disorder.

Thank you for raising this question. Indeed, one hypothesis would be that greater update bias will be associated specifically with transition to a manic episode. This type of prediction will be compatible with the notion held by some that mood is a one-dimensional construct with depression and mania as opposites of the same continuum (Chang and Chou, 2018). An alternative hypothesis (which our data supports) is that greater update bias will be associated with episodes of either polarity. Such a hypothesis would be consistent with multi-stable models that suggest that environmental or inner perturbation such as stressors could trigger either episode onset (Cochran et al., 2017, 2018). The update bias may be an indicator of the existence of such stressors and their associated prodromal symptoms. These models consider mood a two-dimensional construct and suggest that mania and depression can vary independently and do not inhibit each other. We now detail these hypotheses in the revised manuscript.

9) The authors show that the belief update task can predict the time of the dysthymic transition. But they only have tested the belief update task. What if any task predicts the time of the dysthymic? On the other side, optimistic biases can be assessed in several ways. On one side of the spectrum there are standard personality measures, such as the LOT-R, on the other side of the spectrum there are reinforcement learning tasks, where the optimistic bias is frequently reported. Does any measure of optimism predict the time of the dysthymic transition? Or is it just the belief update task? In the Supplementary material it is mentioned that trait optimism (LOT-R) was assessed. Does this scale also predict the time spent in euthymia? Do the results hold if it is included in the GLMs?

Following the reviewer’s comment, we tested whether the LOTR was associated with time in euthymia. It was not (rho = 0.164, p = 0.340). Moreover, including the LOT-R in our model showed that the results hold. That is, update bias was still related to time in euthymia (beta = 0.740; p = 0.001) while LOTR was not (beta = 0.208, p = 0.148). In addition, we note that other measures such as memory bias (that is memory errors for positive trials minus for negative trials) also did not predict time in euthymia (beta = 0.13; p = 0.959). It may certainly be that other measures not tested here, such as a bias in learning in a reinforcement learning task (e.g., Lefebvre et al., 2017) may be related to time in euthymia. That will require future testing. However, our results show that not all tasks/measures predict time in euthymia. We now include this information in the revised manuscript.

10) Another important check that is lacking is whether the results hold if the number of previous dysthymic episodes is entered as a control. In such pathological states, it seems that crises become more and more frequent with time. A possible interpretation of the results (that cannot be ruled out presently) is that a higher number of previous dysthymic episodes makes a patient concomitantly less optimistic and more likely to have a shorter euthymic phase. A related question is whether the belief task is a better predictor compared just the number of previous dysthymic episodes (note that even if correlated this factor is not completely taken into account by including the patient age in the regression).

We thank the reviewer for prompting us to examine the effect of number of previous episodes. We extracted the number of previous episodes from the clinical records and found that it was not associated with time in euthymia (r = -0.013; p = 0.93) nor with the Update Bias (r = -0.107; p = 0.53). Adding this variable into the model did not change the results: update bias was still associated with time in euthymia (Beta = 0.066, p = 0.013). Thus, we conclude that update bias is a better predictor than number of previous dysthymic episodes in predicting future time in euthymia. We now add this information to the manuscript.

11) Medication effects were controlled for statistically by including 0 or 1 for different classes of medication. However, this is unlikely to be a perfect control as few participants are medication free (>80% on a mood stabiliser). This should be discussed in more detail.

It is true that given that most participants were on mood stabilisers it is impossible to reliably determine the effect of medication on the relationship between task performance and time in euthymia. This question could only be addressed if there are sufficient patients not on such medication. We now highlight this caveat in the Discussion.

12) The inclusion of the different clinical and demographic factors that might influence the association is appreciated. One might wonder if this could just be a general cognitive effect e.g. the more “impaired” cognition (or difference to normal) associated with a worst illness profile. The authors go some way towards assessing this by looking for example at memory for the new information presented in the task. However, this justifies a bit more attention i.e. did the authors include relapse history in the model? Was memory (as oppose to belief updating) a predictor by itself?

Following the reviewers’ suggestion, we looked at the association between the overall magnitude of memory errors and future time in euthymia and found no association (r = -0.008; p = 0.964). Neither was relapse history, calculated as the number of previous episodes, associated with future time in euthymia (r = -0.013; p = 0.939). Finally, we note that the association between update and time in euthymia is valence-dependent. Greater update in response to positive information is associated with longer time in euthymia but more update in response to negative information is associated with less time in euthymia. We thus conclude that the association between the update bias and time in euthymia is unlikely to be explained by a general cognitive impairment as this would have been better captured by overall update and overall memory performance. We now add this to the revised manuscript.

13) What is the likely mechanism underpinning the failure to update beliefs on the basis of positive information? Is it that despite knowing the population average, the participant believes they have more specific information to refute that a positive update applies to them e.g. burglaries happen to 20% of people. but for some its 30% because they live in an area where burglaries are more common. So, could it be that indirectly the task measures the disadvantaged setting of individuals more prone to relapse? Or is it in the mechanism by which this learning takes place?

We thank the reviewer for raising this question. Our task carefully controls for people’ perception of how likely they are to experience these aversive events by including their first estimate of these likelihoods in the model. We also have two additional measure of the “disadvantaged setting” of the participants; we ask them how often they had experienced these events in the past and how familiar they are with the events from others experiencing them. Including the average rating on either of these measures in our model did not change the main result: update bias was still associated with future time in euthymia (Beta = 0.806; p = 0.003) while these measures were not (past experience Beta = -0.022 p = 0.929; familiarity Beta = 0.177 p = 0.416). It is possible that the failure to update in response to positive information is due to the participant perceiving the positive information as irrelevant to them.

https://doi.org/10.7554/eLife.58891.sa2

## Article and author information

### Author details

1. #### Paolo Ossola

Psychiatry Unit, Department of Medicine and Surgery, Università di Parma, Parma, Italy
##### Contribution
Conceptualization, Resources, Data curation, Formal analysis, Investigation, Methodology, Writing - original draft, Project administration
##### For correspondence
paolo.ossola@unipr.it
##### Competing interests
No competing interests declared
2. #### Neil Garrett

Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
##### Contribution
Conceptualization, Data curation, Software, Formal analysis, Methodology, Writing - review and editing
##### Competing interests
No competing interests declared
3. #### Tali Sharot

Affective Brain Lab, Department of Experimental Psychology, University College London, London, United Kingdom
##### Contribution
Conceptualization, Supervision, Validation, Methodology, Writing - original draft, Writing - review and editing
##### Competing interests
No competing interests declared
4. #### Carlo Marchesi

Psychiatry Unit, Department of Medicine and Surgery, Università di Parma, Parma, Italy
##### Contribution
Conceptualization, Resources, Data curation, Supervision, Funding acquisition, Methodology, Project administration, Writing - review and editing
##### Competing interests
No competing interests declared

### Funding

• Paolo Ossola

• Tali Sharot

#### Sir Henry Wellcome Postdoctoral Fellowship (209108/Z/17/Z)

• Neil Garrett

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

### Ethics

Human subjects: All participants gave informed consent prior to testing. The local ethics committee (Comitato Etico per Parma) approved the study protocol (protocol number 45411 approved the 7th of December 2012) and the study has been conducted in accordance with the Helsinki Declaration.

### Senior Editor

1. Michael J Frank, Brown University, United States

### Reviewing Editor

1. Shelly B Flagel, University of Michigan, United States

### Reviewers

1. Amy Louise Cochran, University of Wisconsin - Madison, United States
2. Catherine Harmer, University of Oxford, United Kingdom

### Publication history

2. Accepted: October 23, 2020
3. Version of Record published: November 10, 2020 (version 1)

? 2020, Ossola et al.

## Metrics

• 651
Page views
• 65
• 0
Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

A two-part list of links to download the article, or parts of the article, in various formats.

### Open citations (links to open the citations from this article in various online reference manager services)

1. Neuroscience

# Magnesium efflux from Drosophila Kenyon Cells is critical for normal and diet-enhanced long-term memory

Yanying Wu et al.
Research Article

Dietary magnesium (Mg2+) supplementation can enhance memory in young and aged rats. Memory-enhancing capacity was largely ascribed to increases in hippocampal synaptic density and elevated expression of the NR2B subunit of the NMDA-type glutamate receptor. Here we show that Mg2+ feeding also enhances long-term memory in Drosophila. Normal and Mg2+ enhanced fly memory appears independent of NMDA receptors in the mushroom body and instead requires expression of a conserved CNNM-type Mg2+-efflux transporter encoded by the unextended (uex) gene. UEX contains a putative cyclic nucleotide-binding homology domain and its mutation separates a vital role for uex from a function in memory. Moreover, UEX localization in mushroom body Kenyon Cells is altered in memory defective flies harboring mutations in cAMP-related genes. Functional imaging suggests that UEX-dependent efflux is required for slow rhythmic maintenance of Kenyon Cell Mg2+. We propose that regulated neuronal Mg2+ efflux is critical for normal and Mg2+ enhanced memory.

1. Neuroscience

# Tracking prototype and exemplar representations in the brain across learning

Caitlin R Bowman et al.
Research Article

There is a long-standing debate about whether categories are represented by individual category members (exemplars) or by the central tendency abstracted from individual members (prototypes). Neuroimaging studies have shown neural evidence for either exemplar representations or prototype representations, but not both. Presently, we asked whether it is possible for multiple types of category representations to exist within a single task. We designed a categorization task to promote both exemplar and prototype representations and tracked their formation across learning. We found only prototype correlates during the final test. However, interim tests interspersed throughout learning showed prototype and exemplar representations across distinct brain regions that aligned with previous studies: prototypes in ventromedial prefrontal cortex and anterior hippocampus and exemplars in inferior frontal gyrus and lateral parietal cortex. These findings indicate that, under the right circumstances, individuals may form representations at multiple levels of specificity, potentially facilitating a broad range of future decisions.