Information

Good Resources for Learning Modeling of fMRI Data

Good Resources for Learning Modeling of fMRI Data

As a neuroscience student who works with fMRI I'm competent at the standard univariate analyses and resting state analysis techniques in AFNI and MATLAB/SPM. However, I want to learn how to use models in fMRI data analyses from a start-to-finish perspective.

More specifically, I want to learn how to model prediction error like they did in this article for example. The problem is, I have no one to teach/help me.

What are recommended resources (online or textbooks), for learning modelling of fMRI data in detail?


Frequently Asked Questions (FAQs) to Dr. Ahn

I seek to build a “happy” laboratory where lab members (including the PI) respect each other, feel they are growing intellectually, enjoy excellent support for research, and generate research outputs that will make them competitive for the next career steps.

Building such an environment and a culture is a very challenging task especially because each person is from different backgrounds and has different expectations and norms. But I try to achieve it by (1) fostering communication within the lab, (2) recruiting people who are effective team players and share similar visions with each other, (3) individually tailoring training based on each member’s strengths and interests, and (4) securing enough research funds.


INTRODUCTION

The study of cognition has flourished in the recent decades because of the abundance of neuroimaging data that give access to brain activity in human subjects. Along the years, tools from various fields like machine learning and network theory have been brought to neuroimaging applications in order to analyze data. The corresponding tools have their own strengths, like predictability for machine learning. This article brings together recent studies based on the same whole-brain dynamic model in a unified pipeline, which is consistent from the model estimation to its analysis—in particular, the implications of the model assumptions can be evaluated at each step. This allows us to naturally combine concepts from several fields, in particular for predictability and interpretability of the data. We stress that our framework can be transposed to other dynamic models, while preserving the concepts underlying its design. In the following, we first review previous work on connectivity measures to set our formalism in context. After presenting the dynamic model (the multivariate Ornstein-Uhlenbeck process, or MOU), we discuss its optimization procedure to reproduce statistics of the fMRI/BOLD signals (spatiotemporal covariances), yielding a whole-brain effective connectivity estimate (MOU-EC). Then two MOU-EC-based applications are examined: machine learning to extract biomarkers and network analysis to interpret the estimated connectivity weights in a collective manner. Meanwhile, presenting details about our framework, we provide a critical comparison with previous studies to highlight similarities and differences. We illustrate MOU-EC capabilities in studying cognition in using a dataset where subjects were recorded in two conditions, watching a movie and a black screen (referred to as rest). We also note that the same tools can be used to examine cognitive alterations due to neuropathologies.


FMRI in Healthy Aging

From the behavioral point of view, it is known that some adults are able to maintain their cognitive capabilities at high levels, in contrast with other persons who show clear cognitive declines with advancing age. It has been hypothesized that this variability depends on neurofunctional resources. However, the exact mechanisms that lead to such wide differences are still unclear (Park and Reuter-Lorenz, 2009).

The use of task-fMRI in aging has revealed a complex pattern of brain activity changes, which is characterized by both, decreases and increases in old subjects compared to young subjects (Grady, 2012). In some cases, the diversity of findings depends on many variables, such as the cognitive tests used and their level of difficulty (Grady et al., 2006). Nonetheless, there is a relative consensus that there is an age-related increase of brain activity in the (PFC Turner and Spreng, 2012), while the findings as regards reduced activation are localized more heterogeneously in the brain.

In this part, we will review some of these main theories that have appeared in the attempt to explain the trajectories of brain changes and their relationship with cognition. It is important to note that whereas earlier or “more classical” views aimed to provide meaningful interpretations of a variety of isolated phenomena, such as the increased or the decreased regional brain activity in old compared with young subjects, more recent theories aim to provide a global, integrative interpretation of brain changes.

Classical Theories Derived from Task-fMRI Studies

In general, regional hyperactivation has been interpreted as compensation (or an attempt to compensate), whereas a failure to activate or reduced activation has been typically related with cognitive deficits associated with aging. Two main hypotheses were proposed to explain the nature of these age-related activity changes: the dedifferentiation hypothesis and the compensation hypothesis.

By one hand, the term dedifferentiation is described as the loss of functional specificity in the brain regions that are engaged during the performance of a task (Park et al., 2004 Rajah and D𠆞sposito, 2005). In neurobiological terms, it has been suggested that this pattern of changes is caused by a chain of processes which starts from a decline in the dopaminergic neuromodulation that produces increases in neural noise, leading to less distinctive cortical representations (Li et al., 2001).

On the other hand, the compensation hypothesis in aging states that older adults are able to recruit higher levels of activity in comparison to young subjects in some brain areas to compensate for functional deficits located somewhere else in the brain. This increased activity is often seen in frontal regions (Park and Reuter-Lorenz, 2009 Turner and Spreng, 2012). The first studies suggesting compensatory mechanisms appeared early in the literature and used PET during the performance of visuospatial (Grady et al., 1994) or episodic memory (Cabeza et al., 1997 Madden et al., 1999) tasks. Later on, these findings were replicated with fMRI (Cabeza et al., 2002).

Furthermore, the different patterns of spatial localization of the compensation-related mechanisms leaded to the formulation of three main cognitive models:

(1) The Hemispheric Asymmetry Reduction in Old Adults (HAROLD) model (Cabeza, 2002) states that older adults use a less lateralized pattern of activity in comparison with young subjects during the performance of a task, which is compensatory. This reduced lateralization was mainly observed in frontal areas, during the performance of episodic memory and working memory tasks (Cabeza et al., 2002 Cabeza, 2004).

(2) The Compensation-Related Utilization of Neural Circuits Hypothesis (CRUNCH Reuter-Lorenz and Cappell, 2008 Schneider-Garces et al., 2010) defends that, in older adults, higher neural recruitment occurs in cognitive levels that typically imply lower brain activity in younger subjects. This effect has been observed in the PFC and also in the parietal cortex, concretely in the precuneus and posterior cingulate and both in episodic memory tasks (Spaniol and Grady, 2012) and in working memory tasks (Mattay et al., 2006 Reuter-Lorenz and Cappell, 2008).

(3) The Posterior-Anterior Shift with Aging (PASA) was experimentally proved by Davis et al., who used two different tasks, visuoperceptive and episodic retrieval and found that older subjects had deficits to activate regions in the posterior midline cortex accompanied with increased activity in medial frontal cortex (Davis et al., 2008).

Global, Integrative Theories of Cognitive Function and the Aging Brain

With the unique information provided by fMRI activity and with the classification described above, which presents the models as being exclusive between them, it seems difficult to discern which of the proposed model better explains the age-related changes in cognition.

More recently, an important contribution to the interpretation of these models has been given by multimodal studies that integrate structural and functional brain measures. For example, in some cases, it has been reported that reduced activity in task-related regions correlated positively with brain atrophy in the same brain regions (Brassen et al., 2009 Rajah et al., 2011), whereas other studies have reported correlations between the increased functional activity in the PFC and the preserved structural integrity of the entorhinal cortex and other medial temporal lobe (MTL) structures (Rosen et al., 2005 Braskie et al., 2009). Given this, some authors have theorized that while increased activity in the PFC may be triggered by the atrophy of frontal GM, which is a commonly reported feature in aging, the compensatory role of this increased activity may depend on the preserved structural integrity of distal regions mainly in the MTL (Maillet and Rajah, 2013).

Therefore, and mainly thanks to the new advances in neuroimaging techniques, it has been suggested that cognitive function in aging is a result of a sum of processes, including structural and functional brain measures as well as external factors. In this regard, the scaffolding theory of aging and cognition (STAC) states that there is a process in the aging brain, called compensatory scaffolding that entails the engagement of additional neural resources (in terms of network reorganization) providing a support to preserve cognitive function in the face of structural and functional decline (Park and Reuter-Lorenz, 2009). This theory has been recently revised in order to include the more recent findings on the field, obtained mainly from longitudinal and interventional studies. As a result, the STAC-r is a conceptual model that extends the STAC by incorporating life-course influences that enhance, preserve, or compromise brain status, compensatory potential and cognitive function over time (Reuter-Lorenz and Park, 2014).

In a similar sense, Walhovd et al. (2014) proposed a system-vulnerability view of cognition in aging. According to them, the age-associated cognitive decline would be the result of a life-long accumulation of impact that alters brain function and structure in a multidimensional way, affecting a wide range of neuroimage markers such as structural integrity, functional activity and connectivity, glucose metabolism, or amyloid deposition. According to this view some particular brain systems such as the hippocampus and posteromedial regions would be particularly vulnerable to ageing effects, related to its central role as mechanisms subtending lifetime brain plasticity (Fjell et al., 2014).

Finally, a complementary hypothesis, also emerged from the results of longitudinal studies is the 𠇋rain maintenance,” which states that the lack of changes in brain structural and functional markers would allow some people to show little or no age-related cognitive decline. The conceptual idea of brain maintenance was motivated by the fact that increased functional activity in HA do not necessarily imply up-regulation of functional networks over time. Therefore, according to maintenance, the best predictors of successful performance in aging would be the minimization of chemical, structural and functional changes over time (Nyberg et al., 2012).


TASKS AND COMPUTATIONAL MODELS IMPLEMENTED IN hBayesDM

Table 1 shows the list of tasks and computational models currently implemented in the hBayesDM package (as of version 0.3.0). Note that some tasks have multiple computational models and that users can compare model performance within the hBayesDM framework (see Step-by-Step Tutorials for the hBayesDM Package). To fit models to a task, first the user must prepare trial-by-trial data as a text file (*.txt) in which each row (observation) contains the columns required for the given task (see Table 1). Users can also use each task’s sample dataset as a template.

Below, we describe each task and its computational model(s), briefly review its applications to healthy and clinical populations, and describe the model parameters. For brevity, we refer readers to original articles for the full details of the experimental design and computational models, and to the package help files for example codes that detail how to estimate and extract the parameters from each model. The package help files can be found by issuing the following command within the R console:

The command above will open the main help page, from which one can then navigate to the corresponding task/model. Users can also directly look up a help file for each task/model by calling its help file, which follows the form ?function_name (e.g., ?dd_cs see Table 1 for a list of these functions). Each help file provides working codes to run a concrete real-data example from start to finish.

The Delay-Discounting Task

The delay-discounting task (DDT Rachlin, Raineri, & Cross, 1991) is designed to estimate how much an individual discounts temporally delayed larger outcomes in comparison to smaller–sooner ones. On each trial of the DDT, two options are presented: a sooner and smaller reward (e.g., $5 now) and a later and larger reward (e.g., $20 next week). Subjects are asked to choose which option they prefer on each trial.

The DDT has been widely studied in healthy populations (e.g., Green & Myerson, 2004 Kable & Glimcher, 2007) and delay discounting has been associated with cognitive abilities such as intelligence (Shamosh et al., 2008) and working memory (Hinson, Jameson, & Whitney, 2003). Steeper delay discounting is a strong behavioral marker for addictive behaviors (Ahn, Ramesh, Moeller, & Vassileva, 2016 Ahn & Vassileva, 2016 Bickel, 2015 Green & Myerson, 2004 MacKillop, 2013) and has also been associated with other psychiatric conditions, including schizophrenia (Ahn, Rass, et al., 2011 Heerey, Matveeva, & Gold, 2011 Heerey, Robinson, McMahon, & Gold, 2007) and bipolar disorder (Ahn, Rass, et al., 2011). The hBayesDM package currently contains three different models for the DDT:

dd_cs (constant-sensitivity model Ebert & Prelec, 2007)

Exponential discounting rate (0 <r < 1)

Inverse temperature (0 < β< 5)

dd_exp (exponential model Samuelson, 1937)

Exponential discounting rate (0 < r < 1)

Inverse temperature (0 < β < 5)

dd_hyperbolic (hyperbolic model Mazur, 1987)

Inverse temperature (0 < β < 5)

DDT: Parameter descriptions

In the exponential and hyperbolic models, temporal discounting of future (i.e., delayed) rewards is described by a single parameter, the discounting rate (0 < r < 1), which indicates how much future rewards are discounted. High and low discounting rates reflect greater and lesser discounting of future rewards, respectively. In the exponential and hyperbolic models, the value of a delayed reward is discounted in an exponential and hyperbolic form, respectively. The constant-sensitivity (CS) model has an additional parame ter, called time sensitivity (0 < s < 10). When s is equal to 1, the CS model reduces to the ex ponential model. Values of s near 0 lead to a simple “present–future dichotomy” in which all future rewards are steeply discounted to a certain subjective value, irrespective of delays. Values of s greater than 1 result in an “extended-present” heuristic, in which rewards during the extended present are valued nearly equally, and future rewards outside the extended present have zero value.

All models use the softmax choice rule with an inverse-temperature parameter (Kaelbling, Littman, & Moore, 1996 Luce, 1959), which reflects how deterministically individuals’ choices are made with respect to the strength (subjective value) of the alternative choices. High and low inverse temperatures represent more deterministic and more random choices, respectively.

The Iowa Gambling Task

The Iowa Gambling Task (IGT Bechara, Damasio, Damasio, & Anderson, 1994) was originally developed to assess decision-making deficits of patients with ventromedial prefrontal cortex lesions. On each trial, subjects are presented with four decks of cards. Two decks are advantageous (good) and the other two decks disadvantageous (bad), in terms of long-term gains. Subjects are instructed to choose decks that maximize long-term gains, which they are expected to learn by trial and error. From a statistical perspective, the IGT is a four-armed bandit problem.

The IGT has been used extensively to study decision-making in several psychiatric populations (Ahn et al., 2014 Bechara & Martin, 2004 Bechara et al., 2001 Bolla et al., 2003 Grant, Contoreggi, & London, 2000 Vassileva, Gonzalez, Bechara, & Martin, 2007). The hBayesDM package currently contains three different models for the IGT:

igt_pvl_decay (Ahn et al., 2014 Ahn, Krawitz, Kim, Busemeyer, & Brown, 2011)

igt_pvl_delta (Ahn, Busemeyer, Wagenmakers, & Stout, 2008)

igt_vpp (Worthy, Pang, & Byrne, 2013)

Perseverance gain impact ( ⁠ − ∞ < ϵ p < ∞ ⁠ )

Perseverance loss impact ( ⁠ − ∞ < ϵ n < ∞ ⁠ )

Perseverance decay rate (0 < k < 1)

Reinforcement-learning weight (0 < ω < 1)

IGT: Parameter descriptions

The Prospect Valence Learning (PVL) model with delta rule (PVL-delta) uses a Rescorla–Wagner updating equation (Rescorla & Wagner, 1972) to update the expected value of the selected deck on each trial. The expected value is updated with a learning rate parameter (0 < A < 1) and a prediction error term, where A close to 1 places more weight on recent outcomes, and A close to 0 places more weight on past outcomes the prediction error is the difference between the predicted and experienced outcomes. The shape (0 < α < 2) and loss aversion (0 < λ < 1) parameters control the shape of the utility (power) function and the effect of losses relative to gains, respectively. Values of α greater than 1 indicate that the utility of an outcome is convex, and values less than 1 indicate that the utility is concave. Values of λ greater than or less than 1 indicate greater or reduced sensitivity, respectively, to losses relative to gains. The consistency parameter (0 < c < 1) is an inverse-temperature parameter (refer to The Delay-Discounting Task for details).

The PVL model with decay rule (PVL-decay) uses the same shape, loss aversion, and consistency parameters as the PVL-delta, but a recency parameter (0 < A < 1) is used for value updating. The recency parameter indicates how much the expected values of all decks are discounted on each trial.

The PVL-delta model is nested within the Value-Plus-Perseverance (VPP) model, which is a hybrid model of PVL-delta and a heuristic strategy of perseverance. The perseverance decay rate (0 < k < 1) decays the perseverance strengths of all choices on each trial, akin to how PVL-decay’s recency parameter affects the expected value. The parameters for the impacts of gain ( ⁠ − ∞ < ϵ p < ∞ ⁠ ) and loss ( ⁠ − ∞ < ϵ n < ∞ ⁠ ) on perseverance reflect how the perseverance value changes after wins and losses, respectively positive values reflect a tendency to make the same choice, and negative values a tendency to switch choices. The reinforcement-learning weight (0 < ω < 1) is a mixing parameter that controls how much decision weight is given to the reinforcement-learning versus the perseverance term. High versus low values reflect more versus less reliance on the reinforcement-learning term, respectively.

The Orthogonalized Go/No-Go Task

Animals use Pavlovian and instrumental controllers when taking action. The Pavlovian controller selects approaching/engaging actions with predictors of appetitive outcomes or avoiding/inhibiting actions with predictors of aversive outcomes. The instrumental controller, on the other hand, selects actions on the basis of the action–outcome contingencies of the environment. The two controllers typically cooperate, but sometimes they compete with each other (e.g., Dayan, Niv, Seymour, & Daw, 2006). The orthogonalized go/no-go (GNG) task (Guitart-Masip et al., 2012) is designed to examine the interaction between the two controllers by orthogonalizing the action requirement (go vs. no go) versus the valence of the outcome (winning vs. avoiding losing money).

Each trial of the orthogonal GNG task has three events in the following sequence: cue presentation, target detection, and outcome presentation. First, one of four cues is presented (“Go to win,” “Go to avoid (losing),” “NoGo to win,” or “NoGo to avoid”). After some delay, a target (“circle”) is presented on the screen, and subjects need to respond with either a go (press a button) or no go (withhold the button press). Then subjects receive a probabilistic (e.g., 80%) outcome. See Guitart-Masip et al. (2012) for more details of the experimental design.

gng_m1 (M1 in Guitart-Masip et al., 2012)

Effective size of a reinforcement ( ⁠ 0 < ρ < ∞ ⁠ )

gng_m2 (M2 in Guitart-Masip et al., 2012)

Effective size of a reinforcement ( ⁠ 0 < ρ < ∞ ⁠ )

gng_m3 (M3 in Guitart-Masip et al., 2012)

Effective size of a reinforcement ( ⁠ 0 < ρ < ∞ ⁠ )

gng_m4 (M5 in Cavanagh et al., 2013)

Effective size of reward reinforcement ( ⁠ 0 < ρ r e w < ∞ ⁠ )

Effective size of punishment reinforcement ( ⁠ 0 < ρ p u n < ∞ ⁠ )

GNG: Parameter descriptions

All models for the GNG task include a lapse rate parameter (0 < ξ < 1), a learning rate parameter (0 < ϵ < 1 refer to IGT: Parameter descriptions for details), and a parameter for the effective size of reinforcement ( ⁠ 0 < ρ < ∞ ⁠ ). The lapse rate parameter captures the proportion of random choices made, regardless of the strength of their action probabilities. The ρ parameter determines the effective size of a reinforcement. The gng_m4 model has separate effective size parameters for reward ( ⁠ 0 < ρ r e w < ∞ ⁠ ) and punishment ( ⁠ 0 < ρ p u n < ∞ ⁠ ), allowing for rewards and punishments to be evaluated differently.

Three GNG models ( gng_m2 , gng_m3 , and gng_m4 ) include a go bias parameter ( ⁠ − ∞ < b < ∞ ⁠ ). Go bias reflects a tendency to respond (go), regardless of the action–outcome associations high or low values for b reflect a high or a low tendency to make a go (motor) response, respectively.

Two GNG models ( gng_m3 and gng_m4 ) include a Pavlovian bias parameter ( ⁠ − ∞ < π < ∞ ⁠ ). Pavlovian bias reflects a tendency to make responses that are Pavlovian congruent: that is, to promote or inhibit goif the expected value of the stimulus is positive (appetitive) or negative (aversive), respectively.

Probabilistic Reversal-Learning Task

Environments often have higher-order structures, such as interdependencies between the stimuli, actions, and outcomes. In such environments, subjects need to infer and make use of the structures in order to make optimal decisions. In the probabilistic reversal-learning (PRL) task, higher-order structure exists such that the reward distributions of two stimuli are anticorrelated (e.g., if one option has a reward rate of 80%, the other option has a reward rate of [100 – 80]%, which is 20%). Subjects need to learn the higher-order structure and take it into account to optimize their decision-making and to maximize earnings.

In a typical PRL task, two stimuli are presented to a subject. The choice of a “correct” or good stimulus will usually lead to a monetary gain (e.g., 70%), whereas the choice of an “incorrect” or bad stimulus will usually lead to a monetary loss. The reward contingencies will reverse at fixed points (e.g., Murphy, Michael, Robbins, & Sahakian, 2003) or will be triggered by consecutive correct choices (Cools, Clark, Owen, & Robbins, 2002 Hampton et al., 2006).

The PRL task has been widely used to study reversal learning in healthy individuals (Cools et al., 2002 den Ouden et al., 2013 Gläscher et al., 2009). The PRL has been also used to study decision-making deficits associated with prefrontal cortex lesions (e.g., Fellows & Farah, 2003 Rolls, Hornak, Wade, & McGrath, 1994), as well as Parkinson’s disease (e.g., Cools, Lewis, Clark, Barker, & Robbins, 2007 Swainson et al., 2000), schizophrenia (e.g., Waltz & Gold, 2007), and cocaine dependence (Ersche, Roiser, Robbins, & Sahakian, 2008). The hBayesDM package currently contains three models for PRL tasks:

Inverse temperature (0 < β < 1)

prl_fictitious (Gläscher et al., 2009)

Inverse temperature (0 < β < 1)

Inverse temperature (0 < β < 1)

PRL: Parameter descriptions

All PRL models above contain learning rate parameters (refer to IGT: Parameter descriptions for details). The prl_rp model has separate learning rates for rewards (0 < Arew < 1) and punishments (0 < Apun < 1). In the prl_ewa model (Camerer & Ho, 1999), low and high values of φ reflect more weight on recent and on past outcomes, respectively. All PRL models also contain an inverse-temperature parameter (refer to DDT: Parameter descriptions for details).

The prl_ewa model proposed in den Ouden et al. (2013) contains a decay rate parameter (0 < ρ <). The experienced weight of the chosen option is decayed in proportion to ρ, and 1 is added to the weight on each trial. Thus, a higher value of ρ indicates slower decaying or updating of the experienced weight.

The prl_fictitious model contains an indecision point parameter (0 < α < 1). This point reflects a subject’s amount of bias or preference toward an option. High or low values for α indicate a greater or a lesser preference for one option over the other.

Risk Aversion Task

The risk aversion (RA Sokol-Hessner, Camerer, & Phelps, 2013 Sokol-Hessner et al., 2009) task is a description-based task (Hertwig, Barron, Weber, & Erev, 2004) in which the possible outcomes of all options and their probabilities are provided to subjects on each trial. In the RA task, subjects choose either a sure option with a guaranteed amount or a risky option (i.e., gamble) with possible gains and/or loss amounts. Subjects are asked to choose which option they prefer (or whether they want to accept the gamble) on each trial. In the RA task, subjects per form two cognitive regulation (attend and regulate) conditions in a within-subjects design: in the attend condition, subjects are asked to focus on each choice in isolation, whereas in the regulate condition, subjects are asked to emphasize choices in their greater context (see Sokol-Hessner et al., 2009, for the details). The data published in Sokol-Hessner et al. (2009) can be found using the following paths (these paths are also available in the RA model help files):

path_to_attend_data = system.file("extdata/ra_data_attend.txt", package="hBayesDM")

path_to_regulate_data = system.file("extdata/ra_data_reappraisal. txt", package="hBayesDM").

ra_prospect (Sokol-Hessner et al., 2009)

Inverse temperature ( ⁠ 0 < τ < ∞ ⁠ )

ra_noLA (no loss aversion [LA] parameter for tasks that involve only gains)

Inverse temperature ( ⁠ 0 < τ < ∞ ⁠ )

ra_noRA (no risk aversion [RA] parameter see, e.g., Tom et al., 2007)

Inverse temperature ( ⁠ 0 < τ < ∞ ⁠ )

RA: Parameter descriptions

The ra_prospect model includes a loss aversion parameter (0 < λ < 5), a risk aversion parameter (0 < ρ < 2), and an inverse-temperature parameter ( ⁠ 0 < τ < ∞ ⁠ ). See DDT: Parameter descriptions for inverse temperature. The risk aversion and loss aversion parameters in the RA models are similar to those in the IGT models. However, in RA models they control the valuations of the possible choices under consideration, as opposed to the evaluation of outcomes after they are experienced (Rangel et al., 2008).

The ra_noLA and ra_noRA models are nested within the ra_prospect model, with either loss aversion ( ra_noLA ) or risk aversion ( ra_noRA ) set to 1.

Two-Armed Bandit Task

Multi-armed bandit tasks or problems typically refer to situations in which gamblers decide which gamble or slot machine to play in order to maximize long-term gain. Many reinforcement-learning tasks and experience-based (Hertwig et al., 2004) tasks can be classified as bandit problems. In a typical two-armed bandit task, subjects are presented with two options (stimuli) on each trial. Feedback is given after a stimulus is chosen. Subjects are asked to maximize positive feedback as they make choices, and they are expected to learn stimulus–outcome contingencies from trial-by-trial experience. The hBayesDM package currently contains a simple model for a two-armed bandit task:

bandit2arm_delta (Hertwig et al., 2004)

Inverse temperature (0 < τ < 1)

Two-armed bandit: Parameter descriptions

The bandit2arm_delta model uses the Rescorla–Wagner rule (see IGT: Parameter descriptions) for updating the expected value of the chosen option, along with the softmax choice rule with an inverse temperature (see DDT: Parameter descriptions).

The Ultimatum Game (Norm-Training)

The abilities to understand the social norms of an environment and to adaptively cope with those norms are critical for normal social functioning (Gu et al., 2015 Montague & Lohrenz, 2007). The ultimatum game (UG) is a widely used social decision-making task that examines how individuals respond to deviations from social norms and adapt to norms in a changing environment.

The UG involves two players: a proposer and a responder. On each trial, the proposer is given some amount of money to divide up amongst the two players. After deciding how to divide the money, an offer is made to the responder. The responder can either accept the offer (and the money is split as offered) or reject it (both players receive nothing). Previous studies have shown that the most common offer is approximately 50% of the total amount, and that “unfair” offers (<∼20% of the total amount) are often rejected, even though it is optimal to accept any offer (Güth, Schmittberger, & Schwarze, 1982 Sanfey, 2003 Thaler, 1988). A recent study examined the computational substrates of norm adjustment by using a norm-training UG in which subjects played the role of responder in a norm-changing environment (Xiang et al., 2013).

The UG has been used to investigate the social decision-making of individuals with ventromedial prefrontal (Gu et al., 2015 Koenigs et al., 2007) and insular cortex (Gu et al., 2015) lesions, as well as of patients with schizophrenia (Agay, Kron, Carmel, Mendlovic, & Levkovitz, 2008 Csukly, Polgár, Tombor, Réthelyi, & Kéri, 2011). The hBayesDM package currently contains two models for the UG (or norm-training UG) in which subjects play the role of responder:

Inverse temperature (0 < τ < 10)

Inverse temperature (0 < τ < 10)

Norm adaptation rate (0 < ϵ < 1)

UG: Parameter descriptions

The ug_bayes model assumes that the subject (responder) behaves like a Bayesian ideal observer (Knill & Pouget, 2004), so that the expected offer made by the proposer is updated in a Bayesian fashion. This is in contrast to the ug_delta model, which assumes that the subject (again the responder) updates the expected offer using a Rescorla–Wagner (delta) updating rule. Both the ug_bayes and ug_delta models contain envy (0 < α < 20) and inverse-temperature (0 < τ < 10 refer to DDT: Parameter descriptions for details) parameters. The envy parameter reflects sensitivity to norm prediction error (see below for the ug_bayes model), where higher or lower values indicate greater or lesser sensitivity, respectively. In the UG, prediction error reflects the difference between the expected and received offers.

In the ug_bayes model, the utility of an offer is adjusted by two norm prediction errors: (1) negative prediction errors, multiplied by an envy parameter (0 < α < 20), and (2) positive prediction errors, multiplied by a guilt parameter (0 < β < 10). Higher and lower values for envy (α) and guilt (β) reflect greater and lesser sensitivity to negative and positive norm prediction errors, respectively. The ug_delta model includes only the envy parameter (Gu et al., 2015).


Results

Applicability of feedback to self-views

Applicability ratings were affected by a valence by group interaction [χ 2 (4) = 106.19, p < 0.001], see online Supplementary Tables S4 and S5 for model comparisons and parameters. Consistent with our hypothesis, BPD patients rated the intermediate (b = −0.40, s.e. = 0.16, t = −2.50) and especially negative feedback (b = −0.53, s.e. = 0.16, t = −3.36) as more applicable compared with HC, see Fig. 1a. Positive feedback was rated as less applicable by BPD compared with HC (b = 1.07, s.e. = 0.16, t = 6.74). Compared with LSE, BPD also rated negative feedback as more applicable (b = −0.43, s.e. = 0.17, t = −2.43) and positive feedback as less applicable (b = 0.63, s.e. = 0.18, t = 3.61) but did not differ in applicability of intermediate feedback (b = −0.15, s.e. = 0.18, t = −0.83). Moreover, using the valence ratings (i.e. degree of negativity or positivity), we found that all three groups rated the valence of the words in a similar way [χ 2 (2) = 2.4, p = 0.307], with negative and positive words being more emotional than intermediate words, see online Supplementary Tables S2 and S3. However, there was a trend for an interaction effect between valence and group [χ 2 (4) = 8.42, p = 0.077], which could indicate that negative feedback was rated slightly less negative by BPD than HC (b = −0.43, s.e. = 0.16, t = −2.69), see also online Supplementary Table S3 for model parameters.

Fig. 1. (a) Mean applicability ratings by group after negative, intermediate and positive feedback (error bars indicate 95% confidence intervals). (b) Illustration of mood ratings by group after negative, intermediate and positive feedback at the mean level of applicability of feedback. (c) Illustration of mean mood ratings by group after negative, intermediate and positive feedback for not to very applicable feedback. Applicability has a greater impact on mood during negative and intermediate feedback than positive feedback. Applicability has a greater impact on the mood of HC compared with BPD. Mood rating is rescaled to scores 1–4 for display purposes.

Affective responses

Mood was affected by group [χ 2 (2) = 11.4, p = 0.003] with BPD reporting a worse mood than HC overall (b = 0.81, s.e. = 0.19, t = 4.28), see Table 2 and online Supplementary Table S6. Valence moderated the group effect [χ 2 (4) = 39.89, p < 0.001]. BPD reported a worse mood after negative (b = −0.14, s.e. = 0.15, t = −0.95) and intermediate feedback (b = −0.81, s.e. = 0.19, t = 4.28) and similar mood after positive feedback (b = −0.49, s.e. = 0.13, t = −3.70) compared with HC, see Fig. 1b. Compared with LSE, BPD reported equal mood after intermediate (b = 0.19, s.e. = 0.21, t = 0.91) and positive feedback (b = 0.11, s.e. = 0.15, t = 0.75) but a better mood after negative feedback (b = −0.50, s.e. = 0.16, t = −3.10).

Table 2. Effect parameters of model predicting mood ratings by valence category (intermediate = reference), group (BPD = reference), and applicability of feedback and two-way interactions

Significance level (***<0.001, **<0.01, *<0.05, ^<0.10) based on χ 2 test of model comparisons, see online Supplementary Table S6.

Applicability moderated the group effect as well [χ 2 (4) = 14.8, p = 0.005]. BPD mood ratings were less affected by applicability compared with HC (b = 0.07, s.e. = 0.03, t = 2.27), but did not differ in this respect from LSE (b = 0.01, s.e. = 0.03, t = 0.23), see Fig. 1c. There was no three-way interaction of valence by applicability by group [χ 2 (4) = 8.0, p = 0.090].

Neural responses

Groups differed in neural correlates of feedback valence, see Table 3 for clusters and peak voxels Footnote † Footnote 1 . In response to negative feedback compared with positive feedback, HC showed stronger left precuneus activation, whereas BPD showed relatively low and equal precuneus activation for negative and positive feedback, see Fig. 2. In this precuneus cluster, LSE showed relatively high and equal activation for negative and positive feedback, albeit not significantly different from BPD, see Fig. 2. In response to positive compared with negative feedback, HC showed stronger right anterior TPJ activation, whereas BPD showed the reverse pattern, with stronger TPJ activation for negative feedback compared with positive feedback. Compared with LSE, BPD showed stronger left precuneus activation during negative compared with positive feedback, see Table 3 and Fig. 2. However, this cluster in the left precuneus did not overlap with the cluster found in comparison to HC. Groups did not differ in neural correlates of applicability. The three-way interaction of applicability by negative valence of BPD compared with HC in the motor cortex, superior parietal lobule and inferior parietal lobule is probably attributable to button press movements (Mars et al., Reference Mars, Jbabdi, Sallet, O'Reilly, Croxson, Olivier, Noonan, Bergmann, Mitchell, Baxter, Behrens, Johansen-Berg, Tomassini, Miller and Rushworth 2011).

Fig. 2. Left: Clusters of neural activation indicating HC > BPD (blue) and BPD > LSE (orange). Right: Mean contrast values for the HC > BPD clusters (blue clusters) by group and contrast.

Table 3. Selected neural correlates for group comparisons on contrasts of valence and applicability of feedback a , cluster corrected Z = 2.3, cluster p < 0.05

a Contrasts without any above threshold clusters are not reported in this table.

Exploratory findings

For exploratory purposes, we checked whether LSE differed in self-views from HC by rerunning the model with applicability ratings as an outcome but with HC set as a reference group instead of BPD. We found that despite lower self-esteem, LSE did not report that negative feedback was more applicable to them (b = 0.11, s.e. = 0.17, t = 0.65), neither was intermediate feedback (b = 0.26, s.e. = 0.17, t = 1.52). However, they did report that positive feedback is less applicable to them (b = −0.44, s.e. = 0.17, t = −2.64).

Confounds

To control for potential effects of whether the participant believed the SF paradigm (yes/no), medication status (on/off) and current depression comorbidity, we took this into account in additional affective and neural analyses. These confounds had no effects on the affective results.

Handedness was also taken into account in neural analyses. The stronger precuneus activation in HC compared with BPD found after negative feedback compared with positive feedback did not survive significance threshold after current depression or handedness was taken into account.


Introduction

The advent of fMRI revolutionized psychology as it allowed, for the first time, the noninvasive mapping of human cognition. Despite this progress, traditional fMRI analyses are limited in that they can, for the most part, only ascertain the involvement of an area in a task but not its precise role in that task. Recently, model-based fMRI methods have been developed to overcome this limitation by using computational models of behavior to shed light on latent variables of the models (such as prediction errors) and their mapping to neural structures. This approach has led to important insights into the algorithms employed by the brain and has been particularly successful in understanding the neural basis of reinforcement learning (e.g. [1�]).

In a typical model-based fMRI analysis, one first specifies a model that describes the hypothesized cognitive processes underlying the behavior in question. Typically these models have one or more free parameters (e.g. learning rate in a model of trial-and-error learning). These parameters must be set to fully specify the model, which is commonly done by fitting them to the observed behavior [14]. For instance, given the model, one can find subject-specific learning rates that best explain the subject’s behavioral choices. The fully specified model is then used to generate trial-by-trial measures of latent variables in the model (e.g. action values and prediction errors) that can be regressed against neural data in order to find areas whose activity correlates with these variables in the brain.

One potential weakness of this approach is the requirement for model fitting. In many cases, the data are insufficient to precisely identify the parameter values. This can be due to limited number of trials, interactions between parameters that make them hard to disentangle [14] or lack of behavior that can be used for the fitting process (e.g., in some Pavlovian conditioning experiments). Thus a key question is: How important is the model fitting step? In other words, to what extent is model-based fMRI sensitive to errors in parameter estimation? The answer to this question will determine how hard we should work to obtain the best possible parameter fits, and will affect not only how we analyze data, but also how we design experiments in the first place.

Here we show how this question can be addressed, by analyzing the sensitivity of model-based fMRI to the learning rate parameter in simple reinforcement learning tasks. We provide analytical bounds on the sensitivity of the model-based analysis to errors in estimating the learning rate, and show through simulation how value and prediction error signals generated with one learning rate would be interpreted by a model-based analysis that used the wrong learning rate. Amazingly, we find that the results of model-based fMRI are remarkably robust to settings of the learning rate to the extent that, in some situations, setting the parameters of the model as far as possible from their true value barely affects the results. This theoretical prediction of robustness is borne out by analysis of fMRI data from two recent experiments.

Our findings are both good and bad news for model-based fMRI. The good news is that it is robust, thus errors in the learning rate will not dramatically change the results of studies seeking to localize a particular signal. The bad news, however, is that model-based fMRI is insensitive to differences in parameters, which means that one should use extreme caution when attempting to determine the computational role of a neural area (e.g., when asking whether a brain area corresponds to an outcome signal or a prediction error signal). In the Discussion we consider the extent to which this result generalizes to other parameters and other models and offer suggestions to diagnose parameter sensitivity in other models.


Good Resources for Learning Modeling of fMRI Data - Psychology

By the end of this section, you will be able to:

  • Define observational learning
  • Discuss the steps in the modeling process
  • Explain the prosocial and antisocial effects of observational learning

Previous sections of this chapter focused on classical and operant conditioning, which are forms of associative learning. In observational learning , we learn by watching others and then imitating, or modeling, what they do or say. The individuals performing the imitated behavior are called models . Research suggests that this imitative learning involves a specific type of neuron, called a mirror neuron (Hickock, 2010 Rizzolatti, Fadiga, Fogassi, & Gallese, 2002 Rizzolatti, Fogassi, & Gallese, 2006).

Humans and other animals are capable of observational learning. As you will see, the phrase “monkey see, monkey do” really is accurate ([link]). The same could be said about other animals. For example, in a study of social learning in chimpanzees, researchers gave juice boxes with straws to two groups of captive chimpanzees. The first group dipped the straw into the juice box, and then sucked on the small amount of juice at the end of the straw. The second group sucked through the straw directly, getting much more juice. When the first group, the “dippers,” observed the second group, “the suckers,” what do you think happened? All of the “dippers” in the first group switched to sucking through the straws directly. By simply observing the other chimps and modeling their behavior, they learned that this was a more efficient method of getting juice (Yamamoto, Humle, and Tanaka, 2013).

This spider monkey learned to drink water from a plastic bottle by seeing the behavior modeled by a human. (credit: U.S. Air Force, Senior Airman Kasey Close)

Imitation is much more obvious in humans, but is imitation really the sincerest form of flattery? Consider Claire’s experience with observational learning. Claire’s nine-year-old son, Jay, was getting into trouble at school and was defiant at home. Claire feared that Jay would end up like her brothers, two of whom were in prison. One day, after yet another bad day at school and another negative note from the teacher, Claire, at her wit’s end, beat her son with a belt to get him to behave. Later that night, as she put her children to bed, Claire witnessed her four-year-old daughter, Anna, take a belt to her teddy bear and whip it. Claire was horrified, realizing that Anna was imitating her mother. It was then that Claire knew she wanted to discipline her children in a different manner.

Like Tolman, whose experiments with rats suggested a cognitive component to learning, psychologist Albert Bandura’s ideas about learning were different from those of strict behaviorists. Bandura and other researchers proposed a brand of behaviorism called social learning theory, which took cognitive processes into account. According to Bandura , pure behaviorism could not explain why learning can take place in the absence of external reinforcement. He felt that internal mental states must also have a role in learning and that observational learning involves much more than imitation. In imitation, a person simply copies what the model does. Observational learning is much more complex. According to Lefrançois (2012) there are several ways that observational learning can occur:
You learn a new response. After watching your coworker get chewed out by your boss for coming in late, you start leaving home 10 minutes earlier so that you won’t be late.
You choose whether or not to imitate the model depending on what you saw happen to the model. Remember Julian and his father? When learning to surf, Julian might watch how his father pops up successfully on his surfboard and then attempt to do the same thing. On the other hand, Julian might learn not to touch a hot stove after watching his father get burned on a stove.

You learn a general rule that you can apply to other situations.

Bandura identified three kinds of models: live, verbal, and symbolic. A live model demonstrates a behavior in person, as when Ben stood up on his surfboard so that Julian could see how he did it. A verbal instructional model does not perform the behavior, but instead explains or describes the behavior, as when a soccer coach tells his young players to kick the ball with the side of the foot, not with the toe. A symbolic model can be fictional characters or real people who demonstrate behaviors in books, movies, television shows, video games, or Internet sources ([link]).

(a) Yoga students learn by observation as their yoga instructor demonstrates the correct stance and movement for her students (live model). (b) Models don’t have to be present for learning to occur: through symbolic modeling, this child can learn a behavior by watching someone demonstrate it on television. (credit a: modification of work by Tony Cecala credit b: modification of work by Andrew Hyde)

Link to Learning

Latent learning and modeling are used all the time in the world of marketing and advertising. This commercial played for months across the New York, New Jersey, and Connecticut areas, Derek Jeter, an award-winning baseball player for the New York Yankees, is advertising a Ford. The commercial aired in a part of the country where Jeter is an incredibly well-known athlete. He is wealthy, and considered very loyal and good looking. What message are the advertisers sending by having him featured in the ad? How effective do you think it is?


Limitations

Psychopy_ext debuted publically in November 2013 and thus has not been adopted and extensively tested by the community yet. It is therefore difficult to predict the learning curve of the underlying psychopy_ext philosophy and to what extent it resonates with the needs of the community. For example, many researchers are used to linear experimental and analysis scripts, while psychopy_ext relies on object-based programming concepts such as classes and modular functions in order to provide inheritance and flexibility. However, object-oriented approach also means that whenever necessary functions are not available directly from psychopy_ext or do not meet user's needs, they can be overridden or used directly from the packages that are extended, often (but not always) without affecting the rest of the workflow.

Furthermore, psychopy_ext was designed to improve a workflow of a typical PsychoPy user. Researchers that use other stimulus generation packages or even different programming languages (such as R for data analyses) will not be able to benefit from psychopy_ext as easily. Such limitation is partially a design choice to provide workflows that depend on as few tools as possible. Python has a large number of powerful packages and psychopy_ext is committed to promoting them in favor of equivalent solutions in other languages. Nonetheless, when an alternative does not exist, users can easily interact with their R (via rpy2 25 ), C/C++ (via Python's own ctypes), MATLAB (via pymatlab 26 or mlab 27 ) and a number of other kinds of scripts.


Introduction

Experimental psychology strives to explain human behavior. This implies being able to explain underlying causal mechanisms of behavior as well as to predict future behavior (Kaplan, 1973 Shmueli, 2010 Yarkoni & Westfall, 2016). In practice, however, traditional methods in experimental psychology have mainly focused on testing causal explanations. It is only in recent years that research in psychology has come to emphasize prediction (Forster, 2002 Shmueli & Koppius, 2011). Within this predictive turn, machine learning-based predictive methods have rapidly emerged as viable means to predict future observations as accurately as possible, i.e., to minimize prediction error (Breiman, 2001b Song, Mitnitski, Cox, & Rockwood, 2004).

The multivariate nature and focus on prediction error (rather than “goodness of fit”) confers these methods greater sensitivity and higher future predictive power compared to traditional methods. In experimental psychology, they are successfully used for predicting a variable of interest (e.g., experimental condition A vs. experimental condition B) from behavioral patterns of an individual engaged in a task or activity by minimizing prediction error. Current applications range from prediction of facial action recognition from facial micro-expressions to classification of intention from differences in the movement kinematics (e.g., Ansuini et al., 2015 Cavallo, Koul, Ansuini, Capozzi, & Becchio, 2016 Haynes et al., 2007 Srinivasan, Golomb, & Martinez, 2016). For example, they have been used to decode the intention in grasping an object (to pour vs. to drink) from subtle differences in patterns of hand movements (Cavallo et al., 2016). What is more, machine learning-based predictive models can be employed not only for group prediction (patients vs. controls), but also for individual prediction. Consequently, these models lend themselves as a potential diagnostic tool in clinical settings (Anzulewicz, Sobota, & Delafield-Butt, 2016 Hahn, Nierenberg, & Whitfield-Gabrieli, 2017 Huys, Maia, & Frank, 2016).

However, while the assets of predictive approaches are becoming well known, machine learning-based predictive methods still lack an established and easy-to-use software framework. Many existing implementations provide no or limited guidelines, consisting of small code snippets, or sets of packages. In addition, the use of existing packages often requires advanced programming expertise. To overcome these shortcomings, the main objective of the current paper was to build a user-friendly toolbox, “PredPsych”, endowed with multiple functionalities for multivariate analyses of quantitative behavioral data based on machine-learning models.

In the following, we present the framework of PredPsych via the analysis of a recently published multiple-subject motion capture dataset (Ansuini et al., 2015). First, we provide a brief description of the dataset and describe how to install and run PredPsych. Next, we discuss five research questions that can be addressed with the machine learning framework implemented in PredPsych. We provide guided illustrations on how to address these research questions using PredPsych along with guidelines for the best techniques to use (for an overview see Fig. 1) and reasons for caution. Because the assets of predictive approaches have been recently discussed elsewhere (Breiman, 2001b Shmueli, 2010), we only briefly deal with them here.

Overview of PredPsych functions. An overview of the research questions that can be addressed using PredPsych and the corresponding techniques


INTRODUCTION

The study of cognition has flourished in the recent decades because of the abundance of neuroimaging data that give access to brain activity in human subjects. Along the years, tools from various fields like machine learning and network theory have been brought to neuroimaging applications in order to analyze data. The corresponding tools have their own strengths, like predictability for machine learning. This article brings together recent studies based on the same whole-brain dynamic model in a unified pipeline, which is consistent from the model estimation to its analysis—in particular, the implications of the model assumptions can be evaluated at each step. This allows us to naturally combine concepts from several fields, in particular for predictability and interpretability of the data. We stress that our framework can be transposed to other dynamic models, while preserving the concepts underlying its design. In the following, we first review previous work on connectivity measures to set our formalism in context. After presenting the dynamic model (the multivariate Ornstein-Uhlenbeck process, or MOU), we discuss its optimization procedure to reproduce statistics of the fMRI/BOLD signals (spatiotemporal covariances), yielding a whole-brain effective connectivity estimate (MOU-EC). Then two MOU-EC-based applications are examined: machine learning to extract biomarkers and network analysis to interpret the estimated connectivity weights in a collective manner. Meanwhile, presenting details about our framework, we provide a critical comparison with previous studies to highlight similarities and differences. We illustrate MOU-EC capabilities in studying cognition in using a dataset where subjects were recorded in two conditions, watching a movie and a black screen (referred to as rest). We also note that the same tools can be used to examine cognitive alterations due to neuropathologies.


TASKS AND COMPUTATIONAL MODELS IMPLEMENTED IN hBayesDM

Table 1 shows the list of tasks and computational models currently implemented in the hBayesDM package (as of version 0.3.0). Note that some tasks have multiple computational models and that users can compare model performance within the hBayesDM framework (see Step-by-Step Tutorials for the hBayesDM Package). To fit models to a task, first the user must prepare trial-by-trial data as a text file (*.txt) in which each row (observation) contains the columns required for the given task (see Table 1). Users can also use each task’s sample dataset as a template.

Below, we describe each task and its computational model(s), briefly review its applications to healthy and clinical populations, and describe the model parameters. For brevity, we refer readers to original articles for the full details of the experimental design and computational models, and to the package help files for example codes that detail how to estimate and extract the parameters from each model. The package help files can be found by issuing the following command within the R console:

The command above will open the main help page, from which one can then navigate to the corresponding task/model. Users can also directly look up a help file for each task/model by calling its help file, which follows the form ?function_name (e.g., ?dd_cs see Table 1 for a list of these functions). Each help file provides working codes to run a concrete real-data example from start to finish.

The Delay-Discounting Task

The delay-discounting task (DDT Rachlin, Raineri, & Cross, 1991) is designed to estimate how much an individual discounts temporally delayed larger outcomes in comparison to smaller–sooner ones. On each trial of the DDT, two options are presented: a sooner and smaller reward (e.g., $5 now) and a later and larger reward (e.g., $20 next week). Subjects are asked to choose which option they prefer on each trial.

The DDT has been widely studied in healthy populations (e.g., Green & Myerson, 2004 Kable & Glimcher, 2007) and delay discounting has been associated with cognitive abilities such as intelligence (Shamosh et al., 2008) and working memory (Hinson, Jameson, & Whitney, 2003). Steeper delay discounting is a strong behavioral marker for addictive behaviors (Ahn, Ramesh, Moeller, & Vassileva, 2016 Ahn & Vassileva, 2016 Bickel, 2015 Green & Myerson, 2004 MacKillop, 2013) and has also been associated with other psychiatric conditions, including schizophrenia (Ahn, Rass, et al., 2011 Heerey, Matveeva, & Gold, 2011 Heerey, Robinson, McMahon, & Gold, 2007) and bipolar disorder (Ahn, Rass, et al., 2011). The hBayesDM package currently contains three different models for the DDT:

dd_cs (constant-sensitivity model Ebert & Prelec, 2007)

Exponential discounting rate (0 <r < 1)

Inverse temperature (0 < β< 5)

dd_exp (exponential model Samuelson, 1937)

Exponential discounting rate (0 < r < 1)

Inverse temperature (0 < β < 5)

dd_hyperbolic (hyperbolic model Mazur, 1987)

Inverse temperature (0 < β < 5)

DDT: Parameter descriptions

In the exponential and hyperbolic models, temporal discounting of future (i.e., delayed) rewards is described by a single parameter, the discounting rate (0 < r < 1), which indicates how much future rewards are discounted. High and low discounting rates reflect greater and lesser discounting of future rewards, respectively. In the exponential and hyperbolic models, the value of a delayed reward is discounted in an exponential and hyperbolic form, respectively. The constant-sensitivity (CS) model has an additional parame ter, called time sensitivity (0 < s < 10). When s is equal to 1, the CS model reduces to the ex ponential model. Values of s near 0 lead to a simple “present–future dichotomy” in which all future rewards are steeply discounted to a certain subjective value, irrespective of delays. Values of s greater than 1 result in an “extended-present” heuristic, in which rewards during the extended present are valued nearly equally, and future rewards outside the extended present have zero value.

All models use the softmax choice rule with an inverse-temperature parameter (Kaelbling, Littman, & Moore, 1996 Luce, 1959), which reflects how deterministically individuals’ choices are made with respect to the strength (subjective value) of the alternative choices. High and low inverse temperatures represent more deterministic and more random choices, respectively.

The Iowa Gambling Task

The Iowa Gambling Task (IGT Bechara, Damasio, Damasio, & Anderson, 1994) was originally developed to assess decision-making deficits of patients with ventromedial prefrontal cortex lesions. On each trial, subjects are presented with four decks of cards. Two decks are advantageous (good) and the other two decks disadvantageous (bad), in terms of long-term gains. Subjects are instructed to choose decks that maximize long-term gains, which they are expected to learn by trial and error. From a statistical perspective, the IGT is a four-armed bandit problem.

The IGT has been used extensively to study decision-making in several psychiatric populations (Ahn et al., 2014 Bechara & Martin, 2004 Bechara et al., 2001 Bolla et al., 2003 Grant, Contoreggi, & London, 2000 Vassileva, Gonzalez, Bechara, & Martin, 2007). The hBayesDM package currently contains three different models for the IGT:

igt_pvl_decay (Ahn et al., 2014 Ahn, Krawitz, Kim, Busemeyer, & Brown, 2011)

igt_pvl_delta (Ahn, Busemeyer, Wagenmakers, & Stout, 2008)

igt_vpp (Worthy, Pang, & Byrne, 2013)

Perseverance gain impact ( ⁠ − ∞ < ϵ p < ∞ ⁠ )

Perseverance loss impact ( ⁠ − ∞ < ϵ n < ∞ ⁠ )

Perseverance decay rate (0 < k < 1)

Reinforcement-learning weight (0 < ω < 1)

IGT: Parameter descriptions

The Prospect Valence Learning (PVL) model with delta rule (PVL-delta) uses a Rescorla–Wagner updating equation (Rescorla & Wagner, 1972) to update the expected value of the selected deck on each trial. The expected value is updated with a learning rate parameter (0 < A < 1) and a prediction error term, where A close to 1 places more weight on recent outcomes, and A close to 0 places more weight on past outcomes the prediction error is the difference between the predicted and experienced outcomes. The shape (0 < α < 2) and loss aversion (0 < λ < 1) parameters control the shape of the utility (power) function and the effect of losses relative to gains, respectively. Values of α greater than 1 indicate that the utility of an outcome is convex, and values less than 1 indicate that the utility is concave. Values of λ greater than or less than 1 indicate greater or reduced sensitivity, respectively, to losses relative to gains. The consistency parameter (0 < c < 1) is an inverse-temperature parameter (refer to The Delay-Discounting Task for details).

The PVL model with decay rule (PVL-decay) uses the same shape, loss aversion, and consistency parameters as the PVL-delta, but a recency parameter (0 < A < 1) is used for value updating. The recency parameter indicates how much the expected values of all decks are discounted on each trial.

The PVL-delta model is nested within the Value-Plus-Perseverance (VPP) model, which is a hybrid model of PVL-delta and a heuristic strategy of perseverance. The perseverance decay rate (0 < k < 1) decays the perseverance strengths of all choices on each trial, akin to how PVL-decay’s recency parameter affects the expected value. The parameters for the impacts of gain ( ⁠ − ∞ < ϵ p < ∞ ⁠ ) and loss ( ⁠ − ∞ < ϵ n < ∞ ⁠ ) on perseverance reflect how the perseverance value changes after wins and losses, respectively positive values reflect a tendency to make the same choice, and negative values a tendency to switch choices. The reinforcement-learning weight (0 < ω < 1) is a mixing parameter that controls how much decision weight is given to the reinforcement-learning versus the perseverance term. High versus low values reflect more versus less reliance on the reinforcement-learning term, respectively.

The Orthogonalized Go/No-Go Task

Animals use Pavlovian and instrumental controllers when taking action. The Pavlovian controller selects approaching/engaging actions with predictors of appetitive outcomes or avoiding/inhibiting actions with predictors of aversive outcomes. The instrumental controller, on the other hand, selects actions on the basis of the action–outcome contingencies of the environment. The two controllers typically cooperate, but sometimes they compete with each other (e.g., Dayan, Niv, Seymour, & Daw, 2006). The orthogonalized go/no-go (GNG) task (Guitart-Masip et al., 2012) is designed to examine the interaction between the two controllers by orthogonalizing the action requirement (go vs. no go) versus the valence of the outcome (winning vs. avoiding losing money).

Each trial of the orthogonal GNG task has three events in the following sequence: cue presentation, target detection, and outcome presentation. First, one of four cues is presented (“Go to win,” “Go to avoid (losing),” “NoGo to win,” or “NoGo to avoid”). After some delay, a target (“circle”) is presented on the screen, and subjects need to respond with either a go (press a button) or no go (withhold the button press). Then subjects receive a probabilistic (e.g., 80%) outcome. See Guitart-Masip et al. (2012) for more details of the experimental design.

gng_m1 (M1 in Guitart-Masip et al., 2012)

Effective size of a reinforcement ( ⁠ 0 < ρ < ∞ ⁠ )

gng_m2 (M2 in Guitart-Masip et al., 2012)

Effective size of a reinforcement ( ⁠ 0 < ρ < ∞ ⁠ )

gng_m3 (M3 in Guitart-Masip et al., 2012)

Effective size of a reinforcement ( ⁠ 0 < ρ < ∞ ⁠ )

gng_m4 (M5 in Cavanagh et al., 2013)

Effective size of reward reinforcement ( ⁠ 0 < ρ r e w < ∞ ⁠ )

Effective size of punishment reinforcement ( ⁠ 0 < ρ p u n < ∞ ⁠ )

GNG: Parameter descriptions

All models for the GNG task include a lapse rate parameter (0 < ξ < 1), a learning rate parameter (0 < ϵ < 1 refer to IGT: Parameter descriptions for details), and a parameter for the effective size of reinforcement ( ⁠ 0 < ρ < ∞ ⁠ ). The lapse rate parameter captures the proportion of random choices made, regardless of the strength of their action probabilities. The ρ parameter determines the effective size of a reinforcement. The gng_m4 model has separate effective size parameters for reward ( ⁠ 0 < ρ r e w < ∞ ⁠ ) and punishment ( ⁠ 0 < ρ p u n < ∞ ⁠ ), allowing for rewards and punishments to be evaluated differently.

Three GNG models ( gng_m2 , gng_m3 , and gng_m4 ) include a go bias parameter ( ⁠ − ∞ < b < ∞ ⁠ ). Go bias reflects a tendency to respond (go), regardless of the action–outcome associations high or low values for b reflect a high or a low tendency to make a go (motor) response, respectively.

Two GNG models ( gng_m3 and gng_m4 ) include a Pavlovian bias parameter ( ⁠ − ∞ < π < ∞ ⁠ ). Pavlovian bias reflects a tendency to make responses that are Pavlovian congruent: that is, to promote or inhibit goif the expected value of the stimulus is positive (appetitive) or negative (aversive), respectively.

Probabilistic Reversal-Learning Task

Environments often have higher-order structures, such as interdependencies between the stimuli, actions, and outcomes. In such environments, subjects need to infer and make use of the structures in order to make optimal decisions. In the probabilistic reversal-learning (PRL) task, higher-order structure exists such that the reward distributions of two stimuli are anticorrelated (e.g., if one option has a reward rate of 80%, the other option has a reward rate of [100 – 80]%, which is 20%). Subjects need to learn the higher-order structure and take it into account to optimize their decision-making and to maximize earnings.

In a typical PRL task, two stimuli are presented to a subject. The choice of a “correct” or good stimulus will usually lead to a monetary gain (e.g., 70%), whereas the choice of an “incorrect” or bad stimulus will usually lead to a monetary loss. The reward contingencies will reverse at fixed points (e.g., Murphy, Michael, Robbins, & Sahakian, 2003) or will be triggered by consecutive correct choices (Cools, Clark, Owen, & Robbins, 2002 Hampton et al., 2006).

The PRL task has been widely used to study reversal learning in healthy individuals (Cools et al., 2002 den Ouden et al., 2013 Gläscher et al., 2009). The PRL has been also used to study decision-making deficits associated with prefrontal cortex lesions (e.g., Fellows & Farah, 2003 Rolls, Hornak, Wade, & McGrath, 1994), as well as Parkinson’s disease (e.g., Cools, Lewis, Clark, Barker, & Robbins, 2007 Swainson et al., 2000), schizophrenia (e.g., Waltz & Gold, 2007), and cocaine dependence (Ersche, Roiser, Robbins, & Sahakian, 2008). The hBayesDM package currently contains three models for PRL tasks:

Inverse temperature (0 < β < 1)

prl_fictitious (Gläscher et al., 2009)

Inverse temperature (0 < β < 1)

Inverse temperature (0 < β < 1)

PRL: Parameter descriptions

All PRL models above contain learning rate parameters (refer to IGT: Parameter descriptions for details). The prl_rp model has separate learning rates for rewards (0 < Arew < 1) and punishments (0 < Apun < 1). In the prl_ewa model (Camerer & Ho, 1999), low and high values of φ reflect more weight on recent and on past outcomes, respectively. All PRL models also contain an inverse-temperature parameter (refer to DDT: Parameter descriptions for details).

The prl_ewa model proposed in den Ouden et al. (2013) contains a decay rate parameter (0 < ρ <). The experienced weight of the chosen option is decayed in proportion to ρ, and 1 is added to the weight on each trial. Thus, a higher value of ρ indicates slower decaying or updating of the experienced weight.

The prl_fictitious model contains an indecision point parameter (0 < α < 1). This point reflects a subject’s amount of bias or preference toward an option. High or low values for α indicate a greater or a lesser preference for one option over the other.

Risk Aversion Task

The risk aversion (RA Sokol-Hessner, Camerer, & Phelps, 2013 Sokol-Hessner et al., 2009) task is a description-based task (Hertwig, Barron, Weber, & Erev, 2004) in which the possible outcomes of all options and their probabilities are provided to subjects on each trial. In the RA task, subjects choose either a sure option with a guaranteed amount or a risky option (i.e., gamble) with possible gains and/or loss amounts. Subjects are asked to choose which option they prefer (or whether they want to accept the gamble) on each trial. In the RA task, subjects per form two cognitive regulation (attend and regulate) conditions in a within-subjects design: in the attend condition, subjects are asked to focus on each choice in isolation, whereas in the regulate condition, subjects are asked to emphasize choices in their greater context (see Sokol-Hessner et al., 2009, for the details). The data published in Sokol-Hessner et al. (2009) can be found using the following paths (these paths are also available in the RA model help files):

path_to_attend_data = system.file("extdata/ra_data_attend.txt", package="hBayesDM")

path_to_regulate_data = system.file("extdata/ra_data_reappraisal. txt", package="hBayesDM").

ra_prospect (Sokol-Hessner et al., 2009)

Inverse temperature ( ⁠ 0 < τ < ∞ ⁠ )

ra_noLA (no loss aversion [LA] parameter for tasks that involve only gains)

Inverse temperature ( ⁠ 0 < τ < ∞ ⁠ )

ra_noRA (no risk aversion [RA] parameter see, e.g., Tom et al., 2007)

Inverse temperature ( ⁠ 0 < τ < ∞ ⁠ )

RA: Parameter descriptions

The ra_prospect model includes a loss aversion parameter (0 < λ < 5), a risk aversion parameter (0 < ρ < 2), and an inverse-temperature parameter ( ⁠ 0 < τ < ∞ ⁠ ). See DDT: Parameter descriptions for inverse temperature. The risk aversion and loss aversion parameters in the RA models are similar to those in the IGT models. However, in RA models they control the valuations of the possible choices under consideration, as opposed to the evaluation of outcomes after they are experienced (Rangel et al., 2008).

The ra_noLA and ra_noRA models are nested within the ra_prospect model, with either loss aversion ( ra_noLA ) or risk aversion ( ra_noRA ) set to 1.

Two-Armed Bandit Task

Multi-armed bandit tasks or problems typically refer to situations in which gamblers decide which gamble or slot machine to play in order to maximize long-term gain. Many reinforcement-learning tasks and experience-based (Hertwig et al., 2004) tasks can be classified as bandit problems. In a typical two-armed bandit task, subjects are presented with two options (stimuli) on each trial. Feedback is given after a stimulus is chosen. Subjects are asked to maximize positive feedback as they make choices, and they are expected to learn stimulus–outcome contingencies from trial-by-trial experience. The hBayesDM package currently contains a simple model for a two-armed bandit task:

bandit2arm_delta (Hertwig et al., 2004)

Inverse temperature (0 < τ < 1)

Two-armed bandit: Parameter descriptions

The bandit2arm_delta model uses the Rescorla–Wagner rule (see IGT: Parameter descriptions) for updating the expected value of the chosen option, along with the softmax choice rule with an inverse temperature (see DDT: Parameter descriptions).

The Ultimatum Game (Norm-Training)

The abilities to understand the social norms of an environment and to adaptively cope with those norms are critical for normal social functioning (Gu et al., 2015 Montague & Lohrenz, 2007). The ultimatum game (UG) is a widely used social decision-making task that examines how individuals respond to deviations from social norms and adapt to norms in a changing environment.

The UG involves two players: a proposer and a responder. On each trial, the proposer is given some amount of money to divide up amongst the two players. After deciding how to divide the money, an offer is made to the responder. The responder can either accept the offer (and the money is split as offered) or reject it (both players receive nothing). Previous studies have shown that the most common offer is approximately 50% of the total amount, and that “unfair” offers (<∼20% of the total amount) are often rejected, even though it is optimal to accept any offer (Güth, Schmittberger, & Schwarze, 1982 Sanfey, 2003 Thaler, 1988). A recent study examined the computational substrates of norm adjustment by using a norm-training UG in which subjects played the role of responder in a norm-changing environment (Xiang et al., 2013).

The UG has been used to investigate the social decision-making of individuals with ventromedial prefrontal (Gu et al., 2015 Koenigs et al., 2007) and insular cortex (Gu et al., 2015) lesions, as well as of patients with schizophrenia (Agay, Kron, Carmel, Mendlovic, & Levkovitz, 2008 Csukly, Polgár, Tombor, Réthelyi, & Kéri, 2011). The hBayesDM package currently contains two models for the UG (or norm-training UG) in which subjects play the role of responder:

Inverse temperature (0 < τ < 10)

Inverse temperature (0 < τ < 10)

Norm adaptation rate (0 < ϵ < 1)

UG: Parameter descriptions

The ug_bayes model assumes that the subject (responder) behaves like a Bayesian ideal observer (Knill & Pouget, 2004), so that the expected offer made by the proposer is updated in a Bayesian fashion. This is in contrast to the ug_delta model, which assumes that the subject (again the responder) updates the expected offer using a Rescorla–Wagner (delta) updating rule. Both the ug_bayes and ug_delta models contain envy (0 < α < 20) and inverse-temperature (0 < τ < 10 refer to DDT: Parameter descriptions for details) parameters. The envy parameter reflects sensitivity to norm prediction error (see below for the ug_bayes model), where higher or lower values indicate greater or lesser sensitivity, respectively. In the UG, prediction error reflects the difference between the expected and received offers.

In the ug_bayes model, the utility of an offer is adjusted by two norm prediction errors: (1) negative prediction errors, multiplied by an envy parameter (0 < α < 20), and (2) positive prediction errors, multiplied by a guilt parameter (0 < β < 10). Higher and lower values for envy (α) and guilt (β) reflect greater and lesser sensitivity to negative and positive norm prediction errors, respectively. The ug_delta model includes only the envy parameter (Gu et al., 2015).


Limitations

Psychopy_ext debuted publically in November 2013 and thus has not been adopted and extensively tested by the community yet. It is therefore difficult to predict the learning curve of the underlying psychopy_ext philosophy and to what extent it resonates with the needs of the community. For example, many researchers are used to linear experimental and analysis scripts, while psychopy_ext relies on object-based programming concepts such as classes and modular functions in order to provide inheritance and flexibility. However, object-oriented approach also means that whenever necessary functions are not available directly from psychopy_ext or do not meet user's needs, they can be overridden or used directly from the packages that are extended, often (but not always) without affecting the rest of the workflow.

Furthermore, psychopy_ext was designed to improve a workflow of a typical PsychoPy user. Researchers that use other stimulus generation packages or even different programming languages (such as R for data analyses) will not be able to benefit from psychopy_ext as easily. Such limitation is partially a design choice to provide workflows that depend on as few tools as possible. Python has a large number of powerful packages and psychopy_ext is committed to promoting them in favor of equivalent solutions in other languages. Nonetheless, when an alternative does not exist, users can easily interact with their R (via rpy2 25 ), C/C++ (via Python's own ctypes), MATLAB (via pymatlab 26 or mlab 27 ) and a number of other kinds of scripts.


FMRI in Healthy Aging

From the behavioral point of view, it is known that some adults are able to maintain their cognitive capabilities at high levels, in contrast with other persons who show clear cognitive declines with advancing age. It has been hypothesized that this variability depends on neurofunctional resources. However, the exact mechanisms that lead to such wide differences are still unclear (Park and Reuter-Lorenz, 2009).

The use of task-fMRI in aging has revealed a complex pattern of brain activity changes, which is characterized by both, decreases and increases in old subjects compared to young subjects (Grady, 2012). In some cases, the diversity of findings depends on many variables, such as the cognitive tests used and their level of difficulty (Grady et al., 2006). Nonetheless, there is a relative consensus that there is an age-related increase of brain activity in the (PFC Turner and Spreng, 2012), while the findings as regards reduced activation are localized more heterogeneously in the brain.

In this part, we will review some of these main theories that have appeared in the attempt to explain the trajectories of brain changes and their relationship with cognition. It is important to note that whereas earlier or “more classical” views aimed to provide meaningful interpretations of a variety of isolated phenomena, such as the increased or the decreased regional brain activity in old compared with young subjects, more recent theories aim to provide a global, integrative interpretation of brain changes.

Classical Theories Derived from Task-fMRI Studies

In general, regional hyperactivation has been interpreted as compensation (or an attempt to compensate), whereas a failure to activate or reduced activation has been typically related with cognitive deficits associated with aging. Two main hypotheses were proposed to explain the nature of these age-related activity changes: the dedifferentiation hypothesis and the compensation hypothesis.

By one hand, the term dedifferentiation is described as the loss of functional specificity in the brain regions that are engaged during the performance of a task (Park et al., 2004 Rajah and D𠆞sposito, 2005). In neurobiological terms, it has been suggested that this pattern of changes is caused by a chain of processes which starts from a decline in the dopaminergic neuromodulation that produces increases in neural noise, leading to less distinctive cortical representations (Li et al., 2001).

On the other hand, the compensation hypothesis in aging states that older adults are able to recruit higher levels of activity in comparison to young subjects in some brain areas to compensate for functional deficits located somewhere else in the brain. This increased activity is often seen in frontal regions (Park and Reuter-Lorenz, 2009 Turner and Spreng, 2012). The first studies suggesting compensatory mechanisms appeared early in the literature and used PET during the performance of visuospatial (Grady et al., 1994) or episodic memory (Cabeza et al., 1997 Madden et al., 1999) tasks. Later on, these findings were replicated with fMRI (Cabeza et al., 2002).

Furthermore, the different patterns of spatial localization of the compensation-related mechanisms leaded to the formulation of three main cognitive models:

(1) The Hemispheric Asymmetry Reduction in Old Adults (HAROLD) model (Cabeza, 2002) states that older adults use a less lateralized pattern of activity in comparison with young subjects during the performance of a task, which is compensatory. This reduced lateralization was mainly observed in frontal areas, during the performance of episodic memory and working memory tasks (Cabeza et al., 2002 Cabeza, 2004).

(2) The Compensation-Related Utilization of Neural Circuits Hypothesis (CRUNCH Reuter-Lorenz and Cappell, 2008 Schneider-Garces et al., 2010) defends that, in older adults, higher neural recruitment occurs in cognitive levels that typically imply lower brain activity in younger subjects. This effect has been observed in the PFC and also in the parietal cortex, concretely in the precuneus and posterior cingulate and both in episodic memory tasks (Spaniol and Grady, 2012) and in working memory tasks (Mattay et al., 2006 Reuter-Lorenz and Cappell, 2008).

(3) The Posterior-Anterior Shift with Aging (PASA) was experimentally proved by Davis et al., who used two different tasks, visuoperceptive and episodic retrieval and found that older subjects had deficits to activate regions in the posterior midline cortex accompanied with increased activity in medial frontal cortex (Davis et al., 2008).

Global, Integrative Theories of Cognitive Function and the Aging Brain

With the unique information provided by fMRI activity and with the classification described above, which presents the models as being exclusive between them, it seems difficult to discern which of the proposed model better explains the age-related changes in cognition.

More recently, an important contribution to the interpretation of these models has been given by multimodal studies that integrate structural and functional brain measures. For example, in some cases, it has been reported that reduced activity in task-related regions correlated positively with brain atrophy in the same brain regions (Brassen et al., 2009 Rajah et al., 2011), whereas other studies have reported correlations between the increased functional activity in the PFC and the preserved structural integrity of the entorhinal cortex and other medial temporal lobe (MTL) structures (Rosen et al., 2005 Braskie et al., 2009). Given this, some authors have theorized that while increased activity in the PFC may be triggered by the atrophy of frontal GM, which is a commonly reported feature in aging, the compensatory role of this increased activity may depend on the preserved structural integrity of distal regions mainly in the MTL (Maillet and Rajah, 2013).

Therefore, and mainly thanks to the new advances in neuroimaging techniques, it has been suggested that cognitive function in aging is a result of a sum of processes, including structural and functional brain measures as well as external factors. In this regard, the scaffolding theory of aging and cognition (STAC) states that there is a process in the aging brain, called compensatory scaffolding that entails the engagement of additional neural resources (in terms of network reorganization) providing a support to preserve cognitive function in the face of structural and functional decline (Park and Reuter-Lorenz, 2009). This theory has been recently revised in order to include the more recent findings on the field, obtained mainly from longitudinal and interventional studies. As a result, the STAC-r is a conceptual model that extends the STAC by incorporating life-course influences that enhance, preserve, or compromise brain status, compensatory potential and cognitive function over time (Reuter-Lorenz and Park, 2014).

In a similar sense, Walhovd et al. (2014) proposed a system-vulnerability view of cognition in aging. According to them, the age-associated cognitive decline would be the result of a life-long accumulation of impact that alters brain function and structure in a multidimensional way, affecting a wide range of neuroimage markers such as structural integrity, functional activity and connectivity, glucose metabolism, or amyloid deposition. According to this view some particular brain systems such as the hippocampus and posteromedial regions would be particularly vulnerable to ageing effects, related to its central role as mechanisms subtending lifetime brain plasticity (Fjell et al., 2014).

Finally, a complementary hypothesis, also emerged from the results of longitudinal studies is the 𠇋rain maintenance,” which states that the lack of changes in brain structural and functional markers would allow some people to show little or no age-related cognitive decline. The conceptual idea of brain maintenance was motivated by the fact that increased functional activity in HA do not necessarily imply up-regulation of functional networks over time. Therefore, according to maintenance, the best predictors of successful performance in aging would be the minimization of chemical, structural and functional changes over time (Nyberg et al., 2012).


Good Resources for Learning Modeling of fMRI Data - Psychology

By the end of this section, you will be able to:

  • Define observational learning
  • Discuss the steps in the modeling process
  • Explain the prosocial and antisocial effects of observational learning

Previous sections of this chapter focused on classical and operant conditioning, which are forms of associative learning. In observational learning , we learn by watching others and then imitating, or modeling, what they do or say. The individuals performing the imitated behavior are called models . Research suggests that this imitative learning involves a specific type of neuron, called a mirror neuron (Hickock, 2010 Rizzolatti, Fadiga, Fogassi, & Gallese, 2002 Rizzolatti, Fogassi, & Gallese, 2006).

Humans and other animals are capable of observational learning. As you will see, the phrase “monkey see, monkey do” really is accurate ([link]). The same could be said about other animals. For example, in a study of social learning in chimpanzees, researchers gave juice boxes with straws to two groups of captive chimpanzees. The first group dipped the straw into the juice box, and then sucked on the small amount of juice at the end of the straw. The second group sucked through the straw directly, getting much more juice. When the first group, the “dippers,” observed the second group, “the suckers,” what do you think happened? All of the “dippers” in the first group switched to sucking through the straws directly. By simply observing the other chimps and modeling their behavior, they learned that this was a more efficient method of getting juice (Yamamoto, Humle, and Tanaka, 2013).

This spider monkey learned to drink water from a plastic bottle by seeing the behavior modeled by a human. (credit: U.S. Air Force, Senior Airman Kasey Close)

Imitation is much more obvious in humans, but is imitation really the sincerest form of flattery? Consider Claire’s experience with observational learning. Claire’s nine-year-old son, Jay, was getting into trouble at school and was defiant at home. Claire feared that Jay would end up like her brothers, two of whom were in prison. One day, after yet another bad day at school and another negative note from the teacher, Claire, at her wit’s end, beat her son with a belt to get him to behave. Later that night, as she put her children to bed, Claire witnessed her four-year-old daughter, Anna, take a belt to her teddy bear and whip it. Claire was horrified, realizing that Anna was imitating her mother. It was then that Claire knew she wanted to discipline her children in a different manner.

Like Tolman, whose experiments with rats suggested a cognitive component to learning, psychologist Albert Bandura’s ideas about learning were different from those of strict behaviorists. Bandura and other researchers proposed a brand of behaviorism called social learning theory, which took cognitive processes into account. According to Bandura , pure behaviorism could not explain why learning can take place in the absence of external reinforcement. He felt that internal mental states must also have a role in learning and that observational learning involves much more than imitation. In imitation, a person simply copies what the model does. Observational learning is much more complex. According to Lefrançois (2012) there are several ways that observational learning can occur:
You learn a new response. After watching your coworker get chewed out by your boss for coming in late, you start leaving home 10 minutes earlier so that you won’t be late.
You choose whether or not to imitate the model depending on what you saw happen to the model. Remember Julian and his father? When learning to surf, Julian might watch how his father pops up successfully on his surfboard and then attempt to do the same thing. On the other hand, Julian might learn not to touch a hot stove after watching his father get burned on a stove.

You learn a general rule that you can apply to other situations.

Bandura identified three kinds of models: live, verbal, and symbolic. A live model demonstrates a behavior in person, as when Ben stood up on his surfboard so that Julian could see how he did it. A verbal instructional model does not perform the behavior, but instead explains or describes the behavior, as when a soccer coach tells his young players to kick the ball with the side of the foot, not with the toe. A symbolic model can be fictional characters or real people who demonstrate behaviors in books, movies, television shows, video games, or Internet sources ([link]).

(a) Yoga students learn by observation as their yoga instructor demonstrates the correct stance and movement for her students (live model). (b) Models don’t have to be present for learning to occur: through symbolic modeling, this child can learn a behavior by watching someone demonstrate it on television. (credit a: modification of work by Tony Cecala credit b: modification of work by Andrew Hyde)

Link to Learning

Latent learning and modeling are used all the time in the world of marketing and advertising. This commercial played for months across the New York, New Jersey, and Connecticut areas, Derek Jeter, an award-winning baseball player for the New York Yankees, is advertising a Ford. The commercial aired in a part of the country where Jeter is an incredibly well-known athlete. He is wealthy, and considered very loyal and good looking. What message are the advertisers sending by having him featured in the ad? How effective do you think it is?


Results

Applicability of feedback to self-views

Applicability ratings were affected by a valence by group interaction [χ 2 (4) = 106.19, p < 0.001], see online Supplementary Tables S4 and S5 for model comparisons and parameters. Consistent with our hypothesis, BPD patients rated the intermediate (b = −0.40, s.e. = 0.16, t = −2.50) and especially negative feedback (b = −0.53, s.e. = 0.16, t = −3.36) as more applicable compared with HC, see Fig. 1a. Positive feedback was rated as less applicable by BPD compared with HC (b = 1.07, s.e. = 0.16, t = 6.74). Compared with LSE, BPD also rated negative feedback as more applicable (b = −0.43, s.e. = 0.17, t = −2.43) and positive feedback as less applicable (b = 0.63, s.e. = 0.18, t = 3.61) but did not differ in applicability of intermediate feedback (b = −0.15, s.e. = 0.18, t = −0.83). Moreover, using the valence ratings (i.e. degree of negativity or positivity), we found that all three groups rated the valence of the words in a similar way [χ 2 (2) = 2.4, p = 0.307], with negative and positive words being more emotional than intermediate words, see online Supplementary Tables S2 and S3. However, there was a trend for an interaction effect between valence and group [χ 2 (4) = 8.42, p = 0.077], which could indicate that negative feedback was rated slightly less negative by BPD than HC (b = −0.43, s.e. = 0.16, t = −2.69), see also online Supplementary Table S3 for model parameters.

Fig. 1. (a) Mean applicability ratings by group after negative, intermediate and positive feedback (error bars indicate 95% confidence intervals). (b) Illustration of mood ratings by group after negative, intermediate and positive feedback at the mean level of applicability of feedback. (c) Illustration of mean mood ratings by group after negative, intermediate and positive feedback for not to very applicable feedback. Applicability has a greater impact on mood during negative and intermediate feedback than positive feedback. Applicability has a greater impact on the mood of HC compared with BPD. Mood rating is rescaled to scores 1–4 for display purposes.

Affective responses

Mood was affected by group [χ 2 (2) = 11.4, p = 0.003] with BPD reporting a worse mood than HC overall (b = 0.81, s.e. = 0.19, t = 4.28), see Table 2 and online Supplementary Table S6. Valence moderated the group effect [χ 2 (4) = 39.89, p < 0.001]. BPD reported a worse mood after negative (b = −0.14, s.e. = 0.15, t = −0.95) and intermediate feedback (b = −0.81, s.e. = 0.19, t = 4.28) and similar mood after positive feedback (b = −0.49, s.e. = 0.13, t = −3.70) compared with HC, see Fig. 1b. Compared with LSE, BPD reported equal mood after intermediate (b = 0.19, s.e. = 0.21, t = 0.91) and positive feedback (b = 0.11, s.e. = 0.15, t = 0.75) but a better mood after negative feedback (b = −0.50, s.e. = 0.16, t = −3.10).

Table 2. Effect parameters of model predicting mood ratings by valence category (intermediate = reference), group (BPD = reference), and applicability of feedback and two-way interactions

Significance level (***<0.001, **<0.01, *<0.05, ^<0.10) based on χ 2 test of model comparisons, see online Supplementary Table S6.

Applicability moderated the group effect as well [χ 2 (4) = 14.8, p = 0.005]. BPD mood ratings were less affected by applicability compared with HC (b = 0.07, s.e. = 0.03, t = 2.27), but did not differ in this respect from LSE (b = 0.01, s.e. = 0.03, t = 0.23), see Fig. 1c. There was no three-way interaction of valence by applicability by group [χ 2 (4) = 8.0, p = 0.090].

Neural responses

Groups differed in neural correlates of feedback valence, see Table 3 for clusters and peak voxels Footnote † Footnote 1 . In response to negative feedback compared with positive feedback, HC showed stronger left precuneus activation, whereas BPD showed relatively low and equal precuneus activation for negative and positive feedback, see Fig. 2. In this precuneus cluster, LSE showed relatively high and equal activation for negative and positive feedback, albeit not significantly different from BPD, see Fig. 2. In response to positive compared with negative feedback, HC showed stronger right anterior TPJ activation, whereas BPD showed the reverse pattern, with stronger TPJ activation for negative feedback compared with positive feedback. Compared with LSE, BPD showed stronger left precuneus activation during negative compared with positive feedback, see Table 3 and Fig. 2. However, this cluster in the left precuneus did not overlap with the cluster found in comparison to HC. Groups did not differ in neural correlates of applicability. The three-way interaction of applicability by negative valence of BPD compared with HC in the motor cortex, superior parietal lobule and inferior parietal lobule is probably attributable to button press movements (Mars et al., Reference Mars, Jbabdi, Sallet, O'Reilly, Croxson, Olivier, Noonan, Bergmann, Mitchell, Baxter, Behrens, Johansen-Berg, Tomassini, Miller and Rushworth 2011).

Fig. 2. Left: Clusters of neural activation indicating HC > BPD (blue) and BPD > LSE (orange). Right: Mean contrast values for the HC > BPD clusters (blue clusters) by group and contrast.

Table 3. Selected neural correlates for group comparisons on contrasts of valence and applicability of feedback a , cluster corrected Z = 2.3, cluster p < 0.05

a Contrasts without any above threshold clusters are not reported in this table.

Exploratory findings

For exploratory purposes, we checked whether LSE differed in self-views from HC by rerunning the model with applicability ratings as an outcome but with HC set as a reference group instead of BPD. We found that despite lower self-esteem, LSE did not report that negative feedback was more applicable to them (b = 0.11, s.e. = 0.17, t = 0.65), neither was intermediate feedback (b = 0.26, s.e. = 0.17, t = 1.52). However, they did report that positive feedback is less applicable to them (b = −0.44, s.e. = 0.17, t = −2.64).

Confounds

To control for potential effects of whether the participant believed the SF paradigm (yes/no), medication status (on/off) and current depression comorbidity, we took this into account in additional affective and neural analyses. These confounds had no effects on the affective results.

Handedness was also taken into account in neural analyses. The stronger precuneus activation in HC compared with BPD found after negative feedback compared with positive feedback did not survive significance threshold after current depression or handedness was taken into account.


Introduction

The advent of fMRI revolutionized psychology as it allowed, for the first time, the noninvasive mapping of human cognition. Despite this progress, traditional fMRI analyses are limited in that they can, for the most part, only ascertain the involvement of an area in a task but not its precise role in that task. Recently, model-based fMRI methods have been developed to overcome this limitation by using computational models of behavior to shed light on latent variables of the models (such as prediction errors) and their mapping to neural structures. This approach has led to important insights into the algorithms employed by the brain and has been particularly successful in understanding the neural basis of reinforcement learning (e.g. [1�]).

In a typical model-based fMRI analysis, one first specifies a model that describes the hypothesized cognitive processes underlying the behavior in question. Typically these models have one or more free parameters (e.g. learning rate in a model of trial-and-error learning). These parameters must be set to fully specify the model, which is commonly done by fitting them to the observed behavior [14]. For instance, given the model, one can find subject-specific learning rates that best explain the subject’s behavioral choices. The fully specified model is then used to generate trial-by-trial measures of latent variables in the model (e.g. action values and prediction errors) that can be regressed against neural data in order to find areas whose activity correlates with these variables in the brain.

One potential weakness of this approach is the requirement for model fitting. In many cases, the data are insufficient to precisely identify the parameter values. This can be due to limited number of trials, interactions between parameters that make them hard to disentangle [14] or lack of behavior that can be used for the fitting process (e.g., in some Pavlovian conditioning experiments). Thus a key question is: How important is the model fitting step? In other words, to what extent is model-based fMRI sensitive to errors in parameter estimation? The answer to this question will determine how hard we should work to obtain the best possible parameter fits, and will affect not only how we analyze data, but also how we design experiments in the first place.

Here we show how this question can be addressed, by analyzing the sensitivity of model-based fMRI to the learning rate parameter in simple reinforcement learning tasks. We provide analytical bounds on the sensitivity of the model-based analysis to errors in estimating the learning rate, and show through simulation how value and prediction error signals generated with one learning rate would be interpreted by a model-based analysis that used the wrong learning rate. Amazingly, we find that the results of model-based fMRI are remarkably robust to settings of the learning rate to the extent that, in some situations, setting the parameters of the model as far as possible from their true value barely affects the results. This theoretical prediction of robustness is borne out by analysis of fMRI data from two recent experiments.

Our findings are both good and bad news for model-based fMRI. The good news is that it is robust, thus errors in the learning rate will not dramatically change the results of studies seeking to localize a particular signal. The bad news, however, is that model-based fMRI is insensitive to differences in parameters, which means that one should use extreme caution when attempting to determine the computational role of a neural area (e.g., when asking whether a brain area corresponds to an outcome signal or a prediction error signal). In the Discussion we consider the extent to which this result generalizes to other parameters and other models and offer suggestions to diagnose parameter sensitivity in other models.


Introduction

Experimental psychology strives to explain human behavior. This implies being able to explain underlying causal mechanisms of behavior as well as to predict future behavior (Kaplan, 1973 Shmueli, 2010 Yarkoni & Westfall, 2016). In practice, however, traditional methods in experimental psychology have mainly focused on testing causal explanations. It is only in recent years that research in psychology has come to emphasize prediction (Forster, 2002 Shmueli & Koppius, 2011). Within this predictive turn, machine learning-based predictive methods have rapidly emerged as viable means to predict future observations as accurately as possible, i.e., to minimize prediction error (Breiman, 2001b Song, Mitnitski, Cox, & Rockwood, 2004).

The multivariate nature and focus on prediction error (rather than “goodness of fit”) confers these methods greater sensitivity and higher future predictive power compared to traditional methods. In experimental psychology, they are successfully used for predicting a variable of interest (e.g., experimental condition A vs. experimental condition B) from behavioral patterns of an individual engaged in a task or activity by minimizing prediction error. Current applications range from prediction of facial action recognition from facial micro-expressions to classification of intention from differences in the movement kinematics (e.g., Ansuini et al., 2015 Cavallo, Koul, Ansuini, Capozzi, & Becchio, 2016 Haynes et al., 2007 Srinivasan, Golomb, & Martinez, 2016). For example, they have been used to decode the intention in grasping an object (to pour vs. to drink) from subtle differences in patterns of hand movements (Cavallo et al., 2016). What is more, machine learning-based predictive models can be employed not only for group prediction (patients vs. controls), but also for individual prediction. Consequently, these models lend themselves as a potential diagnostic tool in clinical settings (Anzulewicz, Sobota, & Delafield-Butt, 2016 Hahn, Nierenberg, & Whitfield-Gabrieli, 2017 Huys, Maia, & Frank, 2016).

However, while the assets of predictive approaches are becoming well known, machine learning-based predictive methods still lack an established and easy-to-use software framework. Many existing implementations provide no or limited guidelines, consisting of small code snippets, or sets of packages. In addition, the use of existing packages often requires advanced programming expertise. To overcome these shortcomings, the main objective of the current paper was to build a user-friendly toolbox, “PredPsych”, endowed with multiple functionalities for multivariate analyses of quantitative behavioral data based on machine-learning models.

In the following, we present the framework of PredPsych via the analysis of a recently published multiple-subject motion capture dataset (Ansuini et al., 2015). First, we provide a brief description of the dataset and describe how to install and run PredPsych. Next, we discuss five research questions that can be addressed with the machine learning framework implemented in PredPsych. We provide guided illustrations on how to address these research questions using PredPsych along with guidelines for the best techniques to use (for an overview see Fig. 1) and reasons for caution. Because the assets of predictive approaches have been recently discussed elsewhere (Breiman, 2001b Shmueli, 2010), we only briefly deal with them here.

Overview of PredPsych functions. An overview of the research questions that can be addressed using PredPsych and the corresponding techniques


Frequently Asked Questions (FAQs) to Dr. Ahn

I seek to build a “happy” laboratory where lab members (including the PI) respect each other, feel they are growing intellectually, enjoy excellent support for research, and generate research outputs that will make them competitive for the next career steps.

Building such an environment and a culture is a very challenging task especially because each person is from different backgrounds and has different expectations and norms. But I try to achieve it by (1) fostering communication within the lab, (2) recruiting people who are effective team players and share similar visions with each other, (3) individually tailoring training based on each member’s strengths and interests, and (4) securing enough research funds.