Prioritization and Interference of Emotional Information in Briefly Presented Scenes: Selection Advantage for Positive Emotional Scenes

Previous studies have repeatedly demonstrated the attentional prioritization of emotional information over neutral information. However, the parsing of interference from negative and positive stimuli has not received the same attention. In the study reported here, we examined the effect of real-world visual scenes of neutral, positive, and negative valence, as well as the effect of both highand low-arousal (differentially categorized based on their arousal and valence ratings) on scene gist identification. Using a partial-report paradigm, participants were asked to report the gist of a post-cued scene from a briefly-presented array of four scenes. Scene gist identification performance was significantly higher for positive scenes, regardless of arousal, than for negative scenes. All emotional scenes, regardless of valence and arousal, interfered with reporting the gist of neutral scenes. The findings support the hypothesis that emotional scenes more often interfere with processing of neutral scenes and are selectively attended to during briefly-presented scene arrays. Moreover, the results suggest that the identification and the interference of positive, higharousal scenes are prioritized in visual information processing.


Introduction
Selection of behaviorally relevant information from the plethora of stimuli we are constantly exposed to has a clear evolutionary advantage. One extremely pertinent category of information that has been shown to have an effect on visual orientation and processing is emotional content (Anderson & Phelps, 2001;Anderson, 2005;Fernandes, Koji, Dixon, & Aquino, 2011;Fredrickson & Branigan, 2005;Kuhbandner, Spitzer, & Pekrun, 2011;Nummenmaa, Hyönä, & Calvo, 2009;Öhman, Lundqvist, & Esteves, 2001;Most, Chun, Widders, & Zald, 2005;Rowe, Hirsch, & Anderson, 2007;Vuilleumier, 2005). Emotional content is found to interfere with performance in both cognitive and perceptual tasks (Schimmack, 2005) as well as detection of neutral information when emotional information closely precedes the former's presentation (Choisdealbha, Piech, Fuller, & Zald, 2017;Most et al., 2005;Most, Smith, Cooter, Levy, & Zald, 2007;Phelps, Ling, & Carrasco, 2006). When offered monetary reward for resisting distractors, participants were still unable to avoid selectively attending to and processing emotional stimuli (Most et al., 2007;Piech, Pastorino, & Zald, 2010). Not only does emotional content affect information processing, but it is also shown to be processed preferentially in various cognitive domains (Anderson, 2005;Cesarei & Loftus, 2011;Keil & Ihssen, 2004). For example, a recent study using simple sketches revealed that threatening stimuli are equally likely to be selected in briefly-presented scenes as positive stimuli (Kuhbandner et al., 2011). Here we investigate potential interactions of a visual scene's arousal and valence on participants' ability to perceive and report scene gist from briefly-presented scene arrays.
We first expand on the studies mentioned above by utilizing real-world visual scenes of different valence and arousal (e.g., a scene depicting a romantic embrace and a scene with a car crash) rather than using simple schematic objects (for instance, outlined drawings of animals of various valence). Visual scenes are a special class of stimuli; the visual system is exquisitely sensitive to scenes and can pick up the gist of such scenes with minimal attention (Li, VanRullen, Koch, & Perona, 2002) and minimal presentation times (Potter, Wyble, Hagmann, & McCourt, 2014). Disruptions of the "grammar" of a scene (e.g., objects appearing in semantically inconsistent locations) lead to an increase in reaction times and an impairment in object identification (Biederman, Mezzanotte, & Rabinowitz, 1982;Davenport & Potter, 2004). Evidence shows that the processing of visual scenes differs from that of more simplistic synthetic displays (Braun, 2003). For example, scene gist can be picked up in the near absence of visual attention, while detecting color changes in simple geometric stimuli cannot (Li et al., 2002). A study by Calvo, Nummenmaa, and Hyönä (2008) revealed that, with a plethora of semantic and affective information available for further processing, even vague impressions of emotional information tend to orient attention. Such findings on the impact of real-world scenes on visual information processing have led many researchers to advocate understanding the visual system primarily through utilization of naturalistic stimuli (Henderson & Hollingworth, 1999;Kayser, Körding, & König, 2004;Simoncelli, 2003).
We used a partial report paradigm (Sperling, 1960;Averbach & Corriel, 1961 1 ) involving the presentation of an array made of four visual scenes (Clarke & Mack, 2014), one of which was immediately post-cued at the offset of the scene array. Prioritized selection of emotional information was addressed through reporting scene gist when an emotional scene was cued. Interference of emotional information was determined through reporting the gist of a post-cued neutral scene from the scene array while an emotional scene was present.
In the control condition, participants reported the gist of one post-cued neutral scene from an array of four neutral visual scenes. Participants could be exposed to scenes from one of the four possible combinations of arousal and valence. Based on previous literature, we predicted that emotional information of either positive or negative valence, particularly of higher arousal, will capture attention and cause interference in trials where neutral scenes are to be reported.

Participants
Eighty-one university students participated in the experiment for course credit. All participants had normal or corrected-to-normal vision. Each participant was randomly assigned to one emotional scene condition: positive high arousal (PHA) (n = 21), positive low arousal (PLA) (n = 20), negative high arousal (NHA) (n = 20), and negative low arousal (NLA) (n = 20).

Stimuli and Conditions
Stimuli were visual scenes selected from the International Affective Picture System (IAPS; Lang, Bradley, & Cuthbert, 2008). Each was chosen based on its rating of arousal (the level of "arousability" or physiological reactivity ranging from low to high) and valence (an index of its pleasantness or its hedonistic value varying from negative through neutral to positive) as indicated in the IAPS Manual. We selected stimuli from IAPS as it provides both validated and reliable ratings for valence and arousal on a scale from 1-9 (valence: 1 -very negative, 9very positive; arousal: 1 -low arousal, 9 -high arousal). The visual scenes in all of the experiments were the same, as we had to control for variables such as the presence or absence of agents in the scene, scene complexity, or chro-matic patterns of each scene. Since the goal of our experiment was to examine the influence of emotional visual scenes on the identification of neutral visual scenes, we decided to divide the experiment into the following within-subject and between-subject conditions. Target Category Condition (Neutral x Emotional; Within-Subject): 1. Emotional Target Category (Selection of Emotional Visual Scenes). In this condition, participants were presented with a four-scene array that consisted of one emotional visual scene and three neutral visual scenes, with the emotional scene being post-cued. This condition aimed to test the hypothesis that emotional scenes were preferentially selected from an array of otherwise neutral scenes.

Neutral Target/Emotional Distractor Category (Interference of Emotional Visual Scenes).
In this condition, the four-scene array consisted of one emotional visual scene and three neutral visual scenes, with the neutral scene being postcued. This condition examined the interference of emotional scenes on the perception and identification of neutral scenes.
3. Neutral Target Category (Control Condition). In this condition, the four-scene array consisted of four neutral visual scenes, with one of the four visual scenes being post-cued. This was our control condition. Neutral images varied from 4.76 to 5.38 in valence and from 1.72 to 4.97 in arousal.

Emotional Category Condition (Valence x Arousal; Between-Subject):
Each participant was randomly assigned to one of four conditions: 1. Positive Valence and High Arousal (PHA). The arousal ratings of images presented in this condition ranged from 5.41 to 7.35 2 , while valence ratings varied between 6.82 and 8.34.
Examples of images included in this condition were pleasurable situations (two naked bodies in a sexual context), adventurous sports (a person skiing, a person skydiving), and images depicting victory (a person winning a competition).
2. Positive Valence and Low Arousal (PLA). The arousal ratings of images presented in this condition ranged from 2.51 to 3.94, while valence ratings varied between 6.54 and 8.05.
Examples in this condition could be affection-evoking images (a smiling girl), and positive nature images (a meadow full of flowers).
3. Negative Valence and High Arousal (NHA). The arousal ratings of images presented in this condition ranged from 5.17 to 6.99, while valence ratings varied between 1.67 and 3.95. Some examples are visual scenes containing injured bodies (a man with blood on his face), scenes containing threatening animals (a biting dog, a snake, a spider), and accident scenes (airplane crash, fire scenes).

Negative Valence and Low Arousal (NLA).
The arousal ratings of images presented in this condition ranged from 3.52 to 4.96, while valence ratings varied between 1.95 and 3.92. Visual scenes selected for this category included those depicting sad persons (a child hiding in a corner), or scenes depicting unfortunate situations (funeral or cemetery).

Procedure
At the beginning of the experiment, each participant read and signed an informed consent form. Following this, they were told that their task was to identify the gist of a postcued scene as accurately and in as much detail as possible. The participant then began a training session consisting of 20 trials, which consisted of neutral scenes only. None of the visual scenes presented during the training session were presented in the experimental trials.

Partial-Report Paradigm
Each trial started with the presentation of a fixation cross at the center of the screen for 1500 msec, followed by a four-scene memory array displayed for 500 msec. Immediately upon the disappearance of the scene array, a post-cue indicating which visual scene to report (a red line placed beneath the location of one of the lower two scenes or above the location of one of the upper two scenes) was presented for 100 msec. The cue was followed by a text box in which participants were to report the gist of the cued scene (see Figure 1 below).
Each visual scene subtended 6 x 5 degrees of visual angle at a viewing distance of 56 cm. The scene array consisted of 4 scenes centered around fixation. The center of each scene was 4 visual degrees away from fixation, while the nearest corner of the scene was 2 degrees from fixation. Depending on the experimental condition, the 4-scene displays either contained or did not contain an emotionally charged scene. The experiment included 36 trials (12 per target category condition: emotional, distractor, neutral), presented in a random order.

Data Preparation and Scoring
Each response was coded by three independent raters (Fleiss k = .96). The raters were trained to code gist performance based on the richness of detail provided by the partic-ipant. Responses were rated in the following way: Misidentifications and non-identifications were scored as zero. Basic-level category descriptions of the visual scene (e.g., if the scene included a spider, and the participant said 'animal') were scored as one. Subordinate-level category descriptions of the visual scene (e.g., if they gave more specific information) were scored as two. This scoring system was based on the scorers' realization that participants in some instances can identify the scene in very much detail, while in other instances could provide only basic-level category, and we deemed this distinction as crucial. As such, a perfect performance in each target category condition could be a maximum of 24 points (scoring 2 points for each of the 12 scenes).

Overall Effects
All statistical analyses were computed using RStudio. In order to examine the effect of arousal and valence of briefly-presented emotional information on reporting visual scene gist as well as the interference of emotional information in reporting neutral scenes, we computed a 4 x 3 mixed-design ANOVA with target category (3: emotional target, neutral target/emotional distractors, neutral target) as a within-subject factor, and emotional category (4: PHA, PLA, NHA, NLA) as a between-subject factor. Mauchly's test of sphericity was not significant (x 2 (2) = 2.78, Figure 1 Schematic illustration of the experimental design. p = .25), confirming homogeneity of variances across the different emotional conditions. In addition, Levene' test for equality of variances was not found to be significant among the different emotional groups for all three dependent variables (p = .09, p = .58, p = .64 for the emotional, distractor, and neutral target category).
A main effect of target category: A significant main effect was obtained for within-subject factor; target category F(2, 154) = 152.16, p < .001, η 2 = 0.7. Post hoc analyses using the Tukey HSD test indicated (Figure 2) that emotional targets were identified to a greater extent (M = 12.20, SD = 4.66, SEM = .39) than neutral targets in general (M = 8.33, SD = 3.01, SEM = .33), p < .001 as well as neutral targets in the presence of an emotional distractor (M = 6.17, SD = 2.92, SEM = .29), p < .001. The difference in the identification of neutral targets and neutral targets with an emotional distractor was also significant, indicating the interfering nature of emotional scenes, p < .0001. 3 We also found a statistically significant interaction effect between emotional category and target category, F(6, 154) = 20.11, p < .001, η 2 = 0.4.
More specifically, in both positive emotional category groups (PHA and PLA), emotional targets were identified more frequently than neutral targets, t(40) = 11.44, p < .001 (positive emotional targets: M = 15.29, SD = 3.60; neutral targets: M = 8.29, SD = 3.10) demonstrating prioritized selection of positive emotional information regardless of its arousal. However, in the negative emotional category (NHA and NLA), identification of emotional targets did not significantly differ from identi-3 The main effect of the emotional category did not make sense to report as the mean for each condition (NHA, NLA, NLA, PHA) reflected a combination of selection of the emotional scene (higher number = greater selection), its interference (lower number = greater interference), and neutral category (= number of neutral scenes reported regardless of the emotional category.) Figure 2 Gist identification scores for individual target categories (within-subjects factor), regardless of the emotional category (between-subjects factor).
Note. *** p < .001 fication of neutral targets (negative emotional targets: M = 9.10, SD = 3.36; neutral targets: M = 8.37, SD = 2.96). Figure 3 illustrates the gist identification performance for individual emotional categories for different combinations of valence and arousal. In regard to the attentional selection of positive scenes, there was no difference in the gist identification score for PHA and PLA scenes (p = .43).
In terms of interference of positive scenes, the gist identification score for neutral scenes with a PHA distractor was lower (M = 4.50, SD = 2.12) than the gist identification score for neutral scenes with a PLA distractor (M = 8.33, SD = 2.85), t(39) = 4.87, p < .0001. This finding illustrates the profound attentional capture by and attentional disengagement from PHA scenes, in particular.
In terms of the difference in the prioritization or interference of negative scenes, there was no difference in the gist identification between NHA and NLA scenes (p = .31), and, interestingly enough, there was also no difference between NHA and NLA scenes in their interfering effect on the gist identification of neutral scenes (p = .44).
We were also interested in the interference of emotional distractors on the perception and reporting of neutral visual scenes. Regardless of the emotional category (PHA, PLA, NHA, NLA), the presence of any emotional visual scene distractor significantly impacted the ability to report the gist of neutral scenes. For instance, we observed a decrease of 1.83 in the gist identification score when there was a positive emotional distractor; t(40) = -3.73,  Table 1 and Figure 4.
While we believe our results support a selection advantage for positive emotional visual scenes, it was important to rule out that the perceived differences between individual between-subject emotional categories could stem from different performance of subjects in those categories. In an attempt to equalize Note. *** p < .001, ns -non-significant participants' overall performance, we decided to deem performance in gist identification in neutral scenes as a baseline. Subsequently, for each participant, we subtracted their neutral target gist identification score from that of emotional target condition in order to obtain an index of "emotional scene prioritization effect." Analogously, participants' neutral target gist identification was subtracted from their neutral target/emotional distractor performance indexed as "emotional scene interference effect." Repeating our previous analysis, a mixed-design ANOVA was run with target category (emotional target, neutral target/emotional distractor, neutral target) as a within-subject factor and emotional category (NHA, NLA, PHA, PLA) as a between-subject factor. Due to our subtraction method, all  participants' neutral target performance was considered baseline -thus zero. Unsurprisingly, the analysis yielded very similar results. As can be seen in Figure 3, a significant main effect was observed for the target category, F(2, 154) = 151.75, p < .001, η 2 = 0.7. The gist identification score for emotional scenes (M = 3.84, SEM = .39) was significantly higher than for neutral scenes (baseline = 0). Additionally, the gist identification score when there was an emotional distractor was significantly lower (M = -2.22, SEM = .33). The difference between emotional scene prioritization and emotional scene interference was significant (t(80) = 12.61, [two-tailed], p < .001) illustrating the detrimental effect an emotional distractor may have on the reportability of neutral scenes.
Additionally, an interaction between emotional category (PHA, PLA, NHA, NLA) and target category (emotional target, neutral target/emotional distractor, neutral target) was also found to be significant: Altogether, these results suggest that the gist of positive visual scenes of any level of arousal are identified to a greater extent than neutral visual scenes. However, this gist identification advantage was not observed for negative visual scenes. In terms of interference, all emotional scenes showed an interference effect when identifying neutral visual scenes, regardless of valence and arousal. These findings show that despite the inability to identify negative scenes gist, the "emotionality" of the scenes must be detected -as shown by their interference.

Discussion
From an evolutionary perspective, the relationship between emotional and visual attention is undoubtedly fundamental -the motivated prioritization and perhaps even automatic selection of emotional information has a robust value for survival by orienting our attention to biologically relevant stimuli (Lang, Bradley, & Cuthbert, 1997;Öhman, Flykt, & Esteves, 2001;Piech et al., 2010). To explore this relationship in the context of briefly-presented natural visual scenes, we employed a partial-report task in which participants had to report the gist of a scene immediately post cued at the offset of a briefly-presented fourscene display. While other studies have attempted to address and answer the question of the prioritization of emotional information (Anticevic, Barch, & Repovs, 2010;Calvo et al., 2008;Dolcos & McCarthy, 2006;Kalanthroff, Cohen, & Henik, 2013;Kuhbandner et al., 2011;Padmala, Bauer, & Pessoa, 2011), this is the first comprehensive study that systematically used scenes of different combinations of valence and arousal to examine the phenomenon of attentional selection and interference from emotional visual scenes.
Overall, we found that positive visual scenes were much more likely to be identified than both neutral and negative visual scenes. In line with the evolutionary perspective on the relationship between emotional and visual attention, one can argue that prioritization of positive, highly arousing (i.e. sexual) scenes has a reproductive value. However, this preferential selection was observed for positive scenes regardless of their arousal and biological relevance (e.g., visual scenes depicting exciting sporting activities and others depicting meadows with flowers). Previous studies have reported that negative stimuli narrow the visual field (Fredrickson & Branigan, 2005;Masuda, 2015), and these findings support the suggestion that the gist of positive scenes, when presented simultaneously with neutral scenes, is preferentially processed and selectively attended to, contrary to current understandings of the prioritized processing of threatening emotional information (Mogg & Bradley, 2002;Mogg, Millar, & Bradley, 2000). This finding corroborates results of research studies demonstrating that positive information reduces the attentional blink (Anderson, 2005;Keil & Ihseen, 2004;Most et al., 2007;Oca, Villa, Cervantes, & Welbourne, 2012) and produces automatic orienting of attention toward its location, especially if the stimuli are biologically relevant (Brosch et al., 2008;Fernández-Martín & Calvo, 2016;Sennwald et al., 2015;Williams, Moss, Bradshaw, & Mattingley, 2005). Additionally, erotic images as distractors seem to impair attention allocation to stimuli at early temporal stages (Arnell et al., 2007;Ciesielski et al., 2010;Most et al., 2007). One can therefore assume that the selection advantage of positive scenes might be in guiding visual attention to peripheral details to promote and facilitate global information processing (Bendall, Mohamen, & Thompson, 2018;Wadlinger & Isaacowitz, 2006).
Interestingly, throughout all emotional distractor conditions, emotional visual scenes interfered significantly in identification of the neutral visual scenes, suggesting attentional capture by the emotional scenes, and subsequent difficulty disengaging attention from them, perhaps leading to fewer attentional resources available for deployment to the neutral scenes in the array. This finding is consistent with studies showing that interference by emotional distractors occurs in both spatial (when presented in close proximity) and temporal (when presented in close succession) dimensions (Bocanegra & Zeelenberg, 2011;Most & Junge, 2008;Most et al., 2005). In other words, the previous literature shows that emotional distractors largely impair processing of neutral information at short temporal distances, while this effect disappears at long temporal distances (Bocanegra & Zeelenberg, 2009;Ciesielski et al., 2010). This interference effect cannot be attributed to physical properties of emotional scenes, as it has been shown that merely scrambling emotional images eliminates their impact on visual attention (Most & Junge, 2008). Therefore, our results cannot be attributed to low-level physical characteristics.
One might argue that participants in our study were strategizing and intentionally deploying their attention to the emotional images. However, emotional scenes were less likely to be cued (only 33% of time) and they always occurred in a random, unpredictable spatial location. Furthermore, if this were true, we would have observed a prioritization for negative scenes as well as positive scenes, which we do not see in our data. Therefore, we discredit this explanation for our findings.
The lack of evidence for preferential identification of negative visual scenes vis-à-vis neutral visual scenes in and of itself was rather surprising to us. This impairment in identifying the gist of negative scenes seems to have been due to a perceptual disruption, rather than a failure of memory, as the post-cue occurred immediately after the scene array disappearance. However, one can infer that the emotional significance of negative scenes must still be accessed, as negative scenes did severely interfere with the processing of neutral information similarly to positive scenes. As hypothesized, this effect did not occur when distractor scenes were neutral. Perhaps prior to attentional selection, the negative valence of the visual scene is processed but not to the level of gist identification. In other words, participants might detect that the scene is generally emotional, but it may take a longer time to process and subsequently consciously identify the gist of negative scenes when presented in an array. Again, the interference of the negative scenes in the processing of neutral scenes is sufficient evidence that their emotional significance is, on some level, attended to and processed. Another possible explanation is that the valence of negative scenes might be processed without attention, which, however, clearly does not guarantee gist identification. Future studies should address this possibility to resolve whether the processing of the affective aspect of negative scenes requires attentional resources. Furthermore, from an evolutionary point of view, the identification of the negative valence of the scene alone (without gist identification) might be sufficient to spring an individual to action without compromising their survival. In comparison to positive information, negative information is evaluated more extremely (Ito, Larsen, Smith, & Cacioppo, 1998), seems to reflexively draw attention, particularly in face-recognition (Hansen & Hansen, 1988), and creates greater interference in tasks such as the emotional Stroop (Pratto & John, 1991), attentional blink (Choisdealbha et al., 2017), working memory tasks (Sakaki, Gorlick, & Mather, 2011) and letter identification tasks (Masuda, 2015). Finally, perhaps it is possible that negative scenes can be identified at longer latencies after being selectively attended to. Future studies should investigate this effect.
While our study reveals much about the selection of emotional content in early vision, it specifically aids in our understanding of the processes involved in gist perception and identification when emotional information is present. For there to be a significant difference in interference between reports of negative, positive, and neutral scenes (and more so when broken down into high-and low-arousal conditions, even when visual scenes are post-cued), it must be that the emotional content of all items in the array is identified when the 4-scene array is present. Henderson & Hollingworth (1999) posit that when scenes are briefly presented, participants will most often retain low-frequency, semantic information that would ultimately comprise the gist. Alternatively, confounding variables that may have caused the observed effects could include negative stimuli that were perhaps not as threatening or arousing as anticipated. Previous literature has also suggested that, even though negative and positive stimuli may produce similar effects, there may be considerably different visual processes occurring, accompanied by a tradeoff between visual channels that "boost" some visual features and impede others (Bocanegra & Zeelenberg, 2011;Gupta, Hur, & Lavie, 2016), which would account for the differences found in the current study.

Conclusion
In general, emotional visual scenes, regardless of their valence and arousal, are known to capture attention at early stages of visual information processing at the expense of identification of neutral visual scenes. We believe that our study reveals how selective attention prioritizes positive emotional informa-tion over neutral information in a visual scene array containing both kinds of information. The results of our study also show that emotional scenes (regardless of valence) can interfere with identifying the gist of neutral scenes. In conclusion, all emotional scenes cause interference in comparison to neutral scenes, while positively valenced scenes caused the greatest effect and were identified the most. Future studies are needed to tease apart these valenced differences in emotional interference for briefly-presented visual scenes.