Frijda’s cognitive theory of the emotions (Frijda, , 2007) is the starting point for further explanation of emotional experiences in response to film. The theory posits that the emotion system has evolved for adaptive action in the first place. For example, the sight of a monster will spawn a strong urge to flee due to a basic concern for safety being jeopardised. Of course, film audiences do not run out of the auditorium. According to the cognitive theory of emotion, action responses are not fixed responses to emotional stimuli, but the result of appraisals of what they mean for a person’s concerns in light of the situational context. Playful simulation provides the contextual frame for the complex appraisal of apparent realism of film events. The appraisal has three stages: perceptual, imagination based and self-involved. Footnote 59

1. Many popular film stimuli provoke immediate and automated appraisals of concern relevance and ensuing emotional responses, due for instance to their nature of unconditioned stimuli in the real world. A snake popping out from the bush would be an example. Emotional appraisals in the cinema can be and often are empathetic. That is they include perspectives on events taken by film characters. Film technology in mainstream movies is used to emphasise emotional triggers; editing could strengthen the suddenness of the snake’s appearance, and photography could render fear releasers such as the typical movements of the snake more salient. Footnote 60 But popular films also present us with emotional stimuli that are immediately perceived as fake, for example a rubber prop snake. Due to the playful simulation frame further cognitive processing of perceptions takes place. In the first case, film viewers realise that just perceived events are not real but must be held true for the sake of a playful simulation. In the second, they realise that the fake stimulus is only a prompt, and comply with its invitation to hold the stimulus true and allow it to appeal to their concerns, also for the sake of playful simulation.

2. Once imagination takes over from perception, the reality status of stimuli is traded for believability. As part of the imagination fictional events are matched with higher order genre-specific narrative schemas, and then dealt with as possibilities in a particular world . As Frijda () argued when he discussed the apparent reality of fiction: 'Seeing a fake snake approach a real person is not scary. But watching an imaginary snake approach an imaginary Jane is. The first is seen as unreal in a real word, and the second as real in an imaginary world. And this is how we appraise events in fiction. The fun of art is in the play with the duality' (p. 1546). Play with the possibility of events in the imagined world and entertaining as-if emotions can suffice for genuine emotion to arise. As I argued elsewhere (Tan, ) the appraisal of the possibility of events in a particular fictional world can and usually does lead to genuine emotion, because humans have been equipped with a capacity to have emotions in response to mental representations of counterfactual and imaginary events. Footnote 61

The study of film viewers' attention has delivered a firm account of the role of the ubiquitous Hollywood continuity film style in the typical experience of smoothly flowing film scenes and stories that audiences allover the world have. (See for a review Smith, Levin Cutting, ).

Experimental psychology has always aspired basic explanations of perceptual responses, preferably through transparent mechanistic associations with physically observable stimulus conditions. The role of high-level narrative schema-based attention in smooth film experiences discussed in the previous section, is subject to debates in which experimental data support arguments pro and con. To begin with, AToCC emphasises the role of leading expectations in following cuts, but more akin to the Gibsonian approach of visual perception than to Hochberg’s schema position as it is, it tends to stress lower level features as directing attention bottom-up, too or even more so. One lower level is given by film-stylistic devices, for instance the use of sound that can orient viewers to direct their gazes to the next shot’s portion of the screen where the sound’s origin will be shown. Another are lower level stimulus features in a narrower and technical sense, such as bright lights and movements with sudden onset that automatically attract attention due to the make-up of the senses and the brain. Especially movement was shown by Smith to be an extraordinary low level attentional cue. The power of low level feature control of attentional shifts has inspired Loschky et al. () to speak of the 'tyranny of film'. They start from research findings suggesting that the use of low-level stylistic features can result in attentional synchrony across film audiences, that is individual viewers of a scene gaze at exactly the same portions of the screen at exactly the same time. Footnote 35 Remarkable degrees of inter-viewer synchronization of visual attention has also been established in studies of localisations of brain activity in film viewers (e.g., Hasson et al., ). However, Stephen Hinde’s research has recently shown that the distraction effect of inserted low-level attention triggers is quite limited (Hinde et al., ) In line with this notion of top-down attention control overriding bottom-up attention triggers, Magliano and Zacks () demonstrated that the perception of cuts is suppressed by higher order processes related to the construction of complex events.

Gibson’s idea of invariants in optical arrays can now be made concrete, enabling the prediction of bottom-up controlled attention and perception from objectively identified features. Developments in computer vision, image and sound analysis have paved the way for automated extraction of features and patterns in visual and auditory stimuli in terms of multiple dimensions. For example, machine extraction of saliency as a feature predictive of bottom-up attention has been developed and applied in numerous computer vision applications. A much-cited article by Itti and Koch () illustrates the idea for static images. Specialised neural network algorithms detect features such as colour, intensity, orientations, etc. in parallel over the entire visual field. Each feature is represented in a feature map, in which neurons compete for saliency. Feature maps are combined into a saliency map. A last network sequentially scans the saliency map, moving from the most salient location to the next less salient one and so on. Footnote 36 An excellent explanation of how to obtain saliency maps is given at a Matlab page. Footnote 37

