Primate Pragmatics, Expressive Behavior and the Evolution of Language

– Cheney and Seyfarth’s groundbreaking studies on vervet monkey alarm calls paved the way for a serious investigation of what animal signals might mean and their relevance to the evolution of language. Although the question of what drives call production remains largely unanswered, and parallels with language cannot be discerned in this domain, there appear to be some similarities to language in the way primates, and other animals, derive information from utterances by pragmatically interpreting their significance using available contextual cues. We describe some of the advances that Cheney and Seyfarth’s work spurred and illustrate our current understanding using the alarm calling system of putty-nosed monkeys as an example. We also briefly indicate some of the obstacles to adopting either a purely ‘Carnapian’ or purely ‘Gricean’ pragmatic approach to the evolution of language. We conclude by briefly sketching an intermediate pragmatic framework. This framework takes account of the expressive character of a subset of communicative signals that are biologically designed to openly reveal psychological states, thereby allowing mutually beneficial interactions among, specifically, signalers and receivers that live in social groups.

Over the last 40 years, the work of Dorothy Cheney and Robert Seyfarth, who pioneered the synthesis of experimental psychology and field ethology under the tutelage of Robert Hinde (Hinde, 1970), has produced a large number of immensely valuable contributions and remarkable insights into 'How monkeys see the world' (Cheney & Seyfarth, 1990). Arguably, the most exciting and hotly debated have been those that reinvigorated the study of animal communication with a focus on the evolution of language, and which have undoubtedly inspired the authors contributing to this volume and many others. As postdoctoral fellows of Peter Marler, Cheney and Seyfarth deployed novel field techniques that extended Tom Struhsaker's observations that vervet monkeys (Chlorocebus pygerythrus) have a repertoire of three acoustically distinct alarm calls, each produced in response to the detection of their three main predators, leopards, martial eagles and pythons (Struhsaker, 1970). Using experimental playback studies, they established that these monkeys have different alarm calls that appeared to label these different classes of predators. Moreover, the broadcast of these calls elicited different adaptive, predator-specific responses in listeners (Seyfarth et al., 1980). Together, these findings suggested that the calls might be more than mere emotional outbursts, but rather are akin to 'primitive words' with relatively fixed symbolic meanings that could be understood by others.
Unsurprisingly, this sparked enormous interest among scholars working in the fields of animal behavior and comparative psychology, as well as anthropologists, philosophers (see the commentaries on Cheney & Seyfarth, 1992) and, eventually, linguists. Despite their superficial resemblance to human words, understanding the meaning of such calls continues to pose a largely impenetrable problem. It is extremely difficult to distinguishthrough either observation or even carefully conducted field experimentsbetween alternatives such as whether a leopard alarm call might mean leopard, as opposed to indicating that the caller wants listeners to escape into a tree (Evans & Marler, 1995;Macedonia & Evans, 1993), or is experiencing a state of arousal narrowly associated with leopards. Consequently, the use of the term 'functionally referential' was adopted to describe the semantic character of such calls; 'functionally' was designed to signal that all that could be established was that the vervet monkeys behaved as if their vocalizations encode information about (and thus 'refer' to) objects or events in the external environment (Evans, 1997).
During the decades that passed since these early studies, similar phenomena have been described in a number of other primate species, e.g., saddleback and moustached tamarins (Saguinus fuscicollis and Saguinus mystax; Kirchhof & Hammerschmidt, 2006), ringtailed lemurs (Lemur catta; Macedonia, 1990), Diana monkeys (Cercopithecus diana; Zuberbühler, 2000), Campbell's monkeys (Cercopithecus campbelli; Zuberbühler, 2001), and other types of mammals and birds, mostly in the contexts of the detection of predators and food (e.g., Gill & Bierema, 2013;Manser et al., 2002;Slocombe & Zuberbühler, 2005). There are two sides to this question: what a call might mean for the producer of the signal and what it might mean to the receiver (e.g., Seyfarth & Cheney, 2003a). In the case of the producer, the call is given in response to something, whether external (an object or event in the environment), internal (the caller's state), or both, and so it is safe to say that the call carries information about an external or internal state of affairs, or a combination of both. But how the resulting cognitiveaffective state is linked to the production of the call, never mind the degree to which conceptual-semantic representations could be involved, is largely unknown. And even in the case of human language, where linguistic conventions supposedly fix meaning, it may not be possible to determine precisely what a speaker means when using any particular word (e.g., Putnam, 1975;Quine, 1960;Wittgenstein, 1953).

'Functional Reference' and 'Meaning'
A more tractable problem than what callers mean by their calls is what the calls might mean to the receiver or, at least, what information a listener can extract from a call. Most researchers agree that animal calls carry 'natural' meaning as opposed to 'non-natural' meaning (Grice, 1957). 'Natural' meaning relies on detectable correlations of the sort that exist in nature between, e.g., the appearance of smoke and the presence of fire, or dark clouds and rain, or red spots and measles, where the former is reliably associated with, or indicates, the latter. 'Non-natural' meaning, by contrast, refers to the sort of arbitrary, symbolic, relations that exist between a word such as 'leopard' and leopards. According to Grice, nonnatural meaning depends on communicative intentions of speakers and their decipherment by hearers. Returning to alarm calls, the observations that led to describing them as 'functionally referential' fell short of establishing that they possess nonnatural meaning. However, assigning such calls natural meaning requires establishing that, e.g., a leopard alarm calls are produced when, and only when, a caller detects a leopard, since only then can the call be said to carry very reliable information about the presence of leopards (like a leopard growl). Indeed, as Seyfarth and Cheney (2003a) were careful to point out, the utility of 'functionally referential' calls for picking out different classes of predators depends on (i) their production specificityi.e., whether they are elicited by a narrow or broad class of objects or events in the environment; (ii) their response specificityi.e., whether they elicit predictable responses in listeners; and (iii) and their informative valuei.e., how reliably they are produced when the class of objects to which they putatively refer is present, but not when that class of objects is absent.
When it comes to production specificity, in order to consider a call 'functionally referential', it is crucial to document the full range of contexts in which it is produced. However, most studies that have reported the use of functionally referential alarm calls relied on experiments that involved the presentation of predator stimuli (acoustic or visual), or playbacks of alarm calls, alone. If, for example, leopard stimuli reliably elicited one call type and eagle stimuli reliably elicited another, then it was concluded that the calls were referential. However, a number of studies reported that calls associated with terrestrial predators were also observed to be produced in at least one other non-experimental context, usually during aggressive social interactions (Cebus capucinus: Digweed et al., 2005;Eulemur fulvus rufus, Propithecus verreauxi verreauxi: Fichtel & Kappeler, 2002;Propithecus verreauxi: Fichtel & van Schaik, 2006; Saguinus fuscicollis: Kirchhof & Hammerschmidt, 2006), while others found that alarm calls were not acoustically distinct but graded, with low context specificity (Papio cynocephalus ursinus: Fischer et al., 2001; Cercocebus torquatus atys: Range & Fischer, 2004). These studies indicated that functionally referential calls were not ubiquitous among primates and that alarm calling systems might be shaped by ecological factors such as the presence or absence of certain predator classes and the utility of adopting different modes of escape (Arnold & Zuberbühler, 2006a;Macedonia & Evans 1993). More recently, however, a more extensive analysis of Cheney and Seyfarth's original recordings of vervet monkeys that was first thought to establish the existence of functionally referential alarm calls also found a significant degree of overlap between calls given in predator contexts and those given during intergroup aggression (Price et al., 2015). Females sometimes gave calls in eagle or snake contexts that were similar to those produced during intragroup aggression, and males gave calls to leopards that were also similar to those given during intergroup aggression, which suggests the possibility that calls given in these different contexts might be reflective of similar motivational states (Price et al., 2015).

Alarm Calls vs. Words: The Shift to Pragmatics
Words possessing symbolic meanings are said to have their meanings in a relatively contextindependent way. Importantly, a word like 'leopard', even if produced in the absence of leopards, is still taken to refer to leopards and only to leopards. Given that, it became important to take note of studies that reported that calls given to particular predators were also given in non-predatory contexts, or that compared calls given in non-predator contexts in their analyses of alarm calls. More generally, if alarm calls are to be compared to words, potentially shedding some light on the emergence of language from animal communication systems, it becomes important to study calls that are given in multiple contexts and yet carry information that listeners respond to as if they attribute specific meaning to them. Relevant studies have led in recent years to a significant shift from focusing on alarm calls as analogous to words to focusing on their pragmatic interpretation (see Wheeler & Fischer, 2012).
Calls that carry only ambiguous information require contextual disambiguation in order to have specific significance. Such calls are common, and even calls that were initially interpreted as functionally referential have been found to be less context-specific than originally thought. It is fair to say that, in the vast majority of observed cases, call meanings must be derived from a combination of information contained in the call together with relevant contextual cues. Pragmatics is the field of linguistics that considers the role of context in deriving meaning from utterances. A pragmatic approach to the study of animal communication was already championed by Smith (1977), a contemporary of Cheney and Seyfarth's, although the excitement generated by the apparent discovery of word-like animal calls resulted in his early work on the subject being overlooked. However, a more pragmatics-oriented approach was adopted after a series of studies on the alarm calling system of putty-nosed monkeys (Cercopithecus nictitans martini) showed that, contrary to reports concerning closely related monkey species (e.g., Zuberbühler, 2000Zuberbühler, , 2001, their alarm calls did not fit the criteria necessary to regard them as exemplifying functionally referential communication.
In the next section, we describe the putty-nosed monkey alarm call system in some detail in order to illustrate how the ambiguity of the information the alarm calls convey was determined. Importantly, experimental playback studies of this system produced similar results to those concerning their apparently 'functionally' referential relatives. However, as we describe, these previous studies did not take into account what relevant contextual information is available to arboreal monkeys living in dense forest, or elucidate the ways that listeners may integrate these cues in order to gain useful information about what the calls are likely to be about. We think there is a lesson to be learned from studying the putty-nosed monkey system for a pragmatic approach to animal communication systems and their relevance to the evolution of language. We will conclude by offering a novel perspective on potential connections between monkeys' use of call systems and humans' use of language.

The Alarm-Calling System of Putty-nosed Monkeys
Putty-nosed monkeys are a fairly generic species of guenon and are widespread across Central and West Africa. They live in large groups of up to thirty individuals comprising just one adult male together with females and their offspring (Gautier-Hion & Gautier 1974; K. Arnold, personal observation). Males leave their natal group on becoming sexually mature and live alone, or in bachelor groups, before attempting to compete with existing resident group males for their position and an opportunity to reproduce. Aside from reproduction, an important role for resident males is to defend their group from predators, such as crowned eagles, which they attempt to chase away aggressively (Shultz, 2001;S. Shultz, personal communication, May 23, 2008; K. Arnold personal observation), and leopards, which are primarily ambush hunters and can be deterred by raising alerts to their presence and location (Zuberbühler et al., 1999).
Males have a repertoire of three 'loud' call types that can carry over long distances: booms, pyows, and hacks. Booms are very rarely heard and occur in a wide range of contexts (K. Arnold, personal observation), which renders their interpretation difficult. However, pyows and hacks are produced frequently and were initially understood to function primarily as calls used for intragroup cohesion and the maintenance of intergroup spacing (Gautier & Gautier-Hion, 1977). Early reports indicated that pyows and hacks are also used in a variety of contexts that could be characterized as disturbing (e.g., falling trees, thunderclaps, aerial predators, the approach of humans: Struhsaker, 1970). In an early playback study, Eckardt and Zuberbühler (2004) reported that putty-nosed monkeys in the Ivory Coast use these loud calls as predator-specific alarm callspyows as leopard alarm calls and hacks as eagle alarm callsand concluded that they were functionally referential. However, a later series of studies by Arnold and colleagues demonstrated that this was not the case. These studies showed that putty-nosed monkeys generally produced a series of hacks (or a 'transitional series', which begins with hacks followed by pyows) in response to playbacks of eagle shrieks and a life-size model of an eagle, and a series of pyows to similar leopard stimuli (Arnold & Zuberbühler, 2006a;. But these apparently predator-specific responses were recorded at least equally often in a variety of other non-predatory contexts as well, including in response to natural disturbances such as tree falls, fights among baboons, and the calls of neighboring males (Arnold et al., 2011). Playbacks of eagle shrieks from distances of more than 100 m could also elicit pyow series rather than the more characteristic hack series, and whether hacks or pyows were produced in response to playbacks of the sound of tree falls was also somewhat distance-dependent (Arnold, 2020). Most importantly, these calls were also given in situations where there was no apparent external cause at all. Thus, the fairly stereotypical 'pyow-hack sequence' is produced to elicit whole group movement from one location to another (Arnold & Zuberbühler, 2006b; and, most frequently, males produced pyow series while relaxed and engaged in day to day activities such as feeding , which fits the earliest proposed function of maintaining intragroup cohesion and intergroup spacing. While hacks are generally given in response to eagle stimuli, and pyows are generally given in response to leopard stimuli, both call types are frequently given in the absence of the putative referent and are certainly not tightly predictive of them. It is true that the observed association between hacks and eagles, and pyows and leopards, in experimental and natural contexts, point to the conclusion that these two call types function as alarm calls. And they do allow listeners to form at least probabilistic expectations that a predator of a certain type may have been detected by the caller. Nevertheless, the putty-nosed calls do not meet the criteria for referential specificity or informativity necessary to qualify as 'functionally referential' signals, since they are produced in a wide range of contexts, many of which are nonpredatory (Figure 1).

Figure 1
The natural contexts in which hack, transitional and pyow call series were recorded over a 213-day period . H = hack, P = pyow. The proportion of recorded call series of each type given in each context are indicated by: dashed line < 10%; solid line 11-20%; bold line > 30%.
The idea that alarm calls have a referential dimension caught on among researchers not only because of the suggested possibility that animal vocalizations have some word-like properties, but also because it made evolutionary sense. Different predator types often employ very different hunting strategies that require different anti-predatory responses. Leopards are ambush hunters that attack from the ground and cannot climb as proficiently as monkeys, while crowned eagles (these monkeys' primary avian predator) are specially adapted for maneuverability in the dense forest canopy and can attack at any height, including from the ground. Although monkeys can simply flee from a leopard, the most common strategy is to approach en masse and mob it, since stealth is not effective when the prey has the predator in its sights, and so leopards generally abandon a hunting attempt once they have been detected (Curio 1976;Robinson 1980). On the other hand, for females and juvenile monkeys, the best defense against crowned eagles is to hide in dense foliage since this restricts access to smaller individuals while the larger male is relatively invulnerable to attack and is active in driving eagles away (K. Arnold personal observation). Gaining information that allows listeners to choose between different responses, therefore, has crucial survival value, since employing an inappropriate one could prove fatal.
However, as noted above, although pyows but not hacks are given to leopards, and hacks but not pyows are given to nearby eagles, neither call type is sufficiently predictive of the presence of these predators since these calls are often produced in non-predatory contexts as well. So now the question arises: how do putty-nosed listeners know when to scramble and when to conserve their energy? Further observations, later backed by a playback study, revealed a simple solution that relied on the integration of contextual information from a variety of sources . As previously noted, aside from eagles, hacks are also elicited by non-predatory disturbances such as falling trees and baboon fights or the calls of other monkeys in the area. All of these phenomena have loud and distinctive acoustic features that allow listeners to form associations between their occurrence and the hacks given by males in response. Hacks that are heard following a tree fall can, therefore, be recognized as a consequence of the tree fall rather than indicating the presence of an eagle. The only situation that elicits hacks that is not accompanied by sound is where eaglesthat remain quiet while hunting (S. Shultz personal communication, May 23, 2008; K. Arnold personal observation)have been detected. Therefore, hearing hacks that are not preceded by other types of acoustic information allows listeners to infer that the caller may have spotted an eagle and look upwards in order to attempt to detect it or hide in dense foliage. Indeed, female subjects spent more time looking toward the sky after hearing recordings of hacks alone than when hacks were preceded by contextual acoustic cues .
The same principles apply to understanding the cause of pyows given in response to noisy, nonpredatory disturbances. But in the case of these calls, there are two very different contexts in which pyows are produced in the absence of additional acoustic information. One is leopard detection, which requires immediate action by listeners, and the other is where males call spontaneously, drawing attention to their own presence and location, thereby facilitating group cohesion and intergroup spacing. Surprisingly perhaps, analysis of this call type did not detect differences between pyows produced as alarm calls and those given in non-alarm contexts, both in terms of their acoustic structure and the rate at which they are produced (Arnold & Zuberbühler 2006b. So, listeners are unlikely to be able to discriminate between contexts on the basis of call characteristics alone. Series of pyows constitute the most frequently used calling pattern by far (approximately 85% of naturally produced call series), and their proposed function as an attention-getter is well suited to the role they play, both to alert group members to leopard presence and to deter the leopard itself . On detecting a leopard, the male approaches as closely as possible and, from the safety of a branch above, pyows continuously in full view, always keeping the leopard in sight. Since leopards are ambush hunters, advertising the fact that it is being observed, so that it cannot launch a surprise attack, is an effective deterrent and results in the leopard giving up and moving away (Zuberbühler et al., 1999). This anti-predator strategy is employed by the whole group, and females and their young also approach closely to keep track of the leopard and collectively mob it.
Again, how do group members know when pyows signify leopard presence, as opposed to merely advertising the male's presence in a non-predatory context? In an experimental playback study designed to simulate natural situations in which the male calls spontaneously or in response to a disturbance , subjects were observed to spend significantly more time looking in the direction of the caller after hearing pyows alone (an ambiguous situation) than when they were preceded by the sound of leopard growls or tree fall. Looking toward the caller is most likely an attempt to gain information about the male's behavior, as males behave quite differently when producing pyows in response to threats as opposed to spontaneously. When calling spontaneously, the male's attention is not directed to any particular location, nor is he especially vigilant. In contrast, when males call in response to a potential threat, they cease other activities, orient their body toward the threat in order to monitor it, and are extremely attentive. This combination of the male's vocal behavior and body posture allows nearby group members within sight of him to distinguish between predatory and non-predatory contexts very rapidly. The combination of his body posture and gaze direction also allows them to ascertain the location of the predator. If he is calling because he has spotted a predator, females then approach the male so that they too can monitor the threat and begin high-pitched chirping (which is their single alarm call type; Arnold & Zuberbühler, 2006a) and mobbing. However, during day-to-day activities, the group can be spread over a distance of a hundred meters or more and, in a low visibility environment typical of rainforest, many group members will not have direct visual access to the male and cannot take advantage of information about his body posture that affords differentiation between calling contexts. In such situations, whether they hear female alarm calls emanating from his location provides them with the information that they need. If they hear female chirp calls in combination with the male's pyows, they know that they should approach and begin calling themselves. Calls then spread throughout the group alerting all members to the threat.

Two Notions of Pragmatics
Recent discussions of the relevance of alarm calls to the evolution of language have advocated a pragmatic approach that emphasizes the crucial role of contextual information in enabling listeners to derive meaning from ambiguous vocal signals (Wheeler & Fischer, 2012). In similar spirit, our description of the way the meaning of alarm calls of putty-nosed monkeys can be disambiguated highlights the integration by listeners of external visual and auditory cues, and others have highlighted the use of social knowledge concerning the caller's identity, dominance rank, kinship affiliations, and recent interactions, in transforming a signal type that carries only vague information into a token that has a very specific meaning in context Seyfarth & Cheney, 2017). However, we should distinguish two different notions of pragmatics at work in the literature on animal communication (Bar-On & Moore, 2018). The first notion is due to Carnap (1942): Carnapian Pragmatics: the study of the variation and derivation of the significance of signal types with the context of production.
Carnapian pragmatics covers a very wide range of phenomena indeed. It covers the various ways in which the same sentence type might be interpreted differently in different contextsfor example, "It's snowing" will convey different propositions depending on when and where it is uttered. It also covers the ways in which a monkey calls might convey different information in different circumstances, and the way they may be interpreted differently at different times or locations. But it covers much more: not only the interpretation of vocalizations of birds, prairie dogs, suricates, and other animals (e.g., Slobodchickoff et al., 2009;Townsend et al., 2012), but also, it seems, bee dances, firefly mating flashes, octopus color changes, and so on (Barron & Plath, 2017;Scheel et al., 2016;Stegmann, 2009; and see Fitch 2010, who is willing to credit receivers in all these species with "sophisticated pragmatic inferences.") This seems to risk rendering Carnapian pragmatic phenomena in animal communication too ubiquitous to be useful for understanding how language could have emerged in the primate lineage. Importantly, if 'pragmatics' is understood in the Carnapian way, this places a heavy burden on the proponents of the pragmatic approach to animal communication: to explain the specific ways in which animals' interpretation of alarm and other calls could shed light on human linguistic communication. If the only sense in which such communication "constitutes a rich pragmatic system" (Seyfarth & Cheney, 2017, p. 340) is that it exhibits receivers' context-sensitive interpretation of vocal signals, then it is really not clear why we should think that the study of primate calls can shed more light on the evolution of language than, say, the study of bee dances. The reason is this. The more we take calls to have relatively fixed meanings that are produced more-or-less inflexibly, on the model of, e.g., various insect signals, the less their production resembles the use of words. But now suppose that we understand 'contextual interpretation' by receivers along Carnapian lines, as a form of decoding signals with fixed meanings by learning associatively (inductively) to assign different meanings to calls depending on the presence or absence of various contextual cues in the environment. Then it becomes less plausible to suppose that the acquisition of information by call receivers in given situations depends on psychological mechanisms that resembleor in some way foreshadowthose that underlie human linguistic interpretation. In short, granted that primate call interpretation involves psychologically complex integration of multiple sources of information, requiring flexible, learned responses (Wheeler & Fischer, 2012), its relevance to the evolution of language still requires establishing that it is different from the kind of (Carnapian) interpretation shared by many animal signal receivers.
A second, much more restrictive notion of pragmatics derives from the work of Paul Grice (1957). On the Gricean notion, pragmatic phenomena essentially involve the production of utterances with audience-directed communicative intentions and the attribution of these intentions to producers by their interpreters.
Gricean pragmatics: the study of the production of utterances with communicative intentions and their mindreading interpretation by interlocutors.
This more restrictive notion of pragmatics has been adopted by many recent discussions of language evolution (e.g., Anderson, 2004;Burling, 2005;Fitch, 2010;Hurford, 2007;Origgi & Sperber, 2000;Scott-Phillips, 2015;Tomasello, 2008). Gricean pragmatics covers only those phenomena that involve what Grice described as speaker meaning, where speaker meaning is understood to depend on "a serious degree of recursive mindreading" (Origgi & Sperber, 2000, p. 20). On the Gricean approach, the fact that animal receivers extract rich information from signalers' calls is simply insufficient to show the relevance of the calls to understanding the emergence of language. For the calls to have such relevance, what would need to be established is that signalers produce calls with certain kinds of communicative intentions, and that receivers make inferences about those intentions when interpreting the calls. Clearly, from the fact that receivers extract rich information from the signals they receive, it does not follow that their doing so depends on their employment of Gricean 'mindreading' capacities. After all, many creatures extract rich information about their physical environment, which does not involve attributing mental states to anyone.
It is important to keep these two notions of pragmaticsthe Carnapian and the Griceanseparate since their application to animal communication can have very different implications for the relevance of behaviors such as alarm calls to the study of language evolution. A Carnapian 'pragmatics-first' approach appears to set the bar too low for potential relevance of primate calls to the evolution of language, since it is applicable indiscriminately to both calls and insect signals. On the other hand, a Gricean 'pragmaticsfirst' approach sets the bar too high, because it implies that our ancestors would have had to be capable of producing and interpreting utterances with speaker meaning before becoming capable of engaging in linguistic communication. That approach presents us with a puzzle concerning the evolutionary emergence of the sophisticated psychological capacities needed for such production and interpretationa puzzle which is of a piece of the puzzle of language evolution itself ( Bar-On, 2013, 2018.

'Expression Pragmatics'
We think that those who advocate adopting a pragmatic perspective on animal calls in order to establish them as relevant to the evolution of language should seek an intermediary pragmatic understanding of the significance and function of calls. Elsewhere, we have begun to develop such an understanding (Bar-On, 2020; Bar-On & Arnold, 2020). On the view we favor, animal calls constitute a subset of expressive behaviors, the significance of which is not captured by either Carnapian or Gricean pragmatics. On the one hand, expressive communication cannot be fully understood simply in terms of contextual determination of signal significance based on multiple sources of information. On the other hand, expressive communication is not Gricean, as it does not require possession or attribution of communicative intentions.
What we refer to as 'expressive communication' is rather familiar in both the human and the nonhuman domain. In his seminal work, The Expression of the Emotions in Man and Animals (1872), Darwin identifies expressive behavior as representing an important common ground between 'man and animals'. He had in mind various facial and postural expressions, such as those associated with anger, fear, pain, etc., aggressive and affiliative vocalizations and gestures (which we take to include distress, alarm, and food calls, as well as play bows and food-begging gestures), and so on. Darwin portrays expressive behaviors of these sorts as having complex physiological and behavioral profiles that serve to reveal animals' psychological states (Darwin, 1872, especially chapters IV and V). But whereas Darwin himself regarded expressive vocalizations, specifically, as having had an important role to play in the early stages of the evolution of language (Darwin, 1871, Ch. 2), contemporary researchers have been more dismissive. For example, Fitch (2010) cites the species-specific, innate character of animal calls and the relatively tight connection of expressive vocalizations more generally to animals' affective or motivational states as important reasons for rejecting expressive theories of language evolution. See also Tomasello (2008, p. 14) who describes expressive behaviors as mere 'communicative displays', which he characterizes as "prototypically physical characteristics that in some way affect the behavior of others," comparing them to purely informative displays such as "large horns which deter competitors or bright colors which attract mates." By contrast, and more in line with Darwin's view, several other contemporary researchers have suggested a more nuanced view of expressive behavior. For example, Marler (2004) suggests that a bird's alarm call is not best understood simply as a purely instinctive or reflexive reaction that is merely reliably correlated with the presence of a certain type of predator. He remarks that, communication by [affective] displays can be very complex … if a bird couples a call with some kind of indexing behavior, such as head-pointing or gaze direction, a certain object or point in space or particular group member can be precisely specified: the combination adds significantly to the communicative potential of emotion-based signals. (p. 176, emphasis added) The suggestion here is that a bird's alarm call canand often doesfulfill its communicative role by showing the bird's fear at the same time as it reveals the fear's intentional content ('intentional' is here used in Brentano's sense, Brentano, 1874). Similarly, Snowdon (2008) has argued that chickens' food calls can both be referential and communicate an affective state, perhaps of social invitation (see also Marler et al., 1992;Seyfarth & Seyfarth, 2003b2018. On Marler and Snowdon's way of understanding them, birds' alarm calls, though unlearned, can still be regarded as prefiguring at least certain aspects of linguistic communication. An alarm call is directed at a predator of a particular type, in virtue of expressing several aspects of the animal's psychological state. The call showsand its designated audience can recognizea more or less intense agitation at, or fear of, a predator of that type. Coupled with a head tilt or directed gaze, the call can point to a specific predator of the relevant type (Bar-On, 2013).
Along these lines, we tentatively propose to identify a category of signalswe will call them 'expressive signals'that are used in animal communicative interactions that involve expressive behaviors (on the part of producers) and their uptake (on the part of receivers). Properly understood, we suggest, expressive signalsand the kind of communication they afford animals that use thempossess a number of features that have potential relevance to the evolution of language. (for further discussion, see Bar-On, 2013, 2018Bar-On & Arnold, in prep.) (i) Expressive signals are naturally designed to show various aspects of the psychological states they express (both affective and cognitive)the type of state, its intensity or degree, the state's intentional objects (i.e., what they are directed at or are about). They are also designed to show signalers' impending action and to elicit appropriate responses (both behavioral and psychological) in relevant others. Individual producers do not harbor intentions to reveal their psychological states or to affect the psychological states of receivers, and receivers do not attribute such intentions to producers (which would require possession of sophisticated theory of mind, something we have no reason to believe our last common ancestor possessed). (ii) Despite not being designed by intention to affect the audience's states of mind, the natural function of expressive signals is to reflect and affect producers' and recipients' current psychological states. In this way, expressive signals can potentially constitute a psychological starting point for understanding a form of animal communication that foreshadows human communication. (iii) Being naturally designed to suit the social-biological purposes of co-habiting groups of animals, expressive signals, as vehicles, or signal types, enjoy relatively stable significance and specific function that prefigure the conventional stability of linguistic signs. In a sense, they embody shared natural conventions (but see later). (iv) Expressive performances or actsthat is, token uses of expressive signalscan be brought under considerable voluntary control. Unlike the signal repertoires they utilize, the performances are not entirely fixed, and they form intricate patterns of active, dynamic intersubjective engagements Fitch, 2010, especially Ch. 4;. In this respect, expressive signals are different from what Tomasello describes as 'informative displays' (Tomasello, 2008). Even if expressive signals form relatively fixed repertoires, the use and uptake of such signals can manifest various sorts of flexibility. At the very least, producers of even unlearned expressive signals can suppress, modulate, and modify their use; and receivers' understanding of such signals can be shaped by their perception of the environment, memory of prior interactions, their own psychological state, as well as their uptake of producers' present behavior and psychological state (Crockford et al., 2012;Schel et al., 2013;Seyfarth & Cheney, 2017.) (v) Expressive communication is often triadic, relying on mechanisms of shared attention (as opposed to joint attention) that allow signalers and receivers to attend together to objects or events of mutual concern. It is, of course, an empirical question which among the communicative behaviors of animals possess these features. Nevertheless, our tentative proposal is that the above features can be used to characterize a distinct sub-category of animal signals. Thus, recall the widely used definition of signals due to Maynard Smith and Harper (2003, p. 3), according to which "[s]ignals are traits or behaviours that (1) alter the behaviour of other organisms, (2) evolved because of that effect on receivers, and (3) are effective because the receiver's response has also evolved [in relevant ways]." Our proposal can perhaps be best understood as saying that animal expressive signals constitute a special subset of animal communicative signals. These signals have at least as part of their distal evolutionary function effecting mutually beneficial changes in behaviors of recipients; they are designed to motivate receivers in the relevant group to take suitable actions (run from danger, come to get food, back off, and so on). The benefits of these effects explain why the production of these signals persists in a species. However, our tentative theoretical hypothesis (which would, of course, need to be tested empirically) is that, within the broad category of animal signals, there are signals that have evolved to accomplish their distal function by fulfilling a more proximal function. The proximal function is that of openly revealingas opposed to concealingspecifically, psychological states, thereby also bringing about changes in the psychological states of recipients. Such changes often, but not invariably, result in immediate behavioral changes. It is through accomplishing the more proximal function that expressive signals accomplish the more distal function (See Bar-On, 2013, 2018; see also Smith, 1977Smith, , 1997. Expressive communication systems consist of relatively fixed repertoires of signals that are, however, dynamically and relatively flexibly deployed by both producers and receivers. Our conjecture is that such systems have evolved in social groups to facilitate intersubjective, world-directed interactions by relying, specifically, on an evolved capacity for the behavioral display and uptake of psychological states (Bar-On, 2018, 2020.). The communicative work of expressive signals is done through the spontaneous production of behaviors that are designed to manifest or openly reveal (rather than conceal) states of mind of producers the recognition of which by receivers could benefit them in various ways. For example, recognizing a producer as being very scared of a particular threat present can allow a suitablyendowed recipient to be properly alarmed, be in a position to identify the source of the threatperhaps by following the producer's gaze or bodily orientationand thereby be motivated to take the relevant action to avoid the threat. But it is worth re-emphasizing that the producers of expressive signals need not harbor Gricean intentions, and their receivers do not engage in Gricean interpretation of those signals. So expressive communicators are not Gricean communicators. Nevertheless, we maintain, expressive communication is not purely Carnapian. At least as it is manifested in primates, it appears to rely on the capacities of social communicators to adjust their behaviors and responses on the basis of their present perceptions of each other's psychological states, as well as their knowledge of past intersubjective interactions, and other psychological factors.
Focus on these social-psychological features of the expressive character of animal calls, we think, can motivate articulating an intermediary notion of pragmaticsexpression pragmatics.
Expression pragmatics: the study of dynamic social-communicative exchanges that rely on the production and uptake of expressive signalssignals designed to show psychological states of individuals to designated recipients in specific situations.
The phenomena to which expression pragmatics is applicable are not ubiquitous. While various forms of signaling are widespread in the nonhuman animal world, not all animals that signal engage in expressive communication. If we are right, expressive communication has evolved in social species specifically to show openly psychological statespossibly in order to strengthen affiliative bonds, facilitate cooperation, and elicit other mutually beneficial actions via contagion for example (and other, non-Gricean psychological mechanisms), and so on. Given this function, it stands to reason that producers and interpreters of expressive signals would monitor each other's attention, as well as attend to other signals that reveal their current states of mind, and flexibly and dynamically modify their expressive behaviors in response to others' reactions (for some examples and relevant discussion see Smith, 1997). A fruitful research question would then be to what extent the use of calls by individual callers in given situationsboth in primates and perhaps also in other social speciesexhibits these characteristics regardless of whether the calls understood as types of vehicles (or signals) belong to an unlearned, or innate repertoire.
To conclude, we would like to illustrate the approach we have outlined here by returning to the putty-nosed monkey case. As described above, the proposed interpretation of male putty-nosed monkey calls may require only the Carnapian notion of pragmatics, insofar as it describes the monkeys' sensitivity to the combination of call type and other available forms of contextual information in the environment as sufficient to reveal the cause of calls and to select an appropriate response (see also Price & Fischer, 2014). Even so, the interpretation of pyows appears to present a more interesting case. While is it possible that pyows given in contexts in which they function as alarm calls do, in fact, differ acoustically from those that merely draw attention to the calling male, in ways not captured by earlier analyses, the male's accompanying behavior renders distinguishing such subtle differences unnecessary. Although the puttynosed call repertoire is fixed, we think the significance of the calls as used in different contexts exhibits the type of flexibility characteristic of the use of expressive signals. As we noted earlier, while calling, the male's body posture reveals aspects of his psychological statewhether he is focused on something in particularand, in the case of predator detection, the object of his attention and his likely future behavior. Moreover, on hearing his call, other group members within sight were observed to actively seek further information about his behavior in order to establish what the male was calling about, instead of responding in a 'scripted' way by initiating a specific kind of anti-predator behavior. And other group members lacking visual access to that information appeared to be alerted to the threat on hearing these female's chirp calls in concert with the male's calls, and only then approached the threat and began calling and mobbing.
This dynamic pattern of putty-nosed monkey intragroup calling and response behaviors suggests that the communicative work of the putty-nosed alarm calls relies not only on integration of environmental cues, but also on the identification and interpretation of multiple psychological aspects of the calling situation, and can be distributed across different members of the group. The pattern described, we think, is not adequately understood using a purely Carnapian pragmatic framework, although it does not justify applying a Gricean framework, either. If this is so, then our theoretical understanding of call systems and their relevance to the emergence of linguistic communication could benefit from adopting the perspective of expressive pragmatics.