Can Nonhuman Primate Signals be Arbitrarily Meaningful like Human Words? An Affective Approach

– Whether one can label nonhuman primate signals as ‘meaningful’ hinges on what one takes as central features to meaning. If one targets a notion of meaning closely related and comparable to meaning in human words, two features must be identified: firstly, a concrete ascribable meaning to the signal and, secondly, an element of convention or arbitrariness of the signal’s meaning. In their seminal paper published in 1980, Seyfarth, Cheney and Marler demonstrated that vervet monkey alarm calls have concrete, discrete, ascribable meaning. But what about their arbitrariness? Here we will suggest a potential way into the investigation of this second feature: Human individuals are capable of comprehending arbitrary word meaning through learning and teaching processes. The current theory suggests in particular that imitation learning and natural pedagogy-like teaching behavior are necessary. For nonhuman primate signals, there is high doubt that learning processes are involved in the acquisition of novel signals, for instance, during ontogeny, and even higher doubt in the involvement of natural pedagogy. We will tackle the question of why complex imitation learning and natural pedagogy is not necessary for animal signals to be arbitrarily meaningful. We will also argue that the framework of ASL – Affective Social Learning – can help us determine whether simple forms of learning and passive forms of (indirect) teaching hinging on affective states of the teacher are involved, allowing for an arbitrary character of nonhuman signals.


Introduction -The Origins of the Study of Meaning in Nonhuman Primates
In their classic work, Seyfarth et al. (1980a, b) described what appeared as a meaningful triad of alarm calls. These descriptions were based on Struhsaker's (1967) suggestions of acoustically different vocalisations in vervet monkeys (Cercopithecus aethiops). The alarm calls appeared to refer to different predators, each eliciting an adaptive response to the predator in question (Seyfarth et al., 1980b). Following from that, Seyfarth and colleagues considered the vervet alarm calls to be semantic in the sense of carrying concrete information about the potential predator, as opposed to being signals carrying information about only the emotional state of the caller, e.g., fear. At the time of the publication, their aim of comparing these seemingly meaningful alarm calls to human semantics, i.e., meaningful human words, was particularly novel and ground-breaking. Seyfarth and colleagues introduced the idea that, to compare primate signals' semantic features to human words, arbitrariness may be an important feature to look for. Human words are meaningful in an arbitrary sense, i.e., their acoustic features do not presuppose in any way what they mean. Their meaning is determined by convention; i.e., language users agree upon a word's meaning. Seyfarth and colleagues proposed that the vervet alarm calls were at least arbitrary in the sense of them not resembling "in physical contours what they denote" (Seyfarth et al., 1980b(Seyfarth et al., , p. 1091. Their acknowledgement of the importance of arbitrariness as a feature of meaning comparable to human word meaning did not however trigger many further investigations along these lines (for a review focusing on research on potential arbitrariness see Liebal & Oña, 2018). In particular, the question of what arbitrariness at the theoretical level in primate signals concretely implies (i.e., what we actually would like to find in order to label it arbitrary) has been mostly ignored. Here, we discuss how this question can be targeted. In general, we will take the stance that there is evidence for arbitrary signals in nonhuman primates. The arbitrariness may be less fully exhibited though than in human language. Furthermore, systematic empirical research is required for a verdict.

Meaning in Human Words -The Characteristics of Conventional Meaning
Human words are semantically meaningful (Hurford, 2007) because they refer to or stand for one particular entity or group of entities context-independently, no matter in what context the speaker produces them (e.g., Bach, 2006). This is despite the fact that context can have an influence and change their meaning, e.g., when a word is used in a novel way (Sievers et al., 2017). Words have this property of being semantically (context-independently meaningful), because we as part of a language community (indirectly) agree to use the word in the same way (e.g., Lewis, 1969). That is, we agree on a convention of the word's use, which amounts to its semantic meaning. For example, the term 'grizzly bear', if used according to convention, refers to a representative of the particular animal species. To understand the sentence 'grizzly bears are scary', the hearer only needs to perceive the utterance, and is not required to take into account further contextual cues, as 'grizzly bear' according to the convention is always used to refer to an individual of the particular species of animal. But what does a convention about a word's use and with that meaning actually imply?
Minimally understood, for a signal to be conventionally meaningful, a proliferation history of its arbitrary use in a particular way is necessary. Pointing out a grizzly bear in stable ways over generations, generated its context-independent meaning. Because of this proliferation history, arbitrary words can mean anything (hence being arbitrary in their meaning), as stable use determines the meaning, not the signal's acoustic structure. According to Millikan's stance on the occurrence of word conventions (1984,2005), the word has a proliferation history, because it served a purpose or function, and, with that, it had a survival value. For example, for the signal 'grizzly bear' to be meaningful, over generations, whenever a signaler saw a grizzly bear, they uttered "Grizzly bear!" in the presence of others and, with that, warned them about the presence of a bear of that species.
The genesis of a new convention may start by one speaker intentionally using a word in a novel way (Grice, 1957;Millikan, 2005). Listeners over time may infer how the word is used in this novel way, and if the new use serves a purpose, may go on using the word in the same way. This is the starting point for the occurrence of a convention. To investigate whether arbitrariness is a feature of nonhuman animal signals, one has to focus on the occurrence of such novel uses, and with that, whether the necessary learning and teaching mechanisms on the signaler's and recipient's side are comparable to the ones in human interlocutors. A second option is to investigate the learning of potential convention-like uses of signals during ontogeny. For human children, according to established research, complex learning and teaching processes are involved in arbitrary meaning acquisition of words (see below). In the following sections, we will discuss whether nonhuman primates can learn arbitrary signal meaning, given what we know about their cognitive capacities. We will focus on ontogeny, even though any occurrence and spread of novel uses in animal signaling are relevant for the investigation.

The Importance of Imitation Learning and Teaching for the Comprehension of Arbitrarily Meaningful Signals in Humans
For young humans to establish knowledge about the proper use of newly acquired words, cognitively complex learning on the child's side and active teaching on the caregiver's side are currently deemed necessary. Concretely, discussions about language acquisition and word learning often focus on the role of, firstly, imitation learning and secondly, natural pedagogy-like teaching. It is claimed that human children are displaying imitation learning (e.g., Whitehurst & Vasta, 1975) defined as the "reproduction of both behavior and its intended result" (Boesch & Tomasello, 1998, p. 599) in order to learn words and their uses. Imitators copy and reproduce both the mental state and the behavior of the demonstrator. For instance, for an observer to imitate a knower's way of using the phrase 'Grizzly bear!' to warn individuals about the presence of a bear, she has to be in the same goal state as the knower wanting to warn others of that danger, and copy her exact action, e.g., using the phrase in the close proximity of that particular species of a bear.
However, imitation alone is not considered sufficient to learn arbitrary words' uses. A language consists of numerous arbitrary signals, which can themselves be combined in many different ways to create new, compositional signals (Hurford, 2007). Learning such an arbitrary system, which exclusively fulfils the function to communicate contents, to request things, and to share information with others (Kirby, 2017), seems quite challenging, particularly as new learners must take part in a previously agreed system for which the rules may not be obvious. For instance, an observer would have a hard time differentiating between relevant and non-relevant behavior of a knower. Adult humans are aware of the conventions, but how did they get to know them? As knowledge about language conventions is cognitively opaque knowledge, i.e., not fully comprehensible by a learner through observation, natural pedagogy-like teaching is also required, so goes one position (Tomasello, 2003(Tomasello, , 2008. The concept of natural pedagogy is defined by Gergely and Csibra as referring "to instances of ostensive communication that promotes the learning of generic knowledge by the addressee" (Gergely & Csibra, 2013, p. 127). Parents communicate ostensively (i.e., they openly show that they intend to provide information) and intentionally with their offspring, enhancing and teaching correct language use Csibra & Gergely, 2009). A combination of imitation and teaching has then the characteristics required for acquiring a complex signal system such as human language: the naïve individual starts to imitate in order to learn more about the goals involved and the concrete use of a signal and a teacher actively provides cues on how to use the exact signal with the correct goal. The teacher, on the other hand, points out information to the naïve individual through ostensive communication. The naïve individual then has to be capable of grasping ostensive cues at the very least. Ostensive signals here do not have to be linguistic but referential (Sievers et al., 2017). Pointing gestures, gaze direction etc. are ways to point out important information to learners. Young children innately seem to grasp ostensive signals' referential qualities, as they naturally follow the gaze of a caregiver to the referential target (Csibra, 2010); but what about nonhumans? Important for the investigation of arbitrary signals in nonhumans is that both imitation learning and natural pedagogy-like teaching are problematic notions for nonhuman primate research: for teaching, at best, few anecdotal descriptions can be found (see review in Gruber, 2016), and, for the presence of imitation learning, the verdict is still out. Researchers mainly disagree over what imitation learning amounts to on a cognitive and behavioral level with Tomasello and colleagues assuming imitation to involve inferring the precise knower's goals and displaying the precise behavior of the knower (e.g. Tennie et al., 2009). However, from such a perception of imitation follows that nonhuman primates, including great apes, do not imitate in this sense. Tennie and colleagues (2012), for instance, tested 15 chimpanzees in their copying of the actions of a conspecific, and only one chimpanzee was successfully found to do so. When the behavior was replaced by an unknown action that was not part of the chimpanzee behavioral repertoire, even this individual did not succeed in imitating the behavior. The most optimistic conclusion is that chimpanzees, and possibly other nonhuman primates, may be capable of imitating only in cases concerning behavior they already know (see also Hobaiter & Byrne, 2010). As Tennie et al. argue (2009), chimpanzees might focus on the outcome rather than on what exact behavior the other is displaying. Others though disagree and adopt the position that, whereas chimpanzees may not be as precise as humans in copying exact step-by step behavior and goals, imitation may not be out of their reach, albeit with lower fidelity (see review in Gruber, 2016;Whiten et al., 2009). It follows that chimpanzees may be poor learners of arbitrary signals, such as words, where the only way to understand the use of a signal is through imitating its use exactly to understand more about the involved goals. It is important to note here that, even though evidence suggests that chimpanzees are not talented imitators, the literature focuses mainly on tool-use based imitation (see Gruber, 2016), rather than communication.

Phylogenetically Pre-determined Signal Meaning
There is no way to deny that animal signals have features of the non-arbitrary kind. The paradigmatic vervet monkey alarm calls are linked to danger due to their acoustic structure. Rendall and Owren (2002) argued that these high-pitched calls point out high arousal in the producer, and therefore, may easily be interpreted as "natural expressions of affective states" (p. 307). Alarm calls are short and have an abrupt onset in order to grab the audience's attention immediately (Owren & Rendall, 2001). Producing this specific acoustic structure in situations of predation will serve the group's survival (and therefore, the signaller's own survival), because all members of the group will immediately be alert ('startle effect, ' Rendall & Owren, 2002, p. 307). With such statements, researchers dismiss the possibility of ascribing particular animal vocalizations any form of content or meaning besides their function of influencing the audience. Signallers merely voice their fear, be it voluntary or not (Ducheminsky et al., 2014;Price et al., 2015).
But genetic transmission and emotional triggering of the vocalisations is not all that there is to a number of nonhuman primate vocal signals, including chimpanzee pant-grunts (Laporte & Zuberbühler, 2011) and vervet monkey alarm calls (Seyfarth et al., 1980a): Young individuals adapt their calling from undirected call production to production in specific contexts during ontogeny. For instance, young vervet infants produce the eagle alarm call in response to just about everything coming from above (e.g., leaves) and leopard alarm calls in response to entities approaching from ground level, for instance towards warthogs that in fact pose no danger to them (Seyfarth et al., 1980a). Therefore, the difference in classification between a predator coming from above and a predator being on the ground seems to be innately and phylogenetically connected to particular calls. Yet, the connection between a given call and a specific predator type appears to underlie a learning process in ontogeny, with youngsters gradually refining their calling to single out particular predators, particularly in the aerial domain (Seyfarth & Cheney, 1986). Based on these results, it is parsimonious to hypothesize the presence of some forms of social learning in the infants. Indeed, the cost of individual learning (i.e., predation) would be too high. The infant may learn how to apply the call correctly, i.e., it refines its 'knowledge' about the call's use through a reinforcement via other individuals' repetition of the correct call. The result of the process of refining the use and meaning of the call appears a promising situation to investigate the potentially arbitrary character of animal signals given that simple social learning is involved.

Primate Communication is Multimodal
Over the last decade, much progress has been made in nonhuman, particularly great ape research, in showing that nonhuman primate communication is multimodal (see reviews in Fröhlich et al., 2019;Liebal et al., 2014). Importantly, compared to the human signalling system, where vocal language grew overwhelmingly dominant over other modalities to convey signals, it is more controversial to select the correct carrying modality in animal signals (Fröhlich et al., 2019). This suggests that, to investigate arbitrariness in nonhumans, researchers must also consider non-vocal systems, particularly their gesturing system. A prime example of possible arbitrariness may lie in the leaf-clipping behavior of wild chimpanzees. Leaf-clipping is described as: [a]chimpanzee picks one to five stiff leaves, grasps the petiole between the thumb and the index finger, repeatedly pulls it from side to side while removing the leaf blade with the incisors, and thus bites the leaf to pieces. In removing the leaf blades, a ripping sound is conspicuously and distinctly produced. When only the midrib with tiny pieces of the leaf blade remains, it is dropped and another sequence of ripping a new leaf is often repeated. (Nishida, 1987, p. 466) None of the leaves used are eaten. Given this description, leaf-clipping appears randomly linked to the context of flirting. Furthermore, inter-group specific differences in the context of use itself are described in the literature. In some communities, leaf-clipping appears to mean "I am frustrated." For instance, in the Tai forest (Ivory Coast) younger males use leaf-clipping as an introduction to a pant-hoot and drumming sequence (Boesch, 1995). Though these sequences are often used to stay in contact with individuals far away, pant-hoots and drumming also serve as a relief of tension and frustration. The Tai chimpanzees (Pan troglodytes verus) appear to use leaf-clipping at the beginning of such a sequence in the latter context, therefore changing the context of use of leaf-clipping from a sexual to a possibly aggressive one. Furthermore, before chimpanzees were fully habituated in Bossou (Guinea), chimpanzees used to leaf-clip when researchers were present, indicating perhaps dissatisfaction with the situation (Boesch, 1995). Leaf-clipping in Bossou thus, may have had another meaning altogether. These different uses in different communities indicate that the signal itself may be arbitrarily linked to its context of use, with novel uses potentially occurring and spreading through learning (and perhaps non-active teaching) processes. A claim along these lines is underlined by Boesch and colleagues' observation of a change of use of the signal in the Tai chimpanzee community in the 1990s (Boesch, 1995): chimpanzees started to leaf-clip, while resting on the ground, interrupting the napping period, after having already had a long tradition of using leaf-clipping for flirting. Because, at that time, the community had already been habituated for several decades, it is likely that this constitutes a novel use of the behavior, rather than an event missed by researchers.
As such, leaf-clipping may present first evidence for arbitrary signals in nonhuman primates. For a final verdict on whether leaf-clipping presents an arbitrary signal, one must tackle the question of how the signal may be acquired in the first place, considering the possible limitations in cognitive processing outlined above. This implies that both a more systematic empirical investigation as well as a theoretical investigation is still missing. For the theoretical part, we aim to describe how leaf clipping and other potentially arbitrary signals may be acquired in what follows.

Is Imitation Learning the Only Learning Mechanism for Acquiring Arbitrary Signals?
Given that imitation learning à la Tomasello appears to be an almost uniquely human quality, one may wonder if imitation learning is not just one way amongst many to learn using signals in new ways. The list of learning mechanisms is long, with another social learning mechanism, emulation, offering an alternative. Emulation learning is present in chimpanzees and other nonhuman primates, and is defined as "the process whereby an individual observes and learns some dynamic affordances of the inanimate world as a result of the behavior of other animals and then uses what it has learned to devise its own behavioral strategies" (Boesch & Tomasello, 1998, p. 598).
As this definition does not require emulators to rely on mental states of the demonstrator, it may appear of limited interest for learning new communicative signals, as per the arguments developed above. Yet, Whiten et al. (2004) argue that an interest in the mental states of the demonstrator can be very well present in emulation learning, in terms of goal emulation. According to them, both emulation learning and imitation learning are observational learning types that are characterised by learners observing knowers engaging with, for instance, an object or producing a signal. Following from that, both types imply learners to have some interest in the mental states of the knower. Given the evidence for emulation learning in nonhuman great apes, one could argue that arbitrary signals could be acquired through emulation learning as well.
Fridland and Moore (2014) argue though, that emulation is not sufficient for learning uses of words because emulation learning does not include the exact copying of the behavior of the demonstrator, which makes the acquiring of language conventions complicated. If one is not exactly copying the context in which to use the word 'grizzly bear,' one might not be able to learn more about the goals of others involved in the production of the word 'grizzly bear,' because they are not copying their exact use. Language conventions require stable proliferation (Millikan, 2005) and not copying exact uses may undermine this stable proliferation. It follows that a learner that does not imitate is less likely to be able to use the word flexibly in new contexts, ironic contexts and, generally speaking, correct contexts. Therefore, chimpanzees that recognise others' goals and are being inspired by the behavior of the conspecific aiming to achieve the same goal may still not be good enough in acquiring novel uses of arbitrary signals, but worse, they probably would not be good at keeping them in circulation.
Yet, we argue that emulation learning could still be a way to acquire new or more specific uses of signals in a signal system that is less complex with regard to combinatorial possibilities, and sheer number of signals available, compared to human language. Furthermore, nonhuman signal systems are, by nature, to a certain extent phylogenetically determined with regard to the signals' meanings and lack features such as irony and the metaphoric uses of signals that make human language use impossible to grasp just through observation.
The lack of "full-blown" arbitrariness then also implies that, as opposed to human words, uses of animal signals may be easier to grasp through observation by learners: a naïve individual learning through emulation about one overarching goal involved in using the signal "grizzly bear," namely to warn others about the presence of a grizzly bear, will still be able to successfully use the signal for that purpose. That is, learners may be imprecise in copying the exact fine-grained intention involved on the knower's side, and, with that, may show only limited flexibility in the use of the signal. A less fine-grained intention though, such as warning others of the presence of a particular kind of bear, is straightforwardly observable and, through emulation learning, the goal of warning can be connected to the observed behavior of the knower. As such, the learned use of the signal involving the goal of warning, can be kept in circulation in the given community.

Is Natural Pedagogy Necessary for Acquiring Arbitrary Signals?
For teaching mechanisms, a similar argument can be made: While it seems that, for complex opaque cultural knowledge such as language conventions, natural pedagogy is an efficient tool to transmit knowledge, one may ask whether it is necessary for signal systems that are less complex and only partially learned.
The rarity or absence of active teaching in, for instance, signal acquisition during ontogeny in nonhumans is striking. However, this rarity might then be in part due to the different kinds of knowledge to be transmitted. Nonhuman primates may not have full-fledged opaque knowledge with regard to signal uses to be learned, as signals are never fully arbitrary. Therefore, to grasp the use of a particular signal, knowers may not be required to display natural pedagogy, but less active forms of teaching. The question is then, if nonhumans do not actively teach their conspecifics how to use a signal, how do they enforce a learning process of the correct use of a signal with arbitrary features? A continuum for mechanisms that can foster learning in the naïve individual would help us to evaluate what other animals are concretely lacking and how close their teaching skills are to humans. Based on the outcome, we then may reconsider whether the teaching and learning abilities in particular nonhuman species might be sufficiently complex to allow for the occurrence of potentially arbitrarily meaningful signals. In the following section, we will discuss a potential continuum within the so-called affective social learning (ASL) framework .

ASL as a Way of Investigating Elements of Teaching and Learning in Animal Signal Use
If we assume that nonhuman primates have no motivations (and perhaps not the capacities such as the possibility to produce ostensive signals and to display mindreading) to actively teach naïve individuals, what other options can allow knowers to influence naïve individuals in focusing on important information? Here, one must propose a framework that does not rely solely on complex intentions by teachers and learners, such as full-blown imitation or natural pedagogy. One promising proposal, known as Affective Social Learning (ASL, , builds largely on the notion of social appraisal, and has the practical advantage of suggesting an evolutionary continuity with natural pedagogy. Social appraisal refers to the phenomenon whereby people evaluate situations based on others' emotions (Manstead & Fischer, 2001). Other individuals' affective states then ascribe value to, for example, a certain object if they display certain emotions toward it. What we believe to be of value to focus on is thus, strongly influenced by others' testimony, via their emotional reactions. This is the case independently of others providing such clues intentionally. For instance, in a study where participants were seated in a room and smoke appeared, they were less likely report the danger if other individuals just shrugged their shoulders (Manstead & Fischer, 2001). The central idea of ASL is that knowers as potential teachers provide testimony about a given object (physical or not) through affective states, and, with that, influence the affective state of the learner, therefore channelling the learner's focus towards the behavior to be learned or the knowledge to be gained. A mother engaging with an object for instance, being immersed in the interaction with it, provides testimony about the object being important in the eyes of the learner and the learner will focus on the object. The mother's affective state of interest thus creates the setting for the child to learn about the object or about how to engage with it independently of how actively the mother actually engages with the potential learner. Following from that, the notion of ascribing value to an object through affective states may allow for teaching in very passive ways to be effective, with knowers merely being interested in the object and not at all or only in part in the learner's behavior. A famous example of social appraisal is the crossing of a 'visual cliff' by children (Gibson & Walk, 1960;Klinnert et al., 1983). In this scenario, young children approached a transparent board, their mothers placing a toy on the other side of the 'cliff.' Young children moved or stopped moving on the cliff depending on the positive or negative affective state displayed by their mothers. It seems like the mother provided them affective information on whether it was dangerous to cross the cliff.
This kind of information provision by potential knowers through display of affective states is at the core of ASL . Within the ASL framework, Clement and Dukes introduce a 4-staged continuum of mechanisms leading knowers to provide information for learners. The four stages are: affective contagion, affective observation, social referencing and natural pedagogy. While originally defined for developmental research, we have adapted this framework to primate social learning . Each stage is cognitively more challenging and demands more active engagement with the learner by the knower, with the last stage being natural pedagogy, requiring joint attention between teacher and learner, and ostensive communication.

Affective Contagion
Clément and Dukes (2019) define emotional contagion as "the process by which one person's emotion or mood can be directly influenced by someone else's" (p. 11). In other words, simply sensing, even without visual contact, that other nearby individuals are in a certain emotional state can influence one's own emotional state. For example, a little girl hearing her father in the other room screaming with fear because he hurt himself in the kitchen, will herself become afraid. Minimal social interactions are thus needed, and in particular, the learner and the knower are not engaged together. While at the verge of social learning, emotional contagion is particularly interesting from a comparative point of view, because it has often been proposed as an explanatory mechanism for vervet monkey alarm calling: vervets 'respond' because they have no control over their calls, which are under strong genetic control (see above). Yet, it is also possible to analyze emotional contagion through ASL. When a vervet infant uses the alarm call correctly (that is, for the correct eagle species), sometimes they hear mothers or other adults repeating the same call (Seyfarth & Cheney, 1986). The second caller likely independently engaged with the same object (the eagle) and displayed a specific affective state (fear), namely through behavioral cues and the production of an 'eagle' alarm call. Yet, such an effect reinforces the young vervet's own calling pattern, refining its 'knowledge' about the use of the call through passive reinforcement via other individuals' repetitions of the correct call. Of course, the fact that they are 'teacher' in this case is completely oblivious to them. Similarly, it may not matter which model replied to the young vervet, because it was focused on the object, not on the teacher.

Affective Observation
The vervet call example is particularly interesting because it can alternatively be described as either an example of emotional contagion or affective observation, underlying a continuum between the two mechanisms. Indeed, while the youngster may call only as an automatic response to others calling or to some particular stimuli, it soon becomes aware of others calling and is given the possibility to compute the relationship between the stimulus (e.g., the predator or the call) and the reaction (i.e., the predatorspecific behavioral reaction by its model). Once again, there is no active teaching from the knowledgeable individual, and no need for it to be aware that it is being observed. Clément and Dukes (2019) give a similar example with a child observing his father greeting a stranger wearing a uniform. The peaceful way his father engaged with the stranger is enough for the child to acquire information that the latter is not a threat. Such an example can, in fact, be fully translated to the nonhuman primate case with the social understanding that human researchers are no threat to incoming individuals. All wild primates emigrating into a habituated community feel initially threatened by the presence of researchers, but soon accommodate to their presence, a phenomenon that is improved when already habituated individuals are present (Samuni et al., 2014), and a prime example of affective observation .

Social Referencing
The major difference between affective observation and social referencing lies in the fact that, in the former case, the knower serves as a model without even needing to be aware that they are being watched, while in the latter case, the knower directly communicates information to the observer (the presence of ostension on the knower's side remains controversial, see discussion in . As discussed above, a paradigmatic example is the visual cliff experiment. Elsewhere, we have argued that chimpanzee road crossing constitutes an equivalent in nonhuman primates, where more dominant individuals provide the model and the incentive for younger individuals to cross the road . Other contexts in which social referencing may occur include tool-use learning; or social play and greetings as these contexts necessitate disambiguation between partners in a possibly tense situation .

Natural Pedagogy
See above for definition and use in the literature. As with its use in social learning, we did not find uncontroversial examples in the primate literature that would suggest the presence of natural pedagogy in nonhuman primates.
Overall, we find much suggestive evidence for various stages of ASL in primates, in particular in terms of social appraisal, both under the form of affective observation and social referencing. We conclude in our final section on how such a framework assists with defining and studying arbitrariness in primate communication.

Arbitrariness as a Feature of Meaningful Signals can be Perceived as a Continuum -But What Does that Actually Imply?
We argued above that the meaning refinement of vervet monkey alarm calls in ontogeny could be explained through simpler forms of social learning and affective observation or social referencing with the mother providing valuable information about the signal's use through affective state display. Whether this qualifies the alarm calls as arbitrary remains to be investigated by determining the learning and teaching process(es) involved. We claimed that the presence of emulation learning can be sufficient to allow the learning of a simple arbitrary signal. Generally speaking, for all potentially learned animal signals, given the in part phylogenetically pre-determined meaning of the signal systems and the comparably less complex uses and combinatorial options of the signals within these systems, less cognitively complex learning and teaching mechanisms displayed by the individuals appear to be necessary but also sufficient for signals to be arbitrary.
Arbitrariness, in this sense then, implies the possibility of a continuum of, on the one hand, fully arbitrary human words, and, at the other end, non-arbitrary, innate/hardwired signals. In the remaining part of this article, we discuss how such a continuum could be spelled out when looking at nonhuman signals, not just within the vocal but also within the gestural modality. Just as for animal vocal signals, it can be argued that arbitrariness is not present as gestures are abbreviated versions of species-specific behaviors produced in the same context the gesture is ritualised to be produced in (Tomasello, 2008). In that sense, the gesture is not arbitrarily meaningful, but its meaning is strongly linked to the behavior it is an abbreviation of. If we take vervet monkey alarm calls and leaf-clipping behavior in chimpanzees, both communicative behaviors fit somewhere on a continuum from fully arbitrary signals to non-arbitrary signals.
Vervet monkey alarm calls, for one, appear to be strongly linked to the emotion of fear (Rendall & Owren, 2002), which may place them close to fully non-arbitrary signals. However, they appear to have a referential quality and meaning refinement in ontogeny involving simple learning in young vervets and rather passive teaching from adults. This implies that the calls may have an arbitrary element.
Leaf-clipping appears closer to fully arbitrary signals. First, providing evolutionary reasons why leafclipping is linked to a flirting context is more difficult than providing evolutionary reasons for vervet monkey alarm calls being linked to predator situations (the 'startle effect'). Vervet monkey alarm calls may just be strongly linked to affective states and their uses are to a certain degree hardwired. Second, following from that, occurrences of novel uses (i.e., meaning) are more flexibly occurring in leaf-clipping than in vervet monkey alarm calls. This, in turn, may hint at more complex learning mechanisms necessary for such novel uses of leaf clipping to occur.

Conclusion
In this article, we have argued that, for the study of meaning in animal signals, a comparison to human meaningful words can be successfully conducted only if one first accepts the documented differences between the human and nonhuman signal systems, particularly with regards to combinatorial complexity and phylogenetically determined signal structures. We developed our arguments for primates but believe they can be extended to other taxa.
Secondly, by assuming partial arbitrariness in nonhuman primate signals, we have to abandon the idea of a clear-cut line between arbitrary and non-arbitrary signals, and rather adopt the stance of a continuum from arbitrary to non-arbitrary signals. Thirdly, abandoning the idea of a clear-cut line emphasizes that human language as well may contain non-arbitrary elements, especially in contexts involving very likely strong selection pressures on the signals and states of high arousal (such as interjections for voicing pain or fear). Finally, one has to acknowledge that, as of yet, nonhuman primate signals can only be in part arbitrary. However, this also means that the learning and teaching mechanisms necessarily involved for a nonhuman signal to be arbitrary do not have to be of the same complexity for humans and nonhumans. For humans, imitation learning and natural pedagogy appear to be features that enable learners to grasp the meaning of essentially arbitrary signals. For nonhumans, we have argued that emulation learning and processes along the line of affective contagion and social referencing are sufficient. Such a stance also opens the theoretical possibility to integrate the larger social learning literature in non-humans into the discussion on arbitrary signals; for example, the question of conformity (Whiten & van de Waal, 2016). Whether nonhuman signals though are to be called arbitrary in the end is not just a theoretical question, but essentially an empirical one. We argue that systematic research is needed to investigate the mechanisms in place, particularly for promising signals such as leaf-clipping.