Monkeys Choose , But do not Learn , through Exclusion

Human children will select a novel object from among a group of known objects when presented with a novel object name. This disambiguation by exclusion may facilitate new name-object mappings and may play a role in the rapid word learning shown by young children. Animals including dogs, apes, monkeys, and birds make similar exclusion choices. However, evidence regarding whether children and nonhuman animals learn new associations through choice by exclusion is mixed. In the present study, we dissociate choice by exclusion from learning by exclusion in rhesus monkeys using a paired-associate task. In Experiment 1, monkeys demonstrated choice by exclusion by choosing a novel comparison image from among known comparison images when presented with a novel sample image. In Experiment 2, monkeys showed little if any benefit from choice by exclusion in learning new sets of paired associates. Monkeys were trained with new sets of four paired associates by trial and error alone or by a combination of exclusion and trial and error. Despite choosing correctly by exclusion on almost 100% of opportunities, monkeys did not learn any faster by exclusion than by trial and error alone. These results indicate that monkeys choose, but do not learn, through exclusion, highlighting the importance of separately evaluating choice and learning in studies of the role of exclusion in word learning.

During the first years of a child's life, vocabulary grows rapidly.Fifteen-month-old children learn an average of 1.2 new words per day (Fenson et al., 1994).Young children encounter many new words daily, and new word-referent mappings are readily learned through these natural encounters without explicit teaching (Bloom & Markson, 1998).Children may "fast map" new words, learning their meaning after a small number of exposures (Bloom & Markson, 1998;Carey & Bartlett, 1978).One candidate mechanism for this rapid naturalistic word learning is disambiguation of word-name mappings through choice by exclusion.Choice by exclusion entails choosing an unknown referent after excluding known referents.The application of this process to word learning is most often studied in the context of selection of a novel item from among familiar items when presented with a novel name.For example, when asked to locate an item with a novel name (e.g., "Where is the zorch?"), children will select the unfamiliar object (shiny metal sponge) from among three familiar objects (ball, teddy bear, book; Golinkoff, Hirshpasek, Bailey, & Wenger, 1992).If the word-item pairing that results from such choice by exclusion is retained, the child has learned a new word.
Many studies have shown that young children can choose by exclusion (Bloom & Markson, 1998;Evey & Merriman, 1998;Ferrari, Derose, & McIlvane, 1993;Golinkoff et al., 1992;Halberda, 2003;Heibeck & Markman, 1987;Markson & Bloom, 1997;Mervis & Bertrand, 1994), and well-controlled studies have shown that these choices are not attributable to a general preference for novel items (Evey & Merriman, 1998;Golinkoff et al., 1992;Halberda, 2003).The claim that exclusion contributes to early word learning therefore seems plausible (Kaminski, Call, & Fischer, 2004).However, for exclusion to be a useful learning mechanism, children must both choose an item by exclusion and remember the word-item mapping selected through that choice (Bion, Borovsky, & Fernald, 2013).The child in the example above who chose the "zorch" from among the ball, bear, book, and shiny metal sponge exhibited choice by exclusion.To test whether this child learned the new word-item pair from this choice, retention tests must be introduced after the choice trial.On a retention test, the child is asked to select, after some delay, the item with the novel name (i.e., "zorch") from among other equally unfamiliar items.If the child has learned the word-item mapping, she will select the correct item without the familiar distracter items present from which to exclude it.
Although children between 2 and 4 years of age reliably choose novel items by exclusion (Bloom & Markson, 1998;Evey & Merriman, 1998;Ferrari et al., 1993;Golinkoff et al., 1992;Halberda, 2003;Heibeck & Markman, 1987;Markson & Bloom, 1997;Mervis & Bertrand, 1994), results regarding whether they learn from this choice are mixed.Some studies find retention of one item over short and long delays (2.5-year-old children, short delay; Golinkoff et al., 1992;short delay, 2-4-year-old children;Heibeck & Markman, 1987;short and long delays, 3-and 4-year-old children;Markson & Bloom, 1997) whereas others find poor retention with multiple items over even short delays (18-30-month-old children: Bion et al., 2013;2.5-year-old children;Golinkoff et al., 1992;2-year-old children;Horst & Samuelson, 2008).Therefore, although there is evidence that long-term word learning can occur through exclusion, it is still unclear whether this learning is common and robust enough to be a major contributing mechanism to the natural rapid word learning seen in children (Bion et al., 2013).The way attention is deployed during exclusion may not be optimal for new learning.This is because the very cognitive processes that allow choice by exclusionfocus on rejecting distractors and selecting the target by defaultare ones that may make learning unlikely as they diminish processing of the target.
Evidence from humans suggests that choice by exclusion is not a language-specific process, but is instead based on general mechanisms.Performance on exclusion tasks does not decrease with age, as does performance on other language acquisition tasks; adults perform as well as children at choosing by exclusion (Golinkoff et al., 1992;Markson & Bloom, 1997).Adults learn from exclusion choices as well, if not better, than children (Golinkoff et al., 1992;Markson & Bloom, 1997).Exclusion in children and adults is not limited to words, but may also be relevant to facts about objects (Markson & Bloom, 1997; but see Waxman & Booth, 2000).Finally, proficiency at choice by exclusion is not consistently related to vocabulary size in children (Byers-Heinlein & Werker, 2009), raising questions about the extent to which exclusion is a major contributor to word learning.If choice by exclusion is not language specific but rather is a general cognitive mechanism, it may be shared by other closely related species.Consistent with this hypothesis, nonhuman animals such as chimpanzees, dogs, monkeys, birds, and sea lions can make choices by exclusion, selecting an unknown item from among known incorrect items (Beran, 2010;Kaminski et al., 2004;Kastak & Schusterman, 2002;Marsh, Vining, Levendoski, & Judge, 2015;Pilley & Reid, 2011;Schloegl, Dierks, et al., 2009;Tomonaga, 1993).However, few studies have tested whether nonhuman animals learn from these choices.
Studies that have tested for retention following choice by exclusion in nonhumans have trained animals on a known set of item-label pairings that can be used as the to-be-excluded stimuli.For example, trained dogs have learned verbal labels for large sets of toys (Kaminski et al., 2004;Pilley & Reid, 2011) and language trained chimpanzees have a vocabulary of well-known lexigram images associated with specific items such as food, toys, and familiar people (Beran, 2010;Beran & Washburn, 2002).In choice by exclusion tests, a novel word (dogs) or lexigram (chimpanzees) is followed by presentation of a novel target item among familiar distracter items.Both the dogs and the chimpanzees correctly select the novel items when presented with a novel label, and do not select the novel item when presented with a familiar label, indicating choice governed by exclusion rather than novelty alone (Beran, 2010;Beran & Washburn, 2002;Kaminski et al., 2004;Pilley & Reid, 2011).However, only one dog showed retention of a small number of mappings chosen through exclusion over even a short delay (correct selection of 4/6 new mappings; Kaminski et al., 2004), the other dog and the chimpanzees showed no retention (Beran, 2010;Beran & Washburn, 2002;Pilley & Reid, 2011).This suggests that, as in human children, proficiency in choice by exclusion in nonhumans does not necessarily result in learning.
To further evaluate the extent to which choice by exclusion leads to learning in non-humans we studied these processes in rhesus monkeys, a primate species that diverged from humans approximately 30 million years ago (Steiper & Young, 2006).In Experiment 1, we determined the proficiency of monkeys in choice by exclusion.Monkeys initially learned a set of four image-image associations.On exclusion trials, they were presented with a novel sample image and four choice images-one novel and three from the known paired associates.Choosing the novel image when presented with a novel sample would provide further evidence that nonhumans can choose by exclusion, and that this is a general process that is not specific to language learning.In Experiment 2, we tested for learning of the novel paired-associates chosen on exclusion trials by comparing learning rates under exclusion plus trial and error to those under trial and error alone.If monkeys acquired the discriminations more rapidly with exclusion trials than without exclusion trials, this would indicate that they learn novel image associations through choice by exclusion.

Experiment 1: Choice by Exclusion
In order to choose by exclusion, monkeys need a familiar "vocabulary" of stimuli to exclude from, similar to a child's existing vocabulary at the start of an experiment.Subjects were taught four visual paired associates, such that each of four sample images was associated with one of four comparison images.On exclusion trials, the sample and the correct comparison image were both novel and the three incorrect comparison images were from this known associate set.Monkeys who can choose by exclusion would exclude the known comparison images as incorrect and select the novel comparison image on these trials.

Methods
Subjects and apparatus.Subjects were six 6-year-old male rhesus monkeys (Macaca mulatta) raised by their biological mothers in a large social group until the age of approximately 2.5 years.Monkeys were pair-housed and kept on a 12:12 light:dark cycle with light onset at 7:00 am.Animals received a full ration of food daily and water was available ad libitum.
Testing occurred in each monkey's home cage.Computerized touch-screen test systems consisting of a 15in.LCD color monitor (3M, St. Paul, MN) running at a resolution of 1024 X 768 pixels, generic stereo speakers, two automated food dispensers (Med Associates Inc., St. Albans, VT), and two food cups below the screen, were attached to the front of each cage.Correct responses were rewarded 85% of the time with nutritionally balanced banana flavored pellets (Bio-Serv, Frenchtown, NJ) and the remaining 15% of the time with miniature chocolate candies.Test sessions were conducted daily between 10:00 am and 5:00 pm, six days per week.
Procedure.During testing, each pair of monkeys was separated by an opaque plastic divider with holes that allowed visual, auditory, and tactile contact but prevented touches to the computer screen in the adjacent cage.Computer screens were locked to the front of each cage and the door was raised, giving subjects full visual and tactile access to the screen during testing.After a 3-s inter-trial interval (ITI), a green box appeared at the bottom of the screen and remained until the monkey touched it (fixed ratio 2) to start a trial (Figure 1).
Known associate training.Monkeys learned four paired associates.On each trial, one of the four possible sample images appeared in the center of the screen.When it was touched (fixed ratio 2), the four possible comparison images appeared in the four corners of the screen, with their locations randomized across trials.Selection of the correct comparison was followed by a positive auditory stimulus and food, while selection of an incorrect comparison was followed by a negative auditory stimulus and a 5-s timeout during which the screen was black.Correction trials followed incorrect choices: after the first error, the trial repeated exactly.If the monkey erred again the same sample image appeared followed by only the correct comparison image at test.Only performance on the first iteration of each trial was used for data analysis.Training sessions consisted of 400 trials, 100 of each paired associate.Monkeys began Experiment 1 after reaching 85% correct in a single training session.Experimental sessions.The same general procedure used in known associate training was used for all experimental trials.However, two new trial types were presented intermixed with the known associate trials from training.Known associate trials were identical to training trials, and consisted of a trained paired associate sample and the four trained paired associate comparisons.Choice of the correct comparison was rewarded.These trials were presented to maintain and evaluate knowledge of the paired associates, which were used as the distracter images for exclusion trials.Exclusion trials tested whether monkeys would choose the novel comparison image when presented with a novel sample.The sample on these trials was a trial unique novel image, and the four comparison images consisted of three known associate comparison distracter images and one trial unique novel image.Choice of the novel comparison image was rewarded.Baseline trials were designed to test whether the monkeys' performance on exclusion trials was due to choice by exclusion or to a general preference for novel comparison images.The sample on these trials was a known associate sample, and the comparisons consisted of the correct known associate comparison image, two known associate distracter images, and one incorrect trial unique novel image.Choice of the correct paired associate was rewarded.If monkeys selected the incorrect novel image above chance on these trials it would suggest that they had developed a general preference for selecting a novel comparison image whenever present.Sessions consisted of 400 trials (Figure 2); 200 known associate trials, 100 exclusion trials, and 100 baseline trials.Monkeys were tested until they reached over 75% correct (75/100 trials) on both exclusion and baseline trials in the same session.Data Analyses.Because monkeys required different numbers of sessions to reach criterion, analysis of improvement over sessions focused on the first five sessions, which all monkeys received, and each subject's final session.Repeated measures ANOVA tested for improvements in performance across these six sessions and one sample t-tests compared performance to chance (25%) in individual sessions.All proportions were arcsine transformed prior to analyses to better approximate normality (Aron & Aron, 1999) and all tests were conducted using an alpha level of .05.Ninety-five percent confidence intervals are presented in terms of the mean difference between observed proportion choice of the correct item and chance.

Results and Discussion
Monkeys learned to choose by exclusion, requiring an average of 11.33 + 5.33 sessions to reach the criterion of greater than 75% correct choices on both baseline and exclusion trials in a single session.Performance on the four known associates remained above criterion and did not change across sessions (Mean 85%; one-sample t-test t(5)= 29.20, p < .001,95% CI [31.46,42.58];Repeated measures ANOVA over test sessions: F(5, 25) = 1.39, p = .26).Monkeys learned to select the novel comparison when presented with a novel sample on exclusion trials (F(5, 25) = 11.36,p < .001,ηp 2 =.69), performing above chance by the second session of testing (t5=9.61p< .001,95% CI [11.69,32.26]; Figure 3).Above chance selection of the incorrect novel image on baseline trials would indicate that exclusion trial performance was due to a general preference for selecting novel items.Monkeys did show an increasing preference for the novel image on baseline trials over the first four sessions, but this preference decreased by the final test session (F(5, 25) = 10.99,p < .001,ηp 2 = .69).Preference for the novel image on baseline trials remained significantly below choice of the novel image on exclusion trials for all sessions (RM ANOVA main effect of trial type: F (1, 5) = 166.90,p < .001,ηp 2 = .97).Importantly, preference for the novel image was never above chance level (Figure 3).Therefore, accurate choice on exclusion trials was not due to a general preference for novelty, as monkeys selected the novel comparison only when the sample was also novel.
Monkeys chose a novel comparison from among familiar comparisons when presented with a novel sample.These results are consistent with findings from sea lions, chimpanzees, dogs, and birds (Schloegl, Bugnyar, & Aust, 2009) and indicate that monkeys can make choices by exclusion.Experiment 2 tested whether monkeys learn new paired associates as a result of this choice by exclusion.

Experiment 2: Exclusion Learning
To test whether learning resulted from choice on exclusion trials, monkeys were presented with two new sets of four paired associates.The basic set was learned solely by trial and error like the known associates trained for Experiment 1.The exclusion set was learned through an equal number of trial and error trials, plus additional exposure to the sample-comparison pairs as the "novel" sample and choice stimuli in exclusion trials.If monkeys learned through choice by exclusion they should learn the exclusion associate set more rapidly than the basic associate set.More rapid learning would be indicated by a steeper slope of the learning curve in the case of the exclusion set.

Methods
Procedure.Subjects were the same six rhesus monkeys that participated in Experiment 1. Monkeys received fourteen 400 trial sessions made up of trials from three types of stimulus sets: known associates, basic, and exclusion.
Known associates.Known associates were the same four paired associates used in Experiment 1. Basic stimulus set.The basic stimulus set consisted of four new paired associates.Trials with this set were identical to known associate trials except that the images were four new sample and four new comparison images.After touching a sample image, monkeys chose between four comparison images.Selection of the correct comparison image was rewarded with a positive auditory stimulus and food, whereas selection of the incorrect comparison image was followed by a negative auditory stimulus, a 5 s time-out period, and correction trials as described in Experiment 1.
Exclusion stimulus set.The exclusion stimulus set consisted of another new set of four paired associates.However, this image set was presented in two trial type formats-trial and error and exclusion.The trial and error trials were the same type as in the basic image sets-after touching a sample image from the exclusion set monkeys chose between the four comparison images from this same set.As in the basic condition, the correction procedure followed errors.On exclusion trials, a sample image from the exclusion set was presented, followed by four comparison images-three distractor images from the known associate set and the correct comparison image from the exclusion set.Based on their performance on exclusion trials in Experiment 1, monkeys should select the correct comparison image by exclusion on these trials even before they could have learned the paired associates.
Each 400-trial session consisted of 100 known associate trials, 100 basic set trials (all trial and error), and 200 exclusion set trials (100 trial and error and 100 exclusion).The exclusion and basic stimulus sets were each presented in the same number of trial-and-error trials per session but paired associates in the exclusion set were presented twice as often as the basic set in each session.If monkeys retain anything from their choices on exclusion trials they should learn the exclusion associate set more quickly than the basic associate set, indicated by a significant interaction between image set and session number.
The images used in the two test sets were counterbalanced across subjects.To assure the replicability of these findings, monkeys participated in four full iterations of this experiment with new counterbalanced image sets each time.Each iteration followed the same procedure and known associates were always the same as those in Experiment 1. Iterations were conducted sequentially such that after completing the first, monkeys moved on to the second.
Data analysis.Performance on the four iterations of this experiment were compared using two way repeated measures ANOVAs (repetition X session) for each of the trial types (known associates, basic set trial and error, exclusion set trial and error, and exclusion set exclusion).To assess whether monkeys learned the exclusion set at a faster rate than the basic set, proportion correct on trial-and-error trials for each set was analyzed using a two way repeated measures ANOVA (session number X set [basic vs exclusion]).An interaction between session number and condition would indicate a difference in learning rate between the two sets.One sample t-tests compared performance to chance (25%).
Monkeys continued to perform above chance on known associate trials (mean across all sessions: 96%, t(5) = 38.89,p < .001,95% CI [50.46,61.67]).Additionally, performance on exclusion trials for the exclusion stimulus set was above chance and remained close to 100% throughout testing (mean across all sessions: 98%, t(5)= 84.86, p < .001,95% CI [69.37,74.88]).This high level of performance indicates that selection of the correct paired associate by exclusion was regularly reinforced.If monkeys learned from their choices on exclusion trials, they should learn the exclusion set more rapidly than the basic set.
On trial and error trials that did not involve exclusion, monkeys were slightly more accurate on the exclusion set than on the basic set (Figure 4; RMANOVA main effect of image set: F(1, 5)= 7.67, p = .04,η 2 partial = .61).Monkeys were more accurate than expected by chance with the exclusion set in the first session, but only exceeded chance with the basic set by the third training session.However, there was no interaction between condition and session number, indicating that the rate of learning between the two sets did not differ (F(13, 65) = 0.63, p = .82,η 2 partial = .11).It may be that the exclusion set gained some small advantage from the added exclusion trials on the first testing session when all the sets were new.This rapid early advantage would result in the observed slightly higher level of performance on session 1 in the exclusion set that continued throughout the rest of testing.However, if choice by exclusion results in learning, it should occur throughout training and contribute to the rate of learning across all sessions.Such an ongoing contribution to learning would be evident in a steeper learning curve in the case of sets learned with the benefit of exclusion.Such a difference was not observed.
Failure to see an enhancement of learning as a result of exclusion trials was clearly not a consequence of lack of reinforced choices.Monkeys were reinforced for selecting the correct exclusion set paired associates on 98% of the 100 exclusion trials per session from the first session.By contrast, monkeys correctly selected the paired associates in the basic set and were reinforced on fewer than 25% of trials in the first few sessions.The lack of a difference in the slope of the basic and exclusion set learning curves indicates that the majority of learning in this experiment was due to trial-and-error, rather than exclusion.Whereas monkeys are excellent at making choices by exclusion, they do not appear to readily learn from these choices.The processes that account for the modest constant difference in accuracy between the basic and exclusion sets should be the focus of future work.

Choice by Exclusion
In Experiment 1, monkeys rapidly learned to choose a novel comparison when presented with a novel sample, demonstrating choice by exclusion.Monkeys continued to accurately choose by exclusion in Experiment 2. These choices were not controlled by a general novelty preference, because subjects in Experiment 1 did not select the novel comparison when the sample was a familiar known associate.Additionally, performance on exclusion trials in Experiment 2 remained high even after monkeys had seen the to-be-excluded pair thousands of times such that they were no longer truly novel.These results are consistent with findings of robust choice by exclusion in humans, chimpanzees, dogs, and sea lions (Beran, 2010;Beran & Washburn, 2002;Kaminski et al., 2004;Kastak & Schusterman, 2002;Markson & Bloom, 1997;Pilley & Reid, 2011;Tomonaga, 1993).

Learning by Exclusion
Despite accurate choice by exclusion, monkeys failed to learn the exclusion paired associate set more rapidly than the basic set in Experiment 2. This pattern suggests that choice by exclusion does not necessarily result in learning.Whereas a subject may exclude known comparisons on trials with novel sample images, they may pay little or no attention to the properties of the correct comparison image when it is selected.Instead, they may focus on the properties of the known incorrect comparisons.Indeed, execution of a true exclusion rule would require considerable processing of known incorrect comparisons they have to be identified as such in order to be rejected.If choice of the novel item on exclusion trials results from rejection of the known distracters, little processing of the selected image would be required, as it is selected by virtue of being the only item not known to be incorrect.If subjects fail to process the properties of the image selected by exclusion it is not surprising that these choices do not contribute to new learning.
These results are consistent with previous findings in animals and humans that learning does not always follow choice by exclusion.A chimpanzee selected a novel lexigram from among known lexigrams when presented with a picture or verbal label for a novel to-be-named item, but showed no retention of pairs selected on these trials (Beran, 2010;Beran & Washburn, 2002).Chaser, a border collie, correctly selected novel toys when presented with novel verbal labels, but showed no retention of these word-item pairs after 24 hrs (Pilley & Reid, 2011).Two-year-old children failed to retain a word-object pairing selected by exclusion after only a 5-minute delay (Horst & Samuelson, 2008).Whereas 2 -4-yearold children have shown retention over short and long delays when there was only one word to be remembered (Golinkoff et al., 1992;Markson & Bloom, 1997), this performance decreased when the children were asked to remember a second item (Golinkoff et al., 1992).

Conclusion
The present study provides further evidence that choice by exclusion is not specific to language or to humans, but may be a function of more generalized cognitive mechanisms.Additionally, it suggests that excellent performance on choice by exclusion does not necessarily result in enhanced learning of new associates.Although there is some evidence to suggest that learning can occur through choice by exclusion, previous findings and the results presented here suggest that this learning may not happen readily and robustly, or may occur only under specific conditions.Our findings highlight the importance of separately evaluating choice and retention to better understand the contributions of exclusion to behavior and learning.

Figure 1 .
Figure 1.Stimuli (left) and trial progression (right) for Experiment 1. Left.Pre-trained known paired associates used in Experiment 1. Right.Example of an exclusion trial from Experiment 1. Monkeys initiated trials by touching the green start box.A sample image appeared in the center of the screen.After this image was touched, four comparison images appeared randomly in the four corners of the screen.On exclusion trials, selection of the novel comparison image (lower left corner in the figure) from among the three known images resulted in a reward, whereas choice of one of the incorrect known comparison images resulted in a time-out period and no reward.

Figure 2 .
Figure 2. The three trial types presented in Experiment 1.The center sample image ("S") appeared first.Once it was touched, the four comparison images ("C") appeared in the four corners of the screen.Arrows indicate the correct response.Dark boxes indicate novel images.Light boxes indicate trained known associate sample and comparison images.

Figure 3 .
Figure 3. Proportion choice of the novel comparison image on exclusion (black) and baseline (solid grey) trials on the first five and the final session of testing.Dashed line indicates chance level.Filled data markers indicate above chance performance, open data markers indicate chance performance.Error bars indicate standard errors.

Figure 4 .
Figure 4. Average performance across the four repetitions of Experiment 2. Solid lines indicate accuracy on trial-and-error trials that do not involve exclusion for the exclusion image set (black) and basic image set (grey).The dotted black line indicates accuracy on exclusion trials.The dashed grey line indicates chance level, error bars indicate standard errors.Filled data markers indicate above chance performance, empty data markers indicate performance that does not differ from chance.