#### contact: vanrij.jacolien "a" gmail "dot" com

Here the data, graphs and analyses scripts (R code) are provided for the paper:
Using context to resolve object pronouns, Jacolien van Rij, Bart Hollebrandse, and Petra Hendriks

Submitted for publication in:
Empirical perspectives on anaphora resolution: Information structural evidence in the race for salience, edited by Anke Holler, Christine Goeb and Katja Suckow.

Note: The code for the eye-tracking GAMM analyses is outdated due to updates in the packages mgcv and itsadug. See Porretta, Kyröläinen, van Rij, & Järvikivi (2017) for a more recent description of a GAMM analysis of VWP gaze data.

Response data analysis $$\rightarrow$$

Gaze data analysis $$\rightarrow$$

# Experiment

41 Children (4-6 yrs old) and 36 adult participants performed the experiment. Behavioral responses and gaze data were recorded.

Our study investigates whether and how children who do not have adult-like object pronoun interpretation make use of context to resolve Dutch object pronouns. We used a Picture-Verification Task, in which participants were asked to indicate whether the sentence they heard was a correct description of the picture presented on the screen. We contrasted a single-referent context with two-referent contexts and the order of referent introduction within the two-referent contexts. An overview of the different conditions is listed in the table below.

## 2. Design

We tested a 2 x 3 x 2 design, defined by the predictors Image Type (self-oriented, other-oriented), Context (P,PA, AP) and Sentence Type (Reflexive, Pronouns).

Image type: Context sentence: Test sentence:
self-oriented P: Here you see a rabbit. The squirrel points at himself congruent
PA: Here you see a rabbit and a squirrel.
AP: Here you see a squirrel and a rabbit.
self-oriented P: Here you see a rabbit. The squirrel points at him incongruent
PA: Here you see a rabbit and a squirrel.
AP: Here you see a squirrel and a rabbit.
other-oriented P: Here you see a rabbit. The squirrel points at himself incongruent
PA: Here you see a rabbit and a squirrel.
AP: Here you see a squirrel and a rabbit.
other-oriented P: Here you see a rabbit. The squirrel points at him congruent
PA: Here you see a rabbit and a squirrel.
AP: Here you see a squirrel and a rabbit.

Examples of the presented visual stimuli:

Self-oriented image: Other-oriented image:

Number of test items per participant (32 in total):

Pronoun Reflexive
Context P 8 8
Context PA 4 4
Context AP 4 4

## 3. R info

All analysis were performed in R, using the packages mgcv and itsadug.

Installing the packages from CRAN:

install.packages('itsadug', repos="http://cran.us.r-project.org")
# and the same for other packages...

Load the packages for use, and check versions.

R.version.string
## [1] "R version 3.2.1 (2015-06-18)"
# For GAMMs:
library(mgcv)
## Loading required package: nlme
## This is mgcv 1.8-7. For overview type 'help("mgcv-package")'.
# For GAMM interpretation and visualization:
library(itsadug)
## Loaded package itsadug 1.0.1 (see 'help("itsadug")' ).
# for generating this R Markdown report the
# info messages are put on:
infoMessages('on')
# load package MASS for calculating inverse later:
library(MASS)
# load package plyr for calculating averages:
library(plyr)
# For printable plot colors:
library(sp)

## 4. Response analysis with SDT

The answers of the child and adult participants were converted into two measures based on Signal Detection Theory (SDT; Macmillan and Creelman 2004; Stanislaw and Todorov 1999):

• The sensitivity $$d'$$ reflects how well participants can distinguish between congruent and incongruent trials, with a higher (positive) value of $$d'$$ indicating more correct “yes”" responses on congruent trials and fewer incorrect “yes” responses on incongruent trials (cf. Başkent et al. 2013).

• The response bias $$C$$ reflects the difference between the participant’s bias and an ideal observer bias. In other words, $$C$$ reflects the participants’ answering strategy:

• a value around zero indicates that participants are equally likely to say “yes” to congruent and incongruent items,

• a positive value indicates that participants are more likely to give incorrect responses on congruent items than to incongruent items (“no” bias),

• and a negative value indicates that participants are more likely to give incorrect responses on incongruent items than to congruent items (“yes” bias).

In our experiment, we treated the Congruent items as the ‘signal’, and the Incongruent items as ‘noise’. In other words, in the SDT analyse we are interested in whether participant responded differently to congruent items i.e., match between picture and referring expression (e.g. other-oriented action in picture with a pronoun being presented), and incongruent items, i.e., mismatch between picture and referring expression. The responses are relabeled following the classification of the SDT, as illustrated in the table below.

1. with other-oriented picture…
1. with self-oriented picture…
response: $$\downarrow$$ pronoun reflexive pronoun reflexive
“yes” HIT FALSE ALARM FALSE ALARM HIT
“no” MISS CORRECT REJ. CORRECT REJ. MISS

### Relation between SDT and accuracy

The SDT differentiates four different response types (hit, miss, false alarm and correct rejection), rather than two (correct and incorrect). As a result, it can disentangle the participant’s sensitivity to the stimuli from potential response biases.

• The sensitivity $$d'$$ reflects how well participants can distinguish between congruent and incongruent trials, with a higher (positive) value of $$d'$$ indicating more correct “yes”" responses on congruent trials and fewer incorrect “yes” responses on incongruent trials.

• The response bias $$C$$ reflects the difference between the participant’s bias and an ideal observer bias. In other words, $$C$$ reflects the participants’ response strategy:

• a value around zero indicates that participants are equally likely to say “yes” or “no” to congruent and incongruent items,

• a positive value indicates that participants are more likely to say “no” to congruent items than to incongruent items,

• and a negative value indicates that participants are more likely to say “yes” to incongruent items than to congruent items.

The interactive plot below shows how the SDT measures $$d'$$ and $$C$$ relate to the accuracy of “signal” items (i.e., congruent items) and “noise” items (i.e., incongruent items). Note: The plot assumes an equal number of congruent and incongruent items.1

The SDT measures are calculated from the count of responses in each of the response categories. Note that the number of items or responses per condition changes the granularity of these measures (i.e., the potential values these measures can take). Calculate the SDT measures in the interactive plot based on 10 items (default), or based on 8 items, or on 5 items (type these numbers in the field Adjust number of items:, followed by ENTER). The accuracy values might change, and also the $$d'$$ and $$C$$ measures.

Our experiment listed only 4 items in each of the two-referents contexts, but 8 items in the single-referent condition. To avoid differences due to granularity, we split the single-referent condition into two groups, and calculated the averages over these two groups. Additionally, we performed an analysis that collapsed the two-referent context conditions to compare the single-referent context with a two-referent context, ignoring the order of referent introduction. The results of the two analyses are highly similar.

The “yes”-bias observed in language aquisition studies (e.g., Chien and Wexler, 1990; van Rij, van Rijn, and Hendriks 2010) is characterized by a high accuracy on congruent items, but low accuracy on incongruent items (i.e., saying “yes” all the time). This translates to a negative $$C$$ measure.

### How to calculate the SDT measures

1. Relabel responses. Depending on the visual context (ie., image on the screen) the pronouns and reflexive forms are considered as signal and noise. See Table.

2. The number of responses in each of these categories are counted per participant per condition.

3. Calculate hit rate and false-alarm rate and convert these to z-scores using the normal quantile function $$\Phi^{-1}$$.

4. Calculate the SDT measures $$d'$$ and $$C$$.

## 5. Nonlinear regression

All analyses were performed with Generalized Additive Mixed-effects models (GAMM; Lin and Zhang (1999)) as implemented in the R package mgcv (Wood 2006; Wood 2011). In contrast to linear regression models, such as linear regression or liner mixed-effects models, GAMMs do not assume that the relation between predictors and the dependent variable (the measure, e.g. $$d'$$, or accuracy) is linear. Thus, GAMMs are a nonlinear regression method. This is useful for many psycholinguistic data sets including the data being analyzed in this paper, show nonlinear patterns. For example, we have no reason to assume that the sensitivity $$d'$$ changes linearly with age, and we also have no reason to assume that the change in gaze position over time in the trial develops linearly.

The next sections provide the R scripts for the response data analysis (with SDT measures), and for the gaze data analysis.

[Next $$\rightarrow$$]

## References

Başkent, Deniz, Jacolien van Rij, Zheng Yen Ng, Rolien Free, and Petra Hendriks. 2013. “Perception of Spectrally Degraded Reflexives and Pronouns by Children.” The Journal of the Acoustical Society of America 134 (5): 3844–52.

Lin, X., and D. Zhang. 1999. “Inference in Generalized Additive Mixed Modelsby Using Smoothing Splines.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61: 381–400.

Macmillan, Neil A., and C. Douglas Creelman. 2004. Detection Theory: A User’s Guide. Psychology Press.

Stanislaw, Harold, and Natasha Todorov. 1999. “Calculation of Signal Detection Theory Measures.” Behavior Research Methods, Instruments, & Computers 31 (1): 137–49.

Wood, Simon N. 2006. Generalized Additive Models: An Introduction with R. Chapman; Hall/CRC.

———. 2011. “Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models.” Journal of the Royal Statistical Society (B) 73 (1): 3–36.

1. Refresh if plot is inactive. In case the interactive plot is not loaded, please inspect the interactive plot at http://jacolienvanrij.shinyapps.io/SDTmeasures.