Bodong Chen

Crisscross Landscapes

Notes: Choudhury - 2004 - Characterizing social interactions using the sociometer



Citekey: @Choudhury2004

Choudhury, T., & Pentland, A. (2004). Characterizing social interactions using the sociometer. Proceedings of NAACOS, 1–4. Retrieved from


A much earlier version of sociometers when they only measure voices for conversation and turn-taking.



This paper describes our work in developing a computational framework to model face-to-face interactions within a community. We have integrated methods from speech processing and machine learning to demonstrate that it is possible to extract information about people’s patterns of communication, without imposing any restriction on the user’s interactions or environment. Furthermore, we analyze some of the conversational dynamics and present results that demonstrate distinctive and consistent turntaking styles for individuals during conversations. Finally, we present results that show strong correlation between a person’s turn-taking style during one-on-one conversations and the person’s role within the network. (p. 1)

The sociometer is an adaptation of the hoarder board, a wearable data acquisition board, designed by the electronic publishing and the wearable computing groups at the Media lab [6]. The sociometer stores the following information (i) identity of people wearing the sociometer (IR sensor 17Hz) and (ii) speech information (microphone-8KHz). (p. 1)

Figure 1: The Sociometer (p. 1)


Once we detect the pair-wise conversations we can measure all the communication that occurs within the community and map the links between individuals. (p. 2)

Detecting Conversations (p. 2)

To detect conversations, we need to reliably segment speech regions from the raw audio. As the first step, we extract spectral features proposed in [2] that discriminate well between speech and non-speech regions. An HMM is trained to detect voiced/unvoiced regions using the features. This method works very reliable even in noisy environment with less than 2% error at 10dB SNR. However, the downside of this is that all speech and not just the user’s speech are detected. (p. 2)

we can use the energy of the speech signal to segment the user’s speech from the rest. When two people are nearby and are talking, although it is highly likely that they are talking to each other, we cannot say this with certainty. Results presented in [2] demonstrate that we can detect whether two people are in a conversation by relying on the fact that the speech of two people in a conversation is tightly synchronized. (p. 2)

We can reliably detect when two people are talking to each other by calculating the mutual information of the two voicing streams, which peaks sharply when they are in a conversation as opposed to talking to someone else. This measures works very well for conversations that are at least one minute in duration. (p. 2)


We primarily focus on the turn-taking patterns of individuals and how they differ from each other. We use these individual dynamics to later estimate how much an individual’s overall pattern changes during her interaction with specific individuals. (p. 3)

We start by defining a “turn”. For each unit of time we estimate how much time each of the participants speaks, the participants who has the highest fraction of speaking time is considered to hold the “turn” for that time unit. (p. 3)

We use the speaker segmentation output within conversations to estimate the turn-taking transition probability. Because most of the conversations in the dataset are between pairs, we transition between two states: speaker A’s turn and speaker B’s turn. (p. 3)

Once we have estimated the turn-taking transition probabilities for the individuals we can measure how similar or dissimilar they are from each other. Figure 6 shows the output of multidimensional scaling of the transition probabilities using a Euclidean distance metric, which shows that individuals have distinctive turn-taking styles and that these turn-taking patterns are not just a noisy variation of the same average style. (p. 4)

Estimating Influences from Turn-taking Dynamics (p. 4)

Now by learning the influence parameters we can measure how much one person affects another’s turn-taking behavior. We discovered an interesting and statistically significant correlation between a person’s influence score and their centrality, the correlation value was 0.90 (p-value < 0.0004, rank correlation 0.92). It appears that a person’s interaction style is indicative of her role within the community based on centrality measure.