read

References

Citekey: @Oliveira2014

Oliveira, M., Guerreiro, A., & Gama, J. (2014). Dynamic communities in evolving customer networks: an analysis using landmark and sliding windows. Social Network Analysis and Mining, 4(1), 208. doi:10.1007/s13278-014-0208-2

Notes

This article contrasts two kinds of dynamic network analysis—landmark window and sliding window—by applying both approaches to analysis of a customer network. Besides comparisons of well-known network metrics (e.g., centrality, density), it also tracks development of sub-communities in the network. In particular, it applies an interesting algorithm named MEC (never explained in the paper) to track community evolution, i.e., birth, survival and death. I found this part most interesting for this paper.

Overall, a nice intro to two different approaches of dynamic network analysis which could be applied to other contexts.

Note: This study uses Gephi for data analysis. Additional resources I found online using R to conduct dynamic network analysis:

  • http://kateto.net/network-visualization
  • http://estebanmoro.org/2012/11/temporal-networks-with-igraph-and-r-with-20-lines-of-code/

Highlights

sliding windows focus on the most recent past thus allowing to capture current events. The application of the proposed methodology on a real-world customer network suggests that both window models provide complementary information. Nevertheless, the sliding window model is able to capture better the recent changes of the network. (p. 1)

Social Network Analysis techniques can play a key role in business analytics. By modelling the implicit relationships among customers as a social network, it is possible to understand how patterns in these relationships translate into competitive advantages for the company. Additionally, the incorporation of the temporal dimension in such analysis can help detect market trends and changes in customers’ preferences. In this paper, we introduce a methodology to examine the dynamics of customer communities, which relies on two different time window models: a landmark and a sliding window. Landmark windows keep all the historical data and treat all nodes and links equally, even if they only appear at the early stages of the network life. Such approach is appropriate for the longterm analysis of networks, but may fail to provide a realistic picture of the current evolution. On the other hand, (p. 1)

We define customer network as a finite set of customers who are linked to each other if they bought at least one similar product during a given timeframe. Modelling customer-related data using this type of customer networks has the advantage of revealing implicit relationships among customers based on their purchasing behaviour over a given time period. (p. 2)

The most related work to ours is the one by Saganowski et al. (2012). In this work, the authors carry out an empirical study of the influence of several time window models (e.g. sliding window with no overlap, overlapping sliding window, landmark window) on the number of detected community events (e.g. forming, shrinking, merging, splitting). Based on their experiments, they conclude that the choice of the granularity at which the time varying network is snapshotted impacts the results of the method used to extract community dynamics. While landmark windows prove to be useful to detect stable communities, the sliding window model with overlap is more suitable for extracting community evolution in rapidly changing social networks. (p. 2)

one of the key features of many networks is that their structure evolves over time, so approaches focusing on the analysis of a fixed snapshot of the network may fail to capture the dynamics of the evolving network. (p. 2)

Another relevant research is the one by Falkowski et al. (2006), who developed a twopronged methodology to analyse the evolution of two types of dynamic communities in social networks: communities with rather stable membership structure and communities with high fluctuation of members. (p. 2)

methodologies relying on accumulated windows (aka landmark windows) can only discover small changes in communities in consecutive timeframes and any drastic change in short time may potentially remain undetected. Unlike landmark windows, sliding windows focus on the most recent state of the dynamic network when analysing their evolution, thus being able to capture more up-to-date events, especially when dealing with volatile networks. (p. 2)

Our contributions are threefold: (a) application of dynamic community mining in a real-world customer network for the purpose of identifying different evolutionary profiles of customers; (b) use of a sliding window model in the study of community evolution as a complement to the conventional landmark window model and; (c) extension of a previously proposed event-based (p. 2)

the significant body of literature addressing the problem of dynamic network analysis (see Berger et al. 2010, for an overview of the topic) (p. 2)

framework for monitoring clusters dynamics, dubbed MEC (Oliveira and Gama 2012), to tackle the problem of community evolution. (p. 3)

2 Background (p. 3)

2.3 Dynamic community mining (p. 3)

Communities are unstable patterns that can evolve in both membership and content. (p. 3)

Unlike landmark windows, the sliding window model (Datar et al. 2002) incorporates a time-based forgetting mechanism, keeping only the latest information inside the window and disregarding all the data falling outside the window. (p. 4)

Past research on community mining discarded the temporal information by modelling the dynamic network as a static graph. (p. 4)

As a drawback, it might be hard to determine the right parameter settings. It is important to note that there is a trade-off between the window length and the ability to capture changes. Small windows will capture rapid changes, but lose information (memory) about network stability. (p. 4)

Recent work proposes to characterize the evolution of a given community by describing its life-cycle, i.e. a series of critical events undergone by communities over time (Palla et al. 2007; Lin et al. 2009; Asur et al. 2009; Greene et al. 2010; Takaffoli et al. 2011; Bro ́ka et al. 2013). Typically, these approaches rely on a landmark window model for the process of extracting the snapshots of the evolving network, since they take into consideration all the historical network data collected up to the current time point. Alternative models, such as the sliding window, are embedded with forgetting mechanisms that drop older data from the analysis thus allowing to uncover the most recent changes occurring in the network. (p. 4)

3 Methodology (p. 4)

2.4 Window models (p. 4)

Landmark windows (Gehrke et al. 2001) encompass all the data from a specific point in time up to the current moment. (p. 4)

This model is initialized by first selecting a fixed time point (the so-called landmark), which marks the beginning of the time window, and then it grows the window by considering all the data seen so far after the landmark. (p. 4)

The main steps are: (p. 4)

  1. Analysis and description of the dynamic network, at both the network-level and node-level, using wellknown SNA metrics; (p. 4)

  2. Application of our extended community evolution framework, dubbed MECnet, to each window setting; (p. 4)

  3. Interpretation of the dynamics of customers’ communities (or profiles). (p. 5)

At the node-level, we analyse the following metrics for undirected networks: eigenvector centrality and betweenness. At the network-level, we focus on density and modularity (Newman and Girvan 2004). (p. 5)

MEC is a framework proposed by Oliveira and Gama (2012) to monitor the evolution of clusters. MEC traces evolution through the detection and categorization of clusters transitions, such as births, splits and merges. (p. 5)

Based on this information, MECnet then models the communities and their transitions as nodes, respectively weighted edges, in an evolution graph (i.e. consecutive sets of weighted bipartite graphs). (p. 5)

The manipulation, visualization and analysis of the network is performed on Gephi (Bastian et al. 2009) by making use of its dynamic network analysis features. (p. 7)

4 Case study (p. 7)

For the landmark model, we start with a window of 3 months and then we cumulatively grow the window by adding one month at each step. For the overlapping sliding model, we set the window length to 3 months (w 1⁄4 3) and its step width to one month. In both cases, the total number of timesteps is 10. (p. 7)

SNA techniques to perform an exploratory analysis of their customers purchasing behaviour, so as to identify differentiable customers’ profiles (or customers’ communities) and their evolution over a given year. (p. 7)

4.1 Network data (p. 7)

We detect the communities at each Wk using the Louvain method (Blondel et al. 2008) since it produces disjoint partitions of good quality, in a very fast way. (p. 7)

Finally, we evaluate both approaches based on a double perspective. First, we compute a quantitative measure, namely the Survival Ratio proposed by Spiliopoulou et al. (2006), to measure network volatility and the frequency of community transitions in both scenarios. This ratio is given in Eq. (2) and it basically computes the portion of communities found at timestep Wk (k 1⁄4 1; :::; 10) that survived in WkþDt . (p. 7)

We compute the following SNA metrics, which were considered to be of higher relevance within the scope of this case study: eigenvector centrality, betweenness, density and modularity. (p. 8)

Then, we qualitatively assess the actionable insights derived from the analysis of each window model, from the business viewpoint. (p. 8)

Concerning the sliding window, we observe an overall instability in the metrics values. This was expected due to the time-based forgetting mechanism incorporated in the model. The fluctuation of the eigenvector centrality suggests a lack of customers’ purchasing activity. This is related to the company’s business nature, which relies on long-term projects and high added value products which, in turn, are typically associated with low-frequency purchases. Once again, the density of the network is close to zero, due to the sparsity of links between customers. (p. 10)

MECnet detects several types of events which mirror the dynamics of communities. In this paper, we focus only on survivals, births and deaths since they were found to be more representative of the dynamics occurring at the community-level. (p. 10)

The total number of survival, birth and death events detected by MECnet (with a survival threshold of s 1⁄4 0:5) for both window approaches, at each timestep interval, is provided in Fig. 7. From the analysis of these figures, we can deduce that the overlapping sliding window model is able to capture the unstable community structure of the customer network, by focusing only on the most recent past. This volatility is reflected on the high number of births and deaths and relatively low number of survivals and is captured by the values of the survival ratio (see Fig. 7). These differences on the number of events show the potential of the sliding window in capturing temporary acute changes occurring in the network, reflecting more closely the changes in customers’ preferences for products. (p. 12)

the analysis of the dynamics of the customer network using the sliding window approach, as a complement of the traditional landmark window, allows us to identify the migration of customers between communities. (p. 13)

5 Conclusions and future work (p. 18)

In this paper, we present the first results of our exploratory analysis on the company’s customer network. This network uncovers the similarities of purchasing behaviour among the company’s customers. The application of dynamic community mining using two different time window models allowed us to identify the evolutionary profile of groups of customers and grasp insights into the customer base. The results suggest that both window models provide complementary information regarding the dynamics of the underlying network. While the landmark window considers all the historical data, the sliding window employs a catastrophic forgetting of older data, focusing only on the most recent past. Given these distinct perspectives, the sliding window approach proves to be more suitable for, e.g. detecting changes in customers’ purchasing behaviour, whereas landmark windows are more appropriate to identify persistent customers’ profiles. (p. 18)

alternative time window approaches (e.g. accumulated time windows with fading links) (p. 18)

Blog Logo

Bodong Chen


Published

Image

Crisscross Landscapes

Bodong Chen, University of Minnesota

Back to Home