Published online by Cambridge University Press: 23 February 2016
Within the online media universe, there are many underlying communities. These may be defined, for example, through politics, location, health, occupation, extracurricular interests or retail habits. Government departments, charities and commercial organisations can benefit greatly from insights about the structure of these communities; the move to customer-centred practices requires knowledge of the customer base. Motivated by this issue, we address the fundamental question of whether a sub-network looks like a collection of individuals who have effectively been picked at random from the whole, or instead forms a distinctive community with a new, discernible structure. In the former case, to spread a message to the intended user base it may be best to use traditional broadcast media (TV, billboard), whereas in the latter case a more targeted approach could be more effective. In this work, we therefore formalise a concept of testing for sub-structure and apply it to social interaction data. First, we develop a statistical test to determine whether a given sub-network (induced sub-graph) is likely to have been generated by sampling nodes from the full network uniformly at random. This tackles an interesting inverse alternative to the more widely studied “forward” problem. We then apply the test to a Twitter reciprocated mentions network where a range of brand name based sub-networks are created via tweet content. We correlate the computed results against the independent views of 16 digital marketing professionals. We conclude that there is great potential for social media based analytics to quantify, compare and interpret online brand allegiances systematically, in real time and at large scale.
Submitted to the European Journal of Applied Mathematics, Special Issue on Networks.