[statnet_help] ergm inquiry

Carter T. Butts buttsc at uci.edu
Sun Aug 13 13:38:51 PDT 2023


Hi, Julia -

To add to Michal's excellent observations:

  - You seem to have a lot of out-isolates and in-isolates that you are
not catching.  (They may be isolates per se - both out and in - but we
can't tell from these plots.)   Not accounting for that can lead to
edges being too uniformly dispersed through the network.  You may thus
want to consider idegree(0), odegree(0), and/or isolates() terms, if
there is not a covariate that accounts for the structure.

  - As Michal observes, you clearly have some very locally dense
clusters, and it is unlikely that you will capture them using a
homogeneous clustering term - while exceptions exist, these are usually
associated with inhomogeneity, generally clustering around a specific
attribute, affiliation, etc.  His suggestions for dealing with that are
very apposite.  But an even more basic suggestion is to do more
exploratory investigation of your network.  Have you tried plotting the
sociogram, and coloring the nodes by their attributes?  If the graph is
too dense to easily see what is going on, the following can help:

     - Increase the translucency of your edges.  This One Weird Trick
is very simple, but very useful - if the graph is dense, don't be afraid
to make the edges so transparent that they are barely visible!  When
combined with varying the shape/color of nodes by attribute, this can be
very revealing of internal structure, and particularly of where your
clustering is coming from.  The edge.col option in plot.network or sna's
gplot function can be used for this quite easily.

     - Take the graph apart, using degree kcores.  The kcores() command
in sna will give you the core number for every vertex in the network,
using your favorite centrality measure; the default is Freeman degree,
which is probably a good starting point for your network.  Try
sequentially plotting the induced subgraphs for each core (i.e.,
everything in the 1 core, the 2 core, the 3 core, etc.), again coloring
the nodes by their attributes.  (Page 24, figure 4 of this paper has an
example of this approach:
https://sciendo.com/article/10.21307/joss-2019-027)  What this will do
is strip away the less cohesive parts of the network, revealing highly
cohesive subgraphs that may otherwise be obscured by the surrounding
structure.  If you look at figure 8 in that same paper, you'll see an
example where that is used to reveal the presence of several highly
cohesive, homophilous clusters that are not evident if you look at the
whole graph (which is both very large and adorned with pendant trees). 
Knowing the composition of the "deep" portions of the network will often
give you strong indications of what is shaping it, and of what terms you
will need in your model.

     - Plot distributions of degree and other structural indices by
attribute.  Simply looking at these distributions can be revealing - in
addition to in/outdegree heterogeneity, examining e.g. cycle or clique
membership can be another way to detect heterogeneity in local
structure, all of which can inform your choice of graph statistics. 
(Beyond nodal covariate effects and terms like nodematch/nodemix, note
that inhomogeneous versions of e.g. degree and gwdegree exist, and can
be useful when you have groups that interact in very different ways. 
Even localtriangle can sometimes be helpful, if you have very small
groups whose patterns of cohesion cannot be captured with mixing terms.)

Those are just a few examples, but the general suggestion is to do more
exploratory analysis to try to understand the network and its structure
(and build your model around that), rather than trying to rely entirely
on gof plots to infer what is going on. The latter does sometimes work,
especially if the network is very simple, but complex cases usually
require a combination of substantive insight (i.e., thinking about what
the network is, where it came from, how it was measured, etc., and about
how those things would be expected to shape the observed structure) and
exploratory analysis (to reveal major quirks and inhomogeneities,
provide clues about good term candidates, etc.).  Another benefit of
putting more time into these pre-modeling efforts is that they usually
give you a much better sense of what is interesting/important about the
structure, which can greatly improve your model assessment strategy (by
giving you more refined targets) and help you use and interpret the
resulting model more effectively.  I know that, in my own work, I've
often (almost always?) discovered that the things that turned out to be
interesting/important about a network were not the ones that I expected,
and that building effective models has really depended on developing
those insights.  Fortunately, we have a very rich array of tools and
concepts from classical social network analysis that can help us here! 
When one's modeling efforts are stuck, I find it can often be useful to
go back to basics in order to determine what to try next.

Hope that helps,

-Carter

On 8/11/23 7:50 AM, Julia Vassey wrote:

> Thank you Michal! Good to hear from you.

>

> Please, see the plots, attached. I have tried changing the decay

> parameter multiple times, and the model fit gets better when

> increasing the parameter: from 0 to 0.3 (gwesp(0.3, fixed = TRUE)),

> but beyond 0.3 the model starts having issues with convergence.

> p values for all esps are always very low.

>

> Thank you for helping!

>

> Julia

>

>

> On Fri, Aug 11, 2023 at 8:53 AM Michał Bojanowski <michal2992 at gmail.com> wrote:

>> Hi Julia,

>>

>> Can you send the GOF plots attached? In what way does the ESP not fit

>> well? Perhaps it is a matter of changing the decay parameter?

>>

>> Michal

>>

>> On Wed, Aug 9, 2023 at 1:00 PM Julia Vassey <vassey at usc.edu> wrote:

>>>

>>> Dear All,

>>>

>>> This is my first time posting a question in statnet help. If the question needs to be posted/sent to a different email please let me know.

>>>

>>> I have a question related to ergm. I am running an ergm model on a directed unipartite network of ~100 nodes (certain social media users) and ~700 edges. The model includes geolocation attributes of the nodes (geographic regions) and themes the nodes post about on social media. The model also includes terms (mutual, gwesp) provided in the code below.

>>>

>>> I am struggling with achieving a decent goodness of fit for edgewise shared partners using gof function. i/o degree, geodesic distance and model statistics look much better. I tried different things to try to improve edgewise shared partners, but nothing seems to work. The configuration below provides the best fit, however, I want to keep trying to improve the fit for edgewise shared partners. I appreciate any thoughts and comments on how to achieve this.

>>>

>>> model = ergm::ergm(netg_infl ~ edges + nodefactor('region_code', levels = -4) + nodematch('region_code', diff = T, levels = -3) + nodefactor('topic_sum_binary', levels = -LARGEST) + nodefactor('marijuana_recode', levels = -c(1,3)) + nodefactor('nature_recode', levels = -c(1,3)) + nodefactor('health_life_recode', levels = -c(1,3)) + nodefactor('gaming_recode', levels = -c(1,3)) + nodefactor('food_recode', levels = -c(1,3)) + nodefactor('clothing_recode', levels = -c(1,3)) + mutual + gwesp(0.28, fixed = TRUE) + offset(isolates), offset.coef = -Inf, control=control.ergm(parallel=2, parallel.type="PSOCK"))

>>>

>>> Thank you,

>>>

>>>

>>> --

>>> Julia Vassey

>>> Health Behavior Research

>>> Department of Population and Public Health Sciences

>>> Keck School of Medicine

>>> University of Southern California

>>>

>>> _______________________________________________

>>> statnet_help mailing list

>>> statnet_help at u.washington.edu

>>> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!LIr3w8kk_Xxm!tbnR-2a3qwh_Vmaoc9B-qqMKQEZSxi4ehcV1JysdIymvDlKhFAh3fTMCFaO9ViZjQWKS75IRghzrMXLsKQ$

>>>

>>> _______________________________________________

>>> statnet_help mailing list

>>> statnet_help at u.washington.edu

>>> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KFP_Dnvj4xmkPFViPcgvdRkqP6Ybm_N4eB9hriyYxxQhOOOtvoWrmXZPnl_zcUn14YpyXvo3NgopUw$




More information about the statnet_help mailing list