[statnet_help] Upgrading from ERGM v3.10 to v4.6

Khanna, Aditya via statnet_help statnet_help at u.washington.edu
Thu Aug 29 12:13:51 PDT 2024


Hi Carter and All,

Thank you so much for the helpful guidance here. I think following your
suggestions has brought us very close to reproducing the target statistics
in the simulated networks, but there are still some gaps.

Our full previous exchange is below, but to summarize: I have an ERGM that
I fit previously with ERGM v3.10.4 on a directed network with 32,000 nodes.
The model consisted of in- and out-degrees in addition to other terms,
including a custom distance term. In trying to reproduce this fit with ergm
v4.6, the model did not initially converge.

Your suggestion to try setting the main.method = “Stochastic Approximation”
considerably improved the fitting. Specifying the convergence detection to
“Hotelling” on top of that brought us almost to simulated networks that
capture all the mean statistics. (Following an old discussion thread
<https://github.com/statnet/ergm/issues/346> on the statnet github, I also
tried setting the termination criteria to Hummel and MCMLE.effectiveSize =
NULL. I think, for me, in practice, Hotelling worked a bit better than
Hummel though).

In general, I tried fitting the model with variants of this
<https://github.com/hepcep/net-ergm-v4plus/blob/27736b2728965188ed73821e797b5ac7007b1093/fit-ergms/ergm-estimation-with-meta-data.R#L257-L296>
specification. I got the best results with setting both MCMC samplesize=1e6
and interval = 1e6 (see table below).

MCMC interval

MCMC sample size

Convergence Detection

Results/Outcome

Note

1e6

1e6

Hotelling

Closest agreement between simulated and target statistics

Max. Lik. fit summary and simulation Rout
<https://github.com/hepcep/net-ergm-v4plus/commit/777bae726d29dae969f06e0d17b40ee59a01a7fc>


Violin plots
<https://github.com/hepcep/net-ergm-v4plus/tree/rhel9-setup/fit-ergms/out>
showing the simulated and target statistics for each parameter


But, I found that this was the closest I could get producing simulated
statistics that matched the target statistics. In general, any further
increasing or decreasing of either the samplesize or interval did not help
generate a closer result, i.e., this looked to be some optimum in the fit
parameter space. I can provide further details on the results of those
fits, which for some configurations didn’t converge, and if they did
converge, the goodness-of-fit was worse than what I had with setting the
MCMC interval and samplesize to 1e6. Based on your experiences, I was
wondering if this is expected?

For now, my main question is, are there any suggestions on how I can
further tune the fitting parameters to match my targets more closely? I can
provide specific details on the outcomes of those fitting processes if that
would be helpful.

Thanks for your consideration.
Aditya

On Thu, May 16, 2024 at 2:33 PM Carter T. Butts via statnet_help <
statnet_help at u.washington.edu> wrote:


> Hi, Aditya -

>

> I will defer to the mighty Pavel for the exact best formula to reproduce

> 3.x fits with the latest codebase. (You need to switch convergence

> detection to "Hotelling," and there are some other things that must be

> modified.) However, as a general matter, for challenging models where

> Geyer-Thompson-Hummel has a hard time converging (particularly on a large

> node set), you may find it useful to try the stochastic approximation

> method (main="Stochastic" in your control argument will activate it).

> G-T-H can (in principle) have sharper convergence when near the solution,

> but in practice SA fails more gracefully. I would suggest increasing your

> default MCMC thinning interval (MCMC.interval), given your network size;

> depending on density, extent of dependence, and other factors, you may need

> O(N^2) toggles per step. It is sometimes possible to get away with as few

> as k*N (for some k in, say, the 5-100 range), but if your model has

> substantial dependence and is not exceptionally sparse then you will

> probably need to be in the quadratic regime. One notes that it can

> sometimes be helpful when getting things set up to run "pilot" fits with

> the default or otherwise smaller thinning intervals, so that you can

> discover if e.g. you have a data issue or other problem before you spend

> the waiting time on a high-quality model fit.

>

> To put in the obligatory PSA, both G-T-H and SA are simply different

> strategies for computing the same thing (the MLE, in this case), so both

> are fine - they just have different engineering tradeoffs. So use

> whichever proves more effective for your model and data set.

>

> Hope that helps,

>

> -Carter

>

>

> On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote:

>

> Dear Statnet Dev and User Community:

>

> I have an ERGM that I fit previously with ERGM v3.10.4 on a directed

> network with 32,000 nodes. The model included in- and out-degrees, in

> addition to other terms. The complete Rout from this fit can be seen here

> <https://urldefense.com/v3/__https://gist.github.com/khanna7/aefd836baf47463051439c9e72764388__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwORmxHSho$>.

> I am now trying to reproduce this fit with ergm v4.6, but the model does

> not converge. (See here

> <https://urldefense.com/v3/__https://gist.github.com/khanna7/fbabdde53c79504dfeaebd215bb5ee20__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOW7y31IM$>

> .)

>

> I am looking for ideas on how to trouble shoot this. One suggestion I got

> was to set values for the "tuning parameters" in the v4.6 to their defaults

> from v3.11.4. But ERGM v4.6 has a lot more parameters that can be

> specified, and I am not sure which ones make most sense to consider.

>

> I would be grateful for any suggestions on this or alternate ideas to try.

>

> Many thanks,

> Aditya

>

>

>

>

> --

>

>

> <https://urldefense.com/v3/__https://sph.brown.edu/__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOWf8YDMv$>

>

>

> <https://urldefense.com/v3/__https://sph.brown.edu/events/10-year-anniversary__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOcsy9Aer$>

>

> Aditya S. Khanna, Ph.D.

>

> Assistant Professor

>

> Department of Behavioral and Social Sciences

>

> Center for Alcohol and Addiction Studies

>

> Brown University School of Public Health

>

> Pronouns: he/him/his

>

> 401-863-6616

>

> sph.brown.edu

> <https://urldefense.com/v3/__https://sph.brown.edu/__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOWf8YDMv$>

>

> https://vivo.brown.edu/display/akhann16

> <https://urldefense.com/v3/__https://vivo.brown.edu/display/akhann16__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOWy55iTf$>

>

> _______________________________________________

> statnet_help mailing liststatnet_help at u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$

>

> _______________________________________________

> statnet_help mailing list

> statnet_help at u.washington.edu

> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help

>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/statnet_help/attachments/20240829/4467c7c6/attachment-0001.html>


More information about the statnet_help mailing list