[statnet_help] Upgrading from ERGM v3.10 to v4.6

Pavel Krivitsky via statnet_help statnet_help at u.washington.edu
Sat Sep 7 05:00:15 PDT 2024

Previous message: [statnet_help] Upgrading from ERGM v3.10 to v4.6
Next message: [statnet_help] TERGM model analysis problem valued edges
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi, Aditya,

Apologies for the slow reply. I have a quick question: have you tried running ergm() without overriding any of the control parameters? I.e., just let the adaptive code try to figure things out? Or, it might be worth overriding just the SAN controls, since those aren't adaptive.

Best,
Pavel

On Thu, 2024-08-29 at 15:13 -0400, Khanna, Aditya via statnet_help wrote:

Hi Carter and All,

Thank you so much for the helpful guidance here. I think following your suggestions has brought us very close to reproducing the target statistics in the simulated networks, but there are still some gaps.

Our full previous exchange is below, but to summarize: I have an ERGM that I fit previously with ERGM v3.10.4 on a directed network with 32,000 nodes. The model consisted of in- and out-degrees in addition to other terms, including a custom distance term. In trying to reproduce this fit with ergm v4.6, the model did not initially converge.

Your suggestion to try setting the main.method = “Stochastic Approximation” considerably improved the fitting. Specifying the convergence detection to “Hotelling” on top of that brought us almost to simulated networks that capture all the mean statistics. (Following an old discussion thread<https://github.com/statnet/ergm/issues/346> on the statnet github, I also tried setting the termination criteria to Hummel and MCMLE.effectiveSize = NULL. I think, for me, in practice, Hotelling worked a bit better than Hummel though).

In general, I tried fitting the model with variants of this<https://github.com/hepcep/net-ergm-v4plus/blob/27736b2728965188ed73821e797b5ac7007b1093/fit-ergms/ergm-estimation-with-meta-data.R#L257-L296> specification. I got the best results with setting both MCMC samplesize=1e6 and interval = 1e6 (see table below).

MCMC interval

MCMC sample size

Convergence Detection

Results/Outcome

Note

1e6

1e6

Hotelling

Closest agreement between simulated and target statistics

Max. Lik. fit summary and simulation Rout<https://github.com/hepcep/net-ergm-v4plus/commit/777bae726d29dae969f06e0d17b40ee59a01a7fc>

Violin plots<https://github.com/hepcep/net-ergm-v4plus/tree/rhel9-setup/fit-ergms/out> showing the simulated and target statistics for each parameter

But, I found that this was the closest I could get producing simulated statistics that matched the target statistics. In general, any further increasing or decreasing of either the samplesize or interval did not help generate a closer result, i.e., this looked to be some optimum in the fit parameter space. I can provide further details on the results of those fits, which for some configurations didn’t converge, and if they did converge, the goodness-of-fit was worse than what I had with setting the MCMC interval and samplesize to 1e6. Based on your experiences, I was wondering if this is expected?

For now, my main question is, are there any suggestions on how I can further tune the fitting parameters to match my targets more closely? I can provide specific details on the outcomes of those fitting processes if that would be helpful.

Thanks for your consideration.

Aditya

On Thu, May 16, 2024 at 2:33 PM Carter T. Butts via statnet_help <statnet_help at u.washington.edu<mailto:statnet_help at u.washington.edu>> wrote:

Hi, Aditya -

I will defer to the mighty Pavel for the exact best formula to reproduce 3.x fits with the latest codebase. (You need to switch convergence detection to "Hotelling," and there are some other things that must be modified.) However, as a general matter, for challenging models where Geyer-Thompson-Hummel has a hard time converging (particularly on a large node set), you may find it useful to try the stochastic approximation method (main="Stochastic" in your control argument will activate it). G-T-H can (in principle) have sharper convergence when near the solution, but in practice SA fails more gracefully. I would suggest increasing your default MCMC thinning interval (MCMC.interval), given your network size; depending on density, extent of dependence, and other factors, you may need O(N^2) toggles per step. It is sometimes possible to get away with as few as k*N (for some k in, say, the 5-100 range), but if your model has substantial dependence and is not exceptionally sparse then you will probably need to be in the quadratic regime. One notes that it can sometimes be helpful when getting things set up to run "pilot" fits with the default or otherwise smaller thinning intervals, so that you can discover if e.g. you have a data issue or other problem before you spend the waiting time on a high-quality model fit.

To put in the obligatory PSA, both G-T-H and SA are simply different strategies for computing the same thing (the MLE, in this case), so both are fine - they just have different engineering tradeoffs. So use whichever proves more effective for your model and data set.

Hope that helps,

-Carter

On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote:

Dear Statnet Dev and User Community:

I have an ERGM that I fit previously with ERGM v3.10.4 on a directed network with 32,000 nodes. The model included in- and out-degrees, in addition to other terms. The complete Rout from this fit can be seen here<https://urldefense.com/v3/__https://gist.github.com/khanna7/aefd836baf47463051439c9e72764388__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwORmxHSho$>. I am now trying to reproduce this fit with ergm v4.6, but the model does not converge. (See here<https://urldefense.com/v3/__https://gist.github.com/khanna7/fbabdde53c79504dfeaebd215bb5ee20__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOW7y31IM$>.)

I am looking for ideas on how to trouble shoot this. One suggestion I got was to set values for the "tuning parameters" in the v4.6 to their defaults from v3.11.4. But ERGM v4.6 has a lot more parameters that can be specified, and I am not sure which ones make most sense to consider.

I would be grateful for any suggestions on this or alternate ideas to try.

Many thanks,
Aditya

--

[https://lh4.googleusercontent.com/c35709qK0dtBNfIbLY9gT-zcsk01ZAdhlpd4bfFd2oKVbn-rRkqLeClJI34zvYS8jwlOWNzGp4ySGhQblkuf767TWpHsOydn7PnJMsOop4v_2iTAJ5L4YFgDRqht8NN4deZYeAV0JrIOPgxACC77q9Q]<https://urldefense.com/v3/__https://sph.brown.edu/__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOWf8YDMv$>

[https://lh3.googleusercontent.com/jJS-8A-tWxbni-8IqHLVPJgs_5v8_VTq9pb3QSIrQczDuoDk49Nn6Gre6kZFZklaHWHB0_lFxngXrZUfbEk9qmzYVy6161x56ZHULce5hwxsBoL1LHqVsx17oHo2dbzByD8Y1bF6WNvZjqkudloT8qk]<https://urldefense.com/v3/__https://sph.brown.edu/events/10-year-anniversary__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOcsy9Aer$>

Aditya S. Khanna, Ph.D.

Assistant Professor

Department of Behavioral and Social Sciences

Center for Alcohol and Addiction Studies

Brown University School of Public Health

Pronouns: he/him/his

401-863-6616

sph.brown.edu<https://urldefense.com/v3/__https://sph.brown.edu/__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOWf8YDMv$>

https://vivo.brown.edu/display/akhann16<https://urldefense.com/v3/__https://vivo.brown.edu/display/akhann16__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwOWy55iTf$>

_______________________________________________
statnet_help mailing list
statnet_help at u.washington.edu<mailto:statnet_help at u.washington.edu>
https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$

_______________________________________________
statnet_help mailing list
statnet_help at u.washington.edu<mailto:statnet_help at u.washington.edu>
http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
_______________________________________________
statnet_help mailing list
statnet_help at u.washington.edu<mailto:statnet_help at u.washington.edu>
http://mailman13.u.washington.edu/mailman/listinfo/statnet_help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/statnet_help/attachments/20240907/081df0be/attachment-0001.html>

Previous message: [statnet_help] Upgrading from ERGM v3.10 to v4.6
Next message: [statnet_help] TERGM model analysis problem valued edges
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the statnet_help mailing list