From tieliny at andrew.cmu.edu Thu Jan 25 19:01:33 2024 From: tieliny at andrew.cmu.edu (Katy Yu) Date: Mon Mar 25 10:47:53 2024 Subject: [statnet_help] Error in ergm_proposal.NULL when setting up bridge sampling Message-ID: Dear Statnet Community, I am reaching out for assistance regarding an issue we encountered in our research on Cross-National Collaboration in Open-Source Software Development. Our study investigates cultural and religious divisions in online social interactions, inspired by Huntington's theory of post-Cold War divisions structured by culture and religion. Our analysis utilizes ERGM to quantify country-level homophily within eight "civilizations" as classified by Huntington. The network data comprises a directed network with each node representing a country. The edges denote cross-country collaborations on GitHub, encompassing 91 nodes and 425 edges. While the basic ERGM model "edges + nodematch('civilizations', diff = TRUE)" provides statistics successfully, we encounter an error when integrating additional methods such as 'mutual'. This error occurs even after the model converges. [image: Screenshot 2024-01-25 at 12.19.28 PM.png] We have attempted various constraints and explored online resources to resolve this issue but to no avail. Could you kindly provide guidance or suggest potential solutions to address this error? Thank you for your support and expertise. Kind regards, Katy -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2024-01-25 at 12.19.28 PM.png Type: image/png Size: 211452 bytes Desc: not available URL: From statnet_help at u.washington.edu Thu Apr 4 01:41:20 2024 From: statnet_help at u.washington.edu (Gilad Ravid via statnet_help) Date: Thu Apr 4 01:41:25 2024 Subject: [statnet_help] ergm levels selection Message-ID: Greetings, I need to build a model where I can control the baseline and the exclude list of levels in nodefactor term. For example, I want to specify that the baseline will be the X level and exclude levels Y and Z due to empty examples in the network (to avoid the Inf coefficient). Many thanks, Gilad -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Wed Apr 17 15:32:19 2024 From: statnet_help at u.washington.edu (MINGHUA ZHANG via statnet_help) Date: Wed Apr 17 15:32:25 2024 Subject: [statnet_help] ergm algorithm from scratch Message-ID: Hi Statnet, My name is Minghua, and I am an undergrad at the University of Wisconsin, Madison. I am using ERGM in my research to model a healthcare referral network that is bipartite (with one set of nodes has outdegree strictly equal to 1 for every node) and contains missing edges. After reading some literatures and exploring the Statnet packages, I believe that Statnet is unable to handle this scenario. Thus, I tried to code up fitting algorithms that can handle this special case. Currently, I am trying to just have a model that can fit the most general graph (undirected, unipartite, no missing info) with one graph feature, number of edges. After comparing with Statnet's result, I believe my theta update algorithm might not be accurate. My current theta updating method is from the book ERGM for Social Network by Lusher, Koskinen, and Robins using Raphson newton method. Details of the exact formulas are attached in the pdf. I am wondering if you can point me to the literatures where I can find other updating algorithms, or codes where I can check out how statnet fit the model. I am greatly appreciative of any help you can provide! Best, Minghua -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Referrals.pdf Type: application/pdf Size: 67227 bytes Desc: Referrals.pdf URL: From statnet_help at u.washington.edu Sun Apr 28 17:23:08 2024 From: statnet_help at u.washington.edu (Abir Khazaal via statnet_help) Date: Sun Apr 28 17:23:17 2024 Subject: [statnet_help] Detecting communities Message-ID: Hi there, I really hope someone can assist me. Isn?t there a package within the statnet project that I can use for detecting communities within temporal networks? Thanks, Abir (Abby) Khazaal | PhD candidate Lab Manager | Project Coordinator Biomedical AI Laboratory | (Vafaee Lab) School of Biotechnology and Biomolecular Sciences | (BABS) Level 2, E26 | UNSW SYDNEY NSW 2052 AUSTRALIA If I send you an email beyond your regular work hours, please understand that I do not anticipate you to review, reply, or act on it outside of your typical work schedule. [A picture containing logo Description automatically generated] [A logo with a white background Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 50943 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 73776 bytes Desc: image002.png URL: From statnet_help at u.washington.edu Mon Apr 29 05:07:25 2024 From: statnet_help at u.washington.edu (James Moody via statnet_help) Date: Mon Apr 29 05:07:34 2024 Subject: [statnet_help] Detecting communities In-Reply-To: References: Message-ID: Hi Abir - How dynamic is your data? i.e. are you looking at real-time sorts of dynamics (fine grained time) or waves of data collected (course grained time)? While there are likely some avenues you could explore using latent communities within the Statnet ecosystem; I think most of the community detection tools are pre-modeling steps you'd take using other tools. Two reasonable approaches if your data is reasonably course grained: 1. Cluster each wave separately. This is the simplest-non-stupid thing to do. Perfectly defensible as your optimizing within the wave and allows nodes maximal opportunity for change across waves. Cost is that you have multiple communities completely independent across T, so (a) you'll have to do the work to figure out if a community at time t is the same as at t+1 (and it will never be exactly the same), and (b) there will be noise. 1. Convert your data to a multi-layer network, where each node identity is linked to itself across waves (*i.e. stack the edgelists, and add i_t --> i_t+1 to the edgelist). You'll have to do some data munging to get the IDs all sorted and such. Then cluster the multi-layer network. You can (should) adjust weights for clustering within/between layers - which is something of an art. Peter Mucha's team has done a fair amount of work on this. See https://arxiv.org/abs/0911.1824 to get you started.... PTs Jim James Moody Professor of Sociology Director, Duke Network Analysis Center From: statnet_help On Behalf Of Abir Khazaal via statnet_help Sent: Sunday, April 28, 2024 8:23 PM To: statnet_help@u.washington.edu Subject: [statnet_help] Detecting communities Hi there, I really hope someone can assist me. Isn't there a package within the statnet project that I can use for detecting communities within temporal networks? Thanks, Abir (Abby) Khazaal | PhD candidate Lab Manager | Project Coordinator Biomedical AI Laboratory | (Vafaee Lab) School of Biotechnology and Biomolecular Sciences | (BABS) Level 2, E26 | UNSW SYDNEY NSW 2052 AUSTRALIA If I send you an email beyond your regular work hours, please understand that I do not anticipate you to review, reply, or act on it outside of your typical work schedule. [A picture containing logo Description automatically generated] [A logo with a white background Description automatically generated] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 7450 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.png Type: image/png Size: 25472 bytes Desc: image005.png URL: From statnet_help at u.washington.edu Thu May 16 07:52:32 2024 From: statnet_help at u.washington.edu (Khanna, Aditya via statnet_help) Date: Thu May 16 07:52:46 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 Message-ID: Dear Statnet Dev and User Community: I have an ERGM that I fit previously with ERGM v3.10.4 on a directed network with 32,000 nodes. The model included in- and out-degrees, in addition to other terms. The complete Rout from this fit can be seen here . I am now trying to reproduce this fit with ergm v4.6, but the model does not converge. (See here .) I am looking for ideas on how to trouble shoot this. One suggestion I got was to set values for the "tuning parameters" in the v4.6 to their defaults from v3.11.4. But ERGM v4.6 has a lot more parameters that can be specified, and I am not sure which ones make most sense to consider. I would be grateful for any suggestions on this or alternate ideas to try. Many thanks, Aditya -- Aditya S. Khanna, Ph.D. Assistant Professor Department of Behavioral and Social Sciences Center for Alcohol and Addiction Studies Brown University School of Public Health Pronouns: he/him/his 401-863-6616 sph.brown.edu https://vivo.brown.edu/display/akhann16 -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu May 16 11:32:58 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Thu May 16 11:33:05 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: References: Message-ID: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> Hi, Aditya - I will defer to the mighty Pavel for the exact best formula to reproduce 3.x fits with the latest codebase.? (You need to switch convergence detection to "Hotelling," and there are some other things that must be modified.)? However, as a general matter, for challenging models where Geyer-Thompson-Hummel has a hard time converging (particularly on a large node set), you may find it useful to try the stochastic approximation method (main="Stochastic" in your control argument will activate it). G-T-H can (in principle) have sharper convergence when near the solution, but in practice SA fails more gracefully.?? I would suggest increasing your default MCMC thinning interval (MCMC.interval), given your network size; depending on density, extent of dependence, and other factors, you may need O(N^2) toggles per step.? It is sometimes possible to get away with as few as k*N (for some k in, say, the 5-100 range), but if your model has substantial dependence and is not exceptionally sparse then you will probably need to be in the quadratic regime.? One notes that it can sometimes be helpful when getting things set up to run "pilot" fits with the default or otherwise smaller thinning intervals, so that you can discover if e.g. you have a data issue or other problem before you spend the waiting time on a high-quality model fit. To put in the obligatory PSA, both G-T-H and SA are simply different strategies for computing the same thing (the MLE, in this case), so both are fine - they just have different engineering tradeoffs.? So use whichever proves more effective for your model and data set. Hope that helps, -Carter On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: > Dear Statnet Dev and User Community: > > I have an ERGM that I fit previously with ERGM v3.10.4 on a directed > network with 32,000 nodes. The model included in- and out-degrees, in > addition to other terms. The complete Rout from this fit can be seen > here > . > I am now trying to reproduce this fit with ergm v4.6, but the model > does not converge. (See here > .) > > I am looking for ideas on how to trouble shoot this. One suggestion I > got was to set values for the "tuning parameters" in the v4.6 to their > defaults from v3.11.4. But ERGM v4.6 has a lot more ?parameters that > can be specified, and I am not sure which ones make most sense to > consider. > > I would be grateful for any suggestions on this or alternate ideas to try. > > Many thanks, > Aditya > > > > > -- > > > > > > > > > Aditya S. Khanna, Ph.D. > > Assistant Professor > > Department of Behavioral and Social Sciences > > Center for Alcohol and Addiction Studies > > Brown University School of Public Health > > Pronouns: he/him/his > > > 401-863-6616 > > sph.brown.edu > > > https://vivo.brown.edu/display/akhann16 > > > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu Jul 18 07:45:38 2024 From: statnet_help at u.washington.edu (Federico Salvati via statnet_help) Date: Thu Jul 18 07:45:56 2024 Subject: [statnet_help] TERGM model analysis problem valued edges Message-ID: Dear statnet team, I am a PhD student in Berlin and I mostly do text mining but I am approaching Network analysis for my project. I am having problems applying the TERGM model to my data. I have a dynamic network dataset but I really would like to have a weighted value for edges and not just a binary value. Can this be done with the package? I am not sure how to approach the problem Sorry if the question might be silly I am approaching SNA for the first time and I am learning on my own so I am not quite proficient with it yet Best Federico Salvati -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sat Jul 20 13:52:16 2024 From: statnet_help at u.washington.edu (Martina Morris via statnet_help) Date: Sat Jul 20 13:52:41 2024 Subject: [statnet_help] TERGM model analysis problem valued edges In-Reply-To: References: Message-ID: Hi Federico, I think some types of valued tie terms are possible in tergm. Tergms are implemented by specifying one or more ergms (how many depends on how many of the operators you're using (cross, change, form, diss/persist)). So the valued terms that are available in ergm may be used in this context. But Pavel is the expert on all things valued, so he should chime in here. best, Martina On Thu, Jul 18, 2024 at 7:46?AM Federico Salvati via statnet_help < statnet_help@u.washington.edu> wrote: > Dear statnet team, > > > I am a PhD student in Berlin and I mostly do text mining but I am > approaching Network analysis for my project. > > I am having problems applying the TERGM model to my data. I have a dynamic > network dataset but I really would like to have a weighted value for edges > and not just a binary value. > > Can this be done with the package? I am not sure how to approach the > problem > > > Sorry if the question might be silly I am approaching SNA for the first > time and I am learning on my own so I am not quite proficient with it yet > > Best > > Federico Salvati > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sat Jul 20 14:38:55 2024 From: statnet_help at u.washington.edu (Mark S. Handcock via statnet_help) Date: Sat Jul 20 14:39:01 2024 Subject: [statnet_help] TERGM model analysis problem valued edges In-Reply-To: References: Message-ID: Hi Federico, The separable model uses the binary nature of the network in a fundamental way. If the edges have weights/values then a new separability assumption is needed, and this will generally not be as elegant as that for binary edges. This, however, has been done by Yik Lun Kei, a former student of mine: "A partially separable model for dynamic valued networks" (https://www.sciencedirect.com/science/article/abs/pii/S0167947323001226). and he has code on GitHub. Best, Mark ---------------------------------------------------- Mark S. Handcock Distinguished Professor of Statistics Department of Statistics and Data Science University of California Los Angeles, CA 90095-1554. Web: https://faculty.stat.ucla.edu/handcock email: handcock@stat.ucla.edu On 7/20/24 1:52 PM, Martina Morris via statnet_help wrote: > Hi Federico, > > I think some types of valued tie terms are possible in tergm.? Tergms > are implemented by specifying one or more ergms (how many depends on > how many of the operators you're using (cross, change, form, > diss/persist)).? So the valued terms that are available in ergm may be > used in this context. > > But Pavel is the expert on all things valued, so he should chime in here. > > best, > Martina > > On Thu, Jul 18, 2024 at 7:46?AM Federico Salvati via statnet_help > wrote: > > Dear statnet team, > > > I am a PhD student in Berlin and I mostly do text mining but I am > approaching?Network analysis for my project. > > I am having problems applying the TERGM model to my data. I have a > dynamic network dataset but I really would like to have a weighted > value for edges and not just a binary value. > > Can this be done with the package? I am not sure how to > approach?the problem > > > Sorry if the question might be silly I am approaching SNA for the > first time and I am learning on my own so I am not quite > proficient?with it yet > > Best > > Federico Salvati > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Tue Jul 23 02:47:58 2024 From: statnet_help at u.washington.edu (Federico Salvati via statnet_help) Date: Tue Jul 23 02:48:15 2024 Subject: [statnet_help] TERGM model analysis problem valued edges In-Reply-To: References: Message-ID: Wow thank you everybody for your help and your swift responses. What an amazing community Federico On Sat, 20 Jul 2024 at 23:38, Mark S. Handcock wrote: > Hi Federico, > > The separable model uses the binary nature of the network in a fundamental > way. If the edges have weights/values then a new separability assumption is > needed, and this will generally not be as elegant as that for binary edges. > > This, however, has been done by Yik Lun Kei, a former student of mine: > > "A partially separable model for dynamic valued networks" > (https://www.sciencedirect.com/science/article/abs/pii/S0167947323001226). > > > and he has code on GitHub. > > Best, > > Mark > > ---------------------------------------------------- > Mark S. Handcock > Distinguished Professor of Statistics > Department of Statistics and Data Science > University of California > Los Angeles, CA 90095-1554. > Web: https://faculty.stat.ucla.edu/handcock > email: handcock@stat.ucla.edu > > > On 7/20/24 1:52 PM, Martina Morris via statnet_help wrote: > > Hi Federico, > > I think some types of valued tie terms are possible in tergm. Tergms are > implemented by specifying one or more ergms (how many depends on how many > of the operators you're using (cross, change, form, diss/persist)). So the > valued terms that are available in ergm may be used in this context. > > But Pavel is the expert on all things valued, so he should chime in here. > > best, > Martina > > On Thu, Jul 18, 2024 at 7:46?AM Federico Salvati via statnet_help < > statnet_help@u.washington.edu> wrote: > >> Dear statnet team, >> >> >> I am a PhD student in Berlin and I mostly do text mining but I am >> approaching Network analysis for my project. >> >> I am having problems applying the TERGM model to my data. I have a >> dynamic network dataset but I really would like to have a weighted value >> for edges and not just a binary value. >> >> Can this be done with the package? I am not sure how to approach the >> problem >> >> >> Sorry if the question might be silly I am approaching SNA for the first >> time and I am learning on my own so I am not quite proficient with it yet >> >> Best >> >> Federico Salvati >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> > > _______________________________________________ > statnet_help mailing liststatnet_help@u.washington.eduhttp://mailman13.u.washington.edu/mailman/listinfo/statnet_help > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Wed Jul 31 18:48:24 2024 From: statnet_help at u.washington.edu (SJ C via statnet_help) Date: Wed Jul 31 18:48:39 2024 Subject: [statnet_help] Question about including an interaction effect in REM Message-ID: Dear all, May I ask a question about including an interaction effect in REM? As illustrated in the code below, let's say I have two variables, A.cat (categorical) and B.con (continuous). rem.dyad(networkdata, n = 200, effects = c("CovSnd","CovRec", "CovEvent", "NIDRec","NIDSnd", "NODSnd","NODRec", "PSAB-BA", "RRecSnd", "RSndSnd"), covar = list(CovSnd = cbind(A.cat, B.con), CovRec= cbind(A.cat, B.con), CovEvent = abind(same.A.cat, A.cat*B.con, along=0)), ordinal=TRUE, hessian = TRUE) To create their interaction effect, I formed a matrix by inserting senders' A.cat values into all columns except the diagonal. Additionally, I did the same thing with senders' B.con values and mean-centered them. Then, I multiplied these two matrices, which is represented as A.cat*B.con in the code. My questions are: 1. Can the interaction term be inserted into the REM in the way described above? 2. Should I also include A.cat and B.con(centered) as components in the CovEvent code, along with A.cat*B.con? Besides these questions, I am also curious whether there are any measures or indices that can compare the statistical significance of coefficients between two REMs with the same parameters? It would be greatly appreciated if I can have any responses. Thank you! Sincerely, Choi -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu Aug 1 04:06:31 2024 From: statnet_help at u.washington.edu (F. Benjamin Rosche via statnet_help) Date: Thu Aug 1 04:06:47 2024 Subject: [statnet_help] Degree by attribute in bipartite networks Message-ID: Hi everyone! I have a bipartite network of students selecting classes: g <- network( data.frame(from=c(1,2,3), to=4), # 3 students selecting 1 class matrix.type="edgelist", vertex.attr = data.frame(id=1:4, race=c("B", "W", "W", NA)), # race as a student attribute directed = F, bipartite = T ) gplot( g, gmode="twomode", usearrows = FALSE, label = c("B", "W", "W", NA), displaylabels = T, ) I would like to measure how often, say, Black students are the only Black student in class. In the example, the Black student is the only Black student in the class. I can't operationalize this statistic, however. I tried: summary(g ~ F(~b1factor("race", levels="B"), ~b2degree(1))) # doesn't work because b2degree is not a dyad-independent term. summary(g ~ !b1nodematch("race", levels="B")) # NOT operator doesn't work Is there a way to operationalize this statistic? Thank you very much, Ben -- *Benjamin Rosche, Ph.D.* Office of Population Research Princeton University / benrosche.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sun Aug 4 16:21:43 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Sun Aug 4 16:21:49 2024 Subject: [statnet_help] Question about including an interaction effect in REM In-Reply-To: References: Message-ID: Hi, Choi You can certainly include interaction effects by using products as covariates (just as you would in e.g. a regression context). You do, however, need to think carefully about what you expect these effects to do, and whether you want them to act as predictors for individual sending rates (CovSnd), predictors for individual receipt rates (CovRec), or predictors for pairwise specific events (CovEvent).? Also, it is important to distinguish between an effect that says that e.g. the log sending rate for an event from vertex i to vertex j varies with ? x_i y_i (a CovSnd interaction between x and y), and a model that says that the log sending rate for an event from vertex i to vertex j varies with ? x_i y_j.? Those are very different models, with the latter saying that the product between the sender's x value and the receiver's y value modifies the i->j interaction rate.? I'm not sure what you intend here, so I cannot tell whether you are doing what you want to be doing, but all of these are straightfoward to implement using appropriate covariate specifications. Vis a vis testing hypotheses across models, the usual considerations apply as they would in other maximum likelihood scenarios (i.e., you can do it, depending on what assumptions/approximations you are willing to make, and the details may depend on your scenario).? For the most obvious base case, if you have two models A and B on independent data sets of reasonable size, then (coef_A - coef_B)/sqrt(se(coef_A)^2 + se(coef_B)^2) for respective coefficients coef_A from A and coef_B from B should be approximately standard normal (leading to a z-test for equality of coefficients).? If you want a Bayesian answer for the probability that coef_A > coef_B, fit both models using the BSIR method and look at the respective fraction of posterior draws (pairing A and B) for which coef_A is greater than coef_B. Hope that helps, -Carter On 7/31/24 6:48 PM, SJ C via statnet_help wrote: > Dear all, > > May I ask a question about including an interaction effect in REM? > As illustrated in the code below, let's say I have two variables, > A.cat (categorical) and B.con (continuous). > > ? rem.dyad(networkdata, n = 200, effects = c("CovSnd","CovRec", > "CovEvent", > ?"NIDRec","NIDSnd", > ?"NODSnd","NODRec", > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"PSAB-BA", > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"RRecSnd", "RSndSnd"), > ? ? ? ? ? ? ? covar = list(CovSnd = cbind(A.cat, B.con), > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? CovRec= cbind(A.cat, B.con), > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? CovEvent = abind(same.A.cat > , > A.cat*B.con, along=0)), > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ordinal=TRUE, hessian = TRUE) > > To create their?interaction effect, I formed a matrix by inserting > senders' A.cat values into all columns except the diagonal. > Additionally, I did the same thing with senders' B.con values and > mean-centered them. > Then, I multiplied these two matrices, which is represented as > A.cat*B.con in the code. > > My questions are: > 1. Can the interaction term be inserted into the REM in the way > described above? > 2. Should I also include A.cat and B.con(centered) as components in > the CovEvent code, along with A.cat*B.con? > > Besides these questions, I am also curious whether there are any > measures or indices that can compare the statistical significance of > coefficients between two REMs with the same parameters? > > It would be greatly?appreciated if I can have any responses. > Thank you! > > Sincerely, > Choi > > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!O4iVbsPlR-Obu67qMPZFlFsScm-aVJ4_EBM1ZUpI65bhpi_4Bg6N6plaSjMvp66VdzYzmS17AM3_VDJ8EVuWlFfRQDZd$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sun Aug 11 01:25:52 2024 From: statnet_help at u.washington.edu (SJ C via statnet_help) Date: Sun Aug 11 01:26:06 2024 Subject: [statnet_help] Question about including an interaction effect in REM In-Reply-To: References: Message-ID: Dear Carter, I really appreciate your detailed explanation! I am expecting the interaction effect to "act as a predictor for individual sending rates." In addition, my interest is in "the log sending rate for an event from vertex i to vertex j varying with ? x_i y_i (a CovSnd interaction between x and y)." If this is the case, how can I modify the code below that I previously posted? rem.dyad(networkdata, n = 200, effects = c("CovSnd","CovRec", "CovEvent", "NIDRec","NIDSnd", "NODSnd","NODRec", "PSAB-BA", "RRecSnd", "RSndSnd"), covar = list(CovSnd = cbind(A.cat, B.con), CovRec= cbind(A.cat, B.con), CovEvent = abind(same.A.cat , A.cat*B.con, along=0)), ordinal=TRUE, hessian = TRUE) Thank you for your kind support! Sincerely, Choi 2024? 8? 5? (?) ?? 8:22, Carter T. Butts via statnet_help < statnet_help@u.washington.edu>?? ??: > Hi, Choi > > You can certainly include interaction effects by using products as > covariates (just as you would in e.g. a regression context). You do, > however, need to think carefully about what you expect these effects to do, > and whether you want them to act as predictors for individual sending rates > (CovSnd), predictors for individual receipt rates (CovRec), or predictors > for pairwise specific events (CovEvent). Also, it is important to > distinguish between an effect that says that e.g. the log sending rate for > an event from vertex i to vertex j varies with ? x_i y_i (a CovSnd > interaction between x and y), and a model that says that the log sending > rate for an event from vertex i to vertex j varies with ? x_i y_j. Those > are very different models, with the latter saying that the product between > the sender's x value and the receiver's y value modifies the i->j > interaction rate. I'm not sure what you intend here, so I cannot tell > whether you are doing what you want to be doing, but all of these are > straightfoward to implement using appropriate covariate specifications. > > Vis a vis testing hypotheses across models, the usual considerations apply > as they would in other maximum likelihood scenarios (i.e., you can do it, > depending on what assumptions/approximations you are willing to make, and > the details may depend on your scenario). For the most obvious base case, > if you have two models A and B on independent data sets of reasonable size, > then (coef_A - coef_B)/sqrt(se(coef_A)^2 + se(coef_B)^2) for respective > coefficients coef_A from A and coef_B from B should be approximately > standard normal (leading to a z-test for equality of coefficients). If > you want a Bayesian answer for the probability that coef_A > coef_B, fit > both models using the BSIR method and look at the respective fraction of > posterior draws (pairing A and B) for which coef_A is greater than coef_B. > > Hope that helps, > > -Carter > On 7/31/24 6:48 PM, SJ C via statnet_help wrote: > > Dear all, > > May I ask a question about including an interaction effect in REM? > As illustrated in the code below, let's say I have two variables, A.cat > (categorical) and B.con (continuous). > > rem.dyad(networkdata, n = 200, effects = c("CovSnd","CovRec", "CovEvent", > "NIDRec","NIDSnd", > "NODSnd","NODRec", > "PSAB-BA", > "RRecSnd", "RSndSnd"), > covar = list(CovSnd = cbind(A.cat, B.con), > CovRec= cbind(A.cat, B.con), > CovEvent = abind(same.A.cat > , > A.cat*B.con, along=0)), > ordinal=TRUE, hessian = TRUE) > > To create their interaction effect, I formed a matrix by inserting > senders' A.cat values into all columns except the diagonal. > Additionally, I did the same thing with senders' B.con values and > mean-centered them. > Then, I multiplied these two matrices, which is represented as A.cat*B.con > in the code. > > My questions are: > 1. Can the interaction term be inserted into the REM in the way described > above? > 2. Should I also include A.cat and B.con(centered) as components in the > CovEvent code, along with A.cat*B.con? > > Besides these questions, I am also curious whether there are any measures > or indices that can compare the statistical significance of coefficients > between two REMs with the same parameters? > > It would be greatly appreciated if I can have any responses. > Thank you! > > Sincerely, > Choi > > > _______________________________________________ > statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!O4iVbsPlR-Obu67qMPZFlFsScm-aVJ4_EBM1ZUpI65bhpi_4Bg6N6plaSjMvp66VdzYzmS17AM3_VDJ8EVuWlFfRQDZd$ > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Fri Aug 23 21:07:20 2024 From: statnet_help at u.washington.edu (CHU-DING LING via statnet_help) Date: Fri Aug 23 21:08:01 2024 Subject: [statnet_help] Assistance with Calculating Degree Centrality in Directed and Valued Networks Using the sna Package in R Message-ID: Dear all, I hope this message finds you well. I am currently working on a project that involves social network analysis using the *sna* package in R. I am reaching out to seek your expertise on a particular issue I have encountered regarding the calculation of degree centrality in directed and valued networks. I am working with a directed network where edges have associated weights. My goal is to accurately calculate both the in-degree and out-degree centrality of nodes while considering the edge weights. I attempted to calculate the degree centrality using the degree function in the *sna* package. While this function works well for unweighted networks, I realized that it does not account for edge weights. Could you please advise on the best method or function within the *sna* package to accurately calculate the degree centrality in this context? Though I can make it with *igraph* or other packages, I am particularly interested in whether *sna* could directly handle weighted edges in directed networks. Your guidance would be invaluable, and I would greatly appreciate any suggestions or resources you might be able to provide. Thank you for your time and consideration. I look forward to your insights. Best, Chuding -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sat Aug 24 14:15:38 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Sat Aug 24 14:15:45 2024 Subject: [statnet_help] Assistance with Calculating Degree Centrality in Directed and Valued Networks Using the sna Package in R In-Reply-To: References: Message-ID: <01def657-125d-44bc-9716-a18f8f945838@uci.edu> H, Chuding - The degree() function already exploits edge values; this is its default behavior.? If you wish to /ignore/ edge values, you need to set the "ignore.eval" argument to TRUE. If you are not getting valued degree calculations from degree() using the defaults, then you are not passing it valued data.? This may be due to a preprocessing error (so check your inputs). Another possible failure mode is that you are passing it a network object that has value information stored as an edge attribute, and are expecting degree() to use those edge values.? Since a network object can have any number of edge attributes (or none at all), and they can be of any data type (i.e., not necessarily numeric), degree() can't automagically know what is intended in that case, and will therefore treat the data as unvalued.? An easy way to use edge attribute information is to wrap your object in a call like as.edgelist.sna(,attrname=), which will extract from the object the specific valued network that you want to analyze.? That's especially handy if you have several different edge values you want to store in the same network object.? Of course, you can also use that same trick to make a "working" edgelist at the top of your script that you reuse for multiple calculations.? (The same can be done with adjacency matrices rather than edgelists, if one prefers.? See e.g. ?as.sociomatrix.sna.) Hope that helps, -Carter On 8/23/24 9:07 PM, CHU-DING LING via statnet_help wrote: > > Dear all, > > I hope this message finds you well. I am currently working on a > project that involves social network analysis using the *sna* package > in R. I am reaching out to seek your expertise on a particular issue I > have encountered regarding the calculation of degree centrality in > directed and valued networks. > > I am working with a directed network where edges have associated > weights. My goal is to accurately calculate both the in-degree and > out-degree centrality of nodes while considering the edge weights. I > attempted to calculate the degree centrality using the degree function > in the *sna* package. While this function works well for unweighted > networks, I realized that it does not account for edge weights. > > Could you please advise on the best method or function within the > *sna* package to accurately calculate the degree centrality in this > context? Though I can make it with *igraph* or other packages, I am > particularly interested in whether *sna* could directly handle > weighted edges in directed networks. > > Your guidance would be invaluable, and I would greatly appreciate any > suggestions or resources you might be able to provide. Thank you for > your time and consideration. I look forward to your insights. > > Best, > > Chuding > > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Npw4CFqLg3mWAfPesSBtoa6UjvVqK_t7JYrixPsjqKAxzUTjhOvoeAxG6tO4iWruplppJ7ZQGd8FOLXV8VLxBP8-nfnt$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Mon Aug 26 05:59:48 2024 From: statnet_help at u.washington.edu (CHU-DING LING via statnet_help) Date: Mon Aug 26 06:00:30 2024 Subject: [statnet_help] Assistance with Calculating Degree Centrality in Directed and Valued Networks Using the sna Package in R In-Reply-To: <01def657-125d-44bc-9716-a18f8f945838@uci.edu> References: <01def657-125d-44bc-9716-a18f8f945838@uci.edu> Message-ID: Carter, Thank you for your suggestions! The problem has been resolved. Initially, I imported a matrix from a CSV file and stored it as a matrix class object. I then converted it into a network class object since many functions in *sna* require objects to be of the network class. However, I noticed that the edge weights were lost during the conversion from the matrix object to the network object, which caused the results from the degree() function not to account for edge weights. Actually, the degree() function can directly handle the matrix object. I also used as.sociomatrix.sna() to convert the original matrix object into another matrix object with a different name. Both approaches produced the same degree centrality results for the directed and valued network. I also experimented with the as.edgelist.sna() function to convert the original matrix object into an edgelist object. However, when I calculated the degree centrality of this object, it produced incorrect results, with a greater number of elements than the number of nodes in my network. I appreciate if you can give some insights on this issue. Thanks in advance! Chuding Carter T. Butts via statnet_help ?2024?8?25??? 05:15??? > H, Chuding - > > The degree() function already exploits edge values; this is its default > behavior. If you wish to *ignore* edge values, you need to set the > "ignore.eval" argument to TRUE. > > If you are not getting valued degree calculations from degree() using the > defaults, then you are not passing it valued data. This may be due to a > preprocessing error (so check your inputs). Another possible failure mode > is that you are passing it a network object that has value information > stored as an edge attribute, and are expecting degree() to use those edge > values. Since a network object can have any number of edge attributes (or > none at all), and they can be of any data type (i.e., not necessarily > numeric), degree() can't automagically know what is intended in that case, > and will therefore treat the data as unvalued. An easy way to use edge > attribute information is to wrap your object in a call like > as.edgelist.sna(,attrname=), which > will extract from the object the specific valued network that you want to > analyze. That's especially handy if you have several different edge values > you want to store in the same network object. Of course, you can also use > that same trick to make a "working" edgelist at the top of your script that > you reuse for multiple calculations. (The same can be done with adjacency > matrices rather than edgelists, if one prefers. See e.g. > ?as.sociomatrix.sna.) > > Hope that helps, > > -Carter > On 8/23/24 9:07 PM, CHU-DING LING via statnet_help wrote: > > Dear all, > > > > I hope this message finds you well. I am currently working on a project > that involves social network analysis using the *sna* package in R. I am > reaching out to seek your expertise on a particular issue I have > encountered regarding the calculation of degree centrality in directed and > valued networks. > > > > I am working with a directed network where edges have associated weights. > My goal is to accurately calculate both the in-degree and out-degree > centrality of nodes while considering the edge weights. I attempted to > calculate the degree centrality using the degree function in the *sna* > package. While this function works well for unweighted networks, I realized > that it does not account for edge weights. > > > > Could you please advise on the best method or function within the *sna* > package to accurately calculate the degree centrality in this context? > Though I can make it with *igraph* or other packages, I am particularly > interested in whether *sna* could directly handle weighted edges in > directed networks. > > > > Your guidance would be invaluable, and I would greatly appreciate any > suggestions or resources you might be able to provide. Thank you for your > time and consideration. I look forward to your insights. > > > > Best, > > Chuding > > _______________________________________________ > statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Npw4CFqLg3mWAfPesSBtoa6UjvVqK_t7JYrixPsjqKAxzUTjhOvoeAxG6tO4iWruplppJ7ZQGd8FOLXV8VLxBP8-nfnt$ > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Tue Aug 27 21:41:42 2024 From: statnet_help at u.washington.edu (SJ C via statnet_help) Date: Tue Aug 27 21:41:57 2024 Subject: [statnet_help] Intercept for ordinal data in REM Message-ID: Dear all, I have a question about including the intercept in REM. I've noticed that with 'ordinal' data, the intercept is not included in the model, whereas with 'time-stamp' data, it is included. Is my understanding correct? (By intercept here I mean creating a variable with a value of one for all nodes and including it as a covariate of CovSnd.) I'd be grateful if anyone could respond to this question. Thank you! Best, Choi -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu Aug 29 12:13:51 2024 From: statnet_help at u.washington.edu (Khanna, Aditya via statnet_help) Date: Thu Aug 29 12:14:07 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> Message-ID: Hi Carter and All, Thank you so much for the helpful guidance here. I think following your suggestions has brought us very close to reproducing the target statistics in the simulated networks, but there are still some gaps. Our full previous exchange is below, but to summarize: I have an ERGM that I fit previously with ERGM v3.10.4 on a directed network with 32,000 nodes. The model consisted of in- and out-degrees in addition to other terms, including a custom distance term. In trying to reproduce this fit with ergm v4.6, the model did not initially converge. Your suggestion to try setting the main.method = ?Stochastic Approximation? considerably improved the fitting. Specifying the convergence detection to ?Hotelling? on top of that brought us almost to simulated networks that capture all the mean statistics. (Following an old discussion thread on the statnet github, I also tried setting the termination criteria to Hummel and MCMLE.effectiveSize = NULL. I think, for me, in practice, Hotelling worked a bit better than Hummel though). In general, I tried fitting the model with variants of this specification. I got the best results with setting both MCMC samplesize=1e6 and interval = 1e6 (see table below). MCMC interval MCMC sample size Convergence Detection Results/Outcome Note 1e6 1e6 Hotelling Closest agreement between simulated and target statistics Max. Lik. fit summary and simulation Rout Violin plots showing the simulated and target statistics for each parameter But, I found that this was the closest I could get producing simulated statistics that matched the target statistics. In general, any further increasing or decreasing of either the samplesize or interval did not help generate a closer result, i.e., this looked to be some optimum in the fit parameter space. I can provide further details on the results of those fits, which for some configurations didn?t converge, and if they did converge, the goodness-of-fit was worse than what I had with setting the MCMC interval and samplesize to 1e6. Based on your experiences, I was wondering if this is expected? For now, my main question is, are there any suggestions on how I can further tune the fitting parameters to match my targets more closely? I can provide specific details on the outcomes of those fitting processes if that would be helpful. Thanks for your consideration. Aditya On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via statnet_help < statnet_help@u.washington.edu> wrote: > Hi, Aditya - > > I will defer to the mighty Pavel for the exact best formula to reproduce > 3.x fits with the latest codebase. (You need to switch convergence > detection to "Hotelling," and there are some other things that must be > modified.) However, as a general matter, for challenging models where > Geyer-Thompson-Hummel has a hard time converging (particularly on a large > node set), you may find it useful to try the stochastic approximation > method (main="Stochastic" in your control argument will activate it). > G-T-H can (in principle) have sharper convergence when near the solution, > but in practice SA fails more gracefully. I would suggest increasing your > default MCMC thinning interval (MCMC.interval), given your network size; > depending on density, extent of dependence, and other factors, you may need > O(N^2) toggles per step. It is sometimes possible to get away with as few > as k*N (for some k in, say, the 5-100 range), but if your model has > substantial dependence and is not exceptionally sparse then you will > probably need to be in the quadratic regime. One notes that it can > sometimes be helpful when getting things set up to run "pilot" fits with > the default or otherwise smaller thinning intervals, so that you can > discover if e.g. you have a data issue or other problem before you spend > the waiting time on a high-quality model fit. > > To put in the obligatory PSA, both G-T-H and SA are simply different > strategies for computing the same thing (the MLE, in this case), so both > are fine - they just have different engineering tradeoffs. So use > whichever proves more effective for your model and data set. > > Hope that helps, > > -Carter > > > On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: > > Dear Statnet Dev and User Community: > > I have an ERGM that I fit previously with ERGM v3.10.4 on a directed > network with 32,000 nodes. The model included in- and out-degrees, in > addition to other terms. The complete Rout from this fit can be seen here > . > I am now trying to reproduce this fit with ergm v4.6, but the model does > not converge. (See here > > .) > > I am looking for ideas on how to trouble shoot this. One suggestion I got > was to set values for the "tuning parameters" in the v4.6 to their defaults > from v3.11.4. But ERGM v4.6 has a lot more parameters that can be > specified, and I am not sure which ones make most sense to consider. > > I would be grateful for any suggestions on this or alternate ideas to try. > > Many thanks, > Aditya > > > > > -- > > > > > > > > Aditya S. Khanna, Ph.D. > > Assistant Professor > > Department of Behavioral and Social Sciences > > Center for Alcohol and Addiction Studies > > Brown University School of Public Health > > Pronouns: he/him/his > > 401-863-6616 > > sph.brown.edu > > > https://vivo.brown.edu/display/akhann16 > > > _______________________________________________ > statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Fri Aug 30 00:59:25 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Fri Aug 30 00:59:31 2024 Subject: [statnet_help] Assistance with Calculating Degree Centrality in Directed and Valued Networks Using the sna Package in R In-Reply-To: References: <01def657-125d-44bc-9716-a18f8f945838@uci.edu> Message-ID: Hi, Chuding - Glad that fixed it.?? With respect to edgelists vs. adjacency matrices, they will give you equivalent results.? If you seem to be getting different results, check the two objects: somewhere in your code, you've presumably made them non-equivalent.... Best, -Carter On 8/26/24 5:59 AM, CHU-DING LING wrote: > > Carter, > > > Thank you for your suggestions! The problem has been resolved. > > Initially, I imported a matrix from a CSV file and stored it as a > matrix class object. I then converted it into a network class object > since many functions in *sna* require objects to be of the network > class. However, I noticed that the edge weights were lost during the > conversion from the matrix object to the network object, which caused > the results from the degree() function not to account for edge weights. > > Actually, the degree() function can directly handle the matrix object. > I also used as.sociomatrix.sna() to convert the original matrix object > into another matrix object with a different name. Both approaches > produced the same degree centrality results for the directed and > valued network. > > I also experimented with the as.edgelist.sna() function to convert the > original matrix object into an edgelist object. However, when I > calculated the degree centrality of this object, it produced incorrect > results, with a greater number of elements than the number of nodes in > my network. I appreciate if you can give some insights on this issue. > > Thanks in advance! > > Chuding > > > Carter T. Butts via statnet_help > ?2024?8?25??? 05:15??? > > H, Chuding - > > The degree() function already exploits edge values; this is its > default behavior.? If you wish to /ignore/ edge values, you need > to set the "ignore.eval" argument to TRUE. > > If you are not getting valued degree calculations from degree() > using the defaults, then you are not passing it valued data.? This > may be due to a preprocessing error (so check your inputs).? > Another possible failure mode is that you are passing it a network > object that has value information stored as an edge attribute, and > are expecting degree() to use those edge values.? Since a network > object can have any number of edge attributes (or none at all), > and they can be of any data type (i.e., not necessarily numeric), > degree() can't automagically know what is intended in that case, > and will therefore treat the data as unvalued.? An easy way to use > edge attribute information is to wrap your object in a call like > as.edgelist.sna(,attrname=), > which will extract from the object the specific valued network > that you want to analyze.? That's especially handy if you have > several different edge values you want to store in the same > network object.? Of course, you can also use that same trick to > make a "working" edgelist at the top of your script that you reuse > for multiple calculations.? (The same can be done with adjacency > matrices rather than edgelists, if one prefers.? See e.g. > ?as.sociomatrix.sna.) > > Hope that helps, > > -Carter > > On 8/23/24 9:07 PM, CHU-DING LING via statnet_help wrote: >> >> Dear all, >> >> I hope this message finds you well. I am currently working on a >> project that involves social network analysis using the *sna* >> package in R. I am reaching out to seek your expertise on a >> particular issue I have encountered regarding the calculation of >> degree centrality in directed and valued networks. >> >> I am working with a directed network where edges have associated >> weights. My goal is to accurately calculate both the in-degree >> and out-degree centrality of nodes while considering the edge >> weights. I attempted to calculate the degree centrality using the >> degree function in the *sna* package. While this function works >> well for unweighted networks, I realized that it does not account >> for edge weights. >> >> Could you please advise on the best method or function within the >> *sna* package to accurately calculate the degree centrality in >> this context? Though I can make it with *igraph* or other >> packages, I am particularly interested in whether *sna* could >> directly handle weighted edges in directed networks. >> >> Your guidance would be invaluable, and I would greatly appreciate >> any suggestions or resources you might be able to provide. Thank >> you for your time and consideration. I look forward to your insights. >> >> Best, >> >> Chuding >> >> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Npw4CFqLg3mWAfPesSBtoa6UjvVqK_t7JYrixPsjqKAxzUTjhOvoeAxG6tO4iWruplppJ7ZQGd8FOLXV8VLxBP8-nfnt$ > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Fri Aug 30 01:37:41 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Fri Aug 30 01:37:49 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> Message-ID: <7f088ef1-8b4f-4613-b7f2-52d81d3ecb3f@uci.edu> Hi, Aditya - I'll be interested in Pavel's take on the convergence issues, but just to verify, you are assessing convergence based on a /second/ MCMC run, correct?? The MCMC statistics in the ergm object are from the penultimate iteration, and may thus be out of equilibrium (but this does /not/ necessarily mean that the /model/ did not converge).? However, if you simulate a new set of draws from the fitted model and the mean stats do not match, /then/ you have an issue.? (This is why we now point folks to gof() for that purpose.)? It looks like your plots are from the ergm object and not from a gof() run (or other secondary simulation), so I want to verify that first. I also note that a quick glance at the plots from your more exhaustive simulation case don't seem all that far off, which could indicate either that the model did converge (and per above, we're not looking at a draw from the final model), or that it converged within the tolerances that were set, and you may need to tighten them.? But best to first know if there's a problem in the first place. Another observation is that, per my earlier email, you may need O(N^2) toggles per draw to get good performance if your model has a nontrivial level of dependence.? You are using a thinning interval of 1e6, which is in your case around 30*N.? It's possible that you've got too much dependence for that: O(N^2) here would mean some multiple of about 1e9, which is about a thousand times greater than what you're using.? Really large, sparse networks sometimes /can/ be modeled well without that much thinning, but it's not a given.? Relatedly, your trace plots from the 1e6 run suggest a fair amount of autocorrelation on some statistics, which suggests a lack of efficiency.? (Autocorrelation by itself isn't necessarily a problem, but it means that your effective MCMC sample size is smaller than it seems, and this can reduce the effectiveness of the MCMCMLE procedure.?? The ones from the 1e6 run aren't bad enough that I would be alarmed, but if I were looking for things to tighten up and knew this could be a problem, they suggest possible room for improvement.)? So anyway, I wouldn't crank this up until verifying that it's needed, but you are still operating on the low end of computational effort (whether it seems like it or not!). Finally, I would note that for the stochastic approximation method, convergence is to some degree (and it's a bit complex) determined by how many subphases are run, and how many iterations are used per subphase.? This algorithm is due to Tom in his classic JoSS paper (but without the complement moves), which is still a good place to look for details.? It is less fancy than some more modern algorithms of its type, but is extremely hard to beat (I've tried and failed more than once!).? In any event, there are several things that can tighten that algorithm relative to its defaults, including increasing thinning, increasing the iterations per subphase, and increasing the number of subphases.? Some of these sharply increase computational cost, because e.g. the number of actual subphase iterations doubles (IIRC) at each subphase - so sometimes one benefits by increasing the phase number but greatly reducing the base number of iterations per phase.? The learning rate ("SA.initial.gain") can also matter, although I would probably avoid messing with it if the model is well-behaved (as here).? I will say that, except under exotic conditions in which I am performing Unspeakable ERGM Experiments (TM) of which we tell neither children nor grad students, I do not recall ever needing to do much with the base parameters - adjusting thinning, as needs must, has almost always done the trick.? Still, if other measures fail, tinkering with these settings can/will certainly affect convergence. I'd check on those things first, and then see if you still have a problem.... Hope that helps, -Carter On 8/29/24 12:13 PM, Khanna, Aditya wrote: > > Hi Carter and All, > > > Thank you so much for the helpful guidance here. I think following > your suggestions has brought us very close to reproducing the target > statistics in the simulated networks, but there are still some gaps. > > > Our full previous exchange is below, but to summarize:? I have an ERGM > that I fit previously with ERGM v3.10.4 on a directed network with > 32,000 nodes. The model consisted of in- and out-degrees in addition > to other terms, including a custom distance term. In trying to > reproduce this fit with ergm v4.6, the model did not initially converge. > > > Your suggestion to try setting the main.method = ?Stochastic > Approximation? considerably improved the fitting. Specifying the > convergence detection to ?Hotelling? on top of that brought us almost > to simulated networks that capture all the mean statistics. (Following > an old discussion thread > on > the statnet github, I also tried setting the termination criteria to > Hummel and MCMLE.effectiveSize = NULL. I think, for me, in practice, > Hotelling worked a bit better than Hummel though). > > > In general, I tried fitting the model with variants of this > specification. > I got the best results with setting both MCMC samplesize=1e6 and > interval = 1e6 (see table below). > > > MCMC interval > > > > MCMC sample size > > > > Convergence Detection > > > > Results/Outcome > > > > Note > > 1e6 > > > > 1e6 > > > > Hotelling > > > > Closest agreement? between simulated and target statistics > > > > Max. Lik. fit summary and simulation Rout > > > > Violin plots > showing > the simulated and target statistics for each parameter > > > > But, I found that this was the closest I could get producing simulated > statistics that matched the target statistics. In general, any further > increasing or decreasing of either the samplesize or interval did not > help generate a closer result, i.e., this looked to be some optimum in > the fit parameter space. I can provide further details on the results > of those fits, which for some configurations didn?t converge, and if > they did converge, the goodness-of-fit was worse than what I had with > setting the MCMC interval and samplesize to 1e6. Based on your > experiences, I was wondering if this is expected? > > > For now, my main question is, are there any suggestions on how I can > further tune the fitting parameters to match my targets more closely? > I can provide specific details on the outcomes of those fitting > processes if that would be helpful. > > > Thanks for your consideration. > > Aditya > > On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via statnet_help > wrote: > > Hi, Aditya - > > I will defer to the mighty Pavel for the exact best formula to > reproduce 3.x fits with the latest codebase. (You need to switch > convergence detection to "Hotelling," and there are some other > things that must be modified.) However, as a general matter, for > challenging models where Geyer-Thompson-Hummel has a hard time > converging (particularly on a large node set), you may find it > useful to try the stochastic approximation method > (main="Stochastic" in your control argument will activate it).? > G-T-H can (in principle) have sharper convergence when near the > solution, but in practice SA fails more gracefully.?? I would > suggest increasing your default MCMC thinning interval > (MCMC.interval), given your network size; depending on density, > extent of dependence, and other factors, you may need O(N^2) > toggles per step.? It is sometimes possible to get away with as > few as k*N (for some k in, say, the 5-100 range), but if your > model has substantial dependence and is not exceptionally sparse > then you will probably need to be in the quadratic regime.? One > notes that it can sometimes be helpful when getting things set up > to run "pilot" fits with the default or otherwise smaller thinning > intervals, so that you can discover if e.g. you have a data issue > or other problem before you spend the waiting time on a > high-quality model fit. > > To put in the obligatory PSA, both G-T-H and SA are simply > different strategies for computing the same thing (the MLE, in > this case), so both are fine - they just have different > engineering tradeoffs.? So use whichever proves more effective for > your model and data set. > > Hope that helps, > > -Carter > > > On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: >> Dear Statnet Dev and User Community: >> >> I have an ERGM that I fit previously with ERGM v3.10.4 on a >> directed network with 32,000 nodes. The model included in- and >> out-degrees, in addition to other terms. The complete Rout from >> this fit can be seen here >> . >> I am now trying to reproduce this fit with ergm v4.6, but the >> model does not converge. (See here >> .) >> >> I am looking for ideas on how to trouble shoot this. One >> suggestion I got was to set values for the "tuning parameters" in >> the v4.6 to their defaults from v3.11.4. But ERGM v4.6 has a lot >> more ?parameters that can be specified, and I am not sure which >> ones make most sense to consider. >> >> I would be grateful for any suggestions on this or alternate >> ideas to try. >> >> Many thanks, >> Aditya >> >> >> >> >> -- >> >> >> >> >> >> >> >> >> Aditya S. Khanna, Ph.D. >> >> Assistant Professor >> >> Department of Behavioral and Social Sciences >> >> Center for Alcohol and Addiction Studies >> >> Brown University School of Public Health >> >> Pronouns: he/him/his >> >> >> 401-863-6616 >> >> sph.brown.edu >> >> >> https://vivo.brown.edu/display/akhann16 >> >> >> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Fri Aug 30 01:47:23 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Fri Aug 30 01:47:29 2024 Subject: [statnet_help] Intercept for ordinal data in REM In-Reply-To: References: Message-ID: <74b17193-9654-4a98-89f7-5a3f36068a8c@uci.edu> Hi, Choi - In the ordinal-time case (where we know the order of events, but not the time at which they occurred), the hazards can only be identified up to a pacing constant; thus, one does not have an intercept (because the model is telling you about relative hazards).? When time is exactly specified (up to an arbitrary translation, of course), we do need to specify the base rate, so an intercept is called for. (It has become fashionable in some circles to try to get rid of this yet again by using Cox-like constructions that treat the baseline hazard as some wild, nonparametric thing that lies outside the model.? Except when substantively well-motivated (in which case I have no objection), I tend to find this naughty and/or unwise.? But that is a matter for another day and another forum, and I will say no more on it here.? Those wishing to engage on the matter can do so at the next Sunbelt hospitality suite.) Hope that helps, -Carter On 8/27/24 9:41 PM, SJ C via statnet_help wrote: > Dear all, > > I have a question about including the intercept in REM. > > I've noticed that with 'ordinal' data, the intercept is not included > in the model, whereas with 'time-stamp' data, it?is included. Is my > understanding correct? > (By intercept here I mean creating a variable with a value of one for > all nodes and including it as a covariate of CovSnd.) > > I'd be grateful if anyone could respond to this question. > Thank you! > > Best, > Choi > > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!M-tjYPH8d3g88FHQKMfXaGBFoOwMbxSdYIryd0gK9RRVkqogKOgFn76s1Kl6YMK8vIqpfniy6Gknz7jI0Qk2enSeq08G$ From statnet_help at u.washington.edu Mon Sep 2 07:14:32 2024 From: statnet_help at u.washington.edu (Gonzalez Loyola, M.A. via statnet_help) Date: Mon Sep 2 07:14:48 2024 Subject: [statnet_help] Interpreting quadratic effects in ERGMs Message-ID: Dear colleagues, I am currently performing some ERGMs analyses, and in one of them I explored the possibility of a quadratic effect, which was quite expected. In the model I got a significant negative effect (Estimate= - 0.04, p<0.01). I was wondering how to interpret the negative sign. My guess (which would make sense with the data) is that the negative sign reflects a negative concavity, meaning that the data resemble an inverted U shape. Has any of you encountered that? Any information on the interpretation of quadratic effects would be highly appreciated. Have a great week, Melissa Gonz?lez -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Tue Sep 3 21:28:34 2024 From: statnet_help at u.washington.edu (CHU-DING LING via statnet_help) Date: Tue Sep 3 21:29:16 2024 Subject: [statnet_help] Assistance with Calculating Degree Centrality in Directed and Valued Networks Using the sna Package in R In-Reply-To: References: <01def657-125d-44bc-9716-a18f8f945838@uci.edu> Message-ID: Dear Carter, Thanks again for your suggestions. I have some further questions regarding inconsistencies between the Bonacich?s power centrality (Beta centrality) from *igraph* and *sna* for an undirected and valued network. Enclosed please find the example codes and the associated annotations. Despite setting the ?exponent? and ?rescale? parameters consistently between the two packages, I am noticing some discrepancies in the results. I have a couple of questions which I was hoping you could help with: *1. How can we get the highest beta value before calculating Bonacich?s power centrality?* Currently I use another unpublished package to obtain the value. I would like to understand the process or any specific considerations required to obtain the highest beta value with *sna* or *igraph*. *2. How can we achieve consistent results between ?igraph? and ?sna??* Despite aligning the ?exponent? and ?rescale? parameters across both packages, the results still differ. Are there any additional parameters or factors I should consider to ensure the outputs are consistent? I would greatly appreciate any guidance or insights you could provide on these matters. Thank you for your time and assistance. Best regards, Chuding Carter T. Butts via statnet_help ?2024?8?30??? 16:00??? > Hi, Chuding - > > Glad that fixed it. With respect to edgelists vs. adjacency matrices, > they will give you equivalent results. If you seem to be getting different > results, check the two objects: somewhere in your code, you've presumably > made them non-equivalent.... > > Best, > > -Carter > On 8/26/24 5:59 AM, CHU-DING LING wrote: > > Carter, > > > Thank you for your suggestions! The problem has been resolved. > > > > Initially, I imported a matrix from a CSV file and stored it as a matrix > class object. I then converted it into a network class object since many > functions in *sna* require objects to be of the network class. However, I > noticed that the edge weights were lost during the conversion from the > matrix object to the network object, which caused the results from the > degree() function not to account for edge weights. > > > > Actually, the degree() function can directly handle the matrix object. I > also used as.sociomatrix.sna() to convert the original matrix object into > another matrix object with a different name. Both approaches produced the > same degree centrality results for the directed and valued network. > > > > I also experimented with the as.edgelist.sna() function to convert the > original matrix object into an edgelist object. However, when I calculated > the degree centrality of this object, it produced incorrect results, with a > greater number of elements than the number of nodes in my network. I > appreciate if you can give some insights on this issue. > > > > Thanks in advance! > > > > Chuding > > Carter T. Butts via statnet_help > ?2024?8?25??? 05:15??? > >> H, Chuding - >> >> The degree() function already exploits edge values; this is its default >> behavior. If you wish to *ignore* edge values, you need to set the >> "ignore.eval" argument to TRUE. >> >> If you are not getting valued degree calculations from degree() using the >> defaults, then you are not passing it valued data. This may be due to a >> preprocessing error (so check your inputs). Another possible failure mode >> is that you are passing it a network object that has value information >> stored as an edge attribute, and are expecting degree() to use those edge >> values. Since a network object can have any number of edge attributes (or >> none at all), and they can be of any data type (i.e., not necessarily >> numeric), degree() can't automagically know what is intended in that case, >> and will therefore treat the data as unvalued. An easy way to use edge >> attribute information is to wrap your object in a call like >> as.edgelist.sna(,attrname=), which >> will extract from the object the specific valued network that you want to >> analyze. That's especially handy if you have several different edge values >> you want to store in the same network object. Of course, you can also use >> that same trick to make a "working" edgelist at the top of your script that >> you reuse for multiple calculations. (The same can be done with adjacency >> matrices rather than edgelists, if one prefers. See e.g. >> ?as.sociomatrix.sna.) >> >> Hope that helps, >> >> -Carter >> On 8/23/24 9:07 PM, CHU-DING LING via statnet_help wrote: >> >> Dear all, >> >> >> >> I hope this message finds you well. I am currently working on a project >> that involves social network analysis using the *sna* package in R. I am >> reaching out to seek your expertise on a particular issue I have >> encountered regarding the calculation of degree centrality in directed and >> valued networks. >> >> >> >> I am working with a directed network where edges have associated weights. >> My goal is to accurately calculate both the in-degree and out-degree >> centrality of nodes while considering the edge weights. I attempted to >> calculate the degree centrality using the degree function in the *sna* >> package. While this function works well for unweighted networks, I realized >> that it does not account for edge weights. >> >> >> >> Could you please advise on the best method or function within the *sna* >> package to accurately calculate the degree centrality in this context? >> Though I can make it with *igraph* or other packages, I am particularly >> interested in whether *sna* could directly handle weighted edges in >> directed networks. >> >> >> >> Your guidance would be invaluable, and I would greatly appreciate any >> suggestions or resources you might be able to provide. Thank you for your >> time and consideration. I look forward to your insights. >> >> >> >> Best, >> >> Chuding >> >> _______________________________________________ >> statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Npw4CFqLg3mWAfPesSBtoa6UjvVqK_t7JYrixPsjqKAxzUTjhOvoeAxG6tO4iWruplppJ7ZQGd8FOLXV8VLxBP8-nfnt$ >> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> >> > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ## Step 1: Create a 5*5 undirected and valued matrix set.seed(123) # Set seed for reproducibility matrix_values <- sample(1:4, 15, replace = TRUE) # Generate random weights for the upper triangle undirected_matrix <- matrix(0, 5, 5) # Initialize a 5x5 matrix # Fill the upper triangle with random values undirected_matrix[upper.tri(undirected_matrix)] <- matrix_values # Make the matrix symmetric to represent an undirected graph undirected_matrix <- undirected_matrix + t(undirected_matrix) undirected_matrix ## Step 2: Convert the matrix to an igraph object library(igraph) undirected_matrix_graph <- graph_from_adjacency_matrix(undirected_matrix, mode = c("undirected"), weighted = TRUE, diag = FALSE) ## Step 3: Calculate Bonacich's power centrality (Beta centrality) # 0.1025942 in below specification is the highest Beta value # from the results of running xBetaCentrality() in xUCINET package. # I do not know how to obtain the highest Beta value with igraph or sna. Results_from_igraph <- power_centrality( undirected_matrix_graph, nodes = V(undirected_matrix_graph), loops = FALSE, exponent = 0.1025942, rescale = FALSE, tol = 1e-07, sparse = TRUE ) Results_from_igraph <- data.frame(Results_from_igraph = Results_from_igraph) library(sna) Results_from_sna <- bonpow(undirected_matrix, gmode="graph", diag=FALSE, tmaxdev=FALSE, exponent=0.1025942, rescale=FALSE, tol=1e-07) Results_from_sna <- data.frame(Results_from_sna = Results_from_sna) ## Step 4: Combine the two results and compare them # The values in 'Results_from_igraph' show no variance, # which differs from the observation with my own dataset. # However, with my own dataset, # the results are still not consistent with those in 'Results_from_sna'. CombinedResults <- cbind(Results_from_igraph, Results_from_sna) CombinedResults cor(CombinedResults) From statnet_help at u.washington.edu Wed Sep 4 01:34:21 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Wed Sep 4 01:34:28 2024 Subject: [statnet_help] Assistance with Calculating Degree Centrality in Directed and Valued Networks Using the sna Package in R In-Reply-To: References: <01def657-125d-44bc-9716-a18f8f945838@uci.edu> Message-ID: <9aa69a2a-e742-42b9-9495-1a59776532b8@uci.edu> Hi, Chuding - On 9/3/24 9:28 PM, CHU-DING LING wrote: > > Dear Carter, > > Thanks again for your suggestions. I have some further questions > regarding inconsistencies between the Bonacich?s power centrality > (Beta centrality) from *igraph* and *sna* for an undirected and valued > network. Enclosed please find the example codes and the associated > annotations. Despite setting the ?exponent? and ?rescale? parameters > consistently between the two packages, I am noticing some > discrepancies in the results. > I'm afraid that I can only answer questions about statnet tools, so I can't help you there.? If you have a question about a specific calculation done by sna, I can answer that.? But I can't speak to what some other package may or may not be doing, and thus why it gives different results. > I have a couple of questions which I was hoping you could help with: > > *1. How can we get the highest beta value before calculating > Bonacich?s power centrality?*Currently I use another unpublished > package to obtain the value. I would like to understand the process or > any specific considerations required to obtain the highest beta value > with *sna* or *igraph*. > I don't know what you mean by "the highest beta value."? If you define the power centrality in terms of a feedback process (as one, I think, probably should in most cases), /any/ scalar can serve as a beta value (so there is no maximum); however, not all lead to convergent results.? One thus typically confines interest to |beta| < 1/rho, where rho is the spectral radius of the input matrix, which ensures convergence.? (Substantively, non-convergent cases would correspond to runaway power growth, so we are implicitly assuming here that we are seeing the steady state of a feedback process that has converged to a fixed point.) Anyway, you can get that with something like 1/max(Mod(eigen(MyMat)$values)) Which, in your example, is about 0.1026.? Note that you can't find a unique largest beta value for which convergence is guaranteed, because this is an open interval - but in practice, anything close to the limit will give you approximately the same answer. I note that, as beta -> 1/rho from below, the bonpow scores converge to the principal eigenvector (where one exists), up to a scaling coefficient (which can be negative, though this does not change the substantive interpretation since the eigenvector is invariant to a sign flip).? Pragmatically, this means that bonpow is equivalent to eigenvector centrality for positive beta near the inverse spectral radius.? This leads to a number of wonderful insights (to which I subject the students in my theory class), but a not-so-exciting one is that there's not really much /practical/ value in using bonpow with a large positive beta - cleaner to just use eigenvector centrality (because that's what you are approaching as you increase beta), and then not have to worry about whether you're within the spectral radius.? Bonpow only starts to get interesting when beta is small (leading to a much flatter spectral weighting, which allows smaller core-periphery structures to start to matter) or negative (which puts relatively more weight on bipartitions).? But that's another topic.... > *2. How can we achieve consistent results between ?igraph? and > ?sna??*Despite aligning the ?exponent? and ?rescale? parameters across > both packages, the results still differ. Are there any additional > parameters or factors I should consider to ensure the outputs are > consistent? > Again, I can't speak to the behavior of other, non-statnet software.? I note that in your notes to refer to that package as returning constant values on your test case, which might suggest that you didn't tell it to use your edge values (your network returns unit results if one dichotomizes it).? But you'll want to look to their documentation and code to see what is being computed. Hope that helps, -Carter > I would greatly appreciate any guidance or insights you could provide > on these matters. > > Thank you for your time and assistance. > > Best regards, > > Chuding > > > > Carter T. Butts via statnet_help > ?2024?8?30??? 16:00??? > > Hi, Chuding - > > Glad that fixed it.?? With respect to edgelists vs. adjacency > matrices, they will give you equivalent results.? If you seem to > be getting different results, check the two objects: somewhere in > your code, you've presumably made them non-equivalent.... > > Best, > > -Carter > > On 8/26/24 5:59 AM, CHU-DING LING wrote: >> >> Carter, >> >> >> Thank you for your suggestions! The problem has been resolved. >> >> Initially, I imported a matrix from a CSV file and stored it as a >> matrix class object. I then converted it into a network class >> object since many functions in *sna* require objects to be of the >> network class. However, I noticed that the edge weights were lost >> during the conversion from the matrix object to the network >> object, which caused the results from the degree() function not >> to account for edge weights. >> >> Actually, the degree() function can directly handle the matrix >> object. I also used as.sociomatrix.sna() to convert the original >> matrix object into another matrix object with a different name. >> Both approaches produced the same degree centrality results for >> the directed and valued network. >> >> I also experimented with the as.edgelist.sna() function to >> convert the original matrix object into an edgelist object. >> However, when I calculated the degree centrality of this object, >> it produced incorrect results, with a greater number of elements >> than the number of nodes in my network. I appreciate if you can >> give some insights on this issue. >> >> Thanks in advance! >> >> Chuding >> >> >> Carter T. Butts via statnet_help >> ?2024?8?25??? 05:15??? >> >> H, Chuding - >> >> The degree() function already exploits edge values; this is >> its default behavior.? If you wish to /ignore/ edge values, >> you need to set the "ignore.eval" argument to TRUE. >> >> If you are not getting valued degree calculations from >> degree() using the defaults, then you are not passing it >> valued data.? This may be due to a preprocessing error (so >> check your inputs). Another possible failure mode is that you >> are passing it a network object that has value information >> stored as an edge attribute, and are expecting degree() to >> use those edge values. Since a network object can have any >> number of edge attributes (or none at all), and they can be >> of any data type (i.e., not necessarily numeric), degree() >> can't automagically know what is intended in that case, and >> will therefore treat the data as unvalued.? An easy way to >> use edge attribute information is to wrap your object in a >> call like >> as.edgelist.sna(,attrname=), >> which will extract from the object the specific valued >> network that you want to analyze.? That's especially handy if >> you have several different edge values you want to store in >> the same network object.? Of course, you can also use that >> same trick to make a "working" edgelist at the top of your >> script that you reuse for multiple calculations.? (The same >> can be done with adjacency matrices rather than edgelists, if >> one prefers.? See e.g. ?as.sociomatrix.sna.) >> >> Hope that helps, >> >> -Carter >> >> On 8/23/24 9:07 PM, CHU-DING LING via statnet_help wrote: >>> >>> Dear all, >>> >>> I hope this message finds you well. I am currently working >>> on a project that involves social network analysis using the >>> *sna* package in R. I am reaching out to seek your expertise >>> on a particular issue I have encountered regarding the >>> calculation of degree centrality in directed and valued >>> networks. >>> >>> I am working with a directed network where edges have >>> associated weights. My goal is to accurately calculate both >>> the in-degree and out-degree centrality of nodes while >>> considering the edge weights. I attempted to calculate the >>> degree centrality using the degree function in the *sna* >>> package. While this function works well for unweighted >>> networks, I realized that it does not account for edge weights. >>> >>> Could you please advise on the best method or function >>> within the *sna* package to accurately calculate the degree >>> centrality in this context? Though I can make it with >>> *igraph* or other packages, I am particularly interested in >>> whether *sna* could directly handle weighted edges in >>> directed networks. >>> >>> Your guidance would be invaluable, and I would greatly >>> appreciate any suggestions or resources you might be able to >>> provide. Thank you for your time and consideration. I look >>> forward to your insights. >>> >>> Best, >>> >>> Chuding >>> >>> >>> _______________________________________________ >>> statnet_help mailing list >>> statnet_help@u.washington.edu >>> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Npw4CFqLg3mWAfPesSBtoa6UjvVqK_t7JYrixPsjqKAxzUTjhOvoeAxG6tO4iWruplppJ7ZQGd8FOLXV8VLxBP8-nfnt$ >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> >> > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu Sep 5 06:54:19 2024 From: statnet_help at u.washington.edu (CHU-DING LING via statnet_help) Date: Thu Sep 5 06:55:04 2024 Subject: [statnet_help] Assistance with Calculating Degree Centrality in Directed and Valued Networks Using the sna Package in R In-Reply-To: <9aa69a2a-e742-42b9-9495-1a59776532b8@uci.edu> References: <01def657-125d-44bc-9716-a18f8f945838@uci.edu> <9aa69a2a-e742-42b9-9495-1a59776532b8@uci.edu> Message-ID: Dear Carter, Thank you very much for your detailed explanation and for your continued support! I apologize for asking questions that were not directly related to the sna package. I appreciate your guidance on using the codes to determine the appropriate beta value. I have tested this with the example matrix and my own data, and it works perfectly. I am also grateful for your insights on setting the beta value to explore different scenarios. Your explanation has been incredibly helpful in understanding how to approach this parameter when investigating various research questions. Thank you again for your time and expertise! Best regards, Chuding Carter T. Butts via statnet_help ?2024?9?4??? 16:34??? > Hi, Chuding - > On 9/3/24 9:28 PM, CHU-DING LING wrote: > > Dear Carter, > > > > Thanks again for your suggestions. I have some further questions regarding > inconsistencies between the Bonacich?s power centrality (Beta centrality) > from *igraph* and *sna* for an undirected and valued network. Enclosed > please find the example codes and the associated annotations. Despite > setting the ?exponent? and ?rescale? parameters consistently between the > two packages, I am noticing some discrepancies in the results. > > I'm afraid that I can only answer questions about statnet tools, so I > can't help you there. If you have a question about a specific calculation > done by sna, I can answer that. But I can't speak to what some other > package may or may not be doing, and thus why it gives different results. > > > > I have a couple of questions which I was hoping you could help with: > > > > *1. How can we get the highest beta value before calculating Bonacich?s > power centrality?* Currently I use another unpublished package to obtain > the value. I would like to understand the process or any specific > considerations required to obtain the highest beta value with *sna* or > *igraph*. > > I don't know what you mean by "the highest beta value." If you define the > power centrality in terms of a feedback process (as one, I think, probably > should in most cases), *any* scalar can serve as a beta value (so there > is no maximum); however, not all lead to convergent results. One thus > typically confines interest to |beta| < 1/rho, where rho is the spectral > radius of the input matrix, which ensures convergence. (Substantively, > non-convergent cases would correspond to runaway power growth, so we are > implicitly assuming here that we are seeing the steady state of a feedback > process that has converged to a fixed point.) Anyway, you can get that > with something like > > 1/max(Mod(eigen(MyMat)$values)) > > Which, in your example, is about 0.1026. Note that you can't find a > unique largest beta value for which convergence is guaranteed, because this > is an open interval - but in practice, anything close to the limit will > give you approximately the same answer. > > I note that, as beta -> 1/rho from below, the bonpow scores converge to > the principal eigenvector (where one exists), up to a scaling coefficient > (which can be negative, though this does not change the substantive > interpretation since the eigenvector is invariant to a sign flip). > Pragmatically, this means that bonpow is equivalent to eigenvector > centrality for positive beta near the inverse spectral radius. This leads > to a number of wonderful insights (to which I subject the students in my > theory class), but a not-so-exciting one is that there's not really much > *practical* value in using bonpow with a large positive beta - cleaner to > just use eigenvector centrality (because that's what you are approaching as > you increase beta), and then not have to worry about whether you're within > the spectral radius. Bonpow only starts to get interesting when beta is > small (leading to a much flatter spectral weighting, which allows smaller > core-periphery structures to start to matter) or negative (which puts > relatively more weight on bipartitions). But that's another topic.... > > > > *2. How can we achieve consistent results between ?igraph? and ?sna??* > Despite aligning the ?exponent? and ?rescale? parameters across both > packages, the results still differ. Are there any additional parameters or > factors I should consider to ensure the outputs are consistent? > > > Again, I can't speak to the behavior of other, non-statnet software. I > note that in your notes to refer to that package as returning constant > values on your test case, which might suggest that you didn't tell it to > use your edge values (your network returns unit results if one dichotomizes > it). But you'll want to look to their documentation and code to see what > is being computed. > > Hope that helps, > > -Carter > > > > I would greatly appreciate any guidance or insights you could provide on > these matters. > > > > Thank you for your time and assistance. > > > > Best regards, > > Chuding > > > Carter T. Butts via statnet_help > ?2024?8?30??? 16:00??? > >> Hi, Chuding - >> >> Glad that fixed it. With respect to edgelists vs. adjacency matrices, >> they will give you equivalent results. If you seem to be getting different >> results, check the two objects: somewhere in your code, you've presumably >> made them non-equivalent.... >> >> Best, >> >> -Carter >> On 8/26/24 5:59 AM, CHU-DING LING wrote: >> >> Carter, >> >> >> Thank you for your suggestions! The problem has been resolved. >> >> >> >> Initially, I imported a matrix from a CSV file and stored it as a matrix >> class object. I then converted it into a network class object since many >> functions in *sna* require objects to be of the network class. However, >> I noticed that the edge weights were lost during the conversion from the >> matrix object to the network object, which caused the results from the >> degree() function not to account for edge weights. >> >> >> >> Actually, the degree() function can directly handle the matrix object. I >> also used as.sociomatrix.sna() to convert the original matrix object into >> another matrix object with a different name. Both approaches produced the >> same degree centrality results for the directed and valued network. >> >> >> >> I also experimented with the as.edgelist.sna() function to convert the >> original matrix object into an edgelist object. However, when I calculated >> the degree centrality of this object, it produced incorrect results, with a >> greater number of elements than the number of nodes in my network. I >> appreciate if you can give some insights on this issue. >> >> >> >> Thanks in advance! >> >> >> >> Chuding >> >> Carter T. Butts via statnet_help >> ?2024?8?25??? 05:15??? >> >>> H, Chuding - >>> >>> The degree() function already exploits edge values; this is its default >>> behavior. If you wish to *ignore* edge values, you need to set the >>> "ignore.eval" argument to TRUE. >>> >>> If you are not getting valued degree calculations from degree() using >>> the defaults, then you are not passing it valued data. This may be due to >>> a preprocessing error (so check your inputs). Another possible failure >>> mode is that you are passing it a network object that has value information >>> stored as an edge attribute, and are expecting degree() to use those edge >>> values. Since a network object can have any number of edge attributes (or >>> none at all), and they can be of any data type (i.e., not necessarily >>> numeric), degree() can't automagically know what is intended in that case, >>> and will therefore treat the data as unvalued. An easy way to use edge >>> attribute information is to wrap your object in a call like >>> as.edgelist.sna(,attrname=), which >>> will extract from the object the specific valued network that you want to >>> analyze. That's especially handy if you have several different edge values >>> you want to store in the same network object. Of course, you can also use >>> that same trick to make a "working" edgelist at the top of your script that >>> you reuse for multiple calculations. (The same can be done with adjacency >>> matrices rather than edgelists, if one prefers. See e.g. >>> ?as.sociomatrix.sna.) >>> >>> Hope that helps, >>> >>> -Carter >>> On 8/23/24 9:07 PM, CHU-DING LING via statnet_help wrote: >>> >>> Dear all, >>> >>> >>> >>> I hope this message finds you well. I am currently working on a project >>> that involves social network analysis using the *sna* package in R. I >>> am reaching out to seek your expertise on a particular issue I have >>> encountered regarding the calculation of degree centrality in directed and >>> valued networks. >>> >>> >>> >>> I am working with a directed network where edges have associated >>> weights. My goal is to accurately calculate both the in-degree and >>> out-degree centrality of nodes while considering the edge weights. I >>> attempted to calculate the degree centrality using the degree function in >>> the *sna* package. While this function works well for unweighted >>> networks, I realized that it does not account for edge weights. >>> >>> >>> >>> Could you please advise on the best method or function within the *sna* >>> package to accurately calculate the degree centrality in this context? >>> Though I can make it with *igraph* or other packages, I am particularly >>> interested in whether *sna* could directly handle weighted edges in >>> directed networks. >>> >>> >>> >>> Your guidance would be invaluable, and I would greatly appreciate any >>> suggestions or resources you might be able to provide. Thank you for your >>> time and consideration. I look forward to your insights. >>> >>> >>> >>> Best, >>> >>> Chuding >>> >>> _______________________________________________ >>> statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Npw4CFqLg3mWAfPesSBtoa6UjvVqK_t7JYrixPsjqKAxzUTjhOvoeAxG6tO4iWruplppJ7ZQGd8FOLXV8VLxBP8-nfnt$ >>> >>> _______________________________________________ >>> statnet_help mailing list >>> statnet_help@u.washington.edu >>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >>> >>> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> >> > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu Sep 5 12:57:20 2024 From: statnet_help at u.washington.edu (Khanna, Aditya via statnet_help) Date: Thu Sep 5 12:57:37 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: <7f088ef1-8b4f-4613-b7f2-52d81d3ecb3f@uci.edu> References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> <7f088ef1-8b4f-4613-b7f2-52d81d3ecb3f@uci.edu> Message-ID: Hi Carter, Thank you so much for your helpful response as always. I have organized my report in terms of the various things you suggest. Verifying MCMC, GOF and the ?second? MCMC: Yes, the ERGM for the model described below does converge, but, despite having converged, the simulated networks don?t seem to statistically capture the targets. I did make the GOF plots as well. Most of the terms look good, though some have peaks that are off from the zero. In general, however, I have come to rely more on actually simulating networks from the fitted ERGM object (what I think you mean by ?second MCMC run?) in addition to the GOF plots. Usually I consider my goal fulfilled if the simulated network objects capture the targets, even if the GOF plots don?t look perfect. Model Convergence and tightening the MCMC tolerances: In terms of tightening the MCMC tolerances, I did increase the MCMC interval to 1e9, of the order of O(N^2). But this particular specification timed out after 120 hours, and I didn?t try to run it for longer time than that. Alternate parameters to tighten the MCMC: I have experimented with the MCMC sample size and interval parameters, but have not been able to improve the quality of the simulated network. I am not as familiar with what options are available within the bounds of some reasonable computational cost. In summary, the problem remains that despite the ERGM convergence, the quality of the simulated networks suggests room for improvement, since the specified targets are not captured within the distribution of the simulated networks. Aditya On Fri, Aug 30, 2024 at 4:37?AM Carter T. Butts via statnet_help < statnet_help@u.washington.edu> wrote: > Hi, Aditya - > > I'll be interested in Pavel's take on the convergence issues, but just to > verify, you are assessing convergence based on a *second* MCMC run, > correct? The MCMC statistics in the ergm object are from the penultimate > iteration, and may thus be out of equilibrium (but this does *not* > necessarily mean that the *model* did not converge). However, if you > simulate a new set of draws from the fitted model and the mean stats do not > match, *then* you have an issue. (This is why we now point folks to > gof() for that purpose.) It looks like your plots are from the ergm object > and not from a gof() run (or other secondary simulation), so I want to > verify that first. > > I also note that a quick glance at the plots from your more exhaustive > simulation case don't seem all that far off, which could indicate either > that the model did converge (and per above, we're not looking at a draw > from the final model), or that it converged within the tolerances that were > set, and you may need to tighten them. But best to first know if there's a > problem in the first place. > > Another observation is that, per my earlier email, you may need O(N^2) > toggles per draw to get good performance if your model has a nontrivial > level of dependence. You are using a thinning interval of 1e6, which is in > your case around 30*N. It's possible that you've got too much dependence > for that: O(N^2) here would mean some multiple of about 1e9, which is about > a thousand times greater than what you're using. Really large, sparse > networks sometimes *can* be modeled well without that much thinning, but > it's not a given. Relatedly, your trace plots from the 1e6 run suggest a > fair amount of autocorrelation on some statistics, which suggests a lack of > efficiency. (Autocorrelation by itself isn't necessarily a problem, but it > means that your effective MCMC sample size is smaller than it seems, and > this can reduce the effectiveness of the MCMCMLE procedure. The ones from > the 1e6 run aren't bad enough that I would be alarmed, but if I were > looking for things to tighten up and knew this could be a problem, they > suggest possible room for improvement.) So anyway, I wouldn't crank this > up until verifying that it's needed, but you are still operating on the low > end of computational effort (whether it seems like it or not!). > > Finally, I would note that for the stochastic approximation method, > convergence is to some degree (and it's a bit complex) determined by how > many subphases are run, and how many iterations are used per subphase. > This algorithm is due to Tom in his classic JoSS paper (but without the > complement moves), which is still a good place to look for details. It is > less fancy than some more modern algorithms of its type, but is extremely > hard to beat (I've tried and failed more than once!). In any event, there > are several things that can tighten that algorithm relative to its > defaults, including increasing thinning, increasing the iterations per > subphase, and increasing the number of subphases. Some of these sharply > increase computational cost, because e.g. the number of actual subphase > iterations doubles (IIRC) at each subphase - so sometimes one benefits by > increasing the phase number but greatly reducing the base number of > iterations per phase. The learning rate ("SA.initial.gain") can also > matter, although I would probably avoid messing with it if the model is > well-behaved (as here). I will say that, except under exotic conditions in > which I am performing Unspeakable ERGM Experiments (TM) of which we tell > neither children nor grad students, I do not recall ever needing to do much > with the base parameters - adjusting thinning, as needs must, has almost > always done the trick. Still, if other measures fail, tinkering with these > settings can/will certainly affect convergence. > > I'd check on those things first, and then see if you still have a > problem.... > > Hope that helps, > > -Carter > On 8/29/24 12:13 PM, Khanna, Aditya wrote: > > Hi Carter and All, > > Thank you so much for the helpful guidance here. I think following your > suggestions has brought us very close to reproducing the target statistics > in the simulated networks, but there are still some gaps. > > Our full previous exchange is below, but to summarize: I have an ERGM > that I fit previously with ERGM v3.10.4 on a directed network with 32,000 > nodes. The model consisted of in- and out-degrees in addition to other > terms, including a custom distance term. In trying to reproduce this fit > with ergm v4.6, the model did not initially converge. > > Your suggestion to try setting the main.method = ?Stochastic > Approximation? considerably improved the fitting. Specifying the > convergence detection to ?Hotelling? on top of that brought us almost to > simulated networks that capture all the mean statistics. (Following an old discussion > thread > > on the statnet github, I also tried setting the termination criteria to > Hummel and MCMLE.effectiveSize = NULL. I think, for me, in practice, > Hotelling worked a bit better than Hummel though). > > In general, I tried fitting the model with variants of this > > specification. I got the best results with setting both MCMC samplesize=1e6 > and interval = 1e6 (see table below). > > MCMC interval > > MCMC sample size > > Convergence Detection > > Results/Outcome > > Note > > 1e6 > > 1e6 > > Hotelling > > Closest agreement between simulated and target statistics > > Max. Lik. fit summary and simulation Rout > > > > Violin plots > > showing the simulated and target statistics for each parameter > > > But, I found that this was the closest I could get producing simulated > statistics that matched the target statistics. In general, any further > increasing or decreasing of either the samplesize or interval did not help > generate a closer result, i.e., this looked to be some optimum in the fit > parameter space. I can provide further details on the results of those > fits, which for some configurations didn?t converge, and if they did > converge, the goodness-of-fit was worse than what I had with setting the > MCMC interval and samplesize to 1e6. Based on your experiences, I was > wondering if this is expected? > > For now, my main question is, are there any suggestions on how I can > further tune the fitting parameters to match my targets more closely? I can > provide specific details on the outcomes of those fitting processes if that > would be helpful. > > Thanks for your consideration. > Aditya > > On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via statnet_help < > statnet_help@u.washington.edu> wrote: > >> Hi, Aditya - >> >> I will defer to the mighty Pavel for the exact best formula to reproduce >> 3.x fits with the latest codebase. (You need to switch convergence >> detection to "Hotelling," and there are some other things that must be >> modified.) However, as a general matter, for challenging models where >> Geyer-Thompson-Hummel has a hard time converging (particularly on a large >> node set), you may find it useful to try the stochastic approximation >> method (main="Stochastic" in your control argument will activate it). >> G-T-H can (in principle) have sharper convergence when near the solution, >> but in practice SA fails more gracefully. I would suggest increasing your >> default MCMC thinning interval (MCMC.interval), given your network size; >> depending on density, extent of dependence, and other factors, you may need >> O(N^2) toggles per step. It is sometimes possible to get away with as few >> as k*N (for some k in, say, the 5-100 range), but if your model has >> substantial dependence and is not exceptionally sparse then you will >> probably need to be in the quadratic regime. One notes that it can >> sometimes be helpful when getting things set up to run "pilot" fits with >> the default or otherwise smaller thinning intervals, so that you can >> discover if e.g. you have a data issue or other problem before you spend >> the waiting time on a high-quality model fit. >> >> To put in the obligatory PSA, both G-T-H and SA are simply different >> strategies for computing the same thing (the MLE, in this case), so both >> are fine - they just have different engineering tradeoffs. So use >> whichever proves more effective for your model and data set. >> >> Hope that helps, >> >> -Carter >> >> >> On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: >> >> Dear Statnet Dev and User Community: >> >> I have an ERGM that I fit previously with ERGM v3.10.4 on a directed >> network with 32,000 nodes. The model included in- and out-degrees, in >> addition to other terms. The complete Rout from this fit can be seen here >> . >> I am now trying to reproduce this fit with ergm v4.6, but the model does >> not converge. (See here >> >> .) >> >> I am looking for ideas on how to trouble shoot this. One suggestion I got >> was to set values for the "tuning parameters" in the v4.6 to their defaults >> from v3.11.4. But ERGM v4.6 has a lot more parameters that can be >> specified, and I am not sure which ones make most sense to consider. >> >> I would be grateful for any suggestions on this or alternate ideas to try. >> >> Many thanks, >> Aditya >> >> >> >> >> -- >> >> >> >> >> >> >> >> Aditya S. Khanna, Ph.D. >> >> Assistant Professor >> >> Department of Behavioral and Social Sciences >> >> Center for Alcohol and Addiction Studies >> >> Brown University School of Public Health >> >> Pronouns: he/him/his >> >> 401-863-6616 >> >> sph.brown.edu >> >> >> https://vivo.brown.edu/display/akhann16 >> >> >> _______________________________________________ >> statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ >> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> >> > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sat Sep 7 05:00:15 2024 From: statnet_help at u.washington.edu (Pavel Krivitsky via statnet_help) Date: Sat Sep 7 05:00:27 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> Message-ID: <9b382b54e14e59a0ae5e270a4bb8e29adee468f8.camel@unsw.edu.au> Hi, Aditya, Apologies for the slow reply. I have a quick question: have you tried running ergm() without overriding any of the control parameters? I.e., just let the adaptive code try to figure things out? Or, it might be worth overriding just the SAN controls, since those aren't adaptive. Best, Pavel On Thu, 2024-08-29 at 15:13 -0400, Khanna, Aditya via statnet_help wrote: Hi Carter and All, Thank you so much for the helpful guidance here. I think following your suggestions has brought us very close to reproducing the target statistics in the simulated networks, but there are still some gaps. Our full previous exchange is below, but to summarize: I have an ERGM that I fit previously with ERGM v3.10.4 on a directed network with 32,000 nodes. The model consisted of in- and out-degrees in addition to other terms, including a custom distance term. In trying to reproduce this fit with ergm v4.6, the model did not initially converge. Your suggestion to try setting the main.method = ?Stochastic Approximation? considerably improved the fitting. Specifying the convergence detection to ?Hotelling? on top of that brought us almost to simulated networks that capture all the mean statistics. (Following an old discussion thread on the statnet github, I also tried setting the termination criteria to Hummel and MCMLE.effectiveSize = NULL. I think, for me, in practice, Hotelling worked a bit better than Hummel though). In general, I tried fitting the model with variants of this specification. I got the best results with setting both MCMC samplesize=1e6 and interval = 1e6 (see table below). MCMC interval MCMC sample size Convergence Detection Results/Outcome Note 1e6 1e6 Hotelling Closest agreement between simulated and target statistics Max. Lik. fit summary and simulation Rout Violin plots showing the simulated and target statistics for each parameter But, I found that this was the closest I could get producing simulated statistics that matched the target statistics. In general, any further increasing or decreasing of either the samplesize or interval did not help generate a closer result, i.e., this looked to be some optimum in the fit parameter space. I can provide further details on the results of those fits, which for some configurations didn?t converge, and if they did converge, the goodness-of-fit was worse than what I had with setting the MCMC interval and samplesize to 1e6. Based on your experiences, I was wondering if this is expected? For now, my main question is, are there any suggestions on how I can further tune the fitting parameters to match my targets more closely? I can provide specific details on the outcomes of those fitting processes if that would be helpful. Thanks for your consideration. Aditya On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via statnet_help > wrote: Hi, Aditya - I will defer to the mighty Pavel for the exact best formula to reproduce 3.x fits with the latest codebase. (You need to switch convergence detection to "Hotelling," and there are some other things that must be modified.) However, as a general matter, for challenging models where Geyer-Thompson-Hummel has a hard time converging (particularly on a large node set), you may find it useful to try the stochastic approximation method (main="Stochastic" in your control argument will activate it). G-T-H can (in principle) have sharper convergence when near the solution, but in practice SA fails more gracefully. I would suggest increasing your default MCMC thinning interval (MCMC.interval), given your network size; depending on density, extent of dependence, and other factors, you may need O(N^2) toggles per step. It is sometimes possible to get away with as few as k*N (for some k in, say, the 5-100 range), but if your model has substantial dependence and is not exceptionally sparse then you will probably need to be in the quadratic regime. One notes that it can sometimes be helpful when getting things set up to run "pilot" fits with the default or otherwise smaller thinning intervals, so that you can discover if e.g. you have a data issue or other problem before you spend the waiting time on a high-quality model fit. To put in the obligatory PSA, both G-T-H and SA are simply different strategies for computing the same thing (the MLE, in this case), so both are fine - they just have different engineering tradeoffs. So use whichever proves more effective for your model and data set. Hope that helps, -Carter On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: Dear Statnet Dev and User Community: I have an ERGM that I fit previously with ERGM v3.10.4 on a directed network with 32,000 nodes. The model included in- and out-degrees, in addition to other terms. The complete Rout from this fit can be seen here. I am now trying to reproduce this fit with ergm v4.6, but the model does not converge. (See here.) I am looking for ideas on how to trouble shoot this. One suggestion I got was to set values for the "tuning parameters" in the v4.6 to their defaults from v3.11.4. But ERGM v4.6 has a lot more parameters that can be specified, and I am not sure which ones make most sense to consider. I would be grateful for any suggestions on this or alternate ideas to try. Many thanks, Aditya -- [https://lh4.googleusercontent.com/c35709qK0dtBNfIbLY9gT-zcsk01ZAdhlpd4bfFd2oKVbn-rRkqLeClJI34zvYS8jwlOWNzGp4ySGhQblkuf767TWpHsOydn7PnJMsOop4v_2iTAJ5L4YFgDRqht8NN4deZYeAV0JrIOPgxACC77q9Q] [https://lh3.googleusercontent.com/jJS-8A-tWxbni-8IqHLVPJgs_5v8_VTq9pb3QSIrQczDuoDk49Nn6Gre6kZFZklaHWHB0_lFxngXrZUfbEk9qmzYVy6161x56ZHULce5hwxsBoL1LHqVsx17oHo2dbzByD8Y1bF6WNvZjqkudloT8qk] Aditya S. Khanna, Ph.D. Assistant Professor Department of Behavioral and Social Sciences Center for Alcohol and Addiction Studies Brown University School of Public Health Pronouns: he/him/his 401-863-6616 sph.brown.edu https://vivo.brown.edu/display/akhann16 _______________________________________________ statnet_help mailing list statnet_help@u.washington.edu https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ _______________________________________________ statnet_help mailing list statnet_help@u.washington.edu http://mailman13.u.washington.edu/mailman/listinfo/statnet_help _______________________________________________ statnet_help mailing list statnet_help@u.washington.edu http://mailman13.u.washington.edu/mailman/listinfo/statnet_help -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sat Sep 7 17:28:45 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Sat Sep 7 17:28:52 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> <7f088ef1-8b4f-4613-b7f2-52d81d3ecb3f@uci.edu> Message-ID: Hi, Aditya - On 9/5/24 12:57 PM, Khanna, Aditya wrote: > > Hi Carter, > > > Thank you so much for your helpful response as always. I have > organized my report in terms of the various things you suggest. > > > Verifying MCMC, GOF and the ?second? MCMC: Yes, the ERGM for the model > described below does converge, but, despite having converged, the > simulated networks don?t seem to statistically capture the targets. I > did make the GOF > plots > as well. Most of the terms look good, though some have peaks that are > off from the zero. In general, however, I have come to rely more on > actually simulating? networks from the fitted ERGM object (what I > think you mean by ?second MCMC run?) in addition to the GOF plots. > Usually I consider my goal fulfilled if the simulated network objects > capture the targets, even if the GOF plots don?t look perfect. > > There seems to be some confusion here: setting aside exotica, if your mean in-model statistics don't match your observed in-model statistics, then your model has not converged.? The matching here that matters is from an MCMC run from the fitted model, which is what the gof() function does (but this is /not/ what you get from MCMC diagnostics on the fitted ergm() object, which should show the penultimate run - those are handy for diagnosing general MCMC issues, but need not show convergence even when the final coefficients yield a convergent model).? The plots you linked to under "GOF" above seem to be MCMC diagnostics from an ergm() object, not gof() output.? I'll come back to "exotica" below, but first it is important to be sure that we are discussing the same thing. > Model Convergence and tightening the MCMC tolerances:In terms of > tightening the MCMC tolerances, I did increase the MCMC interval to > 1e9, of the order of O(N^2). But this particular specification timed > out after 120 hours, and I didn?t try to run it for longer time than that. > > Unfortunately, there's no "that's longer than I want it to be" card that convinces mixing to happen more quickly: if your model is really going to take 1e9 or more updates per step to converge, and if that takes longer than 120 hours on your hardware, then that's how long it's going to take.? If there were magic sauce that could guarantee great convergence with minimal computational effort every time, it would already be in the code.? That said, I don't know that all other options have yet been ruled out; and, when folks encounter expensive models on large networks, there are various approximate solutions that they may be willing to live with. ? But if one is working with a dependence model on >=1e4 nodes, one must except that one may be in a regime in which gold-standard MCMC-MLE is very expensive.? Just sayin'. > Alternate parameters to tighten the MCMC: I have experimented with the > MCMC sample size and interval parameters, but have not been able to > improve the quality of the simulated network. I am not as familiar > with what options are available within the bounds of some reasonable > computational cost. > > In summary, the problem remains that despite the ERGM convergence, the > quality of the simulated networks suggests room for improvement, since > the specified targets are not captured within the distribution of the > simulated networks. > > OK, let me see if I can offer some further advice, based on your email and also something that came up in your exchange with Pavel: 1. We should be clear that, assuming no-exotica, you should be assessing convergence from an MCMC run on the fitted model (as produced by gof() or done manually).? So far, the plots I've seen appear not to be runs from the fitted model, so I have not actually seen evidence of the alleged phenomenon.? Also, to be clear, (absent exotica) if your simulated mean stats don't match the observed stats (up to numerical and sometimes statistical noise), your model hasn't converged.? A model that isn't converging is not the same as a model that has converged but that is inadequate, and the fixes are very different. 2. The exchange with Pavel led me to dig into your code a bit more, and I realized that you are not fitting to an observed network, but to target stats presumably based on design estimation.? This could put you into the "exotica" box, because it is likely that - due to errors in your estimated targets - there exists no ERGM in your specified family whose expected statistics exactly match the target statistics.? So long as they aren't too far off, you still ought to be able to get close, but hypothetically one could have a situation where someone gets an unusually bad estimate for one or a small number of targets, and their matches are persistently off; in this case, the issue is that the MLE no longer satisfies the first moment condition (expected statistics do not match the target statistics), so this is no longer a valid criterion for assessing convergence.? If one is willing/able to make some distributional statements about one's target estimates, there are some natural relaxations of the usual convergence criteria, and almost surely Pavel has written them down, so I defer to him.? :-)? But anyway, /if/ your model really seems not to be converging (by the criteria of (1)), and /if/ you are using estimated target stats, then I would certainly want to investigate the possibility that your model has actually converged (and that you're just seeing measurement error in your target stats) before going much further.? To write reckless words (that you should read recklessly), one naive heuristic that could perhaps be worth trying would be to look at the Z-scores (t_o-t_s)/(s2_o^2+s2_s^2)^0.5, where t_o is the observed (estimated) target, t_s is the simulated mean statistic, s2_o is the standard error of your target estimator, and s2_s is the standard error of the simulation mean.? (If you are e.g. using Horvitz-Thompson, you can approximate s2_o using standard results, and you can likewise use autocorrelation-corrected approximations to s2_s.)? If these are not large, then this suggests that the discrepancies between the targets and the mean stats are not very large compared to what you would expect from the variation in your simulation outcomes and in your measurement process.? This does not take into account e.g. correlations among statistics, nor support constraints, but it seems like a plausible starting point.? (Pavel and Martina have been working with these kinds of problems a lot of late, so doubtless can suggest better heuristics.) 3. Pavel's comments pointed to SAN, which also led me to observe that you are starting by fitting to an empty graph.? I recommend against that.? In principle, the annealer should get you to a not-too-bad starting point, but in my own informal simulation tests I have observed that this doesn't always work well if the network is very large; in particular, if SAN dumps you out with a starting point that is far from equilibrium, you are wasting a lot of MCMC steps wandering towards the high-density region of the graph space, and this can sometimes lead to poor results (especially if you can't afford to run some (large k)*N^2 burn-in - and recall that the default MCMC algorithm tends to preserve density, so if the seed is poor in that regard, it can take a lot of iterations to fix).? My suggestion is to use rgraph() to get a Bernoulli graph draw from a model whose mixing characteristics (and, above all, density) approximate the target, and start with that.? An easy way to set the parameters is to fit a pilot ERGM using only independence terms, use these construct a tie probability matrix, and pass that to the tp argument of rgraph(). Your case makes for a very large matrix, but it's still within the range of the feasible.? (rgraph() does not use adjacency matrices internally, and so long as you set the return value to be an edgelist is not constrained by the sizes of feasible adjacency matrices, but if you want to pass an explicit tie probability matrix then obviously that puts you in the adjacency matrix regime.)? Anyway, it's better to use rgraph() for this than an simulate() call, because it will be both faster and an exact simulation (no MCMC).? A poorer approach not to bother with mixing structure, and just to draw an initial state with the right density (which at least reduces the risk that SAN exits with a graph that is too sparse)....but you might as well put your starting point as close to the right neighborhood as you can.? The goal here is to help the annealer get you to a high-potential graph, rather than expecting it to carry you there from a remote location.? It is possible that this turns out not to be a problem in your particular case, but it seems worth ruling out. Hope that helps, -Carter > Aditya > > > On Fri, Aug 30, 2024 at 4:37?AM Carter T. Butts via statnet_help > wrote: > > Hi, Aditya - > > I'll be interested in Pavel's take on the convergence issues, but > just to verify, you are assessing convergence based on a /second/ > MCMC run, correct?? The MCMC statistics in the ergm object are > from the penultimate iteration, and may thus be out of equilibrium > (but this does /not/ necessarily mean that the /model/ did not > converge).? However, if you simulate a new set of draws from the > fitted model and the mean stats do not match, /then/ you have an > issue.? (This is why we now point folks to gof() for that > purpose.)? It looks like your plots are from the ergm object and > not from a gof() run (or other secondary simulation), so I want to > verify that first. > > I also note that a quick glance at the plots from your more > exhaustive simulation case don't seem all that far off, which > could indicate either that the model did converge (and per above, > we're not looking at a draw from the final model), or that it > converged within the tolerances that were set, and you may need to > tighten them.? But best to first know if there's a problem in the > first place. > > Another observation is that, per my earlier email, you may need > O(N^2) toggles per draw to get good performance if your model has > a nontrivial level of dependence.? You are using a thinning > interval of 1e6, which is in your case around 30*N.? It's possible > that you've got too much dependence for that: O(N^2) here would > mean some multiple of about 1e9, which is about a thousand times > greater than what you're using.? Really large, sparse networks > sometimes /can/ be modeled well without that much thinning, but > it's not a given.? Relatedly, your trace plots from the 1e6 run > suggest a fair amount of autocorrelation on some statistics, which > suggests a lack of efficiency.? (Autocorrelation by itself isn't > necessarily a problem, but it means that your effective MCMC > sample size is smaller than it seems, and this can reduce the > effectiveness of the MCMCMLE procedure.?? The ones from the 1e6 > run aren't bad enough that I would be alarmed, but if I were > looking for things to tighten up and knew this could be a problem, > they suggest possible room for improvement.)? So anyway, I > wouldn't crank this up until verifying that it's needed, but you > are still operating on the low end of computational effort > (whether it seems like it or not!). > > Finally, I would note that for the stochastic approximation > method, convergence is to some degree (and it's a bit complex) > determined by how many subphases are run, and how many iterations > are used per subphase.? This algorithm is due to Tom in his > classic JoSS paper (but without the complement moves), which is > still a good place to look for details.? It is less fancy than > some more modern algorithms of its type, but is extremely hard to > beat (I've tried and failed more than once!).? In any event, there > are several things that can tighten that algorithm relative to its > defaults, including increasing thinning, increasing the iterations > per subphase, and increasing the number of subphases.? Some of > these sharply increase computational cost, because e.g. the number > of actual subphase iterations doubles (IIRC) at each subphase - so > sometimes one benefits by increasing the phase number but greatly > reducing the base number of iterations per phase.? The learning > rate ("SA.initial.gain") can also matter, although I would > probably avoid messing with it if the model is well-behaved (as > here).? I will say that, except under exotic conditions in which I > am performing Unspeakable ERGM Experiments (TM) of which we tell > neither children nor grad students, I do not recall ever needing > to do much with the base parameters - adjusting thinning, as needs > must, has almost always done the trick.? Still, if other measures > fail, tinkering with these settings can/will certainly affect > convergence. > > I'd check on those things first, and then see if you still have a > problem.... > > Hope that helps, > > -Carter > > On 8/29/24 12:13 PM, Khanna, Aditya wrote: >> >> Hi Carter and All, >> >> >> Thank you so much for the helpful guidance here. I think >> following your suggestions has brought us very close to >> reproducing the target statistics in the simulated networks, but >> there are still some gaps. >> >> >> Our full previous exchange is below, but to summarize:? I have an >> ERGM that I fit previously with ERGM v3.10.4 on a directed >> network with 32,000 nodes. The model consisted of in- and >> out-degrees in addition to other terms, including a custom >> distance term. In trying to reproduce this fit with ergm v4.6, >> the model did not initially converge. >> >> >> Your suggestion to try setting the main.method = ?Stochastic >> Approximation? considerably improved the fitting. Specifying the >> convergence detection to ?Hotelling? on top of that brought us >> almost to simulated networks that capture all the mean >> statistics. (Following an old discussion thread >> on >> the statnet github, I also tried setting the termination criteria >> to Hummel and MCMLE.effectiveSize = NULL. I think, for me, in >> practice, Hotelling worked a bit better than Hummel though). >> >> >> In general, I tried fitting the model with variants of this >> specification. >> I got the best results with setting both MCMC samplesize=1e6 and >> interval = 1e6 (see table below). >> >> >> MCMC interval >> >> >> >> MCMC sample size >> >> >> >> Convergence Detection >> >> >> >> Results/Outcome >> >> >> >> Note >> >> 1e6 >> >> >> >> 1e6 >> >> >> >> Hotelling >> >> >> >> Closest agreement? between simulated and target statistics >> >> >> >> Max. Lik. fit summary and simulation Rout >> >> >> >> Violin plots >> showing >> the simulated and target statistics for each parameter >> >> >> >> But, I found that this was the closest I could get producing >> simulated statistics that matched the target statistics. In >> general, any further increasing or decreasing of either the >> samplesize or interval did not help generate a closer result, >> i.e., this looked to be some optimum in the fit parameter space. >> I can provide further details on the results of those fits, which >> for some configurations didn?t converge, and if they did >> converge, the goodness-of-fit was worse than what I had with >> setting the MCMC interval and samplesize to 1e6. Based on your >> experiences, I was wondering if this is expected? >> >> >> For now, my main question is, are there any suggestions on how I >> can further tune the fitting parameters to match my targets more >> closely? I can provide specific details on the outcomes of those >> fitting processes if that would be helpful. >> >> >> Thanks for your consideration. >> >> Aditya >> >> On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via statnet_help >> wrote: >> >> Hi, Aditya - >> >> I will defer to the mighty Pavel for the exact best formula >> to reproduce 3.x fits with the latest codebase.? (You need to >> switch convergence detection to "Hotelling," and there are >> some other things that must be modified.)? However, as a >> general matter, for challenging models where >> Geyer-Thompson-Hummel has a hard time converging >> (particularly on a large node set), you may find it useful to >> try the stochastic approximation method (main="Stochastic" in >> your control argument will activate it).? G-T-H can (in >> principle) have sharper convergence when near the solution, >> but in practice SA fails more gracefully.?? I would suggest >> increasing your default MCMC thinning interval >> (MCMC.interval), given your network size; depending on >> density, extent of dependence, and other factors, you may >> need O(N^2) toggles per step.? It is sometimes possible to >> get away with as few as k*N (for some k in, say, the 5-100 >> range), but if your model has substantial dependence and is >> not exceptionally sparse then you will probably need to be in >> the quadratic regime.? One notes that it can sometimes be >> helpful when getting things set up to run "pilot" fits with >> the default or otherwise smaller thinning intervals, so that >> you can discover if e.g. you have a data issue or other >> problem before you spend the waiting time on a high-quality >> model fit. >> >> To put in the obligatory PSA, both G-T-H and SA are simply >> different strategies for computing the same thing (the MLE, >> in this case), so both are fine - they just have different >> engineering tradeoffs.? So use whichever proves more >> effective for your model and data set. >> >> Hope that helps, >> >> -Carter >> >> >> On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: >>> Dear Statnet Dev and User Community: >>> >>> I have an ERGM that I fit previously with ERGM v3.10.4 on a >>> directed network with 32,000 nodes. The model included in- >>> and out-degrees, in addition to other terms. The complete >>> Rout from this fit can be seen here >>> . >>> I am now trying to reproduce this fit with ergm v4.6, but >>> the model does not converge. (See here >>> .) >>> >>> I am looking for ideas on how to trouble shoot this. One >>> suggestion I got was to set values for the "tuning >>> parameters" in the v4.6 to their defaults from v3.11.4. But >>> ERGM v4.6 has a lot more ?parameters that can be specified, >>> and I am not sure which ones make most sense to consider. >>> >>> I would be grateful for any suggestions on this or alternate >>> ideas to try. >>> >>> Many thanks, >>> Aditya >>> >>> >>> >>> >>> -- >>> >>> >>> >>> >>> >>> >>> >>> >>> Aditya S. Khanna, Ph.D. >>> >>> Assistant Professor >>> >>> Department of Behavioral and Social Sciences >>> >>> Center for Alcohol and Addiction Studies >>> >>> Brown University School of Public Health >>> >>> Pronouns: he/him/his >>> >>> >>> 401-863-6616 >>> >>> sph.brown.edu >>> >>> >>> https://vivo.brown.edu/display/akhann16 >>> >>> >>> >>> _______________________________________________ >>> statnet_help mailing list >>> statnet_help@u.washington.edu >>> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> >> > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sat Sep 7 21:32:38 2024 From: statnet_help at u.washington.edu (CHU-DING LING via statnet_help) Date: Sat Sep 7 21:33:18 2024 Subject: [statnet_help] Inquiry about eigenvector centrality normalization in sna package Message-ID: Dear Carter and all, I hope this email finds you well. I'm writing to discuss the normalization of eigenvector centrality scores in the *sna* package. After reviewing the documentation, I saw that when ?rescale=TRUE? is used, the 1-norm is applied for normalization. However, for the default setting of ?rescale=FALSE?, the documentation does not specify the normalization method. Since the choice of normalization method can significantly affect centralization calculations based on eigenvector centrality, I believe it would be helpful to clarify this in the documentation. Is the Euclidean norm used when ?rescale=FALSE?, or could this be a side effect of the eigenvector computation method? For reference, there is an ongoing discussion on this topic in the following thread: [ https://igraph.discourse.group/t/clarification-on-eigenvector-centralization-calculation-in-igraph/1867/2]. Your insights would be greatly appreciated and could help improve consistency and clarity in network analysis tools across the R ecosystem. Thank you for your time and consideration. Best regards, Chuding -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sun Sep 8 02:12:37 2024 From: statnet_help at u.washington.edu (Carter T. Butts via statnet_help) Date: Sun Sep 8 02:12:43 2024 Subject: [statnet_help] Inquiry about eigenvector centrality normalization in sna package In-Reply-To: References: Message-ID: <1218e612-b0f8-4f79-8ec5-2ee6d667ff23@uci.edu> Hi, Chuding - Eigenvectors (and hence eigenvector centrality) are defined only up to a nonzero scalar multiple, so their lengths are inherently arbitrary;? evcent() follows the behavior of eigen(), which in turn follows the very common convention of taking eigenvectors to be of unit length.? If rescale=TRUE, then the scores are rescaled so that the sum of scores is 1 (which does not, in general, yield a unit-length vector).? However, this is again cosmetic, and you can make them sum to 5, to ?, or to your birthdate if you like, without changing the meaning of the index.? (You can also flip their signs, if you like.? And, indeed, eigen()-based calculation can give you negative-sign solutions.? An all-negative solution is equivalent to an all-positive solution, because only the products of values matter.) Hope that helps, -Carter On 9/7/24 9:32 PM, CHU-DING LING via statnet_help wrote: > > Dear Carter and all, > > I hope this email finds you well. I'm writing to discuss the > normalization of eigenvector centrality scores in the *sna* package. > > After reviewing the documentation, I saw that when ?rescale=TRUE? is > used, the 1-norm is applied for normalization. However, for the > default setting of ?rescale=FALSE?, the documentation does not specify > the normalization method. > > Since the choice of normalization method can significantly affect > centralization calculations based on eigenvector centrality, I believe > it would be helpful to clarify this in the documentation. Is the > Euclidean norm used when ?rescale=FALSE?, or could this be a side > effect of the eigenvector computation method? > > For reference, there is an ongoing discussion on this topic in the > following thread: > [https://igraph.discourse.group/t/clarification-on-eigenvector-centralization-calculation-in-igraph/1867/2 > ]. > Your insights would be greatly appreciated and could help improve > consistency and clarity in network analysis tools across the R ecosystem. > > Thank you for your time and consideration. > > Best regards, > > Chuding > > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Mag52IdlUKL5O9HWcQYJP4oMfX-Js9neFXT7Jdz8RIxMCKyL76a1SpKv8BVmUr3yOn4Dltz1uEgzqsU2_RrnVPOhdh3A$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu Sep 12 08:56:11 2024 From: statnet_help at u.washington.edu (Allan Clifton via statnet_help) Date: Thu Sep 12 08:56:51 2024 Subject: [statnet_help] TERGM on ego networks Message-ID: Hi all- I wanted to get some thoughts on how to analyze our data. We have about 500 ego participants, with longitudinal data over 6 assessment times. At each timepoint, they completed an egocentric network of their 20 closest associates, which included connections between alters, compositional information about each alter, and allowing alters to enter or leave the networks. We want to look at how these networks change over time, and how structural and compositional changes in the network are related to behavioral change in the egos. This seems like a question that TERGM would be appropriate for, but I'm a little daunted about how to do this with 500 ego networks. Is it possible to use ergm.ego with longitudinal networks? Is there a better way to approach this? Any suggestions of approaches to try, or resources to get a better foundation on this type of data would be helpful. Thank you! Allan -- Allan Clifton, PhD Associate Professor Department of Psychological Science Director, Independent Program Vassar College (845) 437-7381 alclifton@vassar.edu www.personalitystudies.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Fri Sep 13 06:05:16 2024 From: statnet_help at u.washington.edu (CHU-DING LING via statnet_help) Date: Fri Sep 13 06:05:58 2024 Subject: [statnet_help] Inquiry about eigenvector centrality normalization in sna package In-Reply-To: <1218e612-b0f8-4f79-8ec5-2ee6d667ff23@uci.edu> References: <1218e612-b0f8-4f79-8ec5-2ee6d667ff23@uci.edu> Message-ID: Dear Carter, Thank you for your detailed explanation regarding the normalization of eigenvector centrality scores in the *sna* package. Your guidance will certainly help me and others in the community achieve greater clarity in using these tools. Thank you again for your valuable input. Best regards, Chuding Carter T. Butts via statnet_help ?2024?9?8??? 17:12??? > Hi, Chuding - > > Eigenvectors (and hence eigenvector centrality) are defined only up to a > nonzero scalar multiple, so their lengths are inherently arbitrary; > evcent() follows the behavior of eigen(), which in turn follows the very > common convention of taking eigenvectors to be of unit length. If > rescale=TRUE, then the scores are rescaled so that the sum of scores is 1 > (which does not, in general, yield a unit-length vector). However, this is > again cosmetic, and you can make them sum to 5, to ?, or to your birthdate > if you like, without changing the meaning of the index. (You can also flip > their signs, if you like. And, indeed, eigen()-based calculation can give > you negative-sign solutions. An all-negative solution is equivalent to an > all-positive solution, because only the products of values matter.) > > Hope that helps, > > -Carter > On 9/7/24 9:32 PM, CHU-DING LING via statnet_help wrote: > > Dear Carter and all, > > > > I hope this email finds you well. I'm writing to discuss the normalization > of eigenvector centrality scores in the *sna* package. > > > > After reviewing the documentation, I saw that when ?rescale=TRUE? is used, > the 1-norm is applied for normalization. However, for the default setting > of ?rescale=FALSE?, the documentation does not specify the normalization > method. > > > > Since the choice of normalization method can significantly affect > centralization calculations based on eigenvector centrality, I believe it > would be helpful to clarify this in the documentation. Is the Euclidean > norm used when ?rescale=FALSE?, or could this be a side effect of the > eigenvector computation method? > > > > For reference, there is an ongoing discussion on this topic in the > following thread: [ > https://igraph.discourse.group/t/clarification-on-eigenvector-centralization-calculation-in-igraph/1867/2 > ]. > Your insights would be greatly appreciated and could help improve > consistency and clarity in network analysis tools across the R ecosystem. > > > > Thank you for your time and consideration. > > > > Best regards, > > Chuding > > _______________________________________________ > statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Mag52IdlUKL5O9HWcQYJP4oMfX-Js9neFXT7Jdz8RIxMCKyL76a1SpKv8BVmUr3yOn4Dltz1uEgzqsU2_RrnVPOhdh3A$ > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Fri Sep 13 09:44:11 2024 From: statnet_help at u.washington.edu (Allan Clifton via statnet_help) Date: Fri Sep 13 09:44:50 2024 Subject: [statnet_help] TERGM on ego networks Message-ID: Hi all- I wanted to get some thoughts on how to analyze a dataset of ego networks. We have about 500 participants, with longitudinal data gathered at 6 assessment times. At each timepoint, they completed an egocentric network of their 20 closest associates, which included connections between alters (including allowing alters to enter or leave the networks) and behavioral information about each alter. We want to look at how these networks change over time, and how structural and compositional changes in the network are related to behavioral change in the egos. This seems like a question that TERGM would be appropriate for, but I'm a little daunted about how to do this with 500 ego networks. Many of the tutorials that I've viewed about ergm.ego use the ego networks as a starting point to simulate larger networks. But we're instead interested in examining these 500 networks to find commonalities or patterns among our participants' behavioral change over time. Is it even possible to use ergm.ego with longitudinal networks? Is there a better way to approach this? Any suggestions of approaches to try, or resources to get a better foundation on this type of data would be helpful. Thank you! Allan -- Allan Clifton, PhD Associate Professor Department of Psychological Science Director, Independent Program Vassar College (845) 437-7381 alclifton@vassar.edu www.personalitystudies.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Wed Sep 18 14:59:40 2024 From: statnet_help at u.washington.edu (Allan Clifton via statnet_help) Date: Wed Sep 18 15:00:00 2024 Subject: [statnet_help] TERGM on ego networks Message-ID: Hi all- I wanted to get some thoughts on how to analyze a dataset of ego networks. We have about 500 participants, with longitudinal data gathered at 6 assessment times. At each timepoint, they completed an egocentric network of their 20 closest associates, which included connections between alters (including allowing alters to enter or leave the networks) and behavioral information about each alter. We want to look at how these networks change over time, and how structural and compositional changes in the network are related to behavioral change in the egos. This seems like a question that TERGM would be appropriate for, but I'm a little daunted about how to do this with 500 ego networks. Many of the tutorials that I've viewed about ergm.ego use the ego networks as a starting point to simulate larger networks. But we're instead interested in examining these 500 networks to find commonalities or patterns among our participants' behavioral change over time. Is it possible to use ergm.ego with longitudinal networks? Is there a better way to approach this? Any suggestions of approaches to try, or resources to get a better foundation on this type of data would be helpful. Thank you! Allan -- Allan Clifton, PhD Associate Professor Department of Psychological Science Director, Independent Program Vassar College (845) 437-7381 alclifton@vassar.edu www.personalitystudies.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Thu Oct 17 18:52:30 2024 From: statnet_help at u.washington.edu (Khanna, Aditya via statnet_help) Date: Thu Oct 17 18:52:46 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> <7f088ef1-8b4f-4613-b7f2-52d81d3ecb3f@uci.edu> Message-ID: Hi Carter and Pavel, Thank you so much for your helpful suggestions. Following your feedback (and Carter?s numbering scheme below), I report the following: 1. The GOF plot is here and the numerical summary is here . Based on these, it seems that the model has not converged. When we simulate from our best fit, even though it is not from a converged model, the simulated networks seem not far off from the mean target statistics. 2. We have looked into the underlying uncertainty in the target statistics themselves, and these uncertainty regions are quite large. My question is: what is the best approach to handle these uncertain ranges in the target statistics? Is there a way to specify not just the mean but a range of target values for each of the specified ERGM parameters? I can conceptualize other ways, for instance, sampling values from the uncertainty ranges for each parameter, and fitting ERGMs to those configurations, for instance. But I am not sure if there is a recommended heuristic to pursue? 3. I also tried your suggestion to start with a non-empty graph (whereas case 1 above is the GOF output on a fit that was generated where I started from an empty graph). The GOF plot is here . I also combined the idea of starting with a non-empty graph with Pavel?s suggestion to not specify any arguments in control.ergm and let the algorithm figure it out (or just specify the san in the control.ergm). That didn?t work either (see Rout from the session). Thank you in advance for any thoughts you are able to share, Aditya On Sat, Sep 7, 2024 at 8:28?PM Carter T. Butts wrote: > Hi, Aditya - > On 9/5/24 12:57 PM, Khanna, Aditya wrote: > > Hi Carter, > > Thank you so much for your helpful response as always. I have organized my > report in terms of the various things you suggest. > > Verifying MCMC, GOF and the ?second? MCMC: Yes, the ERGM for the model > described below does converge, but, despite having converged, the simulated > networks don?t seem to statistically capture the targets. I did make the > GOF > > plots as well. Most of the terms look good, though some have peaks that are > off from the zero. In general, however, I have come to rely more on > actually simulating networks from the fitted ERGM object (what I think you > mean by ?second MCMC run?) in addition to the GOF plots. Usually I consider > my goal fulfilled if the simulated network objects capture the targets, > even if the GOF plots don?t look perfect. > > There seems to be some confusion here: setting aside exotica, if your mean > in-model statistics don't match your observed in-model statistics, then > your model has not converged. The matching here that matters is from an > MCMC run from the fitted model, which is what the gof() function does (but > this is *not* what you get from MCMC diagnostics on the fitted ergm() > object, which should show the penultimate run - those are handy for > diagnosing general MCMC issues, but need not show convergence even when the > final coefficients yield a convergent model). The plots you linked to > under "GOF" above seem to be MCMC diagnostics from an ergm() object, not > gof() output. I'll come back to "exotica" below, but first it is important > to be sure that we are discussing the same thing. > > Model Convergence and tightening the MCMC tolerances: In terms of > tightening the MCMC tolerances, I did increase the MCMC interval to 1e9, of > the order of O(N^2). But this particular specification timed out after 120 > hours, and I didn?t try to run it for longer time than that. > > Unfortunately, there's no "that's longer than I want it to be" card that > convinces mixing to happen more quickly: if your model is really going to > take 1e9 or more updates per step to converge, and if that takes longer > than 120 hours on your hardware, then that's how long it's going to take. > If there were magic sauce that could guarantee great convergence with > minimal computational effort every time, it would already be in the code. > That said, I don't know that all other options have yet been ruled out; > and, when folks encounter expensive models on large networks, there are > various approximate solutions that they may be willing to live with. But > if one is working with a dependence model on >=1e4 nodes, one must except > that one may be in a regime in which gold-standard MCMC-MLE is very > expensive. Just sayin'. > > Alternate parameters to tighten the MCMC: I have experimented with the > MCMC sample size and interval parameters, but have not been able to improve > the quality of the simulated network. I am not as familiar with what > options are available within the bounds of some reasonable computational > cost. > > In summary, the problem remains that despite the ERGM convergence, the > quality of the simulated networks suggests room for improvement, since the > specified targets are not captured within the distribution of the simulated > networks. > > OK, let me see if I can offer some further advice, based on your email and > also something that came up in your exchange with Pavel: > > 1. We should be clear that, assuming no-exotica, you should be assessing > convergence from an MCMC run on the fitted model (as produced by gof() or > done manually). So far, the plots I've seen appear not to be runs from the > fitted model, so I have not actually seen evidence of the alleged > phenomenon. Also, to be clear, (absent exotica) if your simulated mean > stats don't match the observed stats (up to numerical and sometimes > statistical noise), your model hasn't converged. A model that isn't > converging is not the same as a model that has converged but that is > inadequate, and the fixes are very different. > > 2. The exchange with Pavel led me to dig into your code a bit more, and I > realized that you are not fitting to an observed network, but to target > stats presumably based on design estimation. This could put you into the > "exotica" box, because it is likely that - due to errors in your estimated > targets - there exists no ERGM in your specified family whose expected > statistics exactly match the target statistics. So long as they aren't too > far off, you still ought to be able to get close, but hypothetically one > could have a situation where someone gets an unusually bad estimate for one > or a small number of targets, and their matches are persistently off; in > this case, the issue is that the MLE no longer satisfies the first moment > condition (expected statistics do not match the target statistics), so this > is no longer a valid criterion for assessing convergence. If one is > willing/able to make some distributional statements about one's target > estimates, there are some natural relaxations of the usual convergence > criteria, and almost surely Pavel has written them down, so I defer to > him. :-) But anyway, *if* your model really seems not to be converging > (by the criteria of (1)), and *if* you are using estimated target stats, > then I would certainly want to investigate the possibility that your model > has actually converged (and that you're just seeing measurement error in > your target stats) before going much further. To write reckless words > (that you should read recklessly), one naive heuristic that could perhaps > be worth trying would be to look at the Z-scores > (t_o-t_s)/(s2_o^2+s2_s^2)^0.5, where t_o is the observed (estimated) > target, t_s is the simulated mean statistic, s2_o is the standard error of > your target estimator, and s2_s is the standard error of the simulation > mean. (If you are e.g. using Horvitz-Thompson, you can approximate s2_o > using standard results, and you can likewise use autocorrelation-corrected > approximations to s2_s.) If these are not large, then this suggests that > the discrepancies between the targets and the mean stats are not very large > compared to what you would expect from the variation in your simulation > outcomes and in your measurement process. This does not take into account > e.g. correlations among statistics, nor support constraints, but it seems > like a plausible starting point. (Pavel and Martina have been working with > these kinds of problems a lot of late, so doubtless can suggest better > heuristics.) > > 3. Pavel's comments pointed to SAN, which also led me to observe that you > are starting by fitting to an empty graph. I recommend against that. In > principle, the annealer should get you to a not-too-bad starting point, but > in my own informal simulation tests I have observed that this doesn't > always work well if the network is very large; in particular, if SAN dumps > you out with a starting point that is far from equilibrium, you are wasting > a lot of MCMC steps wandering towards the high-density region of the graph > space, and this can sometimes lead to poor results (especially if you can't > afford to run some (large k)*N^2 burn-in - and recall that the default MCMC > algorithm tends to preserve density, so if the seed is poor in that regard, > it can take a lot of iterations to fix). My suggestion is to use rgraph() > to get a Bernoulli graph draw from a model whose mixing characteristics > (and, above all, density) approximate the target, and start with that. An > easy way to set the parameters is to fit a pilot ERGM using only > independence terms, use these construct a tie probability matrix, and pass > that to the tp argument of rgraph(). Your case makes for a very large > matrix, but it's still within the range of the feasible. (rgraph() does > not use adjacency matrices internally, and so long as you set the return > value to be an edgelist is not constrained by the sizes of feasible > adjacency matrices, but if you want to pass an explicit tie probability > matrix then obviously that puts you in the adjacency matrix regime.) > Anyway, it's better to use rgraph() for this than an simulate() call, > because it will be both faster and an exact simulation (no MCMC). A poorer > approach not to bother with mixing structure, and just to draw an initial > state with the right density (which at least reduces the risk that SAN > exits with a graph that is too sparse)....but you might as well put your > starting point as close to the right neighborhood as you can. The goal > here is to help the annealer get you to a high-potential graph, rather than > expecting it to carry you there from a remote location. It is possible > that this turns out not to be a problem in your particular case, but it > seems worth ruling out. > > Hope that helps, > > -Carter > > > Aditya > > On Fri, Aug 30, 2024 at 4:37?AM Carter T. Butts via statnet_help < > statnet_help@u.washington.edu> wrote: > >> Hi, Aditya - >> >> I'll be interested in Pavel's take on the convergence issues, but just to >> verify, you are assessing convergence based on a *second* MCMC run, >> correct? The MCMC statistics in the ergm object are from the penultimate >> iteration, and may thus be out of equilibrium (but this does *not* >> necessarily mean that the *model* did not converge). However, if you >> simulate a new set of draws from the fitted model and the mean stats do not >> match, *then* you have an issue. (This is why we now point folks to >> gof() for that purpose.) It looks like your plots are from the ergm object >> and not from a gof() run (or other secondary simulation), so I want to >> verify that first. >> >> I also note that a quick glance at the plots from your more exhaustive >> simulation case don't seem all that far off, which could indicate either >> that the model did converge (and per above, we're not looking at a draw >> from the final model), or that it converged within the tolerances that were >> set, and you may need to tighten them. But best to first know if there's a >> problem in the first place. >> >> Another observation is that, per my earlier email, you may need O(N^2) >> toggles per draw to get good performance if your model has a nontrivial >> level of dependence. You are using a thinning interval of 1e6, which is in >> your case around 30*N. It's possible that you've got too much dependence >> for that: O(N^2) here would mean some multiple of about 1e9, which is about >> a thousand times greater than what you're using. Really large, sparse >> networks sometimes *can* be modeled well without that much thinning, but >> it's not a given. Relatedly, your trace plots from the 1e6 run suggest a >> fair amount of autocorrelation on some statistics, which suggests a lack of >> efficiency. (Autocorrelation by itself isn't necessarily a problem, but it >> means that your effective MCMC sample size is smaller than it seems, and >> this can reduce the effectiveness of the MCMCMLE procedure. The ones from >> the 1e6 run aren't bad enough that I would be alarmed, but if I were >> looking for things to tighten up and knew this could be a problem, they >> suggest possible room for improvement.) So anyway, I wouldn't crank this >> up until verifying that it's needed, but you are still operating on the low >> end of computational effort (whether it seems like it or not!). >> >> Finally, I would note that for the stochastic approximation method, >> convergence is to some degree (and it's a bit complex) determined by how >> many subphases are run, and how many iterations are used per subphase. >> This algorithm is due to Tom in his classic JoSS paper (but without the >> complement moves), which is still a good place to look for details. It is >> less fancy than some more modern algorithms of its type, but is extremely >> hard to beat (I've tried and failed more than once!). In any event, there >> are several things that can tighten that algorithm relative to its >> defaults, including increasing thinning, increasing the iterations per >> subphase, and increasing the number of subphases. Some of these sharply >> increase computational cost, because e.g. the number of actual subphase >> iterations doubles (IIRC) at each subphase - so sometimes one benefits by >> increasing the phase number but greatly reducing the base number of >> iterations per phase. The learning rate ("SA.initial.gain") can also >> matter, although I would probably avoid messing with it if the model is >> well-behaved (as here). I will say that, except under exotic conditions in >> which I am performing Unspeakable ERGM Experiments (TM) of which we tell >> neither children nor grad students, I do not recall ever needing to do much >> with the base parameters - adjusting thinning, as needs must, has almost >> always done the trick. Still, if other measures fail, tinkering with these >> settings can/will certainly affect convergence. >> >> I'd check on those things first, and then see if you still have a >> problem.... >> >> Hope that helps, >> >> -Carter >> On 8/29/24 12:13 PM, Khanna, Aditya wrote: >> >> Hi Carter and All, >> >> Thank you so much for the helpful guidance here. I think following your >> suggestions has brought us very close to reproducing the target statistics >> in the simulated networks, but there are still some gaps. >> >> Our full previous exchange is below, but to summarize: I have an ERGM >> that I fit previously with ERGM v3.10.4 on a directed network with 32,000 >> nodes. The model consisted of in- and out-degrees in addition to other >> terms, including a custom distance term. In trying to reproduce this fit >> with ergm v4.6, the model did not initially converge. >> >> Your suggestion to try setting the main.method = ?Stochastic >> Approximation? considerably improved the fitting. Specifying the >> convergence detection to ?Hotelling? on top of that brought us almost to >> simulated networks that capture all the mean statistics. (Following an old discussion >> thread >> >> on the statnet github, I also tried setting the termination criteria to >> Hummel and MCMLE.effectiveSize = NULL. I think, for me, in practice, >> Hotelling worked a bit better than Hummel though). >> >> In general, I tried fitting the model with variants of this >> >> specification. I got the best results with setting both MCMC samplesize=1e6 >> and interval = 1e6 (see table below). >> >> MCMC interval >> >> MCMC sample size >> >> Convergence Detection >> >> Results/Outcome >> >> Note >> >> 1e6 >> >> 1e6 >> >> Hotelling >> >> Closest agreement between simulated and target statistics >> >> Max. Lik. fit summary and simulation Rout >> >> >> >> Violin plots >> >> showing the simulated and target statistics for each parameter >> >> >> But, I found that this was the closest I could get producing simulated >> statistics that matched the target statistics. In general, any further >> increasing or decreasing of either the samplesize or interval did not help >> generate a closer result, i.e., this looked to be some optimum in the fit >> parameter space. I can provide further details on the results of those >> fits, which for some configurations didn?t converge, and if they did >> converge, the goodness-of-fit was worse than what I had with setting the >> MCMC interval and samplesize to 1e6. Based on your experiences, I was >> wondering if this is expected? >> >> For now, my main question is, are there any suggestions on how I can >> further tune the fitting parameters to match my targets more closely? I can >> provide specific details on the outcomes of those fitting processes if that >> would be helpful. >> >> Thanks for your consideration. >> Aditya >> >> On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via statnet_help < >> statnet_help@u.washington.edu> wrote: >> >>> Hi, Aditya - >>> >>> I will defer to the mighty Pavel for the exact best formula to reproduce >>> 3.x fits with the latest codebase. (You need to switch convergence >>> detection to "Hotelling," and there are some other things that must be >>> modified.) However, as a general matter, for challenging models where >>> Geyer-Thompson-Hummel has a hard time converging (particularly on a large >>> node set), you may find it useful to try the stochastic approximation >>> method (main="Stochastic" in your control argument will activate it). >>> G-T-H can (in principle) have sharper convergence when near the solution, >>> but in practice SA fails more gracefully. I would suggest increasing your >>> default MCMC thinning interval (MCMC.interval), given your network size; >>> depending on density, extent of dependence, and other factors, you may need >>> O(N^2) toggles per step. It is sometimes possible to get away with as few >>> as k*N (for some k in, say, the 5-100 range), but if your model has >>> substantial dependence and is not exceptionally sparse then you will >>> probably need to be in the quadratic regime. One notes that it can >>> sometimes be helpful when getting things set up to run "pilot" fits with >>> the default or otherwise smaller thinning intervals, so that you can >>> discover if e.g. you have a data issue or other problem before you spend >>> the waiting time on a high-quality model fit. >>> >>> To put in the obligatory PSA, both G-T-H and SA are simply different >>> strategies for computing the same thing (the MLE, in this case), so both >>> are fine - they just have different engineering tradeoffs. So use >>> whichever proves more effective for your model and data set. >>> >>> Hope that helps, >>> >>> -Carter >>> >>> >>> On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: >>> >>> Dear Statnet Dev and User Community: >>> >>> I have an ERGM that I fit previously with ERGM v3.10.4 on a directed >>> network with 32,000 nodes. The model included in- and out-degrees, in >>> addition to other terms. The complete Rout from this fit can be seen >>> here >>> . >>> I am now trying to reproduce this fit with ergm v4.6, but the model does >>> not converge. (See here >>> >>> .) >>> >>> I am looking for ideas on how to trouble shoot this. One suggestion I >>> got was to set values for the "tuning parameters" in the v4.6 to their >>> defaults from v3.11.4. But ERGM v4.6 has a lot more parameters that can be >>> specified, and I am not sure which ones make most sense to consider. >>> >>> I would be grateful for any suggestions on this or alternate ideas to >>> try. >>> >>> Many thanks, >>> Aditya >>> >>> >>> >>> >>> -- >>> >>> >>> >>> >>> >>> >>> >>> Aditya S. Khanna, Ph.D. >>> >>> Assistant Professor >>> >>> Department of Behavioral and Social Sciences >>> >>> Center for Alcohol and Addiction Studies >>> >>> Brown University School of Public Health >>> >>> Pronouns: he/him/his >>> >>> 401-863-6616 >>> >>> sph.brown.edu >>> >>> >>> https://vivo.brown.edu/display/akhann16 >>> >>> >>> _______________________________________________ >>> statnet_help mailing liststatnet_help@u.washington.eduhttps://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ >>> >>> _______________________________________________ >>> statnet_help mailing list >>> statnet_help@u.washington.edu >>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >>> >>> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sun Nov 17 18:05:39 2024 From: statnet_help at u.washington.edu (Steven M. Goodreau via statnet_help) Date: Sun Nov 17 18:05:47 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> <7f088ef1-8b4f-4613-b7f2-52d81d3ecb3f@uci.edu> Message-ID: Hi all, So I was working through this email chain re Aditya's issue when I heard from Sam, and then spent time exploring both, given that there is some conceptual overlap.? You should have just seen the email re Sam's issues. Re Aditya's: here are my various thoughts. I don't actually have advice on how to improve your model further, Aditya, sorry -- I'll have to leave that to Carter and Pavel. This is about the issues related to the process getting to this point. The email chain is long, so if I missed anything that answers one of these questions, forgive me: - Aditya: it seems you've spent a long time trying to fit this model on ergm 4.x? If I understand correctly, you were able to fit it in ergm 3.x.? In the interest of moving your research forward, is there any reason you can't use the fit from 3.x for your simulations? I understand that ideally you would want it to fit just as easily in the latest version, but are you actually stuck in your research? - Carter: you had mentioned somewhere about how complex models on large networks sometimes just take a really long time to fit, and there's no way around it. Totally agree.? I think the issue, IIUC, is that in ergm 3.x, it didn't take nearly that long to converge. So that certainly suggests that it's possible to do it. Aditya, do I have that right? All: I understand that ergm 4.x's defaults are trying to tackle a wide range of models adaptively.? And that might mean that any given model might not be as smooth as it was in ergm 3.x, while ergm 4.x still represents a step forward overall.? That said, it seems as if Philip Leifeld had issues with a model fitting with 3.x defaults and not 4.x, and the advice that Chad gave then is similar to what Carter did now.? And it's similar to what also seemed to help somewhat with Sam's model.? So there does seem to be a large space of useful models that worked more smoothly in 3 than 4. So now I'm wondering: is there a way to set the parameters in ergm 4.x to create a fitting algorithm that perfectly matches what were the defaults in 3.x?? If so, does a list of the parameters and their values needed to do this exist somewhere?? Is it worth adding in a way to easily make that happen with a single argument (e.g. ergm3.defaults = TRUE)?? Ideally one could set that and then still tweak on top of that, so that one could re-create a given model from ergm 3 that was mostly defaults but a few not. Thanks, Steve On 10/17/2024 9:52 PM, Khanna, Aditya via statnet_help wrote: > Hi Carter and Pavel, > > > Thank you so much for your helpful suggestions. Following your > feedback (and Carter?s numbering scheme below), I report the following: > > > 1. > > The GOF plot is here > and > the numerical summary is here > . > Based on these, it seems that the model has not converged. When we > simulate from our best fit, even though it is not from a converged > model, the simulated networks seem not far off from the mean > target statistics. > > 2. > > We have looked into the underlying uncertainty in the target > statistics themselves, and these uncertainty regions are quite > large. My question is: what is the best approach to handle these > uncertain ranges in the target statistics? Is there a way to > specify not just the mean but a range of target values for each of > the specified ERGM parameters? I can conceptualize other ways, for > instance, sampling values from the uncertainty ranges for each > parameter, and fitting ERGMs to those configurations, for > instance. But I am not sure if there is a recommended heuristic to > pursue? > > 3. > > I also tried your suggestion to start with a non-empty graph > (whereas case 1 above is the GOF output on a fit that was > generated where I started from an empty graph). The GOF plot is > here > . > I also combined the idea of starting with a non-empty graph with > Pavel?s suggestion to not specify any arguments in control.ergmand > let the algorithm figure it out (or just specify the san in the > control.ergm). That didn?t work either (see Rout > from > the session). > > > Thank you in advance for any thoughts you are able to share, > > Aditya > > > > On Sat, Sep 7, 2024 at 8:28?PM Carter T. Butts wrote: > > Hi, Aditya - > > On 9/5/24 12:57 PM, Khanna, Aditya wrote: >> >> Hi Carter, >> >> >> Thank you so much for your helpful response as always. I have >> organized my report in terms of the various things you suggest. >> >> >> Verifying MCMC, GOF and the ?second? MCMC: Yes, the ERGM for the >> model described below does converge, but, despite having >> converged, the simulated networks don?t seem to statistically >> capture the targets. I did make the GOF >> plots >> as well. Most of the terms look good, though some have peaks that >> are off from the zero. In general, however, I have come to rely >> more on actually simulating? networks from the fitted ERGM object >> (what I think you mean by ?second MCMC run?) in addition to the >> GOF plots. Usually I consider my goal fulfilled if the simulated >> network objects capture the targets, even if the GOF plots don?t >> look perfect. >> >> > There seems to be some confusion here: setting aside exotica, if > your mean in-model statistics don't match your observed in-model > statistics, then your model has not converged.? The matching here > that matters is from an MCMC run from the fitted model, which is > what the gof() function does (but this is /not/ what you get from > MCMC diagnostics on the fitted ergm() object, which should show > the penultimate run - those are handy for diagnosing general MCMC > issues, but need not show convergence even when the final > coefficients yield a convergent model).? The plots you linked to > under "GOF" above seem to be MCMC diagnostics from an ergm() > object, not gof() output.? I'll come back to "exotica" below, but > first it is important to be sure that we are discussing the same > thing. > >> Model Convergence and tightening the MCMC tolerances:In terms of >> tightening the MCMC tolerances, I did increase the MCMC interval >> to 1e9, of the order of O(N^2). But this particular specification >> timed out after 120 hours, and I didn?t try to run it for longer >> time than that. >> >> > Unfortunately, there's no "that's longer than I want it to be" > card that convinces mixing to happen more quickly: if your model > is really going to take 1e9 or more updates per step to converge, > and if that takes longer than 120 hours on your hardware, then > that's how long it's going to take.? If there were magic sauce > that could guarantee great convergence with minimal computational > effort every time, it would already be in the code.? That said, I > don't know that all other options have yet been ruled out; and, > when folks encounter expensive models on large networks, there are > various approximate solutions that they may be willing to live > with. ? But if one is working with a dependence model on >=1e4 > nodes, one must except that one may be in a regime in which > gold-standard MCMC-MLE is very expensive.? Just sayin'. > >> Alternate parameters to tighten the MCMC: I have experimented >> with the MCMC sample size and interval parameters, but have not >> been able to improve the quality of the simulated network. I am >> not as familiar with what options are available within the bounds >> of some reasonable computational cost. >> >> In summary, the problem remains that despite the ERGM >> convergence, the quality of the simulated networks suggests room >> for improvement, since the specified targets are not captured >> within the distribution of the simulated networks. >> >> > OK, let me see if I can offer some further advice, based on your > email and also something that came up in your exchange with Pavel: > > 1. We should be clear that, assuming no-exotica, you should be > assessing convergence from an MCMC run on the fitted model (as > produced by gof() or done manually).? So far, the plots I've seen > appear not to be runs from the fitted model, so I have not > actually seen evidence of the alleged phenomenon.? Also, to be > clear, (absent exotica) if your simulated mean stats don't match > the observed stats (up to numerical and sometimes statistical > noise), your model hasn't converged.? A model that isn't > converging is not the same as a model that has converged but that > is inadequate, and the fixes are very different. > > 2. The exchange with Pavel led me to dig into your code a bit > more, and I realized that you are not fitting to an observed > network, but to target stats presumably based on design > estimation.? This could put you into the "exotica" box, because it > is likely that - due to errors in your estimated targets - there > exists no ERGM in your specified family whose expected statistics > exactly match the target statistics.? So long as they aren't too > far off, you still ought to be able to get close, but > hypothetically one could have a situation where someone gets an > unusually bad estimate for one or a small number of targets, and > their matches are persistently off; in this case, the issue is > that the MLE no longer satisfies the first moment condition > (expected statistics do not match the target statistics), so this > is no longer a valid criterion for assessing convergence.? If one > is willing/able to make some distributional statements about one's > target estimates, there are some natural relaxations of the usual > convergence criteria, and almost surely Pavel has written them > down, so I defer to him.? :-)? But anyway, /if/ your model really > seems not to be converging (by the criteria of (1)), and /if/ you > are using estimated target stats, then I would certainly want to > investigate the possibility that your model has actually converged > (and that you're just seeing measurement error in your target > stats) before going much further.? To write reckless words (that > you should read recklessly), one naive heuristic that could > perhaps be worth trying would be to look at the Z-scores > (t_o-t_s)/(s2_o^2+s2_s^2)^0.5, where t_o is the observed > (estimated) target, t_s is the simulated mean statistic, s2_o is > the standard error of your target estimator, and s2_s is the > standard error of the simulation mean.? (If you are e.g. using > Horvitz-Thompson, you can approximate s2_o using standard results, > and you can likewise use autocorrelation-corrected approximations > to s2_s.)? If these are not large, then this suggests that the > discrepancies between the targets and the mean stats are not very > large compared to what you would expect from the variation in your > simulation outcomes and in your measurement process.? This does > not take into account e.g. correlations among statistics, nor > support constraints, but it seems like a plausible starting > point.? (Pavel and Martina have been working with these kinds of > problems a lot of late, so doubtless can suggest better heuristics.) > > 3. Pavel's comments pointed to SAN, which also led me to observe > that you are starting by fitting to an empty graph.? I recommend > against that.? In principle, the annealer should get you to a > not-too-bad starting point, but in my own informal simulation > tests I have observed that this doesn't always work well if the > network is very large; in particular, if SAN dumps you out with a > starting point that is far from equilibrium, you are wasting a lot > of MCMC steps wandering towards the high-density region of the > graph space, and this can sometimes lead to poor results > (especially if you can't afford to run some (large k)*N^2 burn-in > - and recall that the default MCMC algorithm tends to preserve > density, so if the seed is poor in that regard, it can take a lot > of iterations to fix).? My suggestion is to use rgraph() to get a > Bernoulli graph draw from a model whose mixing characteristics > (and, above all, density) approximate the target, and start with > that.? An easy way to set the parameters is to fit a pilot ERGM > using only independence terms, use these construct a tie > probability matrix, and pass that to the tp argument of rgraph().? > Your case makes for a very large matrix, but it's still within the > range of the feasible. (rgraph() does not use adjacency matrices > internally, and so long as you set the return value to be an > edgelist is not constrained by the sizes of feasible adjacency > matrices, but if you want to pass an explicit tie probability > matrix then obviously that puts you in the adjacency matrix > regime.)? Anyway, it's better to use rgraph() for this than an > simulate() call, because it will be both faster and an exact > simulation (no MCMC).? A poorer approach not to bother with mixing > structure, and just to draw an initial state with the right > density (which at least reduces the risk that SAN exits with a > graph that is too sparse)....but you might as well put your > starting point as close to the right neighborhood as you can. The > goal here is to help the annealer get you to a high-potential > graph, rather than expecting it to carry you there from a remote > location.? It is possible that this turns out not to be a problem > in your particular case, but it seems worth ruling out. > > Hope that helps, > > -Carter > > >> Aditya >> >> >> On Fri, Aug 30, 2024 at 4:37?AM Carter T. Butts via statnet_help >> wrote: >> >> Hi, Aditya - >> >> I'll be interested in Pavel's take on the convergence issues, >> but just to verify, you are assessing convergence based on a >> /second/ MCMC run, correct?? The MCMC statistics in the ergm >> object are from the penultimate iteration, and may thus be >> out of equilibrium (but this does /not/ necessarily mean that >> the /model/ did not converge). However, if you simulate a new >> set of draws from the fitted model and the mean stats do not >> match, /then/ you have an issue. (This is why we now point >> folks to gof() for that purpose.)? It looks like your plots >> are from the ergm object and not from a gof() run (or other >> secondary simulation), so I want to verify that first. >> >> I also note that a quick glance at the plots from your more >> exhaustive simulation case don't seem all that far off, which >> could indicate either that the model did converge (and per >> above, we're not looking at a draw from the final model), or >> that it converged within the tolerances that were set, and >> you may need to tighten them.? But best to first know if >> there's a problem in the first place. >> >> Another observation is that, per my earlier email, you may >> need O(N^2) toggles per draw to get good performance if your >> model has a nontrivial level of dependence.? You are using a >> thinning interval of 1e6, which is in your case around 30*N.? >> It's possible that you've got too much dependence for that: >> O(N^2) here would mean some multiple of about 1e9, which is >> about a thousand times greater than what you're using.? >> Really large, sparse networks sometimes /can/ be modeled well >> without that much thinning, but it's not a given. Relatedly, >> your trace plots from the 1e6 run suggest a fair amount of >> autocorrelation on some statistics, which suggests a lack of >> efficiency.? (Autocorrelation by itself isn't necessarily a >> problem, but it means that your effective MCMC sample size is >> smaller than it seems, and this can reduce the effectiveness >> of the MCMCMLE procedure.?? The ones from the 1e6 run aren't >> bad enough that I would be alarmed, but if I were looking for >> things to tighten up and knew this could be a problem, they >> suggest possible room for improvement.) So anyway, I wouldn't >> crank this up until verifying that it's needed, but you are >> still operating on the low end of computational effort >> (whether it seems like it or not!). >> >> Finally, I would note that for the stochastic approximation >> method, convergence is to some degree (and it's a bit >> complex) determined by how many subphases are run, and how >> many iterations are used per subphase.? This algorithm is due >> to Tom in his classic JoSS paper (but without the complement >> moves), which is still a good place to look for details.? It >> is less fancy than some more modern algorithms of its type, >> but is extremely hard to beat (I've tried and failed more >> than once!).? In any event, there are several things that can >> tighten that algorithm relative to its defaults, including >> increasing thinning, increasing the iterations per subphase, >> and increasing the number of subphases.? Some of these >> sharply increase computational cost, because e.g. the number >> of actual subphase iterations doubles (IIRC) at each subphase >> - so sometimes one benefits by increasing the phase number >> but greatly reducing the base number of iterations per >> phase.? The learning rate ("SA.initial.gain") can also >> matter, although I would probably avoid messing with it if >> the model is well-behaved (as here).? I will say that, except >> under exotic conditions in which I am performing Unspeakable >> ERGM Experiments (TM) of which we tell neither children nor >> grad students, I do not recall ever needing to do much with >> the base parameters - adjusting thinning, as needs must, has >> almost always done the trick.? Still, if other measures fail, >> tinkering with these settings can/will certainly affect >> convergence. >> >> I'd check on those things first, and then see if you still >> have a problem.... >> >> Hope that helps, >> >> -Carter >> >> On 8/29/24 12:13 PM, Khanna, Aditya wrote: >>> >>> Hi Carter and All, >>> >>> >>> Thank you so much for the helpful guidance here. I think >>> following your suggestions has brought us very close to >>> reproducing the target statistics in the simulated networks, >>> but there are still some gaps. >>> >>> >>> Our full previous exchange is below, but to summarize:? I >>> have an ERGM that I fit previously with ERGM v3.10.4 on a >>> directed network with 32,000 nodes. The model consisted of >>> in- and out-degrees in addition to other terms, including a >>> custom distance term. In trying to reproduce this fit with >>> ergm v4.6, the model did not initially converge. >>> >>> >>> Your suggestion to try setting the main.method = ?Stochastic >>> Approximation? considerably improved the fitting. Specifying >>> the convergence detection to ?Hotelling? on top of that >>> brought us almost to simulated networks that capture all the >>> mean statistics. (Following an old discussion thread >>> on >>> the statnet github, I also tried setting the termination >>> criteria to Hummel and MCMLE.effectiveSize = NULL. I think, >>> for me, in practice, Hotelling worked a bit better than >>> Hummel though). >>> >>> >>> In general, I tried fitting the model with variants of this >>> specification. >>> I got the best results with setting both MCMC samplesize=1e6 >>> and interval = 1e6 (see table below). >>> >>> >>> MCMC interval >>> >>> >>> >>> MCMC sample size >>> >>> >>> >>> Convergence Detection >>> >>> >>> >>> Results/Outcome >>> >>> >>> >>> Note >>> >>> 1e6 >>> >>> >>> >>> 1e6 >>> >>> >>> >>> Hotelling >>> >>> >>> >>> Closest agreement? between simulated and target statistics >>> >>> >>> >>> Max. Lik. fit summary and simulation Rout >>> >>> >>> >>> Violin plots >>> showing >>> the simulated and target statistics for each parameter >>> >>> >>> >>> But, I found that this was the closest I could get producing >>> simulated statistics that matched the target statistics. In >>> general, any further increasing or decreasing of either the >>> samplesize or interval did not help generate a closer >>> result, i.e., this looked to be some optimum in the fit >>> parameter space. I can provide further details on the >>> results of those fits, which for some configurations didn?t >>> converge, and if they did converge, the goodness-of-fit was >>> worse than what I had with setting the MCMC interval and >>> samplesize to 1e6. Based on your experiences, I was >>> wondering if this is expected? >>> >>> >>> For now, my main question is, are there any suggestions on >>> how I can further tune the fitting parameters to match my >>> targets more closely? I can provide specific details on the >>> outcomes of those fitting processes if that would be helpful. >>> >>> >>> Thanks for your consideration. >>> >>> Aditya >>> >>> On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via >>> statnet_help wrote: >>> >>> Hi, Aditya - >>> >>> I will defer to the mighty Pavel for the exact best >>> formula to reproduce 3.x fits with the latest codebase. >>> (You need to switch convergence detection to >>> "Hotelling," and there are some other things that must >>> be modified.)? However, as a general matter, for >>> challenging models where Geyer-Thompson-Hummel has a >>> hard time converging (particularly on a large node set), >>> you may find it useful to try the stochastic >>> approximation method (main="Stochastic" in your control >>> argument will activate it). G-T-H can (in principle) >>> have sharper convergence when near the solution, but in >>> practice SA fails more gracefully.?? I would suggest >>> increasing your default MCMC thinning interval >>> (MCMC.interval), given your network size; depending on >>> density, extent of dependence, and other factors, you >>> may need O(N^2) toggles per step.? It is sometimes >>> possible to get away with as few as k*N (for some k in, >>> say, the 5-100 range), but if your model has substantial >>> dependence and is not exceptionally sparse then you will >>> probably need to be in the quadratic regime.? One notes >>> that it can sometimes be helpful when getting things set >>> up to run "pilot" fits with the default or otherwise >>> smaller thinning intervals, so that you can discover if >>> e.g. you have a data issue or other problem before you >>> spend the waiting time on a high-quality model fit. >>> >>> To put in the obligatory PSA, both G-T-H and SA are >>> simply different strategies for computing the same thing >>> (the MLE, in this case), so both are fine - they just >>> have different engineering tradeoffs.? So use whichever >>> proves more effective for your model and data set. >>> >>> Hope that helps, >>> >>> -Carter >>> >>> >>> On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: >>>> Dear Statnet Dev and User Community: >>>> >>>> I have an ERGM that I fit previously with ERGM v3.10.4 >>>> on a directed network with 32,000 nodes. The model >>>> included in- and out-degrees, in addition to other >>>> terms. The complete Rout from this fit can be seen here >>>> . >>>> I am now trying to reproduce this fit with ergm v4.6, >>>> but the model does not converge. (See here >>>> .) >>>> >>>> I am looking for ideas on how to trouble shoot this. >>>> One suggestion I got was to set values for the "tuning >>>> parameters" in the v4.6 to their defaults from v3.11.4. >>>> But ERGM v4.6 has a lot more ?parameters that can be >>>> specified, and I am not sure which ones make most sense >>>> to consider. >>>> >>>> I would be grateful for any suggestions on this or >>>> alternate ideas to try. >>>> >>>> Many thanks, >>>> Aditya >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Aditya S. Khanna, Ph.D. >>>> >>>> Assistant Professor >>>> >>>> Department of Behavioral and Social Sciences >>>> >>>> Center for Alcohol and Addiction Studies >>>> >>>> Brown University School of Public Health >>>> >>>> Pronouns: he/him/his >>>> >>>> >>>> 401-863-6616 >>>> >>>> sph.brown.edu >>>> >>>> >>>> https://vivo.brown.edu/display/akhann16 >>>> >>>> >>>> >>>> _______________________________________________ >>>> statnet_help mailing list >>>> statnet_help@u.washington.edu >>>> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ >>> _______________________________________________ >>> statnet_help mailing list >>> statnet_help@u.washington.edu >>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >>> >>> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >> >> > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help -- *********************************************************************************************************************** Steven M. Goodreau / Professor / Dept. of Anthropology / Adjunct Prof. / Dept. of Epidemiology (STEE-vun GOOD-roe) / he-him /https://faculty.washington.edu/goodreau / dzidz?lali?, x???l? Physical address: Denny Hall M236; Mailing address: Campus Box 353100 / 4216 Memorial Way NE Univ. of Washington / Seattle WA 98195 / 1-206-685-3870 (phone) / 1-206-543-3285 (fax) "Fight for the things that you care about, but do it in a way that will lead others to join you" - Justice RB Ginsburg *********************************************************************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sun Nov 17 18:09:29 2024 From: statnet_help at u.washington.edu (Steven M. Goodreau via statnet_help) Date: Sun Nov 17 18:09:37 2024 Subject: [statnet_help] Upgrading from ERGM v3.10 to v4.6 In-Reply-To: References: <3bbc0b12-8e50-4ece-a72e-46904a40d811@uci.edu> <7f088ef1-8b4f-4613-b7f2-52d81d3ecb3f@uci.edu> Message-ID: Sorry, statnet_help listserv members - I meant to send this to the statnet development team listserv, not the users listserv - my apologies!! Steve On 11/17/2024 9:05 PM, Steven M. Goodreau wrote: > > Hi all, > > So I was working through this email chain re Aditya's issue when I > heard from Sam, and then spent time exploring both, given that there > is some conceptual overlap.? You should have just seen the email re > Sam's issues. > > Re Aditya's: here are my various thoughts. I don't actually have > advice on how to improve your model further, Aditya, sorry -- I'll > have to leave that to Carter and Pavel. This is about the issues > related to the process getting to this point. The email chain is long, > so if I missed anything that answers one of these questions, forgive me: > > - Aditya: it seems you've spent a long time trying to fit this model > on ergm 4.x? If I understand correctly, you were able to fit it in > ergm 3.x.? In the interest of moving your research forward, is there > any reason you can't use the fit from 3.x for your simulations? I > understand that ideally you would want it to fit just as easily in the > latest version, but are you actually stuck in your research? > > - Carter: you had mentioned somewhere about how complex models on > large networks sometimes just take a really long time to fit, and > there's no way around it. Totally agree.? I think the issue, IIUC, is > that in ergm 3.x, it didn't take nearly that long to converge. So that > certainly suggests that it's possible to do it. Aditya, do I have that > right? > > All: I understand that ergm 4.x's defaults are trying to tackle a wide > range of models adaptively.? And that might mean that any given model > might not be as smooth as it was in ergm 3.x, while ergm 4.x still > represents a step forward overall.? That said, it seems as if Philip > Leifeld had issues with a model fitting with 3.x defaults and not 4.x, > and the advice that Chad gave then is similar to what Carter did now.? > And it's similar to what also seemed to help somewhat with Sam's > model.? So there does seem to be a large space of useful models that > worked more smoothly in 3 than 4. So now I'm wondering: is there a way > to set the parameters in ergm 4.x to create a fitting algorithm that > perfectly matches what were the defaults in 3.x?? If so, does a list > of the parameters and their values needed to do this exist somewhere?? > Is it worth adding in a way to easily make that happen with a single > argument (e.g. ergm3.defaults = TRUE)? Ideally one could set that and > then still tweak on top of that, so that one could re-create a given > model from ergm 3 that was mostly defaults but a few not. > > Thanks, > > Steve > > > On 10/17/2024 9:52 PM, Khanna, Aditya via statnet_help wrote: > >> Hi Carter and Pavel, >> >> >> Thank you so much for your helpful suggestions. Following your >> feedback (and Carter?s numbering scheme below), I report the following: >> >> >> 1. >> >> The GOF plot is here >> and >> the numerical summary is here >> . >> Based on these, it seems that the model has not converged. When >> we simulate from our best fit, even though it is not from a >> converged model, the simulated networks seem not far off from the >> mean target statistics. >> >> 2. >> >> We have looked into the underlying uncertainty in the target >> statistics themselves, and these uncertainty regions are quite >> large. My question is: what is the best approach to handle these >> uncertain ranges in the target statistics? Is there a way to >> specify not just the mean but a range of target values for each >> of the specified ERGM parameters? I can conceptualize other ways, >> for instance, sampling values from the uncertainty ranges for >> each parameter, and fitting ERGMs to those configurations, for >> instance. But I am not sure if there is a recommended heuristic >> to pursue? >> >> 3. >> >> I also tried your suggestion to start with a non-empty graph >> (whereas case 1 above is the GOF output on a fit that was >> generated where I started from an empty graph). The GOF plot is >> here >> . >> I also combined the idea of starting with a non-empty graph with >> Pavel?s suggestion to not specify any arguments in >> control.ergmand let the algorithm figure it out (or just specify >> the san in the control.ergm). That didn?t work either (see Rout >> from >> the session). >> >> >> Thank you in advance for any thoughts you are able to share, >> >> Aditya >> >> >> >> On Sat, Sep 7, 2024 at 8:28?PM Carter T. Butts wrote: >> >> Hi, Aditya - >> >> On 9/5/24 12:57 PM, Khanna, Aditya wrote: >>> >>> Hi Carter, >>> >>> >>> Thank you so much for your helpful response as always. I have >>> organized my report in terms of the various things you suggest. >>> >>> >>> Verifying MCMC, GOF and the ?second? MCMC: Yes, the ERGM for the >>> model described below does converge, but, despite having >>> converged, the simulated networks don?t seem to statistically >>> capture the targets. I did make the GOF >>> plots >>> as well. Most of the terms look good, though some have peaks >>> that are off from the zero. In general, however, I have come to >>> rely more on actually simulating? networks from the fitted ERGM >>> object (what I think you mean by ?second MCMC run?) in addition >>> to the GOF plots. Usually I consider my goal fulfilled if the >>> simulated network objects capture the targets, even if the GOF >>> plots don?t look perfect. >>> >>> >> There seems to be some confusion here: setting aside exotica, if >> your mean in-model statistics don't match your observed in-model >> statistics, then your model has not converged.? The matching here >> that matters is from an MCMC run from the fitted model, which is >> what the gof() function does (but this is /not/ what you get from >> MCMC diagnostics on the fitted ergm() object, which should show >> the penultimate run - those are handy for diagnosing general MCMC >> issues, but need not show convergence even when the final >> coefficients yield a convergent model).? The plots you linked to >> under "GOF" above seem to be MCMC diagnostics from an ergm() >> object, not gof() output.? I'll come back to "exotica" below, but >> first it is important to be sure that we are discussing the same >> thing. >> >>> Model Convergence and tightening the MCMC tolerances:In terms of >>> tightening the MCMC tolerances, I did increase the MCMC interval >>> to 1e9, of the order of O(N^2). But this particular >>> specification timed out after 120 hours, and I didn?t try to run >>> it for longer time than that. >>> >>> >> Unfortunately, there's no "that's longer than I want it to be" >> card that convinces mixing to happen more quickly: if your model >> is really going to take 1e9 or more updates per step to converge, >> and if that takes longer than 120 hours on your hardware, then >> that's how long it's going to take.? If there were magic sauce >> that could guarantee great convergence with minimal computational >> effort every time, it would already be in the code.? That said, I >> don't know that all other options have yet been ruled out; and, >> when folks encounter expensive models on large networks, there >> are various approximate solutions that they may be willing to >> live with. ? But if one is working with a dependence model on >> >=1e4 nodes, one must except that one may be in a regime in which >> gold-standard MCMC-MLE is very expensive.? Just sayin'. >> >>> Alternate parameters to tighten the MCMC: I have experimented >>> with the MCMC sample size and interval parameters, but have not >>> been able to improve the quality of the simulated network. I am >>> not as familiar with what options are available within the >>> bounds of some reasonable computational cost. >>> >>> In summary, the problem remains that despite the ERGM >>> convergence, the quality of the simulated networks suggests room >>> for improvement, since the specified targets are not captured >>> within the distribution of the simulated networks. >>> >>> >> OK, let me see if I can offer some further advice, based on your >> email and also something that came up in your exchange with Pavel: >> >> 1. We should be clear that, assuming no-exotica, you should be >> assessing convergence from an MCMC run on the fitted model (as >> produced by gof() or done manually).? So far, the plots I've seen >> appear not to be runs from the fitted model, so I have not >> actually seen evidence of the alleged phenomenon. Also, to be >> clear, (absent exotica) if your simulated mean stats don't match >> the observed stats (up to numerical and sometimes statistical >> noise), your model hasn't converged.? A model that isn't >> converging is not the same as a model that has converged but that >> is inadequate, and the fixes are very different. >> >> 2. The exchange with Pavel led me to dig into your code a bit >> more, and I realized that you are not fitting to an observed >> network, but to target stats presumably based on design >> estimation.? This could put you into the "exotica" box, because >> it is likely that - due to errors in your estimated targets - >> there exists no ERGM in your specified family whose expected >> statistics exactly match the target statistics.? So long as they >> aren't too far off, you still ought to be able to get close, but >> hypothetically one could have a situation where someone gets an >> unusually bad estimate for one or a small number of targets, and >> their matches are persistently off; in this case, the issue is >> that the MLE no longer satisfies the first moment condition >> (expected statistics do not match the target statistics), so this >> is no longer a valid criterion for assessing convergence.? If one >> is willing/able to make some distributional statements about >> one's target estimates, there are some natural relaxations of the >> usual convergence criteria, and almost surely Pavel has written >> them down, so I defer to him.? :-)? But anyway, /if/ your model >> really seems not to be converging (by the criteria of (1)), and >> /if/ you are using estimated target stats, then I would certainly >> want to investigate the possibility that your model has actually >> converged (and that you're just seeing measurement error in your >> target stats) before going much further.? To write reckless words >> (that you should read recklessly), one naive heuristic that could >> perhaps be worth trying would be to look at the Z-scores >> (t_o-t_s)/(s2_o^2+s2_s^2)^0.5, where t_o is the observed >> (estimated) target, t_s is the simulated mean statistic, s2_o is >> the standard error of your target estimator, and s2_s is the >> standard error of the simulation mean.? (If you are e.g. using >> Horvitz-Thompson, you can approximate s2_o using standard >> results, and you can likewise use autocorrelation-corrected >> approximations to s2_s.) If these are not large, then this >> suggests that the discrepancies between the targets and the mean >> stats are not very large compared to what you would expect from >> the variation in your simulation outcomes and in your measurement >> process.? This does not take into account e.g. correlations among >> statistics, nor support constraints, but it seems like a >> plausible starting point.? (Pavel and Martina have been working >> with these kinds of problems a lot of late, so doubtless can >> suggest better heuristics.) >> >> 3. Pavel's comments pointed to SAN, which also led me to observe >> that you are starting by fitting to an empty graph.? I recommend >> against that.? In principle, the annealer should get you to a >> not-too-bad starting point, but in my own informal simulation >> tests I have observed that this doesn't always work well if the >> network is very large; in particular, if SAN dumps you out with a >> starting point that is far from equilibrium, you are wasting a >> lot of MCMC steps wandering towards the high-density region of >> the graph space, and this can sometimes lead to poor results >> (especially if you can't afford to run some (large k)*N^2 burn-in >> - and recall that the default MCMC algorithm tends to preserve >> density, so if the seed is poor in that regard, it can take a lot >> of iterations to fix).? My suggestion is to use rgraph() to get a >> Bernoulli graph draw from a model whose mixing characteristics >> (and, above all, density) approximate the target, and start with >> that.? An easy way to set the parameters is to fit a pilot ERGM >> using only independence terms, use these construct a tie >> probability matrix, and pass that to the tp argument of >> rgraph().? Your case makes for a very large matrix, but it's >> still within the range of the feasible.? (rgraph() does not use >> adjacency matrices internally, and so long as you set the return >> value to be an edgelist is not constrained by the sizes of >> feasible adjacency matrices, but if you want to pass an explicit >> tie probability matrix then obviously that puts you in the >> adjacency matrix regime.) Anyway, it's better to use rgraph() for >> this than an simulate() call, because it will be both faster and >> an exact simulation (no MCMC).? A poorer approach not to bother >> with mixing structure, and just to draw an initial state with the >> right density (which at least reduces the risk that SAN exits >> with a graph that is too sparse)....but you might as well put >> your starting point as close to the right neighborhood as you >> can.? The goal here is to help the annealer get you to a >> high-potential graph, rather than expecting it to carry you there >> from a remote location.? It is possible that this turns out not >> to be a problem in your particular case, but it seems worth >> ruling out. >> >> Hope that helps, >> >> -Carter >> >> >>> Aditya >>> >>> >>> On Fri, Aug 30, 2024 at 4:37?AM Carter T. Butts via statnet_help >>> wrote: >>> >>> Hi, Aditya - >>> >>> I'll be interested in Pavel's take on the convergence >>> issues, but just to verify, you are assessing convergence >>> based on a /second/ MCMC run, correct?? The MCMC statistics >>> in the ergm object are from the penultimate iteration, and >>> may thus be out of equilibrium (but this does /not/ >>> necessarily mean that the /model/ did not converge).? >>> However, if you simulate a new set of draws from the fitted >>> model and the mean stats do not match, /then/ you have an >>> issue.? (This is why we now point folks to gof() for that >>> purpose.)? It looks like your plots are from the ergm object >>> and not from a gof() run (or other secondary simulation), so >>> I want to verify that first. >>> >>> I also note that a quick glance at the plots from your more >>> exhaustive simulation case don't seem all that far off, >>> which could indicate either that the model did converge (and >>> per above, we're not looking at a draw from the final >>> model), or that it converged within the tolerances that were >>> set, and you may need to tighten them.? But best to first >>> know if there's a problem in the first place. >>> >>> Another observation is that, per my earlier email, you may >>> need O(N^2) toggles per draw to get good performance if your >>> model has a nontrivial level of dependence.? You are using a >>> thinning interval of 1e6, which is in your case around >>> 30*N.? It's possible that you've got too much dependence for >>> that: O(N^2) here would mean some multiple of about 1e9, >>> which is about a thousand times greater than what you're >>> using. Really large, sparse networks sometimes /can/ be >>> modeled well without that much thinning, but it's not a >>> given.? Relatedly, your trace plots from the 1e6 run suggest >>> a fair amount of autocorrelation on some statistics, which >>> suggests a lack of efficiency. (Autocorrelation by itself >>> isn't necessarily a problem, but it means that your >>> effective MCMC sample size is smaller than it seems, and >>> this can reduce the effectiveness of the MCMCMLE >>> procedure.?? The ones from the 1e6 run aren't bad enough >>> that I would be alarmed, but if I were looking for things to >>> tighten up and knew this could be a problem, they suggest >>> possible room for improvement.)? So anyway, I wouldn't crank >>> this up until verifying that it's needed, but you are still >>> operating on the low end of computational effort (whether it >>> seems like it or not!). >>> >>> Finally, I would note that for the stochastic approximation >>> method, convergence is to some degree (and it's a bit >>> complex) determined by how many subphases are run, and how >>> many iterations are used per subphase.? This algorithm is >>> due to Tom in his classic JoSS paper (but without the >>> complement moves), which is still a good place to look for >>> details.? It is less fancy than some more modern algorithms >>> of its type, but is extremely hard to beat (I've tried and >>> failed more than once!).? In any event, there are several >>> things that can tighten that algorithm relative to its >>> defaults, including increasing thinning, increasing the >>> iterations per subphase, and increasing the number of >>> subphases.? Some of these sharply increase computational >>> cost, because e.g. the number of actual subphase iterations >>> doubles (IIRC) at each subphase - so sometimes one benefits >>> by increasing the phase number but greatly reducing the base >>> number of iterations per phase.? The learning rate >>> ("SA.initial.gain") can also matter, although I would >>> probably avoid messing with it if the model is well-behaved >>> (as here).? I will say that, except under exotic conditions >>> in which I am performing Unspeakable ERGM Experiments (TM) >>> of which we tell neither children nor grad students, I do >>> not recall ever needing to do much with the base parameters >>> - adjusting thinning, as needs must, has almost always done >>> the trick.? Still, if other measures fail, tinkering with >>> these settings can/will certainly affect convergence. >>> >>> I'd check on those things first, and then see if you still >>> have a problem.... >>> >>> Hope that helps, >>> >>> -Carter >>> >>> On 8/29/24 12:13 PM, Khanna, Aditya wrote: >>>> >>>> Hi Carter and All, >>>> >>>> >>>> Thank you so much for the helpful guidance here. I think >>>> following your suggestions has brought us very close to >>>> reproducing the target statistics in the simulated >>>> networks, but there are still some gaps. >>>> >>>> >>>> Our full previous exchange is below, but to summarize:? I >>>> have an ERGM that I fit previously with ERGM v3.10.4 on a >>>> directed network with 32,000 nodes. The model consisted of >>>> in- and out-degrees in addition to other terms, including a >>>> custom distance term. In trying to reproduce this fit with >>>> ergm v4.6, the model did not initially converge. >>>> >>>> >>>> Your suggestion to try setting the main.method = >>>> ?Stochastic Approximation? considerably improved the >>>> fitting. Specifying the convergence detection to >>>> ?Hotelling? on top of that brought us almost to simulated >>>> networks that capture all the mean statistics. (Following >>>> an old discussion thread >>>> on >>>> the statnet github, I also tried setting the termination >>>> criteria to Hummel and MCMLE.effectiveSize = NULL. I think, >>>> for me, in practice, Hotelling worked a bit better than >>>> Hummel though). >>>> >>>> >>>> In general, I tried fitting the model with variants of this >>>> specification. >>>> I got the best results with setting both MCMC >>>> samplesize=1e6 and interval = 1e6 (see table below). >>>> >>>> >>>> MCMC interval >>>> >>>> >>>> >>>> MCMC sample size >>>> >>>> >>>> >>>> Convergence Detection >>>> >>>> >>>> >>>> Results/Outcome >>>> >>>> >>>> >>>> Note >>>> >>>> 1e6 >>>> >>>> >>>> >>>> 1e6 >>>> >>>> >>>> >>>> Hotelling >>>> >>>> >>>> >>>> Closest agreement? between simulated and target statistics >>>> >>>> >>>> >>>> Max. Lik. fit summary and simulation Rout >>>> >>>> >>>> >>>> Violin plots >>>> showing >>>> the simulated and target statistics for each parameter >>>> >>>> >>>> >>>> But, I found that this was the closest I could get >>>> producing simulated statistics that matched the target >>>> statistics. In general, any further increasing or >>>> decreasing of either the samplesize or interval did not >>>> help generate a closer result, i.e., this looked to be some >>>> optimum in the fit parameter space. I can provide further >>>> details on the results of those fits, which for some >>>> configurations didn?t converge, and if they did converge, >>>> the goodness-of-fit was worse than what I had with setting >>>> the MCMC interval and samplesize to 1e6. Based on your >>>> experiences, I was wondering if this is expected? >>>> >>>> >>>> For now, my main question is, are there any suggestions on >>>> how I can further tune the fitting parameters to match my >>>> targets more closely? I can provide specific details on the >>>> outcomes of those fitting processes if that would be helpful. >>>> >>>> >>>> Thanks for your consideration. >>>> >>>> Aditya >>>> >>>> On Thu, May 16, 2024 at 2:33?PM Carter T. Butts via >>>> statnet_help wrote: >>>> >>>> Hi, Aditya - >>>> >>>> I will defer to the mighty Pavel for the exact best >>>> formula to reproduce 3.x fits with the latest >>>> codebase.? (You need to switch convergence detection to >>>> "Hotelling," and there are some other things that must >>>> be modified.)? However, as a general matter, for >>>> challenging models where Geyer-Thompson-Hummel has a >>>> hard time converging (particularly on a large node >>>> set), you may find it useful to try the stochastic >>>> approximation method (main="Stochastic" in your control >>>> argument will activate it).? G-T-H can (in principle) >>>> have sharper convergence when near the solution, but in >>>> practice SA fails more gracefully.?? I would suggest >>>> increasing your default MCMC thinning interval >>>> (MCMC.interval), given your network size; depending on >>>> density, extent of dependence, and other factors, you >>>> may need O(N^2) toggles per step.? It is sometimes >>>> possible to get away with as few as k*N (for some k in, >>>> say, the 5-100 range), but if your model has >>>> substantial dependence and is not exceptionally sparse >>>> then you will probably need to be in the quadratic >>>> regime.? One notes that it can sometimes be helpful >>>> when getting things set up to run "pilot" fits with the >>>> default or otherwise smaller thinning intervals, so >>>> that you can discover if e.g. you have a data issue or >>>> other problem before you spend the waiting time on a >>>> high-quality model fit. >>>> >>>> To put in the obligatory PSA, both G-T-H and SA are >>>> simply different strategies for computing the same >>>> thing (the MLE, in this case), so both are fine - they >>>> just have different engineering tradeoffs.? So use >>>> whichever proves more effective for your model and data >>>> set. >>>> >>>> Hope that helps, >>>> >>>> -Carter >>>> >>>> >>>> On 5/16/24 7:52 AM, Khanna, Aditya via statnet_help wrote: >>>>> Dear Statnet Dev and User Community: >>>>> >>>>> I have an ERGM that I fit previously with ERGM v3.10.4 >>>>> on a directed network with 32,000 nodes. The model >>>>> included in- and out-degrees, in addition to other >>>>> terms. The complete Rout from this fit can be seen >>>>> here >>>>> . >>>>> I am now trying to reproduce this fit with ergm v4.6, >>>>> but the model does not converge. (See here >>>>> .) >>>>> >>>>> I am looking for ideas on how to trouble shoot this. >>>>> One suggestion I got was to set values for the "tuning >>>>> parameters" in the v4.6 to their defaults from >>>>> v3.11.4. But ERGM v4.6 has a lot more ?parameters that >>>>> can be specified, and I am not sure which ones make >>>>> most sense to consider. >>>>> >>>>> I would be grateful for any suggestions on this or >>>>> alternate ideas to try. >>>>> >>>>> Many thanks, >>>>> Aditya >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Aditya S. Khanna, Ph.D. >>>>> >>>>> Assistant Professor >>>>> >>>>> Department of Behavioral and Social Sciences >>>>> >>>>> Center for Alcohol and Addiction Studies >>>>> >>>>> Brown University School of Public Health >>>>> >>>>> Pronouns: he/him/his >>>>> >>>>> >>>>> 401-863-6616 >>>>> >>>>> sph.brown.edu >>>>> >>>>> >>>>> https://vivo.brown.edu/display/akhann16 >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> statnet_help mailing list >>>>> statnet_help@u.washington.edu >>>>> https://urldefense.com/v3/__http://mailman13.u.washington.edu/mailman/listinfo/statnet_help__;!!CzAuKJ42GuquVTTmVmPViYEvSg!KsbhvmLlx8TkLK7y2NKz59hK4-4H7KXVV7dEyUG4vcQzi4Mh7nO-9HupA7_ep2V2p9KkD_i00tcg6nDqczDwObRNh35k$ >>>> _______________________________________________ >>>> statnet_help mailing list >>>> statnet_help@u.washington.edu >>>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >>>> >>>> >>> _______________________________________________ >>> statnet_help mailing list >>> statnet_help@u.washington.edu >>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help >>> >>> >> >> _______________________________________________ >> statnet_help mailing list >> statnet_help@u.washington.edu >> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -- > *********************************************************************************************************************** > Steven M. Goodreau / Professor / Dept. of Anthropology / Adjunct Prof. / Dept. of Epidemiology > (STEE-vun GOOD-roe) / he-him /https://faculty.washington.edu/goodreau / dzidz?lali?, x???l? > Physical address: Denny Hall M236; Mailing address: Campus Box 353100 / 4216 Memorial Way NE > Univ. of Washington / Seattle WA 98195 / 1-206-685-3870 (phone) / 1-206-543-3285 (fax) > "Fight for the things that you care about, but do it in a way that will lead others to join you" - Justice RB Ginsburg > *********************************************************************************************************************** -- *********************************************************************************************************************** Steven M. Goodreau / Professor / Dept. of Anthropology / Adjunct Prof. / Dept. of Epidemiology (STEE-vun GOOD-roe) / he-him /https://faculty.washington.edu/goodreau / dzidz?lali?, x???l? Physical address: Denny Hall M236; Mailing address: Campus Box 353100 / 4216 Memorial Way NE Univ. of Washington / Seattle WA 98195 / 1-206-685-3870 (phone) / 1-206-543-3285 (fax) "Fight for the things that you care about, but do it in a way that will lead others to join you" - Justice RB Ginsburg *********************************************************************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sun Dec 1 14:47:08 2024 From: statnet_help at u.washington.edu (Federico Salvati via statnet_help) Date: Sun Dec 1 14:47:23 2024 Subject: [statnet_help] Question about the time series network data Message-ID: Dear statnet team, As an IR (international relations) practitioner, I would like to analyze interactions within state groups over time. However, I am encountering some major limitations in the models I have available to use so far. My greatest problem is that the number of states making up the networks at each stage changes over time. All the tutorials and the manuals I could put my hands on so far assume that time series network data remain always constant, which is a great drawback for International politics as an assumption. (you know because states just disappear and new ones come to be) Do you have any recommendations on how to deal with this problem? Thank you in advance for your help Federico Salvati -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Sun Dec 1 14:51:52 2024 From: statnet_help at u.washington.edu (James Holland Jones via statnet_help) Date: Sun Dec 1 14:52:01 2024 Subject: [statnet_help] Question about the time series network data In-Reply-To: References: Message-ID: <99659D44-359D-4FF3-9F0F-8BACC09B3CAC@stanford.edu> Check out Zack Almquist and Carter Butts, Dynamic Network Regression: https://depts.washington.edu/zalmquist/articles/almquist_SM.pdf https://cran.r-project.org/web/packages/dnr/index.html Scalable, allows vertex dynamics. -- James Holland Jones Professor, Environmental Behavioral Sciences Stanford Doerr School of Sustainability https://heeh.stanford.edu On Dec 1, 2024, at 2:47 PM, Federico Salvati via statnet_help > wrote: Dear statnet team, As an IR (international relations) practitioner, I would like to analyze interactions within state groups over time. However, I am encountering some major limitations in the models I have available to use so far. My greatest problem is that the number of states making up the networks at each stage changes over time. All the tutorials and the manuals I could put my hands on so far assume that time series network data remain always constant, which is a great drawback for International politics as an assumption. (you know because states just disappear and new ones come to be) Do you have any recommendations on how to deal with this problem? Thank you in advance for your help Federico Salvati _______________________________________________ statnet_help mailing list statnet_help@u.washington.edu http://mailman13.u.washington.edu/mailman/listinfo/statnet_help -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Wed Dec 4 01:54:00 2024 From: statnet_help at u.washington.edu (Jesus Rodriguez Pomeda via statnet_help) Date: Wed Dec 4 01:54:08 2024 Subject: [statnet_help] Please help with an 'ergm.dll' error Message-ID: Dear Sirs, Madams, I just started to work with ergm, because I want to develop an ERGM model on a directed, valued, bimodal network with different attributes for each mode. But I failed to do so because I obtained the following message from my RStudio console: > library(ergm) Error: package or namespace load failed for ?ergm? in inDL(x, as.logical(local), as.logical(now), ...): no es posible cargar el objeto compartido 'C:/Users/MYNAME/AppData/Local/R/win-library/4.4/ergm/libs/x64/ergm.dll': LoadLibrary failure: No se encontr? el proceso especificado. [Translation from Spanish: ?(?) Uncapable to load the shared object ?C:/Users??; (?),?Specified process did not find?] Adem?s: Aviso: package ?ergm? was built under R version 4.4.2 Would you kindly offer some advice? Thanks a lot. All the best, Jes?s Rodr?guez Pomeda Catedr?tico de Universidad (Professor) Departamento de Organizaci?n de Empresas (Department of Management and Organization Studies) Universidad Aut?noma de Madrid ? Campus de Cantoblanco Facultad de Ciencias Econ?micas y Empresariales, E-8-304 ? Avda. Francisco Tom?s y Valiente, 5. 28049 Madrid Tel.: (+34) 91 497 4323 ?- jesus.pomeda@uam.es, www.uam.es Recent publications: ?Motivaciones de los acad?micos espa?oles para publicar en revistas de acceso abierto: un an?lisis sociodemogr?fico?, en Revista Espa?ola de Documentaci?n Cient?fica. Por favor, lea y comparta (acceso abierto): https://redc.revistas.csic.es/index.php/redc/article/view/1555 ?Reflections on the diffusion of Management and Organization Research in the context of open science in Europe? in European Management Journal. Please, read and share (open access): https://www.sciencedirect.com/science/article/pii/S0263237323001093 ?An Essay about a Philosophical Attitude in Management and Organization Studies Based on Parrhesia? in Philosophy of Management. Please, read and share (open access): https://rdcu.be/c9Axj [cid:c707bfda-797d-4265-8d8d-e9ac98b3ff53] LinkedIn | Twitter | Facebook | Youtube ?Eres de la Aut?noma? Hazte AlumniUAM [https://www.uam.es/id/firma/hojita.png]Antes de imprimir este correo piense si es necesario. Cuidemos el medioambiente -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-jxxag5yu.png Type: image/png Size: 18379 bytes Desc: Outlook-jxxag5yu.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-https___ww.png Type: image/png Size: 1016 bytes Desc: Outlook-https___ww.png URL: From statnet_help at u.washington.edu Wed Dec 4 02:19:47 2024 From: statnet_help at u.washington.edu (Steffen Triebel via statnet_help) Date: Wed Dec 4 02:20:04 2024 Subject: [statnet_help] Please help with an 'ergm.dll' error In-Reply-To: References: Message-ID: <1A162D24-F5AB-444A-A0F0-853050CC8DFC@icloud.com> An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Wed Dec 4 02:22:18 2024 From: statnet_help at u.washington.edu (Jaber.belkhiria via statnet_help) Date: Wed Dec 4 02:22:33 2024 Subject: [statnet_help] Please help with an 'ergm.dll' error In-Reply-To: <1A162D24-F5AB-444A-A0F0-853050CC8DFC@icloud.com> References: <1A162D24-F5AB-444A-A0F0-853050CC8DFC@icloud.com> Message-ID: Any chance I can unsubscribe to this mailing list? Le mer. 4 d?c. 2024 ? 10:20, Steffen Triebel via statnet_help < statnet_help@u.washington.edu> a ?crit : > Hi Jes?s, > > That does not seem to be an ergm-specific problem to me but rather a > problem with your R setup or the ergm-package you installed. Are you > otherwise familiar with using RStudio? > > I?m certainly not an expert on this but had my fair share of errors > loading R packages. Here?s a couple of things you could try: > - reinstall the ergm-package including dependencies; make sure to properly > remove it prior to reinstalling > - check if there?s a version mismatch (maybe update RStudio if necessary) > - check if you have the necessary permissions on your system (e.g., read > and write access in the necessary folders) > > For all of these approaches, a quick Google search should give you > detailed instructions on how to go about them. > > Best of luck & best wishes > Steffen > > Am 04.12.2024 um 10:54 schrieb Jesus Rodriguez Pomeda via statnet_help < > statnet_help@u.washington.edu>: > > ? > Dear Sirs, Madams, > > I just started to work with ergm, because I want to develop an ERGM model > on a directed, valued, bimodal network with different attributes for each > mode. > But I failed to do so because I obtained the following message from my > RStudio console: > > > library(ergm) > Error: package or namespace load failed for ?ergm? in inDL(x, > as.logical(local), as.logical(now), ...): > no es posible cargar el objeto compartido > 'C:/Users/MYNAME/AppData/Local/R/win-library/4.4/ergm/libs/x64/ergm.dll': > LoadLibrary failure: No se encontr? el proceso especificado. > [Translation from Spanish: ?(?) Uncapable to load the shared object > ?C:/Users??; (?),?Specified process did not find?] > Adem?s: Aviso: > package ?ergm? was built under R version 4.4.2 > > > Would you kindly offer some advice? > > Thanks a lot. > > All the best, > > > > *Jes?s Rodr?guez Pomeda* > > Catedr?tico de Universidad (*Professor*) > > *Departamento de Organizaci?n de Empresas (Department of Management and > Organization Studies)* > > Universidad Aut?noma de Madrid ? Campus de Cantoblanco > > Facultad de Ciencias Econ?micas y Empresariales, E-8-304 ? Avda. > Francisco Tom?s y Valiente, 5. 28049 Madrid > > Tel.: (+34) 91 497 4323 ?- jesus.pomeda@uam.es > , www.uam.es > > > > *Recent publications:* > > *?Motivaciones de los acad?micos espa?oles para publicar en revistas de > acceso abierto: un an?lisis sociodemogr?fico?,* > > *en **Revista Espa?ola de Documentaci?n Cient?fica*. Por favor, lea y > comparta (acceso abierto): *https://redc.revistas.csic.es/index.php/redc/article/view/1555 > * > > *?Reflections on the diffusion of Management and Organization Research in > the context of open science in Europe?* > > in *European Management Journal.* *Please, read and share (open access):* *https://www.sciencedirect.com/science/article/pii/S0263237323001093 > * > > *?An Essay about a Philosophical Attitude in Management and Organization > Studies Based on Parrhesia?* > > *in **Philosophy of Management**. Please, read and share (open access): * *https://rdcu.be/c9Axj > * > > > > > > > > LinkedIn > | > Twitter > | > Facebook > | > Youtube > > ?Eres de la Aut?noma? Hazte AlumniUAM > > > Antes de imprimir este correo piense si es necesario. Cuidemos el > medioambiente > > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > > _______________________________________________ > statnet_help mailing list > statnet_help@u.washington.edu > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statnet_help at u.washington.edu Wed Dec 4 04:56:57 2024 From: statnet_help at u.washington.edu (=?UTF-8?Q?Micha=C5=82_Bojanowski?= via statnet_help) Date: Wed Dec 4 04:57:13 2024 Subject: [statnet_help] Please help with an 'ergm.dll' error In-Reply-To: References: <1A162D24-F5AB-444A-A0F0-853050CC8DFC@icloud.com> Message-ID: On Wed, Dec 4, 2024 at 11:22?AM Jaber.belkhiria via statnet_help wrote: > > Any chance I can unsubscribe to this mailing list? To unsubscribe, see the bottom of https://mailman13.u.washington.edu/mailman/listinfo/statnet_help Best, Michal