I've playing around with Proc OptGraph recently. I have a data set consisting of district representatives in CA and the committees that they serve on. Here are the first 10 observations sorted by committee assignment.
Obs District Name Committee Assignment
1 16th Costa, Jim Agriculture
2 10th Denham, Jeff Agriculture
3 1st LaMalfa, Doug Agriculture
4 20th Panetta, Jimmy Agriculture
5 31st Aguilar, Pete Appropriations
6 42nd Calvert, Ken Appropriations
7 13th Lee, Barbara Appropriations
8 40th Roybal-Allard, Lucille Appropriations
9 21st Valadao, David Appropriations
10 24th Carbajal, Salud Armed Services
I've analyzed the bipartite graph whose nodes come from the name and committee assignment variables. Edges appear from name to committee assignment if that member serves on said committee.
I now want to create a new graph whose nodes are the names and an edge exists between two names if they sit on a common committee. For example, Agriculture will generate 4 choose 2= 6 edges in the new graph, one of which is Costa-Denham. So, I want to create a new data set from my current data set with two variables:rep1 and rep2. I think that I want to use nested for loops to check committee assignment for every pair of observations and if they match then output the observation names into my new data set as rep1 and rep2. My first 7 observations in the new data set would be:
Obs rep1 rep2
1 Costa Denham
2 Costa LaMalfa
3 Costa Panetta
4 Denham LaMalfa
5 Denham Panetta
6. LaMalfa Panetta
7 Aguilar Calvert
What I don't know how to do, if possible in SAS, is look at the specific entries in observations to compare and then write to a new data set. Suggestions?
And here's another approach that uses PROC OPTGRAPH:
data names(keep=node source);
source = 1;
proc optgraph links=bipartite data_nodes_sub=names;
links_var from=Name to=CommitteeAssignment;
shortpath out_weights=outdata(rename=(source=rep1 sink=rep2) where=(rep1 < rep2 and path_weight=2));
Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.
Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.