I have been trying to enter a list of network graph edge data (from -> to) into SAS Enterprise Miner and use the 'link analysis' node to calculate the degree and centrality measures for each vertex, which I would then like to export.
Is the 'link analysis' node used in this way?
Can someone tell me if this is possible?
input file
[from, to]
tom, linda
tom, bob
linda, bob
output file
[name, in-degree, out-degree, ...]
tom, 2, 1, 1, ...
bob, 2, 2, 0, ...
linda, 2, 1, 1, ...
Link Analysis node is designed not to take graph data directly since we do not want some function overlap(centralility measures and community detection) with social network solution.
But…
there is indeed some workaround: (however, this example can only reconstruct the links data, but not the weight of node data.)
1) ID, Target are here: ID – fake ID – one ID for one pair[=link]. Target – we write here ‘from’ and ‘to’ nodes.
2) Proc assoc starts to find a rules. In current situation it will find all pairs within ID, and its in turn ‘fromèto’ rules. LA nodes sends to proc optgraph frequency of rule as a weight, this is why I add loop for replicating pairs as many time as needed to express weight of link.
3) Result: pairs[=rules] are links: in this case its original from-to links, nodes are Targets and here it will be unique set of {from}U{to} values – i.e. nodes that have links (at least one).
Also, I tested code and it is easy to see that graph inside LA nodes are the same as original before my transformation (first 30 links printed, sorted by from and to vars).
ORIG GRAPH BEFORE TRANSFORMATION |
GRAPH RECONSTRACTED INSIDE LA NODE |
|||||||
Obs |
weight |
from |
to |
Obs |
weight |
from |
to |
|
1 |
6.00 |
0 |
858 |
1 |
6.00 |
0 |
858 |
|
2 |
6.00 |
0 |
872 |
2 |
6.00 |
0 |
872 |
|
3 |
6.00 |
0 |
874 |
3 |
6.00 |
0 |
874 |
|
4 |
6.00 |
0 |
88 |
4 |
6.00 |
0 |
88 |
|
5 |
6.00 |
0 |
934 |
5 |
6.00 |
0 |
934 |
|
6 |
6.00 |
0 |
95 |
6 |
6.00 |
0 |
95 |
|
7 |
6.00 |
0 |
954 |
7 |
6.00 |
0 |
954 |
|
8 |
6.00 |
0 |
959 |
8 |
6.00 |
0 |
959 |
|
9 |
7.00 |
1 |
452 |
9 |
7.00 |
1 |
452 |
|
10 |
6.00 |
1 |
841 |
10 |
6.00 |
1 |
841 |
|
11 |
6.00 |
1 |
848 |
11 |
6.00 |
1 |
848 |
|
12 |
6.00 |
1 |
90 |
12 |
6.00 |
1 |
90 |
|
13 |
6.00 |
1 |
941 |
13 |
6.00 |
1 |
941 |
|
14 |
6.00 |
1 |
947 |
14 |
6.00 |
1 |
947 |
|
15 |
6.00 |
1 |
952 |
15 |
6.00 |
1 |
952 |
|
16 |
6.00 |
1 |
955 |
16 |
6.00 |
1 |
955 |
|
17 |
7.00 |
10 |
154 |
17 |
7.00 |
10 |
154 |
|
18 |
6.00 |
10 |
856 |
18 |
6.00 |
10 |
856 |
|
19 |
6.00 |
10 |
901 |
19 |
6.00 |
10 |
901 |
|
20 |
6.00 |
10 |
907 |
20 |
6.00 |
10 |
907 |
|
21 |
6.00 |
100 |
868 |
21 |
6.00 |
100 |
868 |
|
22 |
6.00 |
100 |
895 |
22 |
6.00 |
100 |
895 |
|
23 |
6.00 |
100 |
901 |
23 |
6.00 |
100 |
901 |
|
24 |
6.00 |
100 |
924 |
24 |
6.00 |
100 |
924 |
|
25 |
6.00 |
100 |
945 |
25 |
6.00 |
100 |
945 |
|
26 |
6.00 |
100 |
962 |
26 |
6.00 |
100 |
962 |
|
27 |
7.00 |
1000 |
181 |
27 |
7.00 |
1000 |
181 |
|
28 |
6.00 |
1000 |
911 |
28 |
6.00 |
1000 |
911 |
|
29 |
6.00 |
1000 |
913 |
29 |
6.00 |
1000 |
913 |
|
30 |
7.00 |
101 |
782 |
30 |
7.00 |
101 |
782 |
Here is a small example:
Say you have links data:
A->B 4
C->D 1
A->E 3
You need to transform it to transactional data like below and feed the transactional data into LA node.
ID Target
1 A
1 B
2 A
2 B
3 A
3 B
4 A
4 B
---------4 (A,B)
5 C
5 D
---------1 (C,D)
6 A
6 E
7 A
7 E
8 A
8 E
----------3 (A, E)
Hope it helps,
Ye
If the LA node run successfully, you can export the centrality mesures table. But the workaround does not work perfectly with large links data since the proc assoc(which generate links from transactional data) may run out of memory.
Ye, I have run into another problem. After transforming the graph data to transactional data (csv format), I import with the "File Import" node (and change the Role to Transaction), then connect the LA node, which gives the following Run Status: "Run time error was encountered. Please see the log in the node Results window for more details. Diagram: transactions." The log errors start when the OPTGRAPH procedure starts and the data sets are incomplete with 0 observations and 0 variables.
Could you please send the full log to ye.liu@sas.com?
If you want to learn how to use Link Analysis node, here is a paper: https://support.sas.com/rnd/app/data-mining/enterprise-miner/papers/2014/linkAnalysis2014.pdf
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.