BookmarkSubscribeRSS Feed
adm
Calcite | Level 5 adm
Calcite | Level 5

I have been trying to enter a list of network graph edge data (from -> to) into SAS Enterprise Miner and use the 'link analysis' node to calculate the degree and centrality measures for each vertex, which I would then like to export.

 

Is the 'link analysis' node used in this way?

 

Can someone tell me if this is possible?

 

input file

[from, to]

tom, linda

tom, bob

linda, bob

 

output file

[name, in-degree, out-degree, ...]

tom, 2, 1, 1, ...

bob, 2, 2, 0, ...

linda, 2, 1, 1, ...

6 REPLIES 6
yeliu
SAS Employee

Link Analysis node is designed not to take graph data directly since we do not want some function overlap(centralility measures and community detection) with social network solution.

But…

there is indeed some workaround: (however, this example can only reconstruct the links data, but not the weight of node data.)

 

1)      ID, Target are here: ID – fake ID – one ID for one pair[=link]. Target – we write here ‘from’ and ‘to’ nodes.

2)      Proc assoc starts to find a rules. In current situation it will find all pairs within ID, and its in turn ‘fromèto’ rules. LA nodes sends to proc optgraph frequency of rule as a weight, this is why I add loop for replicating pairs as many time as needed to express weight of link.

3)      Result: pairs[=rules] are links: in this case its original from-to links, nodes are Targets and here it will be unique set of {from}U{to} values – i.e. nodes that have links (at least one).

 

Also, I tested code and it is easy to see that graph inside LA nodes are the same as original before my transformation (first 30 links printed, sorted by from and to vars).

 

ORIG GRAPH BEFORE TRANSFORMATION

 

GRAPH RECONSTRACTED INSIDE LA NODE

Obs

weight

from

to

 

Obs

weight

from

to

1

6.00

0

858

 

1

6.00

0

858

2

6.00

0

872

 

2

6.00

0

872

3

6.00

0

874

 

3

6.00

0

874

4

6.00

0

88

 

4

6.00

0

88

5

6.00

0

934

 

5

6.00

0

934

6

6.00

0

95

 

6

6.00

0

95

7

6.00

0

954

 

7

6.00

0

954

8

6.00

0

959

 

8

6.00

0

959

9

7.00

1

452

 

9

7.00

1

452

10

6.00

1

841

 

10

6.00

1

841

11

6.00

1

848

 

11

6.00

1

848

12

6.00

1

90

 

12

6.00

1

90

13

6.00

1

941

 

13

6.00

1

941

14

6.00

1

947

 

14

6.00

1

947

15

6.00

1

952

 

15

6.00

1

952

16

6.00

1

955

 

16

6.00

1

955

17

7.00

10

154

 

17

7.00

10

154

18

6.00

10

856

 

18

6.00

10

856

19

6.00

10

901

 

19

6.00

10

901

20

6.00

10

907

 

20

6.00

10

907

21

6.00

100

868

 

21

6.00

100

868

22

6.00

100

895

 

22

6.00

100

895

23

6.00

100

901

 

23

6.00

100

901

24

6.00

100

924

 

24

6.00

100

924

25

6.00

100

945

 

25

6.00

100

945

26

6.00

100

962

 

26

6.00

100

962

27

7.00

1000

181

 

27

7.00

1000

181

28

6.00

1000

911

 

28

6.00

1000

911

29

6.00

1000

913

 

29

6.00

1000

913

30

7.00

101

782

 

30

7.00

101

782

 

Here is a small example:

Say you have links data:

A->B 4

C->D 1

A->E 3

 

You need to transform it to transactional data like below and feed the transactional data into LA node.

ID Target

1 A

1 B

2 A

2 B

3 A

3 B

4 A

4 B

---------4 (A,B)

5 C

5 D

---------1 (C,D)

6 A

6 E

7 A

7 E

8 A

8 E

----------3 (A, E)

 

Hope it helps,

Ye

adm
Calcite | Level 5 adm
Calcite | Level 5
Thanks, Can the metrics then be exported? (degree, betweenness, closeness, etc.)
yeliu
SAS Employee

If the LA node run successfully, you can export the centrality mesures table. But the workaround does not work perfectly with large links data since the proc assoc(which generate links from transactional data) may run out of memory.

adm
Calcite | Level 5 adm
Calcite | Level 5

Ye, I have run into another problem. After transforming the graph data to transactional data (csv format), I import with the "File Import" node (and change the Role to Transaction), then connect the LA node, which gives the following Run Status: "Run time error was encountered. Please see the log in the node Results window for more details. Diagram: transactions." The log errors start when the OPTGRAPH procedure starts and the data sets are incomplete with 0 observations and 0 variables.

yeliu
SAS Employee

Could you please send the full log to ye.liu@sas.com?

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1448 views
  • 0 likes
  • 2 in conversation