Solved: Re: SAS EM: How to get the most important peers and calculating Centra...

nelson_lee · Posted 09-07-2017 12:29 AM

hi all,

I have the following dataset and I want to find out the most important Peers / neibourhood for each "Person" on "From". And I want to calculate the different type of Centrality for those Peers. May I know how?

From To Transactions Count Transaction Amount

A B 10 10000

A C 5 2000

B A 2 100000

.....

I have tried using "Link Analysis" with the following setting for my datasets, but it pop up error

Dataset's Role set to be "Transaction"
"From" set to be "Referrer"
"To" to be "Target"
"Transaction Count" to be "Sequence"

Please help

DougWielenga · Posted 09-11-2017 12:39 PM

There are two things that might be problematic with your transactional approach.

1 - there are several rows typically associated with each transaction id -- you have a separate transaction for each row which is fine if you want to treat each row as being completely separate

2 - I experimented with some mock data and figured out that I needed to use a different variable name than 'To'. Using the same data after changing the variable name 'To' to the name 'Towards' allowed it to run.

Also, you are including the value Txn_Amount in your data but it is not going to be used in Link Analysis. There is no frequency or weight variable that is used since I get the same results either way. Here are the first few rows of my test data:

From Towards ID Seq

A U 1 1
U K 1 2
K U 1 3
U M 1 4
A U 5 1
U K 5 2
K U 5 3
U O 5 4
O O 5 5
A U 10 1
U U 10 2
O O 10 3
A Y 13 1
Y K 13 2
K Y 13 3
Y Y 13 4

You will notice that there are several rows per id and the sequence variable restarts at 1 for each new id. Of course, any set of ordinally equivalent sequence values should be similar.

Hope this helps!

Doug

View solution in original post

DougWielenga · Posted 09-08-2017 11:55 AM

The Link Analysis node in SAS Enterprise Miner is capable of calculating centrality measures for transactional data (multiple rows per 'transaction' as identified by a transaction id field) or observational data (each row corresponds to an entire 'transaction' or sequence), but you appear to have a summarized data which does not fit either input format. Do you have the data set that was used to create your summary data? If so, it should be fairly easy to specify the appropriate roles for the data.

It seems like you are viewing each distinct value in the 'From' or 'To' field as a 'person' and that you are summarizing the total number of transactions and the total sum of those transactions for each combination of individuals. If you have a data set containing all of the transactions and their amount rather than the summarized values, you would specify

From ---> Input

To ---> Target

but would also need to set the value of

Session_ID --> ID

Session_Sequence --> Sequence

but there is no role to specify for the amount of the transaction. The number of transactions would be captured when the data is summarized by SAS Enterprise Miner since it would count the transactions as it processes the data. You can find more about the Link Analysis node by opening SAS Enterprise Miner and clicking on

Help --> Contents

and then navigating in the panel on the left to

Node Reference

Explore

Link Analysis Node

After clicking on Link Analysis Node in the panel on the left, you can select from several relevant links in the panel on the right including

Input Data Requirements for the Link Analysis Node

Link Analysis Node Properties

Link Analysis Node Train Properties: Centrality Measures Properties

Link Analysis Node Examples

I hope this helps!

Doug

nelson_lee · Posted 09-11-2017 07:52 AM

Thanks Doug,

It's really help, while I receive below Error after changing my Dataset and Roles.

Now my dataset becomes

From To Txn_Amt Txn_ID Seq_ID

A B 1000 1 1

A B 10000 2 2

A C 10000 3 3

B C 10000 4 1

B C 10000 5 2

B D 10000 6 3

And I set the Dataset role as "Transaction"
"From" --> Input

"To" --> Target

"Txn_Amt" --> Input

"Txn_ID" --> ID

"Seq_ID" --> Sequence

I then link the Dataset to "Link Analysis Node" and Run, while it pops up below error

Error: Minimum support level is either too high to detect any rules or too low that runs into out of memory issue.

Given that my EM_TRAIN_MAXLEVELS = 100,000

Thanks,

Nelson

DougWielenga · Posted 09-11-2017 12:39 PM

There are two things that might be problematic with your transactional approach.

1 - there are several rows typically associated with each transaction id -- you have a separate transaction for each row which is fine if you want to treat each row as being completely separate

2 - I experimented with some mock data and figured out that I needed to use a different variable name than 'To'. Using the same data after changing the variable name 'To' to the name 'Towards' allowed it to run.

Also, you are including the value Txn_Amount in your data but it is not going to be used in Link Analysis. There is no frequency or weight variable that is used since I get the same results either way. Here are the first few rows of my test data:

From Towards ID Seq

A U 1 1
U K 1 2
K U 1 3
U M 1 4
A U 5 1
U K 5 2
K U 5 3
U O 5 4
O O 5 5
A U 10 1
U U 10 2
O O 10 3
A Y 13 1
Y K 13 2
K Y 13 3
Y Y 13 4

You will notice that there are several rows per id and the sequence variable restarts at 1 for each new id. Of course, any set of ordinally equivalent sequence values should be similar.

Hope this helps!

Doug

nelson_lee · Posted 09-12-2017 03:04 AM

Great thanks Doug,

Following your instructions, I can get the result I want, many thanks

my dataset now becomes the following and there is no error

From Toward Reference_No Seq_ID

A B xxxxxx1 1

A B xxxxxx2 2

A C xxxxxx3 3

B C xxxxxx4 1

B C xxxxxx5 2

B D xxxxxx6 3

........

Thank again,

Nelson

SAS EM: How to get the most important peers and calculating Centrality

Re: SAS EM: How to get the most important peers and calculating Centrality

Re: SAS EM: How to get the most important peers and calculating Centrality

Re: SAS EM: How to get the most important peers and calculating Centrality

Re: SAS EM: How to get the most important peers and calculating Centrality

Re: SAS EM: How to get the most important peers and calculating Centrality