Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-09-2014 08:04 AM
(1814 views)

Hello,

Currently we have some problemes with using SAS.

We have used the following code

PROC IMPORT OUT= WORK.FRAUD

DATAFILE= "C:\..."

DBMS=CSV REPLACE;

GETNAMES=YES;

DATAROW=2;

RUN;

This datafile contains 3958352 records.

Now we want to make a sparse matrix from this datafile.

Do you guys know how we can do this?

We found some example code on the internet:

x = {3 1.1 0 0 ,

1.1 4 0 3.2,

0 1 10 0 ,

0 3.2 0 3 };

a = sparse(x, "sym");

print a[colname={"Value" "Row" "Col"}];

But we don't know how to say that x containts the dataset work.fraud.

We couldn't find anything on the sas community.

Kind regards

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

*Star Wars* or *Lord of the Rings*. But for huge networks with hundred of thousands or millions of nodes, special purpose tools such as SAS Social Network Analysis and SAS Fraud Network Analysis | SAS are more efficient.

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This might help.

input x1-x4;

cards;

3 1.1 0 0

1.1 4 0 3.2

0 1 10 0

0 3.2 0 3

;;;;

use xdata;

read all var _num_ into x;

close xdata;

print x;

a = sparse(x, "sym");

print a[colname={"Value" "Row" "Col"}];

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

We want to do some calculations like pagerank, cliques, betweeness but before we can do this, we need a sparse matrix dataset.

Here you can see a few lines of the matrix (3 columns)

"Row","Col","Value"

830,3,1

852,3,1

52591,3,1

114337,3,1

148326,3,1

196849,3,1

...

So how can we make a sparse matrix of this , so we can do our calculations on sparse matrix dataset?

Kind regards

Bart

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Based on your response, see the discussion at https://communities.sas.com/message/204360#204360

You might also want to read the papers by Hector Rodriguez-Deniz

in the proceedings of SAS Global forum 2012 and 2013.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello we can't use SNA, since we don't have a license for it.

We should calculate pagerank / betweenness / cliques / pagerank for our (fraud) matrix, but we can't do this since we cannot make a sparse matrix.

Any suggestions?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I suggest you think carefully about what you are attempting. Whether a sparse representation of a 4M x 4M matrix can even fit in memory depends on the percentage of nonzero elements and the RAM of your system. The matrix has 1.6 x 10^13 elements. If 1% of those are nonzero, you are still looking at 1.6 TRILLION elements. Stored as a sparse matrix, this requires 4.8 trillion doubles, which is 38.4 trillion bytes, which is about 38.4 TERABYTES of data.

That amount of RAM is needed just to store the data. The algorithms that you mention have nontrivial computational complexity and require additional memory. This is why many people use specialized tools and algorithms for network analysis of very large networks.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Since our matrix is just 400 000 x 400 000 but I guess that is also too big?

Are these calculations right:

400 000 x 400 000 = 1,6 x 10^11 and we have 0,4% nonzero elements so we are looking for 640000000 items.

Stored as a sparse matrix, this requires 1920000000 doubles which is 1,79 TB of data.

Am I right, so we could only juse specialized tools and algorithms for network analysis?

Thanks for answering.

Bart

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

*Star Wars* or *Lord of the Rings*. But for huge networks with hundred of thousands or millions of nodes, special purpose tools such as SAS Social Network Analysis and SAS Fraud Network Analysis | SAS are more efficient.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks!

We would like to calculate 4 metrics for a small network (only 15x15 nodes).

We would like to calculate betweenness centrality , pagerank centrality, cliques, assortativity.

Is it possible to calculate them without using IML?

Currently it is not possible we guess.

Regards

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.