BookmarkSubscribeRSS Feed
ggramajo
Calcite | Level 5

Dear SAS community,

My issue: I am trying to upload a data file of type svm-light. I have attached an example of such a file (which I downloaded from the UCI Machine Learning Dataset). I would like to upload such a file into SAS but I am at a loss as to how to do this.

What I intend to do with the data set: I would like to analyze this dataset using GAM. I pretty sure the data is provided in this svm-light form because the data matrix is extremely sparse.

I googled this topic in various ways and could not find a solution. I sincerely apologize if this has already been solved, and I missed it.

Thank you in advance

2 REPLIES 2
Tom
Super User Tom
Super User

Not sure what that data represents , but you could read it into a vertical table pretty easily.

I am not sure what the first column represents, since it seems to +/- in the front I will call it OFFSET. 

The rest appear to be index:value pairs.  You can read those by using space and colon as the delimiter.

data want ;

  length row col value offset 8;

  infile 'Day120.svm' dlm=' :' truncover lrecl=1000000 ;

  input offset @ ;

  row+1;

  do until (col=.) ;

     input col value @ ;

     if col ne . then output;

  end;

run;

ggramajo
Calcite | Level 5

Thank you very much Tom. I will give your solution a try very soon. And to your point, I should have elaborated on the data more. This data set is a 20000 x 3231961 matrix that categorizes websites as either benign or malicious. Each row represents a website and the +3 million columns describe website features.

  • The first column of +1/-1 (the response) is there for classification purposes. This indicates whether the website is malicious or benign
  • The remaining columns are a combination of categorical {0,1} and real-valued features.

Thank you again,

Gary

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 480 views
  • 3 likes
  • 2 in conversation