Loading svm-light files in SAS

Reply
New Contributor
Posts: 3

Loading svm-light files in SAS

Dear SAS community,

My issue: I am trying to upload a data file of type svm-light. I have attached an example of such a file (which I downloaded from the UCI Machine Learning Dataset). I would like to upload such a file into SAS but I am at a loss as to how to do this.

What I intend to do with the data set: I would like to analyze this dataset using GAM. I pretty sure the data is provided in this svm-light form because the data matrix is extremely sparse.

I googled this topic in various ways and could not find a solution. I sincerely apologize if this has already been solved, and I missed it.

Thank you in advance

Attachment
Super User
Super User
Posts: 6,497

Re: Loading svm-light files in SAS

Not sure what that data represents , but you could read it into a vertical table pretty easily.

I am not sure what the first column represents, since it seems to +/- in the front I will call it OFFSET. 

The rest appear to be index:value pairs.  You can read those by using space and colon as the delimiter.

data want ;

  length row col value offset 8;

  infile 'Day120.svm' dlm=' :' truncover lrecl=1000000 ;

  input offset @ ;

  row+1;

  do until (col=.) ;

     input col value @ ;

     if col ne . then output;

  end;

run;

New Contributor
Posts: 3

Re: Loading svm-light files in SAS

Thank you very much Tom. I will give your solution a try very soon. And to your point, I should have elaborated on the data more. This data set is a 20000 x 3231961 matrix that categorizes websites as either benign or malicious. Each row represents a website and the +3 million columns describe website features.

  • The first column of +1/-1 (the response) is there for classification purposes. This indicates whether the website is malicious or benign
  • The remaining columns are a combination of categorical {0,1} and real-valued features.

Thank you again,

Gary

Ask a Question
Discussion stats
  • 2 replies
  • 196 views
  • 3 likes
  • 2 in conversation