BookmarkSubscribeRSS Feed
ggramajo
Calcite | Level 5

Dear SAS community,

My issue: I am trying to upload a data file of type svm-light. I have attached an example of such a file (which I downloaded from the UCI Machine Learning Dataset). I would like to upload such a file into SAS but I am at a loss as to how to do this.

What I intend to do with the data set: I would like to analyze this dataset using GAM. I pretty sure the data is provided in this svm-light form because the data matrix is extremely sparse.

I googled this topic in various ways and could not find a solution. I sincerely apologize if this has already been solved, and I missed it.

Thank you in advance

2 REPLIES 2
Tom
Super User Tom
Super User

Not sure what that data represents , but you could read it into a vertical table pretty easily.

I am not sure what the first column represents, since it seems to +/- in the front I will call it OFFSET. 

The rest appear to be index:value pairs.  You can read those by using space and colon as the delimiter.

data want ;

  length row col value offset 8;

  infile 'Day120.svm' dlm=' :' truncover lrecl=1000000 ;

  input offset @ ;

  row+1;

  do until (col=.) ;

     input col value @ ;

     if col ne . then output;

  end;

run;

ggramajo
Calcite | Level 5

Thank you very much Tom. I will give your solution a try very soon. And to your point, I should have elaborated on the data more. This data set is a 20000 x 3231961 matrix that categorizes websites as either benign or malicious. Each row represents a website and the +3 million columns describe website features.

  • The first column of +1/-1 (the response) is there for classification purposes. This indicates whether the website is malicious or benign
  • The remaining columns are a combination of categorical {0,1} and real-valued features.

Thank you again,

Gary

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 674 views
  • 3 likes
  • 2 in conversation