BookmarkSubscribeRSS Feed
ggramajo
Calcite | Level 5

Dear SAS community,

My issue: I am trying to upload a data file of type svm-light. I have attached an example of such a file (which I downloaded from the UCI Machine Learning Dataset). I would like to upload such a file into SAS but I am at a loss as to how to do this.

What I intend to do with the data set: I would like to analyze this dataset using GAM. I pretty sure the data is provided in this svm-light form because the data matrix is extremely sparse.

I googled this topic in various ways and could not find a solution. I sincerely apologize if this has already been solved, and I missed it.

Thank you in advance

2 REPLIES 2
Tom
Super User Tom
Super User

Not sure what that data represents , but you could read it into a vertical table pretty easily.

I am not sure what the first column represents, since it seems to +/- in the front I will call it OFFSET. 

The rest appear to be index:value pairs.  You can read those by using space and colon as the delimiter.

data want ;

  length row col value offset 8;

  infile 'Day120.svm' dlm=' :' truncover lrecl=1000000 ;

  input offset @ ;

  row+1;

  do until (col=.) ;

     input col value @ ;

     if col ne . then output;

  end;

run;

ggramajo
Calcite | Level 5

Thank you very much Tom. I will give your solution a try very soon. And to your point, I should have elaborated on the data more. This data set is a 20000 x 3231961 matrix that categorizes websites as either benign or malicious. Each row represents a website and the +3 million columns describe website features.

  • The first column of +1/-1 (the response) is there for classification purposes. This indicates whether the website is malicious or benign
  • The remaining columns are a combination of categorical {0,1} and real-valued features.

Thank you again,

Gary

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 495 views
  • 3 likes
  • 2 in conversation