BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ericdrosano
Obsidian | Level 7

My data comprises multiple IDs and grades for those IDs in multiple terms. I want to condense the dataset so that each unique ID has only one line while all of the columns retain the full grade information. 

 

It currently looks like this:

ID SP20 SM20 FA20 SP21 SM21 FA21 SP22
1001             A+
1002       D      
1002           F  
1003     B        
1004 A+            
1005     B-        
1006           A+  
1007     F        
1007         B+    

 

Please note that IDs 1002 and 1007 each have two lines because they took the course twice, i.e., in two different semesters.

I want it to become this:

 

ID SP20 SM20 FA20 SP21 SM21 FA21 SP22
1001             A+
1002       D   F  
1003     B        
1004 A+            
1005     B-        
1006           A+  
1007     F   B+    

 

Please note that IDs 1002 and 1007 are now on only one line while the grades from each semester are now included.

Attached is a csv of the same data above.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Something like this should work:

proc sql;
   create table UniqueId as
   select distinct Id 
   from yourdatasetname
   ;
quit;

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update uniqueid yourdatasetname;
   by id;
run;

Note that if the "grades" were numeric values this could be done with proc summary in one step.

 

It is best to provide data in the form of a working data step. If I have to make a data set from another format I may may things that don't actually match your variable types and may not have the same variable names.

 

 

View solution in original post

3 REPLIES 3
ballardw
Super User

Something like this should work:

proc sql;
   create table UniqueId as
   select distinct Id 
   from yourdatasetname
   ;
quit;

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update uniqueid yourdatasetname;
   by id;
run;

Note that if the "grades" were numeric values this could be done with proc summary in one step.

 

It is best to provide data in the form of a working data step. If I have to make a data set from another format I may may things that don't actually match your variable types and may not have the same variable names.

 

 

ericdrosano
Obsidian | Level 7
Will send proper format of data next time.
I wish I could use numeric values, but it's more complicated than just A-F scale, including QN, INC, NC, P, and many more "grades".
Tom
Super User Tom
Super User

You don't need the separate dataset.  Just use the one dataset for both of the inputs that the UPDATE statement needs.  Use the OBS=0 dataset option to start with 0 observations in the original dataset and treat all of the observations as transactions to it.

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update yourdatasetname(obs=0) yourdatasetname;
   by id;
run;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1537 views
  • 1 like
  • 3 in conversation