BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ericdrosano
Obsidian | Level 7

My data comprises multiple IDs and grades for those IDs in multiple terms. I want to condense the dataset so that each unique ID has only one line while all of the columns retain the full grade information. 

 

It currently looks like this:

ID SP20 SM20 FA20 SP21 SM21 FA21 SP22
1001             A+
1002       D      
1002           F  
1003     B        
1004 A+            
1005     B-        
1006           A+  
1007     F        
1007         B+    

 

Please note that IDs 1002 and 1007 each have two lines because they took the course twice, i.e., in two different semesters.

I want it to become this:

 

ID SP20 SM20 FA20 SP21 SM21 FA21 SP22
1001             A+
1002       D   F  
1003     B        
1004 A+            
1005     B-        
1006           A+  
1007     F   B+    

 

Please note that IDs 1002 and 1007 are now on only one line while the grades from each semester are now included.

Attached is a csv of the same data above.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Something like this should work:

proc sql;
   create table UniqueId as
   select distinct Id 
   from yourdatasetname
   ;
quit;

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update uniqueid yourdatasetname;
   by id;
run;

Note that if the "grades" were numeric values this could be done with proc summary in one step.

 

It is best to provide data in the form of a working data step. If I have to make a data set from another format I may may things that don't actually match your variable types and may not have the same variable names.

 

 

View solution in original post

3 REPLIES 3
ballardw
Super User

Something like this should work:

proc sql;
   create table UniqueId as
   select distinct Id 
   from yourdatasetname
   ;
quit;

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update uniqueid yourdatasetname;
   by id;
run;

Note that if the "grades" were numeric values this could be done with proc summary in one step.

 

It is best to provide data in the form of a working data step. If I have to make a data set from another format I may may things that don't actually match your variable types and may not have the same variable names.

 

 

ericdrosano
Obsidian | Level 7
Will send proper format of data next time.
I wish I could use numeric values, but it's more complicated than just A-F scale, including QN, INC, NC, P, and many more "grades".
Tom
Super User Tom
Super User

You don't need the separate dataset.  Just use the one dataset for both of the inputs that the UPDATE statement needs.  Use the OBS=0 dataset option to start with 0 observations in the original dataset and treat all of the observations as transactions to it.

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update yourdatasetname(obs=0) yourdatasetname;
   by id;
run;

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 526 views
  • 1 like
  • 3 in conversation