Solved: Condense multiple observations into one retaining all variables

ericdrosano · Posted 06-13-2024 01:20 PM

My data comprises multiple IDs and grades for those IDs in multiple terms. I want to condense the dataset so that each unique ID has only one line while all of the columns retain the full grade information.

It currently looks like this:

ID	SP20	SM20	FA20	SP21	SM21	FA21	SP22
1001							A+
1002				D
1002						F
1003			B
1004	A+
1005			B-
1006						A+
1007			F
1007					B+

Please note that IDs 1002 and 1007 each have two lines because they took the course twice, i.e., in two different semesters.

I want it to become this:

ID	SP20	SM20	FA20	SP21	SM21	FA21	SP22
1001							A+
1002				D		F
1003			B
1004	A+
1005			B-
1006						A+
1007			F		B+

Please note that IDs 1002 and 1007 are now on only one line while the grades from each semester are now included.

Attached is a csv of the same data above.

ballardw · Posted 06-13-2024 01:36 PM

Something like this should work:

proc sql;
   create table UniqueId as
   select distinct Id 
   from yourdatasetname
   ;
quit;

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update uniqueid yourdatasetname;
   by id;
run;

Note that if the "grades" were numeric values this could be done with proc summary in one step.

It is best to provide data in the form of a working data step. If I have to make a data set from another format I may may things that don't actually match your variable types and may not have the same variable names.

View solution in original post

ballardw · Posted 06-13-2024 01:36 PM

Something like this should work:

proc sql;
   create table UniqueId as
   select distinct Id 
   from yourdatasetname
   ;
quit;

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update uniqueid yourdatasetname;
   by id;
run;

Note that if the "grades" were numeric values this could be done with proc summary in one step.

It is best to provide data in the form of a working data step. If I have to make a data set from another format I may may things that don't actually match your variable types and may not have the same variable names.

ericdrosano · Posted 06-13-2024 01:42 PM

Will send proper format of data next time.
I wish I could use numeric values, but it's more complicated than just A-F scale, including QN, INC, NC, P, and many more "grades".

Tom · Posted 06-13-2024 03:00 PM

You don't need the separate dataset. Just use the one dataset for both of the inputs that the UPDATE statement needs. Use the OBS=0 dataset option to start with 0 observations in the original dataset and treat all of the observations as transactions to it.

proc sort data=yourdatasetname;
   by id;
run;

data want;
   update yourdatasetname(obs=0) yourdatasetname;
   by id;
run;

Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Re: Condense multiple observations into one retaining all variables

Register Today!

SAS Training: Just a Click Away