Hi
I have a dataset where each row contains information for an individual and the columns show that individual's academic degrees. So column1 is the individual's ID, column2 is Name of the individual. Column 3, 4, and 5 are for Undergrad, Grad, and Ph.D. degrees. Column 3, 4, and 5 take a value "1" if the individual holds a degree of the respective column, and 0 otherwise.
I want to retrieve information about the highest degree earned by the individual. I want to create three new columns in the dataset with the names like Undergrad_New, Grad_New, and Ph.D_New. If an individual has Undergrad, grad, and Ph.D. degrees, only Ph.D_New column should take value "1" and other two columns should be equal to "0".
Please suggest a code for this purpose. Thanks.
I created a dummy dataset and made some assumptions around the starting dataset being used. But I believe the logic below will accomplish the results you need:
DATA WORK.Have;
FORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFILE DATALINES DLM=',' DSD;
INPUT ID Name UnderGrad Grad PhD;
DATALINES;
12345,Adam Ant,1,1,0
23456,Bob Batty,1,0,0
34567,Chuck Cross,1,1,1
45678,David Dunn,1,1,0
56789,Ed Eagers,0,0,0
;
DATA WORK.WANT;
SET WORK.HAVE;
FORMAT Undergrad_New Grad_New PhD_New 1.;
IF PHD=1 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=1; END;
ELSE IF Grad=1 THEN DO; Undergrad_New=0; Grad_New=1; PhD_New=0; END;
ELSE IF UnderGrad=1 THEN DO; Undergrad_New=1; Grad_New=0; PhD_New=0; END;
ELSE IF PhD=0
AND Grad=0
AND UnderGrad=0 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=0; END;
RUN;
Proc Print Output:
Obs ID Name UnderGrad Grad PhD Undergrad_New Grad_New PhD_New12345
12345 | Adam Ant | 1 | 1 | 0 | 0 | 1 | 0 |
23456 | Bob Batty | 1 | 0 | 0 | 1 | 0 | 0 |
34567 | Chuck Cross | 1 | 1 | 1 | 0 | 0 | 1 |
45678 | David Dunn | 1 | 1 | 0 | 0 | 1 | 0 |
56789 | Ed Eagers | 0 | 0 | 0 | 0 | 0 | 0 |
Hope this helps.
Comments / questions:
I created a dummy dataset and made some assumptions around the starting dataset being used. But I believe the logic below will accomplish the results you need:
DATA WORK.Have;
FORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFILE DATALINES DLM=',' DSD;
INPUT ID Name UnderGrad Grad PhD;
DATALINES;
12345,Adam Ant,1,1,0
23456,Bob Batty,1,0,0
34567,Chuck Cross,1,1,1
45678,David Dunn,1,1,0
56789,Ed Eagers,0,0,0
;
DATA WORK.WANT;
SET WORK.HAVE;
FORMAT Undergrad_New Grad_New PhD_New 1.;
IF PHD=1 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=1; END;
ELSE IF Grad=1 THEN DO; Undergrad_New=0; Grad_New=1; PhD_New=0; END;
ELSE IF UnderGrad=1 THEN DO; Undergrad_New=1; Grad_New=0; PhD_New=0; END;
ELSE IF PhD=0
AND Grad=0
AND UnderGrad=0 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=0; END;
RUN;
Proc Print Output:
Obs ID Name UnderGrad Grad PhD Undergrad_New Grad_New PhD_New12345
12345 | Adam Ant | 1 | 1 | 0 | 0 | 1 | 0 |
23456 | Bob Batty | 1 | 0 | 0 | 1 | 0 | 0 |
34567 | Chuck Cross | 1 | 1 | 1 | 0 | 0 | 1 |
45678 | David Dunn | 1 | 1 | 0 | 0 | 1 | 0 |
56789 | Ed Eagers | 0 | 0 | 0 | 0 | 0 | 0 |
Hope this helps.
You're welcome
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.