Hi
I have a dataset where each row contains information for an individual and the columns show that individual's academic degrees. So column1 is the individual's ID, column2 is Name of the individual. Column 3, 4, and 5 are for Undergrad, Grad, and Ph.D. degrees. Column 3, 4, and 5 take a value "1" if the individual holds a degree of the respective column, and 0 otherwise.
I want to retrieve information about the highest degree earned by the individual. I want to create three new columns in the dataset with the names like Undergrad_New, Grad_New, and Ph.D_New. If an individual has Undergrad, grad, and Ph.D. degrees, only Ph.D_New column should take value "1" and other two columns should be equal to "0".
Please suggest a code for this purpose. Thanks.
I created a dummy dataset and made some assumptions around the starting dataset being used. But I believe the logic below will accomplish the results you need:
DATA WORK.Have;
FORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFILE DATALINES DLM=',' DSD;
INPUT ID Name UnderGrad Grad PhD;
DATALINES;
12345,Adam Ant,1,1,0
23456,Bob Batty,1,0,0
34567,Chuck Cross,1,1,1
45678,David Dunn,1,1,0
56789,Ed Eagers,0,0,0
;
DATA WORK.WANT;
SET WORK.HAVE;
FORMAT Undergrad_New Grad_New PhD_New 1.;
IF PHD=1 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=1; END;
ELSE IF Grad=1 THEN DO; Undergrad_New=0; Grad_New=1; PhD_New=0; END;
ELSE IF UnderGrad=1 THEN DO; Undergrad_New=1; Grad_New=0; PhD_New=0; END;
ELSE IF PhD=0
AND Grad=0
AND UnderGrad=0 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=0; END;
RUN;
Proc Print Output:
Obs ID Name UnderGrad Grad PhD Undergrad_New Grad_New PhD_New12345
12345 | Adam Ant | 1 | 1 | 0 | 0 | 1 | 0 |
23456 | Bob Batty | 1 | 0 | 0 | 1 | 0 | 0 |
34567 | Chuck Cross | 1 | 1 | 1 | 0 | 0 | 1 |
45678 | David Dunn | 1 | 1 | 0 | 0 | 1 | 0 |
56789 | Ed Eagers | 0 | 0 | 0 | 0 | 0 | 0 |
Hope this helps.
Comments / questions:
I created a dummy dataset and made some assumptions around the starting dataset being used. But I believe the logic below will accomplish the results you need:
DATA WORK.Have;
FORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFORMAT ID 8. Name $25. UnderGrad 1. Grad 1. PhD 1.;
INFILE DATALINES DLM=',' DSD;
INPUT ID Name UnderGrad Grad PhD;
DATALINES;
12345,Adam Ant,1,1,0
23456,Bob Batty,1,0,0
34567,Chuck Cross,1,1,1
45678,David Dunn,1,1,0
56789,Ed Eagers,0,0,0
;
DATA WORK.WANT;
SET WORK.HAVE;
FORMAT Undergrad_New Grad_New PhD_New 1.;
IF PHD=1 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=1; END;
ELSE IF Grad=1 THEN DO; Undergrad_New=0; Grad_New=1; PhD_New=0; END;
ELSE IF UnderGrad=1 THEN DO; Undergrad_New=1; Grad_New=0; PhD_New=0; END;
ELSE IF PhD=0
AND Grad=0
AND UnderGrad=0 THEN DO; Undergrad_New=0; Grad_New=0; PhD_New=0; END;
RUN;
Proc Print Output:
Obs ID Name UnderGrad Grad PhD Undergrad_New Grad_New PhD_New12345
12345 | Adam Ant | 1 | 1 | 0 | 0 | 1 | 0 |
23456 | Bob Batty | 1 | 0 | 0 | 1 | 0 | 0 |
34567 | Chuck Cross | 1 | 1 | 1 | 0 | 0 | 1 |
45678 | David Dunn | 1 | 1 | 0 | 0 | 1 | 0 |
56789 | Ed Eagers | 0 | 0 | 0 | 0 | 0 | 0 |
Hope this helps.
You're welcome
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.