Hello!
I have a dataset that contains diagnosis for individuals. Each individual can appear on multiple rows depending on the number of diagnosis they have.
Data Have;
Ind_ID$ DiagCode$ Gender$ Age$
1 I11 1 55
1 C12 1 55
1 C01 1 55
2 Y11 2 67
2 F99 2 67
3 X12 1 92
4 C05 2 81
I want to transpose the dataset and count the number of diagnosis ( var. num_diag) each individual have. I also want to place each diagnosis in separate variable.:
Data want;
Ind_ID$ Num_diag DCode1$ DCode2$ DCode2$ Gender$ Age$
1 3 I11 C12 C01 1 55
2 2 Y11 F99 2 67
3 1 X12 1 92
4 1 C01 2 81
I would do it like this:
data have;
input Ind_ID$ DiagCode$ Gender$ Age$;
cards;
1 I11 1 55
1 C12 1 55
1 C01 1 55
2 Y11 2 67
2 F99 2 67
3 X12 1 92
4 C05 2 81
;run;
proc sql noprint; /* get max no of Diags for an ID */
select max(n) into :nCodes trimmed
from (select count(diagcode) as n from have group by Ind_ID);
quit;
data want;
do num_diag=1 to &nCodes until(last.Ind_ID);
set have;
by Ind_ID;
array DCodes (*) $8 DCode1-DCode&nCodes;
DCodes(num_diag)=DiagCode;
end;
drop DiagCode;
run;
proc sql;
select count(distinct DiagCode) into: ndiagcode from Data_Have;
quit;
proc sort data_have;
by ind_ID;
run;
data DataWant(Drop=DiagCode);
length DCode1 -- Dcode&ndiagcode. $8.;
Set Data_Have;
by ind_ID;
array s{} DCode1 -- Dcode&ndiagcode.;
retain DCode1 -- Dcode&ndiagcode. '' Num_diag 0;
if first.ind_ID then do;
Num_diag=0;
end;
Num_diag=Num_diag+1;
s{Num_diag} = DiagCode;
if last.ind_ID then output;
run;
I would do it like this:
data have;
input Ind_ID$ DiagCode$ Gender$ Age$;
cards;
1 I11 1 55
1 C12 1 55
1 C01 1 55
2 Y11 2 67
2 F99 2 67
3 X12 1 92
4 C05 2 81
;run;
proc sql noprint; /* get max no of Diags for an ID */
select max(n) into :nCodes trimmed
from (select count(diagcode) as n from have group by Ind_ID);
quit;
data want;
do num_diag=1 to &nCodes until(last.Ind_ID);
set have;
by Ind_ID;
array DCodes (*) $8 DCode1-DCode&nCodes;
DCodes(num_diag)=DiagCode;
end;
drop DiagCode;
run;
Keep it simple:
proc transpose data=have prefix=DCode out=temp(drop=_name_);
by ind_id gender age notsorted;
var diagCode;
run;
data want;
set temp;
array dc_ DCode:;
numDiag = dim(dc_) - cmiss(of DCode:);
run;
Thank you but I am biased against proc transpose. However I agree that is a simple solution.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.