I have the code below to identify psychiatric diagnoses from a principal diagnosis variable (prindiag). Instead of identifying them as yes or no, I would like to uniquely identify them by sub-category. That is, each of the 15 psychiatric diagnoses would be a different number (e.g., 1, 2, 3, 4, 5 ...). Is this possible to do within the data step?
data ip14aNTA; set ip14; array diagvars {1} prindiag; psyc_flag=0; do i=1 to 1; if diagvars{i} in: ( '295' , '296' ,'297' ,'298' ,'300' ,'301', '302', '306' ,'307', '308' , '309' ,'311', '312', '313', '314') and diagvars{i} not in ('30252') then psyc_flag=1; end; run;
The WHICHC function does this.
What I don't see is how having numbers 1 through 15 is superior for further analyses, or for reporting, than having the character strings '295' , '296' , etc.
It's mildly complicated by the truncation of in: but it can definitely be done. Here's an approach:
data want;
array temp {999} _temporary_;
if _n_=1 then do subcat = '295', '296', '297', '298', '300', '301', '302', '306',
'307', '308', '309', '311', '312', '313', '314';
k + 1;
temp{k} = k;
end;
set have;
if prindiag not in : ('000', '30252', ' ') then
psyc_flag = temp{input(prindiag, 3.)};
drop k subcat;
run;
The top part of the DATA step places the numbers 1 through 15 into matching spots of the temporary array.
The final computations omit diagnoses 30252, blank, and anything that begins with 000. Then it looks up the value 1 through 15 or missing, depending on the value of PRINDIAG.
Your code doesn't really make much sense. You are looping over the character strings codes but then ignoring them.
Were you trying to make a sparse array using the numeric value of the codes as the index?
array temp {0:999} _temporary_;
if _n_=1 then do x = 295 to 298, 300 to 302, 306 to 309, 311 to 314;
k + 1;
temp{x} = k;
end;
If you're just after re-coding the values in prindiag then code like below should do (untested as no sample data provided).
proc format;
value $recodePrindiag (default=2)
'295' = '01'
'296' = '02'
'297' = '03'
'298' = '04'
'300' = '05'
'301' = '06'
'302' = '07'
'306' = '08'
'307' = '09'
'308' = '10'
'309' = '11'
'311' = '12'
'312' = '13'
'313' = '14'
'314' = '15'
other = '-1'
;
run;
data ip14aNTA;
set ip14;
prindiag_recoded=put(prindiag,$recodePrindiag.);
psyc_flag= (prindiag_recoded ne '-1');
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.