I have a list of diagnostic codes in variables dx1-dx9 for each observation. I'm trying to extract the first diagnostic code that matches a specified subset for each observation. Here's the basic structure of my data:
data test;
input dx1 $ dx2 $ dx3 $ dx4 $ dx5 $ dx6 $ dx7 $ dx8 $ dx9 $;
datalines;
D2002 O2829 V1002 T2000 W2018 . . . .
B1080 V1001 S400X R2910 44323 I1088 FB372 X3007 A5850
M5992 R6602 U2710 S400X D0412 V1002 C4010 . .
;
run;This is the code I wrote:
data pwid.test;
length first_dx $6;
set pwid.test;
array dxnum[9] dx1-dx9;
do i = 1 to 9;
if dxnum[i] not in ("V1001", "V1002", "S400X") then continue;
else first_dx=dxnum[i] and leave;
end;
run;I'm getting a message for my "else" line that character values have been converted to numeric values and numeric values have been converted to character values, then I get an error message saying "invalid numeric data" (see screenshot). As far as I can tell everything should be a character variable, so I'm not sure why anything is getting converted to numeric. What's causing this issue? If there's a better way to write this code I'd appreciate that as well. Thank you!
Not sure your exact objective, can i assume this is what you perhaps want?
data test1;
length first_dx $6;
set test;
array dxnum[9] dx1-dx9;
do i = 1 to 9;
if dxnum[i] not in ("V1001", "V1002", "S400X") then continue;
else first_dx=dxnum[i];
end;
run;
I'm trying to set first_dx to the first matching diagnostic code listed in dx1-dx9. In the above example, that would be V1002 for obs 1, V1001 for obs 2, and S400X for obs 3. I think your code keeps running the do loop even after it finds a match, so first_dx is set to the last matching code.
I would recommend this variation:
do i = 1 to 9 until (first_dx > ' ');
if dxnum[i] in ("V1001", "V1002", "S400X") then
first_dx=dxnum[i];
end;
Please read the entire log from this. Your problem is at line 189, column 23. What is there? At that location is a NUMERIC variable named LEAVE. You can't assign the value DXNUM[i] AND LEAVE to FIRST_DX, because DXNUM[i] is charcter and LEAVE is numeric and so the statement DXNUM[i] AND LEAVE cannot be evaluated.
Thanks, I figured that might be my problem. How can I exit the do loop at that point since leave doesn't work?
Forget about LEAVE and CONTINUE.
Tell the DO statement what criteria to use to end the looping.
data pwid.test;
set pwid.test;
array dxnum[9] dx1-dx9;
length first_dx $6;
do i = 1 to 9 until(not missing(first_dx));
if dxnum[i] in ("V1001", "V1002", "S400X") then first_dx=dxnum[i];
end;
run;
Thanks for the help, this syntax is much more straightforward.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.