Hi,
I opened my big mouth and said that I used to work with SAS years and years ago, so here we are at 10:00 on a Saturday night reaching out for help from you guys. Any help you can give will be greatly appreciated.
I have multiple years of patient data that has 28 fields (ID ($2.), DOB (mmddyyyy), DxDate (mmddyyyy), ICD_1, ICD_2, ... ICD_25 ($7.))
The output data that I want is only the ID based on an ICD code of V0171, 0520, or 0522 with an age >= 30.
data have;
input ID DOB DxDate icd_1 icd_2 icd_3 icd_4 icd_5 icd_6 icd_7 icd_8 icd_9 icd_10 icd_11 icd_12 icd_13 icd_14 icd_15 icd_16 icd_17 icd_18 icd_19 icd_20 icd_21 icd_22 icd_23 icd_24 icd_25 $7.;
format DOB mmddyyyy10. DxDate mmddyyyy10.;
age = INT(YRDIF(DOB, DxDate, 'ACTUAL'));
Have I started this off right? How would I proceed from here? Thanks!
So just a where clause?
where prxmatch('/V0171|V0520|V0522/',catx('|',of ICD_: )) ;
or
where index(catx('|',of ICD_: ),'V0171') | index(catx('|',of ICD_: ),'V0520') | index(catx('|',of ICD_: ),'V0522') ;
Complete the clause with the age condition
and yrdif(DOB, today(), 'age') >= 30
where prxmatch('/0520|0522|V0171/',catx('|',of icd_: )) ;
Thanks. When I try and run this line, it underlines icd_ and flags it as:
Show the log please.
The syntax is correct.
25 data t; 26 ICD_1='XXX0520'; ICD_2='XXX0521'; 27 X = prxmatch('/0520|0522|V0171/', catx('|', of ICD_: )) ; 28 put X=; 29 run; X=4 NOTE: The data set WORK.T has 1 observations and 3 variables. NOTE: Compressing data set WORK.T increased size by 100.00 percent. Compressed is 2 pages; un-compressed would require 1 pages. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Sorry, it has been over 12 years since I have messed with SAS, so I want to apologize again for my ignorance, but I am still having issues with your program.
My data looks like this (ID, DOB, Dx_Date and 25 icd_ variables with various data)
ID | DOB | Dx_Date | icd_1 | icd_2 | icd_3 | icd_4 | icd_5 | ... icd_25 |
1 | 1/1/1945 | 2/3/2020 | 0543 | V1111 | 0545 | 0546 | 0520 | 0548 |
59 | 2/2/1955 | 2/4/2020 | 0544 | 0545 | 0546 | V0171 | 0548 | 0549 |
672 | 3/3/1965 | 2/5/2020 | 0545 | 0546 | 0548 | 0548 | 0549 | 0550 |
1055 | 4/4/1975 | 2/6/2020 | 0546 | 0547 | 0522 | V0999 | 0550 | 0551 |
When I run the code that you provided, I get the following output:
Obs icd_1 icd_2 X age DOB Dx_Date
1 XXX0520 XXX0522 4 0 . .
What I need is to have a new output, just display the IDs of the clients that meet the criteria (the three ICD codes, 0520, 0522, V0171, and >=30). So, using the above data snipet.
Output_A:
ID
1
59
1055
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 data work.test;
70 informat ID DOB dx_date age icd_1 icd_2 icd_3 icd_4 icd_5 icd_6 icd_7 icd_8 icd_9 icd_10 icd_11 icd_12 icd_13 icd_14
70 ! icd_15 icd_16 icd_17 icd_18 icd_19 icd_20 icd_21 icd_22 icd_23 icd_24 icd_25;
71 age = INT(YRDIF(DOB, dx_date, 'ACTUAL')) >= 30;
72 where prxmatch('/0520|0522|V0171/', catx('|', of icd_: )) ;
____
22
202
ERROR: Syntax error while parsing WHERE clause.
ERROR: No input data sets available for WHERE statement.
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, ), *, **, +, ',', -, /, <, <=, <>, =, >, >=, ?, AND,
BETWEEN, CONTAINS, EQ, GE, GT, IN, IS, LE, LIKE, LT, NE, NOT, NOTIN, OR, ^, ^=, |, ||, ~, ~=.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
73 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TEST may be incomplete. When this step was stopped there were 0 observations and 29 variables.
WARNING: Data set WORK.TEST was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.01 seconds
memory 672.03k
OS Memory 24996.00k
Timestamp 01/21/2022 01:46:40 PM
Step Count 63 Switch Count 0
Page Faults 0
Page Reclaims 70
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8
Read the log.
ERROR: No input data sets available for WHERE statement.
What data are you reading? You need a SET statement if reading a SAS table.
If reading flat data, you need an INPUT statement. In that case, the INPUT statement should come before the filter statement, and WHERE should be replaced with IF.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.