I have a list of codes that pull corresponding numbers from a dataset, which I pasted from some sql code so I guess the format isn't important.
((med_code between '295' and '29999') or
(med_code between '311' and '31499') or
(med_code between '301' and '30199') or
(med_code between '308' and '30999') or
(med_code between '3004' and '30049') or
(med_code between '3003' and '30039') );
I have an input that has fields like (about 7000 observations) called pat_details
Pat_ID discharge_date med_code
AAA 01/02/2012 29280
AAB 02/04/2012 29800
AAC 05/20/2011 2925
I need to check whether any of the med_codes which are phrased in the logic of between two ranges matches the first 3 numbers of med_code column in the dataset and if not, which ones don't match. I have no idea how to approach this. Thanks!
I realize you said the first three digits but your last two codes are actually a four digit match?
Anyways, consider a format where you can list the ranges. You'll need to expand the proc format ranges, I didn't want to type it all out.
data have;
format pat_id $3. discharge_date date9. med_code $5.;
input Pat_ID $ discharge_date mmddyy10. med_code $;
cards;
AAA 01/02/2012 29280
AAB 02/04/2012 29800
AAC 05/20/2011 2925
;
run;
proc format;
value $ match_fmt
'295'-'29999',
'311'-'31499',
'3003'-'30039' = 1
other = 0;
run;
data want;
set have;
match=put(med_code, $match_fmt.);
run;
What do you want the output to look like?
Since I'm not sure how to go about this, would it be easier if there were two index columns?
Like if the med_code col doesn't match, then match = 0 else match = 1. Then I can do another step and just extract the match = 0 and match = 1 observations.
I realize you said the first three digits but your last two codes are actually a four digit match?
Anyways, consider a format where you can list the ranges. You'll need to expand the proc format ranges, I didn't want to type it all out.
data have;
format pat_id $3. discharge_date date9. med_code $5.;
input Pat_ID $ discharge_date mmddyy10. med_code $;
cards;
AAA 01/02/2012 29280
AAB 02/04/2012 29800
AAC 05/20/2011 2925
;
run;
proc format;
value $ match_fmt
'295'-'29999',
'311'-'31499',
'3003'-'30039' = 1
other = 0;
run;
data want;
set have;
match=put(med_code, $match_fmt.);
run;
Hi,
Yes the codes in the med_code column may have 4 or 5 digits. But I am asked to compare only the first three digits to the list.
Thank you I will try that and get back to you.
Hi, i'm new to SAS programming and I have a very similar problem to solve. The difference is that in my case the list of codes to be inserted at the match_fmt is stored in 2 different columns on a specific database. How can I make SAS to recognize the values on both columns as being the inner and outer limits instead of hard coding them?
Thank!
You should post a new question and link to this as reference if required.
Please include some sample data, input and output to help clarify your question.
Assuming your med_code in the dataset is character,
1. Substr function - if trim(left(right(med_code,1,3) eq '<insert med code>' then <some other logic>. If you are particular just aboutthe first 3 characters, then use substr.
Or
2. Use the index function (index searches the combination of characters anywhere within a given string)
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212242.htm
Now there are other considerations - How are you getting the list of codes from the SQL logic to compare against the dataset field? Once you are clear about the format of both data sources, the logic outlined above can be used to compare and the output can be redirected to different datasets based on your particular requirement.
If you want to compare the two values in each AND expression as well as all values falling within the range. Use PROC format to say
295 - 299 = M (Using only first 3 characters of each value)
311 - 31499 = M
..
other = NM
M is for match, NM for no match.Then use the substr on the first 3 characters of med_code on the dataset and convert it to either M or NM using the format you just created.
Remember these are pointers for approach, there are other things you might have to insert to get exactly what you want. (like format of output, renaming variables etc.)
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.