BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pinkyc
Calcite | Level 5

I have a list of codes that pull corresponding numbers from a dataset, which I pasted from some sql code so I guess the format isn't important.

((med_code between '295'  and '29999') or

(med_code between '311'  and '31499') or

(med_code between '301'  and '30199') or

(med_code between '308'  and '30999') or

(med_code between '3004' and '30049') or

(med_code between '3003' and '30039') );


I have an input that has fields like (about 7000 observations) called pat_details

Pat_ID     discharge_date     med_code

AAA         01/02/2012             29280

AAB          02/04/2012            29800

AAC           05/20/2011          2925

I need to check whether any of the med_codes which are phrased in the logic of between two ranges matches the first 3 numbers of med_code column in the dataset and if not, which ones don't match.  I have no idea how to approach this.  Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I realize you said the first three digits but your last two codes are actually a four digit match?

Anyways, consider a format where you can list the ranges. You'll need to expand the proc format ranges, I didn't want to type it all out.

data have;

format pat_id $3. discharge_date date9. med_code $5.;

input Pat_ID  $   discharge_date mmddyy10.    med_code $;

cards;

AAA 01/02/2012 29280

AAB 02/04/2012 29800

AAC 05/20/2011 2925

;

run;

proc format;

value $ match_fmt

'295'-'29999',

'311'-'31499',

'3003'-'30039' = 1

other = 0;

run;

data want;

set have;

match=put(med_code, $match_fmt.);

run;

View solution in original post

7 REPLIES 7
Reeza
Super User

What do you want the output to look like?

pinkyc
Calcite | Level 5

Since I'm not sure how to go about this, would it be easier if there were two index columns?

Like if the med_code col doesn't match, then match = 0 else match = 1.  Then I can do another step and just extract the match = 0 and match = 1 observations.

Reeza
Super User

I realize you said the first three digits but your last two codes are actually a four digit match?

Anyways, consider a format where you can list the ranges. You'll need to expand the proc format ranges, I didn't want to type it all out.

data have;

format pat_id $3. discharge_date date9. med_code $5.;

input Pat_ID  $   discharge_date mmddyy10.    med_code $;

cards;

AAA 01/02/2012 29280

AAB 02/04/2012 29800

AAC 05/20/2011 2925

;

run;

proc format;

value $ match_fmt

'295'-'29999',

'311'-'31499',

'3003'-'30039' = 1

other = 0;

run;

data want;

set have;

match=put(med_code, $match_fmt.);

run;

pinkyc
Calcite | Level 5

Hi,

Yes the codes in the med_code column may have 4 or 5 digits.  But I am asked to compare only the first three digits to the list.

Thank you I will try that and get back to you.

LucasM
Calcite | Level 5

Hi, i'm new to SAS programming and I have a very similar problem to solve. The difference is that in my case the list of codes to be inserted at the match_fmt is stored in 2 different columns on a specific database. How can I make SAS to recognize the values on both columns as being the inner and outer limits instead of hard coding them?

Thank!

Reeza
Super User

You should post a new question and link to this as reference if required.

Please include some sample data, input and output to help clarify your question.

PoornimaRavishankar
Quartz | Level 8

Assuming your med_code in the dataset is character,

1.  Substr function - if trim(left(right(med_code,1,3) eq '<insert med code>'  then <some other logic>. If you are particular just aboutthe first 3 characters, then use substr.

Or

2. Use the index function (index searches the combination of characters anywhere within a given string)

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212242.htm

Now there are other considerations - How are you getting the list of codes from the SQL logic to compare against the dataset field? Once you are clear about the format of both data sources, the logic outlined above can be used to compare and the output can be redirected to different datasets based on your particular requirement.

If you want to compare the two values in each AND expression as well as all values falling within the range. Use  PROC format to  say 

295 - 299 = M  (Using only first 3 characters of each value)

311 - 31499 = M

..

other = NM

M is for match, NM for no match.Then use the substr on the first 3 characters of med_code on the dataset and convert it to either M or NM using the format you just created.

Remember these are pointers for approach, there are other things you might have to insert to get exactly what you want. (like format of output, renaming variables etc.)

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2607 views
  • 3 likes
  • 4 in conversation