BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pinkyc
Calcite | Level 5

I have a list of codes that pull corresponding numbers from a dataset, which I pasted from some sql code so I guess the format isn't important.

((med_code between '295'  and '29999') or

(med_code between '311'  and '31499') or

(med_code between '301'  and '30199') or

(med_code between '308'  and '30999') or

(med_code between '3004' and '30049') or

(med_code between '3003' and '30039') );


I have an input that has fields like (about 7000 observations) called pat_details

Pat_ID     discharge_date     med_code

AAA         01/02/2012             29280

AAB          02/04/2012            29800

AAC           05/20/2011          2925

I need to check whether any of the med_codes which are phrased in the logic of between two ranges matches the first 3 numbers of med_code column in the dataset and if not, which ones don't match.  I have no idea how to approach this.  Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I realize you said the first three digits but your last two codes are actually a four digit match?

Anyways, consider a format where you can list the ranges. You'll need to expand the proc format ranges, I didn't want to type it all out.

data have;

format pat_id $3. discharge_date date9. med_code $5.;

input Pat_ID  $   discharge_date mmddyy10.    med_code $;

cards;

AAA 01/02/2012 29280

AAB 02/04/2012 29800

AAC 05/20/2011 2925

;

run;

proc format;

value $ match_fmt

'295'-'29999',

'311'-'31499',

'3003'-'30039' = 1

other = 0;

run;

data want;

set have;

match=put(med_code, $match_fmt.);

run;

View solution in original post

7 REPLIES 7
Reeza
Super User

What do you want the output to look like?

pinkyc
Calcite | Level 5

Since I'm not sure how to go about this, would it be easier if there were two index columns?

Like if the med_code col doesn't match, then match = 0 else match = 1.  Then I can do another step and just extract the match = 0 and match = 1 observations.

Reeza
Super User

I realize you said the first three digits but your last two codes are actually a four digit match?

Anyways, consider a format where you can list the ranges. You'll need to expand the proc format ranges, I didn't want to type it all out.

data have;

format pat_id $3. discharge_date date9. med_code $5.;

input Pat_ID  $   discharge_date mmddyy10.    med_code $;

cards;

AAA 01/02/2012 29280

AAB 02/04/2012 29800

AAC 05/20/2011 2925

;

run;

proc format;

value $ match_fmt

'295'-'29999',

'311'-'31499',

'3003'-'30039' = 1

other = 0;

run;

data want;

set have;

match=put(med_code, $match_fmt.);

run;

pinkyc
Calcite | Level 5

Hi,

Yes the codes in the med_code column may have 4 or 5 digits.  But I am asked to compare only the first three digits to the list.

Thank you I will try that and get back to you.

LucasM
Calcite | Level 5

Hi, i'm new to SAS programming and I have a very similar problem to solve. The difference is that in my case the list of codes to be inserted at the match_fmt is stored in 2 different columns on a specific database. How can I make SAS to recognize the values on both columns as being the inner and outer limits instead of hard coding them?

Thank!

Reeza
Super User

You should post a new question and link to this as reference if required.

Please include some sample data, input and output to help clarify your question.

PoornimaRavishankar
Quartz | Level 8

Assuming your med_code in the dataset is character,

1.  Substr function - if trim(left(right(med_code,1,3) eq '<insert med code>'  then <some other logic>. If you are particular just aboutthe first 3 characters, then use substr.

Or

2. Use the index function (index searches the combination of characters anywhere within a given string)

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212242.htm

Now there are other considerations - How are you getting the list of codes from the SQL logic to compare against the dataset field? Once you are clear about the format of both data sources, the logic outlined above can be used to compare and the output can be redirected to different datasets based on your particular requirement.

If you want to compare the two values in each AND expression as well as all values falling within the range. Use  PROC format to  say 

295 - 299 = M  (Using only first 3 characters of each value)

311 - 31499 = M

..

other = NM

M is for match, NM for no match.Then use the substr on the first 3 characters of med_code on the dataset and convert it to either M or NM using the format you just created.

Remember these are pointers for approach, there are other things you might have to insert to get exactly what you want. (like format of output, renaming variables etc.)

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2282 views
  • 3 likes
  • 4 in conversation