BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
chinna0369
Pyrite | Level 9

Hi,

 

Hope you are doing well!

I have variable called mdotsp, contains values of missing number of tablets but it has different keywords such as "pill", "TAB", etc. I am trying to create numeric variable called mnmtb to assign a value from mdotsp which contains a number of missing pills.

I am using below code to check series of conditional statements based on keywords. I have large data but here I have provided some example data. In below I am getting mnmtb value as missing. I am not sure why, can anyone help me with this logic where I am missing?

 

data have;
input mdotsp $50.;
datalines;
1PILL
2TAB
3MINI
4PIECE
#5
5#
;
run;

data want;
    length mnmtb 8;
    set have;
    mdotsp = compress(upcase(mdotsp)); /* Normalize mdotsp for case and spaces */

    /* Loop through 1 to 300 for mnmtb */
    do n = 5 to 1 by -1;
        /* Create potential keywords dynamically */
        pill_keyword = cats(n, "PILL");
        one_pill_keyword = cats(scan(put(n, best.), 1), "PILL");
        tab_keyword = cats(n, "TAB");
        one_tab_keyword = cats(scan(put(n, best.), 1), "TAB");
        mini_keyword = cats(n, "MINI");
        one_mini_keyword = cats(scan(put(n, best.), 1), "MINI");
        piece_keyword = cats(n, "PIECE");
        one_piece_keyword = cats(scan(put(n, best.), 1), "PIECE");
        hash_keyword = cats(n, "#");
        hash_rev_keyword = cats("#", n);

        /* Check if any of these keywords are present in mdotsp */
        if index(mdotsp, pill_keyword) or index(mdotsp, one_pill_keyword) or 
           index(mdotsp, tab_keyword) or index(mdotsp, one_tab_keyword) or 
           index(mdotsp, mini_keyword) or index(mdotsp, one_mini_keyword) or 
           index(mdotsp, piece_keyword) or index(mdotsp, one_piece_keyword) or 
           index(mdotsp, hash_keyword) or index(mdotsp, hash_rev_keyword) then do;
           mnmtb=n;
           leave; /* Exit the loop once a match is found */
        end;
    end;

run;

Thanks,

chin

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

First problem: you only assign a value when the value is not missing:

if mnmtb ne . and 

If your intent is to assign a value when MNMTB is missing then

if missing (mnmtb) and 

Second is when you use a variable as the second parameter of INDEX (and many other character functions) then entire variable is used. Since  you have not defined a length for any of your key word variables they are defaulting to 200 because of the CATS function. So the trailing spaces in the variable are used in the comparison and not found. So all of the index function calls are returning 0.

 

If N is never two digits you could get away with variable lengths to prevent that. OR remove the trailing spaces.

 

A drastically shortened example that returns a 1 for the 1PILL:

data want;
   length mnmtb 8;
   set have;
   mdotsp = compress(upcase(mdotsp)); /* Normalize mdotsp for case and spaces */
   pill_keyword = cats(1, "PILL");
   /* Check if any of these keywords are present in mdotsp */
   if missing (mnmtb) and index(mdotsp, strip(pill_keyword))  then do;
      mnmtb=1;
   end;

run;

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26
        /* Check if any of these keywords are present in mdotsp */
        if mnmtb ne . and (index(mdotsp, pill_keyword) or index(mdotsp, one_pill_keyword) or 

 

MNMTB has not yet been assigned a value, so when you test to see if it is not equal to missing , it is equal to missing and so this IF condition is always false and the rest of the DO; END; never executes.

 

It always helps, and it would help here tremendously, if you explain what you are trying to do in words, explain the purpose of this code. Do not discuss code or SAS, we need to know the purpose.

--
Paige Miller
ballardw
Super User

First problem: you only assign a value when the value is not missing:

if mnmtb ne . and 

If your intent is to assign a value when MNMTB is missing then

if missing (mnmtb) and 

Second is when you use a variable as the second parameter of INDEX (and many other character functions) then entire variable is used. Since  you have not defined a length for any of your key word variables they are defaulting to 200 because of the CATS function. So the trailing spaces in the variable are used in the comparison and not found. So all of the index function calls are returning 0.

 

If N is never two digits you could get away with variable lengths to prevent that. OR remove the trailing spaces.

 

A drastically shortened example that returns a 1 for the 1PILL:

data want;
   length mnmtb 8;
   set have;
   mdotsp = compress(upcase(mdotsp)); /* Normalize mdotsp for case and spaces */
   pill_keyword = cats(1, "PILL");
   /* Check if any of these keywords are present in mdotsp */
   if missing (mnmtb) and index(mdotsp, strip(pill_keyword))  then do;
      mnmtb=1;
   end;

run;
chinna0369
Pyrite | Level 9
What about other keywords and I have mdotsp values from 1 to 300.
ballardw
Super User

Use the STRIP function.

 

IF your next steps don't really need all those composite "keywords" I would drop them from the data set as that is a lot of baggage of very little use that I can see.

 

Or perhaps research REGULAR expressions which can search for things like <digits>WORD values and return the Digit part if done correctly.

chinna0369
Pyrite | Level 9
I have updated my code as below:
data step3;
set step2;
if .< length(mdotsp) le 3 and index(mdotsp,"#") eq 0 then do; mnmtb=input(mdotsp,best.); end;
if strip(mdotsp)="1 TBALET WAS MISSED TO TAKE, REASON WAS UNKNOWN." then mnmtb=1;
mdotsp_comp=compress(upcase(mdotsp)); /* Normalize mdotsp for case and spaces */
do n = 300 to 1 by -1;
description=upcase(put(n,words100.));
pill_keyword = cats(n, "PILL");
word_pill_keyword = cats(description, "PILL");
tab_keyword = cats(n, "TAB");
word_tab_keyword = cats(description, "TAB");
mini_keyword = cats(n, "MINI");
word_mini_keyword = cats(description, "MINI");
piece_keyword = cats(n, "PIECE");
word_piece_keyword = cats(description, "PIECE");
hash_keyword = cats(n, "#");
word_hash_keyword = cats(description, "#");
hash_rev_keyword = cats("#", n);
word_hash_rev_keyword = cats("#", description);
slice_keyword = cats(n, "SLICE");
word_slice_keyword = cats(description, "SLICE");
stdydrg_keyword = cats(n, "STUDYDRUG");
word_stdydrg_keyword = cats(description, "STUDYDRUG");
keyword_missed = cats("MISSED",n);
word_keyword_missed = cats("MISSED",description);
caps_keyword = cats(n, "CAPSULE");
word_caps_keyword = cats(description, "CAPSULE");
/* Check if any of these keywords are present in mdotsp_comp */
if missing (mnmtb) and (
index(mdotsp_comp, strip(pill_keyword)) or index(mdotsp_comp, strip(word_pill_keyword)) or
index(mdotsp_comp, strip(tab_keyword)) or index(mdotsp_comp, strip(word_tab_keyword)) or
index(mdotsp_comp, strip(mini_keyword)) or index(mdotsp_comp, strip(word_mini_keyword)) or
index(mdotsp_comp, strip(piece_keyword)) or index(mdotsp_comp, strip(word_piece_keyword)) or
index(mdotsp_comp, strip(hash_keyword)) or index(mdotsp_comp, strip(word_hash_keyword)) or
index(mdotsp_comp, strip(hash_rev_keyword)) or index(mdotsp_comp, strip(word_hash_rev_keyword)) or
index(mdotsp_comp, strip(slice_keyword)) or index(mdotsp_comp, strip(word_slice_keyword)) or
index(mdotsp_comp, strip(stdydrg_keyword)) or index(mdotsp_comp, strip(word_stdydrg_keyword)) or
index(mdotsp_comp, strip(keyword_missed)) or index(mdotsp_comp, strip(word_keyword_missed)) or
index(mdotsp_comp, strip(caps_keyword)) or index(mdotsp_comp, strip(word_caps_keyword))
) then do;
mnmtb=n;
end;
end;
run;
quickbluefish
Barite | Level 11

I think you can just use the magic of regular expressions for this:

data want;
set have;
n=prxchange('s/.*?(\d+).*/\1/', 1, mdotsp)*1;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1253 views
  • 0 likes
  • 4 in conversation