BookmarkSubscribeRSS Feed
KC_16
Fluorite | Level 6

Hi, 

 

I have inherited some code (sample below) which is very manual whereby I am having to type in the text within the double quotation marks each time we get an updated file

 

data work.all_order_1;

set work.all_order_2;

format order_name_grouped $300.;

order_name_grouped=order_name;

if find(order_name,"Top 100", 'i') ge 1 then order_name_grouped="Top 100";

if find(order_name,"Cancelled", 'i') ge 1 then order_name_grouped="Cancelled";

run;

 

Is there a better way to write this out please so it automatically looks at the criteria and groups up please?

 

Thanks

6 REPLIES 6
PaigeMiller
Diamond | Level 26

@KC_16 wrote:

I have inherited some code (sample below) which is very manual whereby I am having to type in the text within the double quotation marks each time we get an updated file

 

data work.all_order_1;

set work.all_order_2;

format order_name_grouped $300.;

order_name_grouped=order_name;

if find(order_name,"Top 100", 'i') ge 1 then order_name_grouped="Top 100";

if find(order_name,"Cancelled", 'i') ge 1 then order_name_grouped="Cancelled";

run;

 

Is there a better way to write this out please so it automatically looks at the criteria and groups up please?

 


Do you mean (because you didn't say this) that there are a lot more than just the two lines that begin with IF FIND(... ? 

 

And the real code has hundreds of these lines and they change from run to run?

--
Paige Miller
PeterClemmensen
Tourmaline | Level 20

How many criteria do you have?

KC_16
Fluorite | Level 6

Sorry, there are in total over 100 lines that follow this logic and each month, this continues to grow. However, each month I probably need to update 20 new rows to the existing list so it's not hugely time consuming but would be great to automate if possible

PaigeMiller
Diamond | Level 26

@KC_16 wrote:

Sorry, there are in total over 100 lines that follow this logic and each month, this continues to grow. However, each month I probably need to update 20 new rows to the existing list so it's not hugely time consuming but would be great to automate if possible


Well, I think it is possible. However, you haven't mentioned how a programmer who will be automating this could know what the new rows will be? Where is this information stored?

--
Paige Miller
ballardw
Super User

@KC_16 wrote:

Hi, 

 

I have inherited some code (sample below) which is very manual whereby I am having to type in the text within the double quotation marks each time we get an updated file

 

data work.all_order_1;

set work.all_order_2;

format order_name_grouped $300.;

order_name_grouped=order_name;

if find(order_name,"Top 100", 'i') ge 1 then order_name_grouped="Top 100";

if find(order_name,"Cancelled", 'i') ge 1 then order_name_grouped="Cancelled";

run;

 

Is there a better way to write this out please so it automatically looks at the criteria and groups up please?

 

Thanks


My personal take on this where you are having to search text to find changing values is that your process is suboptimal or even seriously flawed. I ask "Why do these important values change so frequently?" I say important because if you are searching for them there must be some importance attached.

Perhaps this is an indication of poor data entry design, training or data specification.

 

IF, and this a pretty big if given that you say you have a hundred lines similar to this, where the found value is used to set to the value of the variable order_named_group there are likely many ways. However you do know that if the same response contains "Top 100" and "Cancelled" only the cancelled is set don't you?

 

One method would be a temporary array:

data example;
   input order_name $ 1-50;
   array t {2} $ 15 _temporary_ ('Top 100','Cancelled');
   length order_name_grouped $ 15;
   do i= 1 to dim(t);
      if find(order_name,strip(t[i]), 'i') ge 1 then order_name_grouped=t[i]; 
   end;
   drop i;
datalines;
Something Top 100 is in this text
Or cancelled is part of this line
;

which moves all of the "typing" into one place. The number of items in the array, the lengths and the value typed have to match.

 

I am also a bit concerned about the 300 length for order_named_grouped as that seems to indicate your values have that length. Which is truly cumbersome for "typing" in the first place.

 

If the values are in a data set some where then other options open up.

s_lassen
Meteorite | Level 14

Just a dumb question? What if your ORDER_NAME variable contains both "Top 100" and "Cancelled"? Will this never happen? Or is there a priority? As your program is written, an ORDER_NAME containing both strings will get the value "Cancelled", as you have no ELSE statement after the first IF.

 

Anyway, if you have your values in SAS data set, e.g.:

 

data search_texts:
  infile cards truncover;
  input priority text $30.;
cards;
1 Cancelled
2 Top 100
;run;

You can use it like this:

filename tempsas temp;
data _null_;
  set search_texts;
  text=quote(trim(text),"'");
  file tempsas;
  put 'when(find(order_name,' text ", 'i') ge 1) order_name_grouped=" text ';';
run;

data want;
  set have;
  length Order_Name_grouped $300;
  select;
    %include tempsas;
    otherwise Order_Name_Grouped=Order_Name;
    end;
run;

This assumes that your data is sorted with the highest priority first (if there are any ORDER_NAME values with more than one search string in it).

 

If you have your strings in a text file, you can try something like this:

%macro search(text);
when(find(order_name,"&text" , 'i') ge 1) order_name_grouped="&text" ;
%mend;

data want;
  length Order_Name_Grouped $300;
  select;
    %search(Top 100               );
    %search(Cancelled             );
    otherwise Order_Name_Grouped=Order_Name;
    end;
run;

The idea is just to use the block mark capability in SAS.

First repeat the %search lines so many times in your program that you have enough lines for all the strings

Open your text file in a SAS editor, and mark the strings with the mouse while holding down the ALT key (that's the block mark).

Copy that (Ctrl-C).

Block mark the area between the parantheses in the macro calls your SAS program.

Paste (Ctrl-V).

Remove macro calls with empty parameters, if you made too many lines. If you made too few lines with macro calls, your program will look garbled in the end - in that case, press undo (Ctrl-Z), make some more macro call lines, and try again.

Remove the extra lines you made in the start, if you made too many.

 

Just remember that this way, again, the strings with highest priority should be first.

Of course, you could also read your text file with a SAS data step, and then apply the first solution.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1633 views
  • 0 likes
  • 5 in conversation