BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
OPHD1
Fluorite | Level 6

I am looking to productionalize some code and would like to make the process more data driven and efficient. I am looking for a way to use do loops, or some other procedure, to drive creation of variables, without specifically identifying each level of a variable.

Basically, I want to take this step:

data example_2;

     set example;

     if practice_type = 'FQHC' then type = 1;

     else if practice_type = 'CAH' then type = 2;

     else if practice_type = 'RHC' then type = 3;

     else if practice_type = 'Independent_Multi_Site' then type = 4;

     else if practice_type = 'Independent_Single_Site' then type = 5;

     else type = 6;

run;

And turn it into a step that creates a value for type based on unique values of practice_type, without specifying the level of practice_type; since values may change with additional updates. This will also serve as a QC check to make sure the data is being entered in the same manner (Independent_Single_Site is different from Independent_Single). I've tried using do = 1 to 5 and a combination of first.practice_type and last.practice_type and haven't had any luck. Thank you in advance for any suggestions or reference materials.

Here's a practice data set:

data example;

     length practice_type $ 25;

     input practice practice_type;

     datalines;

     001 FQHC

     002 Independent_Single_Site

     003 RHC

     004 Independent_Single_Site

     005 Independent_Multi_Site

     006 Independent_Single_Site

     007 CAH

     008 RHC

     009 Independent_Single_Site

     010 Independent_Multi_Site

     011 Independent_Multi_Site

     012 CAH

     013 Independent_Multi_Site

     014 Independent_Single_Site

     015 Independent_Multi_Site

     016 FQHC

     017 CAH

     018 Independent_Multi_Site

     019 Independent_Single

     020 Independent_Multi_Site

     021 Independent_Single_Site

     022 FQHC

     023 FQHC

     024 FQHC

     025 RHC

     ;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
LarryWorley
Fluorite | Level 6

I would start with creating a lookup data set with the relationship of practice_type and type.  Depending upon your requirements you might add effectiveness dates, an effectiveness flag, etc.

Something like this for your data:

practice_type_level            type effectiveness_flag

   'FQHC'                             1               Yes

    'CAH'                               2                Yes               

     'RHC'                              3               Yes

     'Independent_Multi_Site'   4               Yes

    'Independent_Single_Site'   5              Yes

   'new practice type tbd'         7               No

     'Else value'                       9                Yes

You can modify this table as needed.

Your dynamic code can read this table, filtering on the value of the effectiveness_flag if needed.

Then you have sevaral options within SAS to do this:

1.  Use this dataset to create a format which is then used in datastep with assignment statement to assign a value to the new type variable.  This might be the simplest given my interpretation of your requirements.

2.  Use the dataset to create macro vaiable 'arrays' which are then fed to a generic macro.  This is similar to Anca approach, but allows dynamic changes to made to the control table instead of being hard-coded in that example.

3.  Use the dataset to drive a data _null_ step which uses call execute to generate program,

4.  Use the dataset to drive a data _null_ step which writes code to a temporary file which is then %included in job.

You can find references to all of those approaches online and in the SAS documentation.

Let me know if you need more details.

Hope this helps.

Larry

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

Not sure what you mean here.

If you just want to assign a number to each unique value of PRACTICE_TYPE then do something like this.

proc sort data=sample;

  by practice_type;

run;

data want ;

  set

sample;

  by practice_type;

  type + (first.practice_type) ;

run;

OPHD1
Fluorite | Level 6

Thank you Tom, for this simple trick to create the flag in shortened code.

AncaTilea
Pyrite | Level 9

Hi, how about something like this:

%let practice_type_values = FQHC,CAH,RHC,Independent_Multi_Site,Independent_Single_Site,Independent_Single;

%let num_options = %eval(%sysfunc(countw("&practice_type_values.")));

%put &num_options;

%macro quick();

  %do i = 1 %to &num_options.;

  if practice_type ="%scan('&practice_type_values.', &i., ',')" then type = &i.;

  %end;

  output;

%mend quick;

data example_3;

     set example;

  %quick;

run;

Smiley Happy

OPHD1
Fluorite | Level 6

Thank you Anca, this is a very useful macro. I was not familiar with some of this macro functions, so this was very helpful.

LarryWorley
Fluorite | Level 6

I would start with creating a lookup data set with the relationship of practice_type and type.  Depending upon your requirements you might add effectiveness dates, an effectiveness flag, etc.

Something like this for your data:

practice_type_level            type effectiveness_flag

   'FQHC'                             1               Yes

    'CAH'                               2                Yes               

     'RHC'                              3               Yes

     'Independent_Multi_Site'   4               Yes

    'Independent_Single_Site'   5              Yes

   'new practice type tbd'         7               No

     'Else value'                       9                Yes

You can modify this table as needed.

Your dynamic code can read this table, filtering on the value of the effectiveness_flag if needed.

Then you have sevaral options within SAS to do this:

1.  Use this dataset to create a format which is then used in datastep with assignment statement to assign a value to the new type variable.  This might be the simplest given my interpretation of your requirements.

2.  Use the dataset to create macro vaiable 'arrays' which are then fed to a generic macro.  This is similar to Anca approach, but allows dynamic changes to made to the control table instead of being hard-coded in that example.

3.  Use the dataset to drive a data _null_ step which uses call execute to generate program,

4.  Use the dataset to drive a data _null_ step which writes code to a temporary file which is then %included in job.

You can find references to all of those approaches online and in the SAS documentation.

Let me know if you need more details.

Hope this helps.

Larry

OPHD1
Fluorite | Level 6

Thank you Larry, you offered helpful approaches -- I will try Approach # 1. Seems like you know about practice data too -- having a look up table with effective information is a great suggestion.

OPHD1
Fluorite | Level 6

Thanks All, I'd like to mark this question as answered, but I'm not sure how. My apologies for my newbie status with these discussion boards.

LarryWorley
Fluorite | Level 6

You see some buttons inside each reply, which let you mark as correct or as helpfulSmiley Happy

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1576 views
  • 6 likes
  • 4 in conversation