BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
luvscandy27
Quartz | Level 8

I have six main variables which are: ddate rdate rcompdate mdate mpdate hdate each
variable is associated with a number of days as in the table below

 

ID ddate rdate rcompdate mdate mpdate hdate
1 642
2 74           61        61
3 614

 

I would like to flag the variables when they fall within a certain number
of days for each of the main variables (example in the table below)

the range is 0-6  7-30  31-90  91-180  gt181

ID ddate rdate rcompdate mdate mpdate hdate ddate_0to6 ddate_7to30 ddate_31to90 ddate_91to180 ddate_gt181
1 642                                                                                                                                                                     1
2 74 61 61                                                                                                                     1
3 614                                                                                                                                                                     1

 

I can flag one of the variables such as DDate using the code below. In order to flag
my other main variables such as(rdate rcomdate .....) I would have to write the could below
several times. Is there anyway I could do this without having to repeat the code below six times?

 

data flag;
set import;
If 0 le ddate le 6 then ddate_0to6 =1;
If 7 le ddate le 30 then ddate_7to30 =1;
If 31 le ddate le 90 then ddate_31to90 =1;
If 91 le ddate le 180 then ddate_91to180 =1;
If ddate ge 181 then ddate_gt181 =1;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Why are you trying to define two arrays now? That does not match your new data structure.

You could use one array.

data have ;
  input ID $ STAGE :$32. DAYSDIFF ;
cards;
1 ddate     642
1 rdate     .
1 rcompdate .
1 mdate     .
1 mpdate    .
1 hdate     .
2 ddate     74
2 rdate     61
2 rcompdate 61
2 mdate     .
2 mpdate    .
2 hdate     .
3 ddate     614
3 rdate     .
3 rcompdate .
3 mdate     .
3 mpdate    .
3 hdate     .
;

data want;
  set have ;
  array flags ZERO_TO_SIX SEVEN_TO_THIRTY THIRTYONE_TO_NINETY NINETYONE_TO_ONEEIGHTY GT_ONEEIGHTYONE ;
  if 0 le DAYSDIFF le 6 then flags[1]=1;
  if 7 le DAYSDIFF le 30 then flags[2]=1;
  if 31 le DAYSDIFF le 90 then flags[3]=1;
  if 91 le DAYSDIFF le 180 then flags[4]=1;
  if DAYSDIFF ge 181 then flags[5]=1;
run;

proc print; run;

But that code is not really getting much benefit from an array.  Why not just set the variables directly?

data want;
  set have ;
  if 0 le DAYSDIFF le 6 then ZERO_TO_SIX=1;
  if 7 le DAYSDIFF le 30 then SEVEN_TO_THIRTY=1;
  if 31 le DAYSDIFF le 90 then THIRTYONE_TO_NINETY=1;
  if 91 le DAYSDIFF le 180 then NINETYONE_TO_ONEEIGHTY=1;
  if DAYSDIFF ge 181 then GT_ONEEIGHTYONE=1;
run;

That will set the flags to be either 1 or missing.  If you want the flags to be either TRUE (1) or FALSE (0) then instead of using IF/THEN just use an assignment statement.

data want;
  set have ;
  ZERO_TO_SIX            =  0 le DAYSDIFF le 6 ;
  SEVEN_TO_THIRTY        =  7 le DAYSDIFF le 30 ;
  THIRTYONE_TO_NINETY    = 31 le DAYSDIFF le 90 ;
  NINETYONE_TO_ONEEIGHTY = 91 le DAYSDIFF le 180 ;
  GT_ONEEIGHTYONE        =       DAYSDIFF ge 181 ;
run;

But why are you creating those flag variables anyway?  You could create a format that converts DAYSDIFF into your categories and use that with your original dataset.

proc format ;
value daysdiff 
 0 -  6    = '  0 to 6'
 7 - 30    = '  7 to 30'
31 - 90    = ' 31 to 90'
91 - 180   = ' 91 to 180;
181 - high = '181+'
;

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

To do the same operation over a number of different variables you will want to use ARRAY.  This will let you define an alias for the group of variables. Then you use that alias with an index to find a particular value.

 

Looks like you want to make two arrays.  A 1-D array for the dates and a 2-D array for the flag variables.  Looks your dimensions are 6 and 5.

data want;
  set have;
  array d ddate rdate rcompdate mdate mpdate hdate ;
  array f [6,5] ddate_0to6 ddate_7to30 ddate_31to90 ddate_91to180 ddate_gt181
   ... type out the rest of the variable names here ...
  ;
  do idx=1 to dim(d);
    If 0 le d[idx] le 6 then f[idx,1]=1;
    If 7 le d[idx] le 30 then f[idx,2]=1;
    If 31 le d[idx] le 90 then f[idx,3]=1;
    If 91 le d[idx] le 180 then f[idx,4]=1;
    If d[idx] ge 181 then f[idx,5]=1;    
  end;
run;

If you change your variable names to use numeric suffixes then typing the variable names will be easier.  If you change your data model it will be even easier.  You probably will want to use a FORMAT instead of that series of IF statements.

luvscandy27
Quartz | Level 8

When you say change the variable name you mean something along the lines of ddate_1-ddate_6?

 

I'm really not sure what you mean by  change the data model but If it will make things easier I am very open to the suggestion. What route should I take to change the data model? 

Reeza
Super User

Your current data structure is what's known as a wide format. If you transpose it to a long format, you only have one column you need to deal with and you can easily apply a single format to the one column. If you need it wide for reporting, PROC REPORT will do that as well, so there isn't a need to store your data in this fashion. It's also considered against the 'tidy data' format structure because you have information in your column names.

 

Transposing data tutorials:

Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/

https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/


luvscandy27
Quartz | Level 8

I have transposed my data so now my data looks like the table below:

ID STAGE DAYSDIFF
1 ddate           642
1 rdate
1 rcompdate
1 mdate
1 mpdate
1 hdate
2 ddate           74
2 rdate           61
2 rcompdate  61
2 mdate
2 mpdate
2 hdate
3 ddate          614
3 rdate
3 rcompdate
3 mdate
3 mpdate
3 hdate


I tried to modify the code above to fit the new structure but now I get flags in all
columns when I only want the flag in the range that the daysdiff
falls in for that stage. Could someone please assist me ?

 

data want;
set have;
array d daysdiff ;
array f [1,5] ZERO_TO_SIX SEVEN_TO_THIRTY THIRTYONE_TO_NINETY NINETYONE_TO_ONEEIGHTY GT_ONEEIGHTYONE

do idx=1 to dim(d);
If 0 le d[idx] le 6 then f[idx,1]=1;
If 7 le d[idx] le 30 then f[idx,2]=1;
If 31 le d[idx] le 90 then f[idx,3]=1;
If 91 le d[idx] le 180 then f[idx,4]=1;
If d[idx] ge 181 then f[idx,5]=1;
end;
run;

 

Tom
Super User Tom
Super User

Why are you trying to define two arrays now? That does not match your new data structure.

You could use one array.

data have ;
  input ID $ STAGE :$32. DAYSDIFF ;
cards;
1 ddate     642
1 rdate     .
1 rcompdate .
1 mdate     .
1 mpdate    .
1 hdate     .
2 ddate     74
2 rdate     61
2 rcompdate 61
2 mdate     .
2 mpdate    .
2 hdate     .
3 ddate     614
3 rdate     .
3 rcompdate .
3 mdate     .
3 mpdate    .
3 hdate     .
;

data want;
  set have ;
  array flags ZERO_TO_SIX SEVEN_TO_THIRTY THIRTYONE_TO_NINETY NINETYONE_TO_ONEEIGHTY GT_ONEEIGHTYONE ;
  if 0 le DAYSDIFF le 6 then flags[1]=1;
  if 7 le DAYSDIFF le 30 then flags[2]=1;
  if 31 le DAYSDIFF le 90 then flags[3]=1;
  if 91 le DAYSDIFF le 180 then flags[4]=1;
  if DAYSDIFF ge 181 then flags[5]=1;
run;

proc print; run;

But that code is not really getting much benefit from an array.  Why not just set the variables directly?

data want;
  set have ;
  if 0 le DAYSDIFF le 6 then ZERO_TO_SIX=1;
  if 7 le DAYSDIFF le 30 then SEVEN_TO_THIRTY=1;
  if 31 le DAYSDIFF le 90 then THIRTYONE_TO_NINETY=1;
  if 91 le DAYSDIFF le 180 then NINETYONE_TO_ONEEIGHTY=1;
  if DAYSDIFF ge 181 then GT_ONEEIGHTYONE=1;
run;

That will set the flags to be either 1 or missing.  If you want the flags to be either TRUE (1) or FALSE (0) then instead of using IF/THEN just use an assignment statement.

data want;
  set have ;
  ZERO_TO_SIX            =  0 le DAYSDIFF le 6 ;
  SEVEN_TO_THIRTY        =  7 le DAYSDIFF le 30 ;
  THIRTYONE_TO_NINETY    = 31 le DAYSDIFF le 90 ;
  NINETYONE_TO_ONEEIGHTY = 91 le DAYSDIFF le 180 ;
  GT_ONEEIGHTYONE        =       DAYSDIFF ge 181 ;
run;

But why are you creating those flag variables anyway?  You could create a format that converts DAYSDIFF into your categories and use that with your original dataset.

proc format ;
value daysdiff 
 0 -  6    = '  0 to 6'
 7 - 30    = '  7 to 30'
31 - 90    = ' 31 to 90'
91 - 180   = ' 91 to 180;
181 - high = '181+'
;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1501 views
  • 1 like
  • 3 in conversation