Solved: Re: How to Conditionally repeat values of a variable by ID and by Year

Saba1 · Posted 10-09-2017 01:37 AM

Hi

I have following data set with different IDs and each ID has sorted regular monthly dates of various years. There are two variables X and X1 with 0 and 1 values based on some event. See a small sample below :

data have;
input ID Date_month X X1 ;
datalines;
1     12/31/2001    1    0
1     01/31/2005    0    0
1     02/28/2005    1    0
1     03/31/2005    1    0
1     04/30/2005    0    0
1     05/31/2005    0    0
1     06/30/2005    0    0
1     07/31/2005    1    1
1     08/31/2005    0    0
1     09/30/2005    1    0
1     10/31/2005    1    0
1     11/30/2005    0    1
1     12/31/2005    0    0
1     01/31/2008    1    0
1     02/28/2008    1    0
1     03/31/2008    0    0
1     10/31/2008    1    0
2     01/31/2004    1    0
2     02/28/2004    1    1
2     03/31/2004    0    1
2     08/31/2004    0    0
2     11/30/2004    1    0
2     12/31/2004    1    0
;
run;

Now I want to create a column which should consider each ID as a group (By Group). If the value of X1 is equal to "1" in a particular month then the values of "New" column should be equal to X, BUT ONLY for 12 months of the particular year and particular ID.

So in the sample below, for ID1 the value of X1 is equal to "1" in a month of year 2005, therefore column New has the values of column X only for all the months of year 2005, rest is "0". the same is happening for ID2 and all the months of only year 2004.

Please guide me in this regard.

data want;
input ID Date_month X X1 New ;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    1    0    1
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    1    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
1     10/31/2008    1    0    0
2     01/31/2004    1    0    1
2     02/28/2004    1    1    1
2     03/31/2004    0    1    0
2     08/31/2004    0    0    0
2     11/30/2004    1    0    1
2     12/31/2004    1    0    1
;
run;

Ksharp · Posted 10-09-2017 08:46 AM

Assuming I understood what you mean.

data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(Date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
run;
proc sql;
create table temp as
 select *,sum(x1) as sum
  from have 
   group by id,year;
quit;
data want;
 set temp;
 if sum=0 then new=0;
  else new=x;
drop sum;
run;

View solution in original post

Ksharp · Posted 10-09-2017 08:46 AM

Assuming I understood what you mean.

data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(Date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
run;
proc sql;
create table temp as
 select *,sum(x1) as sum
  from have 
   group by id,year;
quit;
data want;
 set temp;
 if sum=0 then new=0;
  else new=x;
drop sum;
run;

Saba1 · Posted 10-09-2017 06:46 PM

@Ksharp I am really grateful to you. The output is exactly what I want. Thanks.

Saba1 · Posted 10-22-2017 01:46 AM

@Ksharp I would like to ask for your assistance regarding the same mentioned data set. This time I need to create a new column in WANT data set where if X1=1 then all remaining observations of a particular ID should be equal to X, regardless of the month or year group. So basically for each ID, the observations of new column must be zero till X1 becomes equal to 1, and soon after X1=1 all values of that year and remaining ones for that ID should be equal to the values of X.

For example, in my data set Have for ID 1, on 07/31/2005 X1=1, so from 01/31/2005 till the end of ID 1 i.e. 10/31/2008 all the values of new column should be equal to X. The same is required for all the IDs. Please guide me in this regard. Thanks.

Ksharp · Posted 10-22-2017 07:34 AM

The first 1 in X1 is 5/31/2005 ,not  07/31/2005 .


data have;
input ID Date_month : mmddyy12. X X1  ;
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
data want;
 set have;
 by id;
 retain found .;
 if first.id then found=.;
 if x1=1 then found=1;
 if found=1 then want=x;
  else want=0;
run;

Saba1 · Posted 10-23-2017 05:20 PM

@Ksharp Thank you so much. The results are exactly the same as I want. Your help is much appreciated.

Saba1 · Posted 10-23-2017 07:07 PM

@Ksharp Sorry to bother you again, as I have mentioned this code is giving the exact output but there are still two issues.

First one is that if X1=1 more than one time for an ID, "Want" column is not equal to "X" in such a case. My objective is that for the very first time as X1=1 for an ID, remaining all observations of Want should be equal to X irrespective of "X1=1" occurs twice, thrice or even more than that in a particular ID.

Secondly, I am looking for Want column where "want=x" NOT from the observation where X1=1 but from the START of the YEAR in which X1=1 for a specific ID. For example, in my data set "have" for ID1 X1=1 in a month of year 2005 (07/31/2005). So now "want=X" should be starting from the very first date of this year i.e. 01/31/2005 and should then continue till 10/31/2008. Please see below the output required:

data want;
input ID Date_month X X1 Want ;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    1    0    1
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    1    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    1
1     02/28/2008    1    0    1
1     03/31/2008    0    0    0
1     10/31/2008    1    0    1
2     01/31/2004    1    0    1
2     02/28/2004    1    1    1
2     03/31/2004    0    1    0
2     08/31/2004    0    0    0
2     11/30/2004    1    0    1
2     12/31/2004    1    0    1;
run;

I will be thankful for your guidance.

Ksharp · Posted 10-24-2017 10:45 AM

OK. I understood your question now.




data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
data temp;
 set have;
 if x1=1;
 keep id year;
run;
data temp1;
 merge have temp(in=inb);
 by id year;
 in_temp=inb;
run;


data want;
 set temp1;
 by id;
 retain found .;
 if first.id then found=.;
 if in_temp then found=1;
 if found=1 then want=x;
  else want=0;
run;

Classroom Training Available!