How to Conditionally repeat values of a variable by ID and by Year

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

How to Conditionally repeat values of a variable by ID and by Year

[ Edited ]

Hi

I have following data set with different IDs and each ID has sorted regular monthly dates of various years. There are two variables X and X1 with 0 and 1 values based on some event. See a small sample below :

data have;
input ID Date_month X X1 ;
datalines;
1     12/31/2001    1    0
1     01/31/2005    0    0
1     02/28/2005    1    0
1     03/31/2005    1    0
1     04/30/2005    0    0
1     05/31/2005    0    0
1     06/30/2005    0    0
1     07/31/2005    1    1
1     08/31/2005    0    0
1     09/30/2005    1    0
1     10/31/2005    1    0
1     11/30/2005    0    1
1     12/31/2005    0    0
1     01/31/2008    1    0
1     02/28/2008    1    0
1     03/31/2008    0    0
1     10/31/2008    1    0
2     01/31/2004    1    0
2     02/28/2004    1    1
2     03/31/2004    0    1
2     08/31/2004    0    0
2     11/30/2004    1    0
2     12/31/2004    1    0
;
run;

Now I want to create a column which should consider each ID as a group (By Group). If the value of X1 is equal to "1" in a particular month then the values of "New" column should be equal to X, BUT ONLY for 12 months of the particular year and particular ID.

So in the sample below, for ID1 the value of X1 is equal to "1" in a month of year 2005, therefore column New has the values of column X only for all the months of year 2005, rest is "0". the same is happening for ID2 and all the months of only year 2004.

 

Please guide me in this regard.

data want;
input ID Date_month X X1 New ;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    1    0    1
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    1    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
1     10/31/2008    1    0    0
2     01/31/2004    1    0    1
2     02/28/2004    1    1    1
2     03/31/2004    0    1    0
2     08/31/2004    0    0    0
2     11/30/2004    1    0    1
2     12/31/2004    1    0    1
;
run;

 


Accepted Solutions
Solution
‎10-09-2017 07:08 PM
Super User
Posts: 10,130

Re: How to Conditionally repeat values of a variable by ID and by Year

Assuming I understood what you mean.

 

data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(Date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
run;
proc sql;
create table temp as
 select *,sum(x1) as sum
  from have 
   group by id,year;
quit;
data want;
 set temp;
 if sum=0 then new=0;
  else new=x;
drop sum;
run;

View solution in original post


All Replies
Solution
‎10-09-2017 07:08 PM
Super User
Posts: 10,130

Re: How to Conditionally repeat values of a variable by ID and by Year

Assuming I understood what you mean.

 

data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(Date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
run;
proc sql;
create table temp as
 select *,sum(x1) as sum
  from have 
   group by id,year;
quit;
data want;
 set temp;
 if sum=0 then new=0;
  else new=x;
drop sum;
run;
Contributor
Posts: 28

Re: How to Conditionally repeat values of a variable by ID and by Year

@Ksharp I am really grateful to you. The output is exactly what I want. Thanks.

Contributor
Posts: 28

Re: How to Conditionally repeat values of a variable by ID and by Year

[ Edited ]

@Ksharp I would like to ask for your assistance regarding the same mentioned data set. This time I need to create a new column in WANT data set where if X1=1 then all remaining  observations of a particular ID should be equal to X, regardless of the month or year group. So basically for each ID, the observations of new column must be zero till X1 becomes equal to 1, and soon after X1=1 all values of that year and remaining ones for that ID should be equal to the values of X. 

 

For example, in my data set Have for ID 1, on 07/31/2005 X1=1, so from 01/31/2005 till the end of ID 1 i.e. 10/31/2008 all the values of new column should be equal to X. The same is required for all the IDs. Please guide me in this regard. Thanks.

Super User
Posts: 10,130

Re: How to Conditionally repeat values of a variable by ID and by Year

The first 1 in X1 is 5/31/2005 ,not  07/31/2005 .


data have;
input ID Date_month : mmddyy12. X X1  ;
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
data want;
 set have;
 by id;
 retain found .;
 if first.id then found=.;
 if x1=1 then found=1;
 if found=1 then want=x;
  else want=0;
run;

Contributor
Posts: 28

Re: How to Conditionally repeat values of a variable by ID and by Year

@Ksharp Thank you so much. The results are exactly the same as I want. Your help is much appreciated.

Contributor
Posts: 28

Re: How to Conditionally repeat values of a variable by ID and by Year

[ Edited ]

@Ksharp Sorry to bother you again, as I have mentioned this code is giving the exact output but there are still two issues.

First one is that if X1=1 more than one time for an ID, "Want" column is not equal to "X" in such a case. My objective is that for the very first time as X1=1 for an ID, remaining all observations of Want should be equal to X irrespective of "X1=1" occurs twice, thrice or even more than that in a particular ID. 

 

Secondly, I am looking for Want column where "want=x" NOT from the observation where X1=1 but from the START of the YEAR in which X1=1 for a specific ID. For example, in my data set "have" for ID1 X1=1 in a month of year 2005 (07/31/2005). So now "want=X" should be starting from the very first date of this year i.e. 01/31/2005 and should then continue till 10/31/2008. Please see below the output required:

 

data want;
input ID Date_month X X1 Want ;
datalines;
1 12/31/2001 1 0 0 1 01/31/2005 0 0 0 1 02/28/2005 1 0 1 1 03/31/2005 1 0 1 1 04/30/2005 0 0 0 1 05/31/2005 0 0 0 1 06/30/2005 0 0 0 1 07/31/2005 1 1 1 1 08/31/2005 0 0 0 1 09/30/2005 1 0 1 1 10/31/2005 1 0 1 1 11/30/2005 0 1 0 1 12/31/2005 0 0 0 1 01/31/2008 1 0 1 1 02/28/2008 1 0 1 1 03/31/2008 0 0 0 1 10/31/2008 1 0 1 2 01/31/2004 1 0 1 2 02/28/2004 1 1 1 2 03/31/2004 0 1 0 2 08/31/2004 0 0 0 2 11/30/2004 1 0 1 2 12/31/2004 1 0 1;
run;

 I will be thankful for your guidance.

Super User
Posts: 10,130

Re: How to Conditionally repeat values of a variable by ID and by Year

OK. I understood your question now.




data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
data temp;
 set have;
 if x1=1;
 keep id year;
run;
data temp1;
 merge have temp(in=inb);
 by id year;
 in_temp=inb;
run;


data want;
 set temp1;
 by id;
 retain found .;
 if first.id then found=.;
 if in_temp then found=1;
 if found=1 then want=x;
  else want=0;
run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 217 views
  • 0 likes
  • 2 in conversation