BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Saba1
Quartz | Level 8

Hi

I have following data set with different IDs and each ID has sorted regular monthly dates of various years. There are two variables X and X1 with 0 and 1 values based on some event. See a small sample below :

data have;
input ID Date_month X X1 ;
datalines;
1     12/31/2001    1    0
1     01/31/2005    0    0
1     02/28/2005    1    0
1     03/31/2005    1    0
1     04/30/2005    0    0
1     05/31/2005    0    0
1     06/30/2005    0    0
1     07/31/2005    1    1
1     08/31/2005    0    0
1     09/30/2005    1    0
1     10/31/2005    1    0
1     11/30/2005    0    1
1     12/31/2005    0    0
1     01/31/2008    1    0
1     02/28/2008    1    0
1     03/31/2008    0    0
1     10/31/2008    1    0
2     01/31/2004    1    0
2     02/28/2004    1    1
2     03/31/2004    0    1
2     08/31/2004    0    0
2     11/30/2004    1    0
2     12/31/2004    1    0
;
run;

Now I want to create a column which should consider each ID as a group (By Group). If the value of X1 is equal to "1" in a particular month then the values of "New" column should be equal to X, BUT ONLY for 12 months of the particular year and particular ID.

So in the sample below, for ID1 the value of X1 is equal to "1" in a month of year 2005, therefore column New has the values of column X only for all the months of year 2005, rest is "0". the same is happening for ID2 and all the months of only year 2004.

 

Please guide me in this regard.

data want;
input ID Date_month X X1 New ;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    1    0    1
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    1    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
1     10/31/2008    1    0    0
2     01/31/2004    1    0    1
2     02/28/2004    1    1    1
2     03/31/2004    0    1    0
2     08/31/2004    0    0    0
2     11/30/2004    1    0    1
2     12/31/2004    1    0    1
;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Assuming I understood what you mean.

 

data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(Date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
run;
proc sql;
create table temp as
 select *,sum(x1) as sum
  from have 
   group by id,year;
quit;
data want;
 set temp;
 if sum=0 then new=0;
  else new=x;
drop sum;
run;

View solution in original post

7 REPLIES 7
Ksharp
Super User

Assuming I understood what you mean.

 

data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(Date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
run;
proc sql;
create table temp as
 select *,sum(x1) as sum
  from have 
   group by id,year;
quit;
data want;
 set temp;
 if sum=0 then new=0;
  else new=x;
drop sum;
run;
Saba1
Quartz | Level 8

@Ksharp I am really grateful to you. The output is exactly what I want. Thanks.

Saba1
Quartz | Level 8

@Ksharp I would like to ask for your assistance regarding the same mentioned data set. This time I need to create a new column in WANT data set where if X1=1 then all remaining  observations of a particular ID should be equal to X, regardless of the month or year group. So basically for each ID, the observations of new column must be zero till X1 becomes equal to 1, and soon after X1=1 all values of that year and remaining ones for that ID should be equal to the values of X. 

 

For example, in my data set Have for ID 1, on 07/31/2005 X1=1, so from 01/31/2005 till the end of ID 1 i.e. 10/31/2008 all the values of new column should be equal to X. The same is required for all the IDs. Please guide me in this regard. Thanks.

Ksharp
Super User
The first 1 in X1 is 5/31/2005 ,not  07/31/2005 .


data have;
input ID Date_month : mmddyy12. X X1  ;
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
data want;
 set have;
 by id;
 retain found .;
 if first.id then found=.;
 if x1=1 then found=1;
 if found=1 then want=x;
  else want=0;
run;

Saba1
Quartz | Level 8

@Ksharp Thank you so much. The results are exactly the same as I want. Your help is much appreciated.

Saba1
Quartz | Level 8

@Ksharp Sorry to bother you again, as I have mentioned this code is giving the exact output but there are still two issues.

First one is that if X1=1 more than one time for an ID, "Want" column is not equal to "X" in such a case. My objective is that for the very first time as X1=1 for an ID, remaining all observations of Want should be equal to X irrespective of "X1=1" occurs twice, thrice or even more than that in a particular ID. 

 

Secondly, I am looking for Want column where "want=x" NOT from the observation where X1=1 but from the START of the YEAR in which X1=1 for a specific ID. For example, in my data set "have" for ID1 X1=1 in a month of year 2005 (07/31/2005). So now "want=X" should be starting from the very first date of this year i.e. 01/31/2005 and should then continue till 10/31/2008. Please see below the output required:

 

data want;
input ID Date_month X X1 Want ;
datalines;
1 12/31/2001 1 0 0 1 01/31/2005 0 0 0 1 02/28/2005 1 0 1 1 03/31/2005 1 0 1 1 04/30/2005 0 0 0 1 05/31/2005 0 0 0 1 06/30/2005 0 0 0 1 07/31/2005 1 1 1 1 08/31/2005 0 0 0 1 09/30/2005 1 0 1 1 10/31/2005 1 0 1 1 11/30/2005 0 1 0 1 12/31/2005 0 0 0 1 01/31/2008 1 0 1 1 02/28/2008 1 0 1 1 03/31/2008 0 0 0 1 10/31/2008 1 0 1 2 01/31/2004 1 0 1 2 02/28/2004 1 1 1 2 03/31/2004 0 1 0 2 08/31/2004 0 0 0 2 11/30/2004 1 0 1 2 12/31/2004 1 0 1;
run;

 I will be thankful for your guidance.

Ksharp
Super User
OK. I understood your question now.




data have;
input ID Date_month : mmddyy12. X X1  ;
year=year(date_month);
format Date_month mmddyy10.;
datalines;
1     12/31/2001    1    0    0
1     01/31/2005    0    0    0
1     02/28/2005    0    0    0
1     03/31/2005    1    0    1
1     04/30/2005    0    0    0
1     05/31/2005    0    0    0
1     06/30/2005    0    0    0
1     07/31/2005    1    1    1
1     08/31/2005    0    0    0
1     09/30/2005    1    0    1
1     10/31/2005    1    0    1
1     11/30/2005    0    0    0
1     12/31/2005    0    0    0
1     01/31/2008    1    0    0
1     02/28/2008    1    0    0
1     03/31/2008    0    0    0
;
data temp;
 set have;
 if x1=1;
 keep id year;
run;
data temp1;
 merge have temp(in=inb);
 by id year;
 in_temp=inb;
run;


data want;
 set temp1;
 by id;
 retain found .;
 if first.id then found=.;
 if in_temp then found=1;
 if found=1 then want=x;
  else want=0;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 3728 views
  • 0 likes
  • 2 in conversation