Solved: Changing the value of a variable for a particular ID

TrueTears · Posted 07-17-2020 07:09 PM

Say I have a subsample of my data that looks as follows:

year	event_year	change	id
2007	2009		3
2008	2009		3
2009	2009	1	3
2010	2009		3
2011	2009		3
2012	2009		3
2002	2004		55
2003	2004		55
2004	2004	0	55
2005	2004		55

Whenever the variable change takes a value of 1 for year = event_ye, I want to create a new column for that id only called event_year_new where it is equal to event_year+1. If change = 0, then nothing happens and event_year = event_year_new:

year	event_year	change	id	event_year_new
2007	2009		3	2010
2008	2009		3	2010
2009	2009	1	3	2010
2010	2009		3	2010
2011	2009		3	2010
2012	2009		3	2010
2002	2004		55	2004
2003	2004		55	2004
2004	2004	0	55	2004
2005	2004		55	2004

How could I do this? Of course the below code does not work because it only changes it for the case when year = event_year, but I want it to apply to each ID.


if year = event_year and change=1 then
      do;
event_year_new = event_ye+1;
      end;

mkeintz · Posted 07-18-2020 12:13 AM

Assuming each id never has more than one change value, and that the data are sorted by id, then:

data have;
  input year	event_year	@17 change 1.	id;
datalines;
2007	2009	 	3
2008	2009	 	3
2009	2009	1	3
2010	2009	 	3
2011	2009	 	3
2012	2009	 	3
2002	2004	 	55
2003	2004	 	55
2004	2004	0	55
2005	2004	 	55
run;

data want;
  set have (where=(change^=.) in=inchange)
      have (in=inwanted);
  by id;

  if first.id then do;
    if change=1 then event_year_new=event_year+1; else
    if change=0 then event_year_new=event_year;  else
    event_year_new=.;
  end;
  retain event_year_new;
  if inwanted;
run;

For each id, this reads the change record prior to reading all the records. So the event_year_new can be established at the start of each id. Leaving the event_year_new variable retained propagates it through all the records for that id.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

mkeintz · Posted 07-17-2020 11:58 PM

Question: what if a given ID has more than one non-missing value for change? Do you plan to make multiple new columns in such cases?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

TrueTears · Posted 07-18-2020 12:09 AM

No, the change variable only has values for when year = event_year. It is always missing for other cases.

mkeintz · Posted 07-18-2020 12:13 AM

Assuming each id never has more than one change value, and that the data are sorted by id, then:

data have;
  input year	event_year	@17 change 1.	id;
datalines;
2007	2009	 	3
2008	2009	 	3
2009	2009	1	3
2010	2009	 	3
2011	2009	 	3
2012	2009	 	3
2002	2004	 	55
2003	2004	 	55
2004	2004	0	55
2005	2004	 	55
run;

data want;
  set have (where=(change^=.) in=inchange)
      have (in=inwanted);
  by id;

  if first.id then do;
    if change=1 then event_year_new=event_year+1; else
    if change=0 then event_year_new=event_year;  else
    event_year_new=.;
  end;
  retain event_year_new;
  if inwanted;
run;

For each id, this reads the change record prior to reading all the records. So the event_year_new can be established at the start of each id. Leaving the event_year_new variable retained propagates it through all the records for that id.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Patrick · Posted 07-18-2020 12:17 AM

Does below work for your actual data?

data have;
  infile datalines dsd truncover;
  input year event_year change id;
  datalines;
2007,2009, ,3
2008,2009, ,3
2009,2009,1,3
2010,2009, ,3
2011,2009, ,3
2012,2009, ,3
2002,2004, ,55
2003,2004, ,55
2004,2004,0,55
2005,2004, ,55
;

data want(drop=_:);
  merge 
    have 
    have( keep=id change
          where=(not missing(_change))
          rename=(change=_change)
        );
  by id;
  event_year_new=sum(event_year,_change);
run;

proc print data=want;
run;

Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Re: Changing the value of a variable for a particular ID

Registration is open

SAS Training: Just a Click Away