Solved: Re: Creating a new variable based on corresponding IDs from another da...

TrueTears · Posted 11-26-2019 12:13 AM

I have the following dataset (there are more years, but I am just showing the first two years):

data have;
     input year  ID  shock;
     cards;
     2000  1001  1
     2000  1002  0
     2000  1004  0
     2000  1006  1
     2000  1008  0
     2000  1010  0
     2000  1011  0
     2001  1001  0
     2001  1002  0
     2001  1004  1
     2001  1006  1
     2001  1008  1
     2001  1010  0
     2001  1011  0
;
run;

I have another dataset that lists "adjacent" IDs as follows:

data list;
     input ID  ID_adj;
     cards;
         1001  1002
         1001  1006
         1002  1001
	 1004  1008
	 1004  1011
	 1006  1010
	 1008  1004
	 1008  1011
	 1010  1006
	 1010  1011
	 1011  1004
	 1011  1008
	 1011  1010
;
run;

What I wish to do is create the following dataset:

data want;
     input year  ID  shock treat $;
     cards;
         2000  1001  1  T
         2000  1002  0  A  
         2000  1004  0  N
	 2000  1006  1  T
	 2000  1008  0  N
	 2000  1010  0  A
	 2000  1011  0  N
	 2001  1001  0  A
         2001  1002  0  N
         2001  1004  1  T
	 2001  1006  1  T
	 2001  1008  1  T
	 2001  1010  0  A
	 2001  1011  0  A
;
run;

The variable treat is defined as follows:

1) If shock takes a value of 1, then treat = T.

2) If shock takes a value of 0, then we look across all of the ID_adj in the list dataset that corresponds to the ID. If the value of shock is 1 for at least one ID_adj in the same year, then treat = A. It's best to illustrate this with an example.

In the original dataset 'have', look at ID 1002 in year 2000. In the dataset 'list', ID 1002 corresponds to 1001 and in year 2000, ID 1001 has shock = 1. Thus, we set treat = A.

Similarly, in the original dataset 'have', look at ID 1010 in year 2000. In the dataset 'list', ID 1010 corresponds to 1006 and 1011. Although ID 1011 has shock = 0 in year 2000, ID 1006 has shock = 1 in year 2000, so treat for ID 1010 = A.

3) If shock takes a value of 0, then we look across all of the ID_adj in the list dataset that corresponds to the ID. If the value of shock is 0 for all ID_adj in the same year, then treat = N. Again, it's best to illustrate with an example:

In the original dataset 'have', look at ID 1008 in year 2000. In the dataset 'list', ID 1008 corresponds to 1004 and 1011. Both ID 1004 and 1011 have shock = 0 in year 2000, so treat = N for ID 1008 in year 2000.

Since the value of shock changes across different years, we need to do the above across all years, thereby producing the dataset 'want'.

PGStats · Posted 11-26-2019 12:48 AM

Proc SQL is well suited for this operation:

proc sql;
create table want as
select
	a.*,
	case when a.shock then "T" when max(c.shock) then "A" else "N" end as treat
from
	have as a left join
	list as b on a.id=b.id left join
	have as c on b.id_adj=c.id and a.year=c.year
group by a.year, a.id, a.shock;
quit;

PG

View solution in original post

PGStats · Posted 11-26-2019 12:48 AM

Proc SQL is well suited for this operation:

proc sql;
create table want as
select
	a.*,
	case when a.shock then "T" when max(c.shock) then "A" else "N" end as treat
from
	have as a left join
	list as b on a.id=b.id left join
	have as c on b.id_adj=c.id and a.year=c.year
group by a.year, a.id, a.shock;
quit;

PG

Creating a new variable based on corresponding IDs from another dataset

Re: Creating a new variable based on corresponding IDs from another dataset

Re: Creating a new variable based on corresponding IDs from another dataset

Registration is open

Creating a new variable based on corresponding IDs from another dataset

Re: Creating a new variable based on corresponding IDs from another dataset

Re: Creating a new variable based on corresponding IDs from another dataset

Registration is open

SAS Training: Just a Click Away