I have the following dataset:
data have;
input year firm id;
cards;
2008 28013 1003
2008 28013 1004
2008 28013 1007
2008 28013 1009
2008 28013 1010
2008 28013 1013
2008 28013 1053
2008 28013 1074
2008 28013 1075
2009 28013 1009
2009 28013 1010
2009 28013 1053
2009 28013 1074
2009 28013 1075
2009 28332 1004
2009 28332 1007
2009 28332 1823
;
run;
For each "id", I have a list of adjacent "id" called id_adj in the following dataset:
data list;
input id id_adj use;
cards;
1003 1009 1
1003 1010 1
1003 1085 1
1004 1007 0
1004 1009 0
1004 1099 0
1004 1100 0
1007 1004 0
1007 1823 1
1009 1003 1
1009 1004 0
1009 1010 0
1010 1003 1
1010 1009 0
1013 1053 1
1013 1074 1
1053 1013 1
1074 1013 1
1075 1743 1
1075 1744 1
1823 1007 1
;
run;
I wish to create the following dataset with an additional variable "treat":
data want;
input year firm id treat;
cards;
2008 28013 1003 0
2008 28013 1004 4
2008 28013 1007 5
2008 28013 1009 0
2008 28013 1010 0
2008 28013 1013 0
2008 28013 1053 0
2008 28013 1074 0
2008 28013 1075 5
2009 28013 1009 5
2009 28013 1010 5
2009 28013 1053 5
2009 28013 1074 5
2009 28013 1075 5
2009 28332 1004 4
2009 28332 1007 0
2009 28332 1823 0
;
run;
where treat is defined as:
Hi,
You can use the following code:
/* First step: Build the "appears_as_id_in_have" indicator, for each id_adj group by year and firm*/
proc sql noprint;
create table list_ext as
select c.year,
c.firm,
a.id,
a.id_adj,
a.use,
ifn(id_adj in (select b.id from have as b where c.year=b.year and c.firm=b.firm),1,0)
as appears_as_id_in_have,
a.use*(calculated appears_as_id_in_have) as indic_treat
from list as a left join have as c
on a.id=c.id
order by year,firm,id;
quit;
/* Second step: determine "treat" based on the previous indicators */
proc sql noprint;
create table output as
select distinct
year,firm,id,treat
rom (select *,
sum(use) as sum_use,
sum(indic_treat) as sum_indic_treat,
case
when (calculated sum_use=0) then 4
when (calculated sum_use>0 and calculated sum_indic_treat=0) then 5
else 0
end as treat
from list_ext
group by year, firm, id);
quit;
Let me know if you have questions !
Regards,
JD
Like this?
proc sql;
select a.*, 4*(NOBS=ONE) as CASE1 , 5*(sum(c.ID_ADJ)^=sum(d.ID)) as CASE2
from HAVE a
left join
(select ID, count(*) as NOBS, sum(USE=0)as ONE from LIST group by ID) b
on a.ID=b.ID
left join
(select ID, ID_ADJ from LIST where USE=1) c
on a.ID=c.ID
left join
HAVE d
on c.ID_ADJ=d.ID
and a.FIRM =d.FIRM
and a.YEAR =d.YEAR
group by a.YEAR, a.FIRM, a.ID
order by a.YEAR, a.FIRM, a.ID
year | firm | id | CASE1 | CASE2 |
---|---|---|---|---|
2008 | 28013 | 1003 | 0 | 5 |
2008 | 28013 | 1004 | 4 | 0 |
2008 | 28013 | 1007 | 0 | 5 |
2008 | 28013 | 1009 | 0 | 0 |
2008 | 28013 | 1010 | 0 | 0 |
2008 | 28013 | 1013 | 0 | 0 |
2008 | 28013 | 1053 | 0 | 0 |
2008 | 28013 | 1074 | 0 | 0 |
2008 | 28013 | 1075 | 0 | 5 |
2009 | 28013 | 1009 | 0 | 5 |
2009 | 28013 | 1010 | 0 | 5 |
2009 | 28013 | 1053 | 0 | 5 |
2009 | 28013 | 1074 | 0 | 5 |
2009 | 28013 | 1075 | 0 | 5 |
2009 | 28332 | 1004 | 4 | 0 |
2009 | 28332 | 1007 | 0 | 0 |
2009 | 28332 | 1823 | 0 | 0 |
Yes, laptop without mouse is no fun. Look now.
You can add variable TREAT once all is confirmed to work as expected:
max(calculated CASE1, calculated CASE2) as TREAT
Hi,
You can use the following code:
/* First step: Build the "appears_as_id_in_have" indicator, for each id_adj group by year and firm*/
proc sql noprint;
create table list_ext as
select c.year,
c.firm,
a.id,
a.id_adj,
a.use,
ifn(id_adj in (select b.id from have as b where c.year=b.year and c.firm=b.firm),1,0)
as appears_as_id_in_have,
a.use*(calculated appears_as_id_in_have) as indic_treat
from list as a left join have as c
on a.id=c.id
order by year,firm,id;
quit;
/* Second step: determine "treat" based on the previous indicators */
proc sql noprint;
create table output as
select distinct
year,firm,id,treat
rom (select *,
sum(use) as sum_use,
sum(indic_treat) as sum_indic_treat,
case
when (calculated sum_use=0) then 4
when (calculated sum_use>0 and calculated sum_indic_treat=0) then 5
else 0
end as treat
from list_ext
group by year, firm, id);
quit;
Let me know if you have questions !
Regards,
JD
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.