BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TrueTears
Obsidian | Level 7

I have the following dataset:

 

data have;
input year firm id;
cards;
2008	28013	1003
2008	28013	1004
2008	28013	1007
2008	28013	1009
2008	28013	1010
2008	28013	1013
2008	28013	1053
2008	28013	1074
2008	28013	1075
2009	28013	1009
2009	28013	1010
2009	28013	1053
2009	28013	1074
2009	28013	1075
2009	28332	1004
2009	28332	1007
2009	28332	1823

;
run;

For each "id", I have a list of adjacent "id" called id_adj in the following dataset:

 

data list;
input id id_adj use;
cards;
1003	1009	1
1003	1010	1
1003	1085	1
1004	1007	0
1004	1009	0
1004	1099	0
1004	1100	0
1007	1004	0
1007	1823	1
1009	1003	1
1009	1004	0
1009	1010	0
1010	1003	1
1010	1009	0
1013	1053	1
1013	1074	1
1053	1013	1
1074	1013	1
1075	1743	1
1075	1744	1
1823	1007	1
;
run;

I wish to create the following dataset with an additional variable "treat":

 

data want;
input year firm id treat;
cards;
2008	28013	1003	0
2008	28013	1004	4
2008	28013	1007	5
2008	28013	1009	0
2008	28013	1010	0
2008	28013	1013	0
2008	28013	1053	0
2008	28013	1074	0
2008	28013	1075	5
2009	28013	1009	5
2009	28013	1010	5
2009	28013	1053	5
2009	28013	1074	5
2009	28013	1075	5
2009	28332	1004	4
2009	28332	1007	0
2009	28332	1823	0


;
run;

where treat is defined as:

 

  • Within each year/firm, for each id in "have", if use = 0 for all id_adj that corresponds to the id in "list", then treat = 4.
  • Within each year/firm, for each id in "have", consider all of the id_adj that corresponds to the id with use = 1 in "list". If none of these id_adj appears as id in "have", then treat = 5. For example, consider year 2008, firm 28013 with id 1007. In "list", id 1007 corresponds to 1823 that has use=1. But 1823 doesn't appear in "have" as an id for year/firm 2008/28013. So treat = 5.
  • For all other cases, treat = 0. 
1 ACCEPTED SOLUTION

Accepted Solutions
JeanDo
Obsidian | Level 7

Hi,

You can use the following code:

 

/* First step: Build the "appears_as_id_in_have" indicator, for each id_adj group by year and firm*/
proc sql noprint;
create table list_ext as
select 	c.year,
	c.firm,
	a.id,
	a.id_adj,
	a.use,
	ifn(id_adj in (select b.id from have as b where c.year=b.year and c.firm=b.firm),1,0)		
	as appears_as_id_in_have,
	a.use*(calculated appears_as_id_in_have) as indic_treat
from list as a left join have as c
on a.id=c.id
order by year,firm,id;
quit;

/* Second step: determine "treat" based on the previous indicators */
proc sql noprint;
create table output as
select distinct 
    year,firm,id,treat
    rom (select *,
	sum(use) as sum_use,
	sum(indic_treat) as sum_indic_treat,
	case
		when (calculated sum_use=0) then 4
		when (calculated sum_use>0 and calculated sum_indic_treat=0) then 5
		else 0
		end as treat
from list_ext
group by year, firm, id);
quit;

Let me know if you have questions !

 

Regards,

 

JD

View solution in original post

5 REPLIES 5
ChrisNZ
Tourmaline | Level 20

Like this?

proc sql;
 select a.*, 4*(NOBS=ONE) as CASE1  , 5*(sum(c.ID_ADJ)^=sum(d.ID)) as CASE2
 from HAVE a
        left join
      (select ID, count(*) as NOBS, sum(USE=0)as ONE from LIST group by ID) b
        on a.ID=b.ID
        left join
      (select ID, ID_ADJ from LIST where USE=1) c
        on a.ID=c.ID
        left join
      HAVE d 
        on  c.ID_ADJ=d.ID      
        and a.FIRM  =d.FIRM 
        and a.YEAR  =d.YEAR
 group by a.YEAR, a.FIRM, a.ID
 order by a.YEAR, a.FIRM, a.ID
 

year firm id CASE1 CASE2
2008 28013 1003 0 5
2008 28013 1004 4 0
2008 28013 1007 0 5
2008 28013 1009 0 0
2008 28013 1010 0 0
2008 28013 1013 0 0
2008 28013 1053 0 0
2008 28013 1074 0 0
2008 28013 1075 0 5
2009 28013 1009 0 5
2009 28013 1010 0 5
2009 28013 1053 0 5
2009 28013 1074 0 5
2009 28013 1075 0 5
2009 28332 1004 4 0
2009 28332 1007 0 0
2009 28332 1823 0 0

 

 

 

TrueTears
Obsidian | Level 7
@ChrisNZ Was there meant to be some code in your response? haha
ChrisNZ
Tourmaline | Level 20

Yes, laptop without mouse is no fun. Look now.

ChrisNZ
Tourmaline | Level 20

You can add variable TREAT once all is confirmed to work as expected:

max(calculated CASE1, calculated CASE2) as TREAT

 

JeanDo
Obsidian | Level 7

Hi,

You can use the following code:

 

/* First step: Build the "appears_as_id_in_have" indicator, for each id_adj group by year and firm*/
proc sql noprint;
create table list_ext as
select 	c.year,
	c.firm,
	a.id,
	a.id_adj,
	a.use,
	ifn(id_adj in (select b.id from have as b where c.year=b.year and c.firm=b.firm),1,0)		
	as appears_as_id_in_have,
	a.use*(calculated appears_as_id_in_have) as indic_treat
from list as a left join have as c
on a.id=c.id
order by year,firm,id;
quit;

/* Second step: determine "treat" based on the previous indicators */
proc sql noprint;
create table output as
select distinct 
    year,firm,id,treat
    rom (select *,
	sum(use) as sum_use,
	sum(indic_treat) as sum_indic_treat,
	case
		when (calculated sum_use=0) then 4
		when (calculated sum_use>0 and calculated sum_indic_treat=0) then 5
		else 0
		end as treat
from list_ext
group by year, firm, id);
quit;

Let me know if you have questions !

 

Regards,

 

JD

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 759 views
  • 0 likes
  • 3 in conversation