BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aj_goodnews
Calcite | Level 5

Dear all,

 

I have a dataset as follow,

 

ID year lag_4var lag_5var lag_6var lag_7var treat
1 1990 6 7 8 9 0
1 1991 11 7 8 9 1
1 1992 9 4 5 13 1
1 1993 12 13 14 11 1
2 2001 8 11 12 8 1
2 2002 6 12 5 6 1
2 2003 13 5 11 14 1

 

lag4_var = lag4(var) where var is a measurement of firm size for each ID/quarter. Similarly, lag5_var is the value of the 5th previous quarterly ‘var’ and lag6_var is the value of the 6th previous quarterly ‘var’, and so on.

Treat is a dummy variable equals 1 if any of lag#_var greater than or equals to 10,  or it equals 0 if all the lag#_var are less than 10.

 

I want to create a new variable, K. For observations with treat=1, K equals the earliest lag#_var that is greater than 10. For observations with treat=0. K equals the latest (most current) lag#_var.

 

For example, for observation with ID=1 and year=1990, treat=0, so in this case K should be lag4_var as lag4_var is the most recent number for lag#_var.

For observation with ID=2 and year=2003, treat=1, and in this case K should be lag7_var as lag7_var is the first and earliest lag#_var that is greater than 10.

So as a result, I want to have the following dataset:

 

ID year lag_4var lag_5var lag_6var lag_7var treat K
1 1990 6 7 8 9 0 6
1 1991 11 7 8 9 1 11
1 1992 9 4 5 13 1 13
1 1993 12 13 14 11 1 11
2 2001 8 11 12 8 1 12
2 2002 6 12 5 6 1 12
2 2003 13 5 11 14 1 14

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

data have;
input ID	year	lag_4var	lag_5var	lag_6var	lag_7var	treat;
cards;
1	1990	6	7	8	9	0
1	1991	11	7	8	9	1
1	1992	9	4	5	13	1
1	1993	12	13	14	11	1
2	2001	8	11	12	8	1
2	2002	6	12	5	6	1
2	2003	13	5	11	14	1
;
data want;
 set have;
 array l lag_4var--lag_7var;
 if treat then do over l;
  if l>10 then k=l;
 end;
 else k=lag_4var;
run;

To address  

ID	year	lag_4var	lag_5var	lag_6var	lag_7var	treat
3      2000    10             9                8           7             1

 

So in this case, K=1 and the earliest lag#_var greater or equal to 10 is lag_4var, so K=10

data want;
 set have;
 array l lag_4var--lag_7var;
 if treat then do over l;
  if l>10 then k=l;
  else k=lag_4var;
 end;
 else k=lag_4var;
run;

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

data have;
input ID	year	lag_4var	lag_5var	lag_6var	lag_7var	treat;
cards;
1	1990	6	7	8	9	0
1	1991	11	7	8	9	1
1	1992	9	4	5	13	1
1	1993	12	13	14	11	1
2	2001	8	11	12	8	1
2	2002	6	12	5	6	1
2	2003	13	5	11	14	1
;
data want;
 set have;
 array l lag_4var--lag_7var;
 if treat then do over l;
  if l>10 then k=l;
 end;
 else k=lag_4var;
run;

To address  

ID	year	lag_4var	lag_5var	lag_6var	lag_7var	treat
3      2000    10             9                8           7             1

 

So in this case, K=1 and the earliest lag#_var greater or equal to 10 is lag_4var, so K=10

data want;
 set have;
 array l lag_4var--lag_7var;
 if treat then do over l;
  if l>10 then k=l;
  else k=lag_4var;
 end;
 else k=lag_4var;
run;
aj_goodnews
Calcite | Level 5

Hi 

novinosrin
Tourmaline | Level 20

Yes, slight tweak if I understand you correctly

 

data want;
 set have;
 array l lag_4var--lag_7var;
 if treat then do over l;
  if l>=10 then do;
   sum=sum(sum,l);
   n=sum(n,1);
  end;
 end;
 else do over l;
  if l<10 then do;
   sum=sum(sum,l);
   n=sum(n,1);
  end;
 end;
 k=sum/n;
run;
art297
Opal | Level 21

How would you score the following record?

 

ID	year	lag_4var	lag_5var	lag_6var	lag_7var	treat
3      2000    10             9                8           7             1

Art, CEO, AnalystFinder.com

 

aj_goodnews
Calcite | Level 5
So in this case, K=1 and the earliest lag#_var greater or equal to 10 is lag_4var, so K=10
aj_goodnews
Calcite | Level 5
I think I forgot to put greater than or equal to 10 in my original posts, which caused your question. I apologize for that
AMSAS
SAS Super FREQ

Hi, based on your explanation I think you are looking for something like the following.
Note I used 7 as my cut off to fit my sample input data

 

/* Set up sample input data */

data input ;
	infile cards ;
	input var ;	
	lagVar1=lag1(var) ;
	lagVar2=lag2(var) ;
	lagVar3=lag3(var) ;
	lagVar4=lag4(var) ;
cards;
3
2
8
4
6
7
1
3
2
9
3
4
6
7
;

run ;

/* generate output */
data ouput (drop=i) ;
	set input ;
	/* Load lag values into an array */
	array lags{4} lagVar1-lagVar4 ;
	/* Set treat and k to 0 */
	treat=0 ;
	k=0 ;
	/* Loop through array backwards, starting with oldest lag value to most recent lag value */
	do i=4 to 1 by -1 ;
		/* if the lag value exceeds your trigger */
		if lags{i}>7 then do ;
			/* Set treat to 1 */
			treat=1 ;
			/* if k=0 */
			if k=0 then 
				/* then store the lag value */
				k=lags{i} ;
		end ;
	end ;
	/* if k=0 */
	if k=0 then
		/* set k to the most recent lag value */
		k=lags{1} ;
run ;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1434 views
  • 0 likes
  • 4 in conversation