BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lijuu
Obsidian | Level 7

I have data consists ID, V1, and V2 for around 11000 observations. The sample format is given below for six observations.

SAS.PNG

From this data, I wanna create V3, V4, and V5 as follows:

V3 is the inverse order of V2. Example for ID=1, V2= (2,1) then V3=(1,2).

V4 Is the cumulative greater type of V1 (in order of V3). Example for ID=1, V1=(1, 7) then V4=(7,8).

V5 is always equal to 0 if V3=1, else V5 is equal to lag1 of V4.

The expected output that I did manually is given below:

SAS2.PNG

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

This gives you v3 and v4. 

 

Regarding v5. You say you want the lag of v4. Though your desired results suggest you want the lead of v4?

 

data have;
input ID v1 v2;
datalines;
1 1 1
1 7 2
2 16 1
2 3 2
3 11 1
4 13 1
4 3 2
5 4 1
5 3 2
5 2 3
5 2 4
6 2 1
6 4 2
6 3 3
;

data want(drop = c n);

   do _N_ = 1 by 1 until (last.ID);
      set have;
	  by ID;

      if first.ID then c = 0;

      array vv1{999} _temporary_;
	  array vv2{999} _temporary_;
	  
	  c + v1;

	  vv1 [_N_] = c;
      vv2 [_N_] = v2;
   end;

   n = _N_;
   v4 = c;

   do _N_ = 1 to _N_;
      set have;

      v3 = vv2[1 + n - _N_];
	  v4 = v4 - ifn(_N_ > 1, lag(v1), 0);

	  output;
   end;

   call missing(of vv1[*], of vv2[*]);

run;

 

Result:

 

ID  v1  v2  v4  v3 
1   1   1   8   2 
1   7   2   7   1 
2   16  1   19  2 
2   3   2   3   1 
3   11  1   11  1 
4   13  1   16  2 
4   3   2   3   1 
5   4   1   11  4 
5   3   2   7   3 
5   2   3   4   2 
5   2   4   2   1 
6   2   1   9   3 
6   4   2   7   2 
6   3   3   3   1 

 

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

This gives you v3 and v4. 

 

Regarding v5. You say you want the lag of v4. Though your desired results suggest you want the lead of v4?

 

data have;
input ID v1 v2;
datalines;
1 1 1
1 7 2
2 16 1
2 3 2
3 11 1
4 13 1
4 3 2
5 4 1
5 3 2
5 2 3
5 2 4
6 2 1
6 4 2
6 3 3
;

data want(drop = c n);

   do _N_ = 1 by 1 until (last.ID);
      set have;
	  by ID;

      if first.ID then c = 0;

      array vv1{999} _temporary_;
	  array vv2{999} _temporary_;
	  
	  c + v1;

	  vv1 [_N_] = c;
      vv2 [_N_] = v2;
   end;

   n = _N_;
   v4 = c;

   do _N_ = 1 to _N_;
      set have;

      v3 = vv2[1 + n - _N_];
	  v4 = v4 - ifn(_N_ > 1, lag(v1), 0);

	  output;
   end;

   call missing(of vv1[*], of vv2[*]);

run;

 

Result:

 

ID  v1  v2  v4  v3 
1   1   1   8   2 
1   7   2   7   1 
2   16  1   19  2 
2   3   2   3   1 
3   11  1   11  1 
4   13  1   16  2 
4   3   2   3   1 
5   4   1   11  4 
5   3   2   7   3 
5   2   3   4   2 
5   2   4   2   1 
6   2   1   9   3 
6   4   2   7   2 
6   3   3   3   1 

 

Lijuu
Obsidian | Level 7
Yes Indeed! V5 needs to be the lead of v4 as like as the suggested results. Thank you very much, Dear.
PeterClemmensen
Tourmaline | Level 20

Ok. Try this then

 

data have;
input ID v1 v2;
datalines;
1 1 1
1 7 2
2 16 1
2 3 2
3 11 1
4 13 1
4 3 2
5 4 1
5 3 2
5 2 3
5 2 4
6 2 1
6 4 2
6 3 3
;

data want(drop = c n);

   do _N_ = 1 by 1 until (last.ID);
      set have;
     by ID;

      if first.ID then c = 0;

      array vv1{999} _temporary_;
      array vv2{999} _temporary_;
     
      c + v1;

      vv1 [_N_] = c;
      vv2 [_N_] = v2;
   end;

   n = _N_;
   v4 = c;

   do _N_ = 1 to _N_;
      set have;

      v3 = vv2[1 + n - _N_];
      v4 = v4 - ifn(_N_ > 1, lag(v1), 0);
      v5 = (v4 - v1) * (v3 ne 1);

      output;
   end;

   call missing(of vv1[*], of vv2[*]);

run;

 

Result:

 

Obs	ID	v1	v2	v4	v3	v5
1	1	1	1	8	2	7
2	1	7	2	7	1	0
3	2	16	1	19	2	3
4	2	3	2	3	1	0
5	3	11	1	11	1	0
6	4	13	1	16	2	3
7	4	3	2	3	1	0
8	5	4	1	11	4	7
9	5	3	2	7	3	4
10	5	2	3	4	2	2
11	5	2	4	2	1	0
12	6	2	1	9	3	7
13	6	4	2	7	2	3
14	6	3	3	3	1	0
Lijuu
Obsidian | Level 7
Thank you very much, Dear.
It is successfully working.
PeterClemmensen
Tourmaline | Level 20
Ksharp
Super User
data have;
input ID v1 v2;
datalines;
1 1 1
1 7 2
2 16 1
2 3 2
3 11 1
4 13 1
4 3 2
5 4 1
5 3 2
5 2 3
5 2 4
6 2 1
6 4 2
6 3 3
;
data temp;
 set have;
 by id;
 if first.id then n=0;
 n+1;
run;
proc sort data=temp;by id descending n;run;
data temp1(keep=v3) temp2(keep=id n v4);
 set temp;
 by id;
 v3=v2;
 if first.id then v4=0;
 v4+v1;
 run;
 proc sort data=temp2;by id n;run;
proc sort data=temp2 out=temp3;by id  descending n;run;
data temp3;
 set temp3;
 by id;
 v5=lag(v4);
 if first.id then v5=.;
run;
proc sort data=temp3;by id n;run;

data want;
 merge have temp1 temp2(keep=v4) temp3(keep=v5); 
 if v3=1 then v5=0;
run;
Lijuu
Obsidian | Level 7
Thank you very much, Dear!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 798 views
  • 0 likes
  • 3 in conversation