DATA Step, Macro, Functions and more

function dif(x) and last.observation

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

function dif(x) and last.observation

[ Edited ]

Hi all,

 

Could you please help with the following issue.

 

We have dataset with variables x,y,z for subjects 1, 2, 3, etc.

We need to add difference - dif(x) between last.observation and 'observation before last'. When simple dif(x) is working it is impossible to apply it for 'last.observation' and 'observation before last'. Does a way exist to sort it out?

 

Please see the code below and output.

data multiple;
	infile datalines;
	input subject 1-2 X 4 Y 6 Z 8;
datalines;
01 1 2 3
01 4 5 6
01 7 8 9
01 8 9 5
02 8 7 6
02 5 4 3
02 2 1 0
03 8 7 9
03 7 5 4
;
run;

proc sort data=multiple;
		by subject;
run;

data one;
	set multiple;
	by subject;
		MX=dif(x);
			if last.subject then do; LX=dif(x); LY=dif(y); LZ=dif(z); end;
run;

proc print data=one; run;

 Untitled.jpgUntitled2.jpg


Accepted Solutions
Solution
3 weeks ago
Super User
Posts: 10,516

Re: function dif(x) and last.observation

the Dif and Lag functions maintain separate queues of values. So when used inside an IF the queue contains the last time the condition was true, not the previous record.

Note that your result for row 9 is the different with the previous LAST subject 2. And Subject 1 had no output because there was no previous "last subject".

View solution in original post


All Replies
Super User
Posts: 5,085

Re: function dif(x) and last.observation

In general, the way you approach this is to calculate on every observation, then reset values to missing.  For example:

 

LX = Mx;

LY = dif(y);

LZ = dif(z);

if last.subject=0 then do;

   lx = .;

   ly = .;

   lz = .;

end;

 

Also note that you might want to re-set MX:

 

MX = dif(x);

if first.subject then mx=.;

Contributor
Posts: 28

Re: function dif(x) and last.observation

Thank you. It seems below is what I need.

data one (drop= MX MY MZ);
	set multiple;
	by subject;
		MX=dif(x);MY=dif(y);MZ=dif(z);
		if last.subject=0 then do; LX=.; LY=.; LY=.; end;
				else do; LX=MX; LY=MY; LZ=MZ; output; end;
run;
Solution
3 weeks ago
Super User
Posts: 10,516

Re: function dif(x) and last.observation

the Dif and Lag functions maintain separate queues of values. So when used inside an IF the queue contains the last time the condition was true, not the previous record.

Note that your result for row 9 is the different with the previous LAST subject 2. And Subject 1 had no output because there was no previous "last subject".

Super User
Posts: 5,085

Re: function dif(x) and last.observation

If you are planning on outputting just the last observation for each SUBJECT (as in your latest program), you can use much less:

 

data one;

set multiple;

by subject;

LX = dif(x);

LY = dif(y);

LZ = dif(x);

if last.subject;

run;

Contributor
Posts: 28

Re: function dif(x) and last.observation

[ Edited ]

Thank you. Here it is the task:

 

Untitled.jpg

 

And here it is my solution:

 

data one (drop= SumX X SumY Y SumZ Z);
	set multiple;
	by subject;
		difX3_2=dif(x);difY3_2=dif(y);difZ3_2=dif(z);
		difX3_1=dif2(x);difY3_1=dif2(y);difZ3_1=dif2(z);
		SumX+X;SumY+Y;SumZ+Z;
	if last.subject then do; MeanX=SumX/3;MeanY=SumY/3;MeanZ=SumZ/3; SumX=0; SumY=0; SumZ=0; output; end;
run;

 

 

Super User
Posts: 5,085

Re: function dif(x) and last.observation

Given that (a)  you need to perform additional calculations, and (b) you need all the results on a single observation, I think you need to switch gears:

 

data want;

set have;

by subject;

retain x1 x2 y1 y2 z1 z2;

if first.subject then do;

   x1 = x;

   y1 = y;

   z1 = 2;

end;

else if last.subject=0 then do;

   x2 = x;

   y2 = y;

   z2 = z;

end;

if last.subject;

 

*** Now you have all 9 values on a single observation.  Perform the final calculations in whatever way you would prefer;

run;

Contributor
Posts: 28

Re: function dif(x) and last.observation

[ Edited ]

Astounding wrote:

... I think you need to switch gears:

 

*** Now you have all 9 values on a single observation.  Perform the final calculations in whatever way you would prefer;

run;


I think I got your idea... 

Actually it was a task to use LAG, DIF, MEAN. Sorry for not mentioning it.

Untitled.jpg

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 150 views
  • 1 like
  • 3 in conversation