## Math with Observations from different Variables/Data Sets

Solved
Occasional Contributor
Posts: 14

# Math with Observations from different Variables/Data Sets

Hi,

I have two Data Sets (A and B). Data Set A has 1 variable (called A1) and Data Set B has also 1 variable (called B1). Both have many observations.

I want to create a new Data Set, named C and with one variable called C1. This C1 variable should be generate as follow:

Logic:

 A1 B1 C1 Observation_1 Observation_1 Absolute value of { [[A1 (Observation_1)]  -  [B1 (Observation_2)]} Observation_2 Observation_2 Absolute value of { [A1 (Observation_2)]  -  [B1 (Observation_3)]} Observation_3 Observation_3 Absolute value of { [A1 (Observation_3)]  -  [B1 (Observation_4)]} Observation_4 Observation_4 Absolute value of { [A1 (Observation_4)]  -  [B1 (Observation_5)]} Observation_5 Observation_5 Absolute value of { [A1 (Observation_5)]  -  [B1 (Observation_6)]} Observation_6 Observation_6 .

Exemple:

 A1 B1 C1 3 11 0 -7 3 4 4 -3 6 -2 -2 3 3 -5 2 13 1 .

How should I proceed?

Thanks!

Accepted Solutions
Solution
‎07-15-2013 01:43 PM
Super User
Posts: 6,774

## Re: Math with Observations from different Variables/Data Sets

Here's a one-step alternative:

data want;

if set_is_done=0 then set b (firstobs=2 keep=b1 rename=(b1=next_b1)) end=set_is_done;

drop next_b1;

merge a b end=merge_is_done;

if merge_is_done=0 then c1 = abs(a1 - next_b1);

run;

If you don't want to drop NEXT_B1, you would also need to reset its value to missing on the final observation.  You're also responsible for being sure that the data sets contain the same number of observations when you begin.  Good luck.

All Replies
Super User
Posts: 23,724

## Re: Math with Observations from different Variables/Data Sets

Create a count variable on each dataset and then merge with n=n+1;

data a_count;

set a;

count=_n_;

run;

data b_count;

set b;

count=_n_;

run;

proc sql;

create table want as

select a.a1, b.b1, abs(a1-b1) as c1

from a_count as a

join b_count as b

on a.count=b.count+1;

quit;

Super Contributor
Posts: 334

## Re: Math with Observations from different Variables/Data Sets

You beat me to the post ... all well here is my version:

data a;

infile cards dsd;

input A1;

cards;

3

-7

4

-2

3

13

run;

data b;

infile cards dsd;

input B1;

cards;

11

3

-3

-2

-5

1

run;

data a2;

set a;

Obs= _n_ ;

run;

data b2;

set b;

Obs= _n_ ;

run;

proc sql;

create table out as

select a.a1, a.b1,

abs(a.a1 - c.b1) as c1

from (select a.obs, a.a1, b.b1 from a2 as a , b2 as b where a.obs = b.obs) as a left join b2 as c

on a.obs+1 = c.obs

;

quit;

Super Contributor
Posts: 334

## Re: Math with Observations from different Variables/Data Sets

Just a little point of correction ... I think Reeza's on statement should read on a.count+1=b.count;

EJ

Super User
Posts: 23,724

## Re: Math with Observations from different Variables/Data Sets

I think mine is correct, but I'm too lazy to check, I'll leave it up to the OP to use what's best for them

Absolute value of { [[A1 (Observation_1)]  -  [B1 (Observation_2)]}

However I don't think the data would be correct, because they're asking for obs1 and obs1 and then obs1-obs2 in the dataset. Which seems weird, so I'll wait for confirmation for what the OP wants.

Solution
‎07-15-2013 01:43 PM
Super User
Posts: 6,774

## Re: Math with Observations from different Variables/Data Sets

Here's a one-step alternative:

data want;

if set_is_done=0 then set b (firstobs=2 keep=b1 rename=(b1=next_b1)) end=set_is_done;

drop next_b1;

merge a b end=merge_is_done;

if merge_is_done=0 then c1 = abs(a1 - next_b1);

run;

If you don't want to drop NEXT_B1, you would also need to reset its value to missing on the final observation.  You're also responsible for being sure that the data sets contain the same number of observations when you begin.  Good luck.

Posts: 3,852

## Re: Math with Observations from different Variables/Data Sets

And a similar data step alternative.  SQL seems unsuited to this task.

data c;
set a;
if not eof then set b(firstobs=2) end=eof;
C1 = abs(a1-b1);

set b;
output;

call missing(of _all_);
run;
Occasional Contributor
Posts: 14

## Re: Math with Observations from different Variables/Data Sets

Thank you all guys!

Just one more question:

What if instead of  "Absolute value of { [[A1 (Observation_1)]  -  [B1 (Observation_2)]}"

the logic would be "Absolute value of { [[A1 (Observation_1)]  -  [A1 (Observation_2)]}"

?

Super User
Posts: 6,774

## Re: Math with Observations from different Variables/Data Sets

Well, the change would be relatively straightforward.  To bring in the next A observation instead of the next B observation:

if set_is_done=0 then set a (firstobs=2 keep=a1 rename=(a1=next_a1)) end=set_is_done;

There would be a small change to the formula to compute c1.

Posts: 3,852

## Re: Math with Observations from different Variables/Data Sets

I would use DIF function for that similar to LAG but with difference.  The difference would be on a different observation but that shouldnt' matter.

Occasional Contributor
Posts: 14

## Re: Math with Observations from different Variables/Data Sets

Thanks data_null_!

Super User
Posts: 10,778

## Re: Math with Observations from different Variables/Data Sets

Not tested code :

data want;

merge a a(firstobs=2 rename=(a1=_a1));

C1 = abs(a1-_a1);

.......

Ksharp

Occasional Contributor
Posts: 14

## Re: Math with Observations from different Variables/Data Sets

Thanks you too Ksharp!

🔒 This topic is solved and locked.