Math with Observations from different Variables/Data Sets

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Math with Observations from different Variables/Data Sets


Hi,

I have two Data Sets (A and B). Data Set A has 1 variable (called A1) and Data Set B has also 1 variable (called B1). Both have many observations.

I want to create a new Data Set, named C and with one variable called C1. This C1 variable should be generate as follow:

Logic:

A1B1C1
Observation_1Observation_1Absolute value of { [[A1 (Observation_1)]  -  [B1 (Observation_2)]}
Observation_2Observation_2Absolute value of { [A1 (Observation_2)]  -  [B1 (Observation_3)]}
Observation_3Observation_3Absolute value of { [A1 (Observation_3)]  -  [B1 (Observation_4)]}
Observation_4Observation_4Absolute value of { [A1 (Observation_4)]  -  [B1 (Observation_5)]}
Observation_5Observation_5Absolute value of { [A1 (Observation_5)]  -  [B1 (Observation_6)]}
Observation_6Observation_6.

Exemple:

A1B1C1
3110
-734
4-36
-2-23
3-52
131.

How should I proceed?

Thanks!


Accepted Solutions
Solution
‎07-15-2013 01:43 PM
Super User
Posts: 5,082

Re: Math with Observations from different Variables/Data Sets

Here's a one-step alternative:

data want;

if set_is_done=0 then set b (firstobs=2 keep=b1 rename=(b1=next_b1)) end=set_is_done;

drop next_b1;

merge a b end=merge_is_done;

if merge_is_done=0 then c1 = abs(a1 - next_b1);

run;

If you don't want to drop NEXT_B1, you would also need to reset its value to missing on the final observation.  You're also responsible for being sure that the data sets contain the same number of observations when you begin.  Good luck.

View solution in original post


All Replies
Super User
Posts: 17,819

Re: Math with Observations from different Variables/Data Sets

Create a count variable on each dataset and then merge with n=n+1;

data a_count;

set a;

count=_n_;

run;

data b_count;

set b;

count=_n_;

run;

proc sql;

create table want as

select a.a1, b.b1, abs(a1-b1) as c1

from a_count as a

join b_count as b

on a.count=b.count+1;

quit;

Super Contributor
Posts: 333

Re: Math with Observations from different Variables/Data Sets

You beat me to the post ... all well here is my version:

data a;

  infile cards dsd;

  input A1;

  cards;

3

-7

4

-2

3

13

run;

data b;

  infile cards dsd;

  input B1;

  cards;

11

3

-3

-2

-5

1

run;

data a2;

  set a;

  Obs= _n_ ;

run;

data b2;

  set b;

  Obs= _n_ ;

run;

proc sql;

  create table out as

  select a.a1, a.b1,

  abs(a.a1 - c.b1) as c1

  from (select a.obs, a.a1, b.b1 from a2 as a , b2 as b where a.obs = b.obs) as a left join b2 as c

  on a.obs+1 = c.obs

  ;

quit;

Super Contributor
Posts: 333

Re: Math with Observations from different Variables/Data Sets

Just a little point of correction ... I think Reeza's on statement should read on a.count+1=b.count;

EJ

Super User
Posts: 17,819

Re: Math with Observations from different Variables/Data Sets

I think mine is correct, but I'm too lazy to check, I'll leave it up to the OP to use what's best for them Smiley Happy

Absolute value of { [[A1 (Observation_1)]  -  [B1 (Observation_2)]}

However I don't think the data would be correct, because they're asking for obs1 and obs1 and then obs1-obs2 in the dataset. Which seems weird, so I'll wait for confirmation for what the OP wants.

Solution
‎07-15-2013 01:43 PM
Super User
Posts: 5,082

Re: Math with Observations from different Variables/Data Sets

Here's a one-step alternative:

data want;

if set_is_done=0 then set b (firstobs=2 keep=b1 rename=(b1=next_b1)) end=set_is_done;

drop next_b1;

merge a b end=merge_is_done;

if merge_is_done=0 then c1 = abs(a1 - next_b1);

run;

If you don't want to drop NEXT_B1, you would also need to reset its value to missing on the final observation.  You're also responsible for being sure that the data sets contain the same number of observations when you begin.  Good luck.

Respected Advisor
Posts: 3,777

Re: Math with Observations from different Variables/Data Sets

And a similar data step alternative.  SQL seems unsuited to this task.

data c;
   set a;
   if not eof then set b(firstobs=2) end=eof;
   C1 = abs(a1-b1);
  
set b;
   output;
  
call missing(of _all_);
   run;
Occasional Contributor
Posts: 14

Re: Math with Observations from different Variables/Data Sets

Thank you all guys!

Just one more question:

What if instead of  "Absolute value of { [[A1 (Observation_1)]  -  [B1 (Observation_2)]}"

the logic would be "Absolute value of { [[A1 (Observation_1)]  -  [A1 (Observation_2)]}"

?

Super User
Posts: 5,082

Re: Math with Observations from different Variables/Data Sets

Well, the change would be relatively straightforward.  To bring in the next A observation instead of the next B observation:

if set_is_done=0 then set a (firstobs=2 keep=a1 rename=(a1=next_a1)) end=set_is_done;

There would be a small change to the formula to compute c1.

Respected Advisor
Posts: 3,777

Re: Math with Observations from different Variables/Data Sets

I would use DIF function for that similar to LAG but with difference.  The difference would be on a different observation but that shouldnt' matter.

Occasional Contributor
Posts: 14

Re: Math with Observations from different Variables/Data Sets

Thanks data_null_!

Super User
Posts: 9,681

Re: Math with Observations from different Variables/Data Sets

Not tested code :

data want;

merge a a(firstobs=2 rename=(a1=_a1));

C1 = abs(a1-_a1);

.......

Ksharp

Occasional Contributor
Posts: 14

Re: Math with Observations from different Variables/Data Sets

Thanks you too Ksharp!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 292 views
  • 13 likes
  • 6 in conversation