How to get difference for all the numerical value in two rows?

Li2024 · Posted 04-24-2025 11:21 AM

Hi,

I have data like this and with almost 500 columns numerical value.

ID	A	B	C	D
1	1	2	3	4
1	2	3	4	5
2	1	3	4	5
2	2	6	3	8
3	1	3	4	7
3	3	4	3	6

I need an output like this, which use the last. ID _numeric_ MINUS first.ID _numeric_, Since there are almost 500 numerical features, I cannot subtract one by one,

ID	A	B	C	D
1	1	1	1	1
2	1	3	-1	3
3	2	1	-1	-1

Stu_SAS · Posted 04-24-2025 12:39 PM

Hey @Li2024! PROC EXPAND can do this very quickly. It's a great swiss-army knife of multi-row operations.

proc expand data=a out=want(where=(time > 0));
   by id;
   convert _NUMERIC_ / method=none transform=(dif 1);
run;

ID	TIME	A	B	C	D
1	1	    1	1	1	1
2	1	    1	3  -1	3
3	1	    2	1  -1  -1

You can ignore the TIME variable and drop it later if needed.

With a DATA Step, it looks like this:

data want;
    set a;
    by id;

    array numvar[*] _NUMERIC_;
    array difvar[5] _TEMPORARY_;

    do i = 2 to dim(difvar);
        difvar[i] = dif(numvar[i]);

        if(NOT first.id) then numvar[i] = difvar[i];
    end;

    if(NOT first.id) then output;

    drop i;
run;

ID	A	B	C	D
1	1	1	1	1
2	1	3  -1	3
3	2	1  -1  -1

Follow me on LinkedIn for SAS tips with #SASTipTuesday
Check out my GitHub for SAS tips and tricks
Read my blog for cool SAS and open source data science projects

Li2024 · Posted 04-24-2025 01:32 PM

The Data step code works perfect. My SAS doesn't support the proc expand.

Thank you so much!

Tom · Posted 04-24-2025 02:27 PM

You don't need the extra array.

And if you make ID character then it is easy to apply to all of the numeric variables.

data have;
  input ID $ A B C D;
cards;
1 1 2 3 4
1 2 3 4 5
2 1 3 4 5
2 2 6 3 8
3 1 3 4 7
3 3 4 3 6
;

data want;
  set have;
  by id;
  array _num_ _numeric_;
  do over _num_;
    _num_=dif(_num_);
  end;
  if not first.id;
run;

Stu_SAS · Posted 04-25-2025 11:05 AM

Great idea for making the ID a character to make the solution very elegant. Nicely done!

Follow me on LinkedIn for SAS tips with #SASTipTuesday
Check out my GitHub for SAS tips and tricks
Read my blog for cool SAS and open source data science projects

Tom · Posted 04-25-2025 11:28 AM

@Stu_SAS wrote:
Great idea for making the ID a character to make the solution very elegant. Nicely done!

If the ID is numeric you can just add a second SET statement with KEEP= dataset option to reset the ID back to the original value.

data want;
  set have;
  by id;
  array _num_ _numeric_;
  do over _num_;
    _num_=dif(_num_);
  end;
  set have(keep=id);
  if not first.id;
run;

mkeintz · Posted 04-24-2025 08:23 PM

Does every ID have exactly 2 records? Then the solution offered by @Tom is the way to go.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ksharp · Posted 04-24-2025 09:51 PM

If you have only TWO obs for each ID, I think the Tom's code is a best try. If there were not, you could try this IML code,but you also have license of SAS/IML to run IML code.

P.S. My IML code also could be used to TWO obs for each ID.

data have;
  input ID $ A B C D;
cards;
1 1 2 3 4
1 2 3 4 5
2 1 3 4 5
2 2 6 3 8
3 1 3 4 7
3 3 4 3 6
;
data first last;
 set have;
 by id;
 if first.id then output first;
 if last.id then output last;
run;

proc iml;
use first;
read all var {ID};
close;
use first(drop=id);
read all var _num_ into first[c=vname];
close;
use last(drop=id);
read all var _num_ into last;
want=last-first;
create want from ID want[c=('id'||vname)];
append from ID want;
close;
quit;

mkeintz · Posted 04-25-2025 09:33 AM

I would say that if you can have more than two obs for a given ID, just implement a minor modification to @Tom 's code:

data have;
  input ID $ A B C D;
cards;
1 1 2 3 4
1 2 3 4 5
2 1 3 4 5
2 2 6 3 8
3 1 3 4 7
3 9 9 9 9    /*extra obs for ID 3 */
3 3 4 3 6
;
data want;
 set have;
 by id;
 if first.id or last.id; /*No further processing of "middle" obs*/
 array _n _numeric_;
 do over _n;
   _n=dif(_n);
 end;
 if not (first.id);   /*Avoid instances of 1 obs per id*/
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ksharp · Posted 04-25-2025 09:49 PM

Sure. Conditionally execute DIF() function when first.id and last.id .

How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Re: How to get difference for all the numerical value in two rows?

Registration is open

SAS Training: Just a Click Away