BookmarkSubscribeRSS Feed
Li2024
Fluorite | Level 6

Hi,

I have data like this and with almost 500 columns numerical value.

 

ID ABCD
11234
12345
21345
22638
31347
33436

 

I need an output like this, which use the last. ID  _numeric_ MINUS first.ID _numeric_, Since there are almost 500 numerical features, I cannot subtract one by one,

IDABCD
11111
213-13
321-1-1
9 REPLIES 9
Stu_SAS
SAS Employee

Hey @Li2024! PROC EXPAND can do this very quickly. It's a great swiss-army knife of multi-row operations.

 

proc expand data=a out=want(where=(time > 0));
   by id;
   convert _NUMERIC_ / method=none transform=(dif 1);
run;
ID	TIME	A	B	C	D
1	1	    1	1	1	1
2	1	    1	3  -1	3
3	1	    2	1  -1  -1

You can ignore the TIME variable and drop it later if needed.

 

With a DATA Step, it looks like this:

data want;
    set a;
    by id;

    array numvar[*] _NUMERIC_;
    array difvar[5] _TEMPORARY_;

    do i = 2 to dim(difvar);
        difvar[i] = dif(numvar[i]);

        if(NOT first.id) then numvar[i] = difvar[i];
    end;

    if(NOT first.id) then output;

    drop i;
run;
ID	A	B	C	D
1	1	1	1	1
2	1	3  -1	3
3	2	1  -1  -1

 

Li2024
Fluorite | Level 6

The Data step code works perfect. My SAS doesn't support the proc expand.

Thank you so much!

Tom
Super User Tom
Super User

You don't need the extra array.

And if you make ID character then it is easy to apply to all of the numeric variables.

data have;
  input ID $ A B C D;
cards;
1 1 2 3 4
1 2 3 4 5
2 1 3 4 5
2 2 6 3 8
3 1 3 4 7
3 3 4 3 6
;

data want;
  set have;
  by id;
  array _num_ _numeric_;
  do over _num_;
    _num_=dif(_num_);
  end;
  if not first.id;
run;
Stu_SAS
SAS Employee
Great idea for making the ID a character to make the solution very elegant. Nicely done!
Tom
Super User Tom
Super User

@Stu_SAS wrote:
Great idea for making the ID a character to make the solution very elegant. Nicely done!

If the ID is numeric you can just add a second SET statement with KEEP= dataset option to reset the ID back to the original value.

data want;
  set have;
  by id;
  array _num_ _numeric_;
  do over _num_;
    _num_=dif(_num_);
  end;
  set have(keep=id);
  if not first.id;
run;
mkeintz
PROC Star

Does every ID have exactly 2 records?  Then the solution offered by @Tom is the way to go.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User

If you have only TWO obs for each ID, I think the Tom's code is a best try. If there were not, you could try this IML code,but you also have license of SAS/IML to run IML code.

P.S. My IML code also could be used to TWO obs for each ID.

 

data have;
  input ID $ A B C D;
cards;
1 1 2 3 4
1 2 3 4 5
2 1 3 4 5
2 2 6 3 8
3 1 3 4 7
3 3 4 3 6
;
data first last;
 set have;
 by id;
 if first.id then output first;
 if last.id then output last;
run;

proc iml;
use first;
read all var {ID};
close;
use first(drop=id);
read all var _num_ into first[c=vname];
close;
use last(drop=id);
read all var _num_ into last;
want=last-first;
create want from ID want[c=('id'||vname)];
append from ID want;
close;
quit;

Ksharp_0-1745545871842.png

 

mkeintz
PROC Star

I would say that if you can have more than two obs for a given ID, just implement a minor modification to @Tom 's code:

 

data have;
  input ID $ A B C D;
cards;
1 1 2 3 4
1 2 3 4 5
2 1 3 4 5
2 2 6 3 8
3 1 3 4 7
3 9 9 9 9    /*extra obs for ID 3 */
3 3 4 3 6
;
data want;
 set have;
 by id;
 if first.id or last.id; /*No further processing of "middle" obs*/
 array _n _numeric_;
 do over _n;
   _n=dif(_n);
 end;
 if not (first.id);   /*Avoid instances of 1 obs per id*/
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
Sure. Conditionally execute DIF() function when first.id and last.id .

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1750 views
  • 12 likes
  • 5 in conversation