BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
crawfe
Quartz | Level 8

Hi. Row-row calculations seem tougher in SAS...

I need to calculate the difference and % difference between values in adjacent rows (Var2-Var1 values)

but only within specific Name groups. Thanks for any ideas!

I would like to do it within EG but if I need to run a separate program and join back, I will.

 

Have:                                      Want:

  Name      Var        Value         Name           Difference    %Dif

Name1     Var1        4              Name1           -2                -50%

Name1     Var2        2              Name2          - 6                -75%

Name2     Var1        8              Name3            1                 25%   

Name2     Var2       2             

Name3     Var1       4                  

Name3     Var2       5              

etc.                   

1 ACCEPTED SOLUTION

Accepted Solutions
Jagadishkatam
Amethyst | Level 16

Please try the below code

 

data have;
inut Name$ Var$ Value;
cards;        
Name1 Var1  4            
Name1 Var2  2            
Name2 Var1   8              
Name2 Var2   2             
Name3 Var1  4                  
Name3 var2  5  
;

proc sort data=have;
by name var;
run;

data want;
set have;
by name var;
if first.name then sum=value;
else sum+value;
dif=value-lag(value);
pct=(value-lag(value))/lag(value);
if first.name then do;dif=.;pct=.;end;
if last.name;
format pct percentn.;
run;
Thanks,
Jag

View solution in original post

11 REPLIES 11
ballardw
Super User

@crawfe wrote:

Hi. Row-row calculations seem tougher in SAS...

I need to calculate the difference and % difference between values in adjacent rows (Var2-Var1 values)

but only within specific Name groups. Thanks for any ideas!

I would like to do it within EG but if I need to run a separate program and join back, I will.

 

Have:                                      Want:

  Name      Var        Value         Name           Difference    %Dif

Name1     Var1        4              Name1           -2                -50%

Name1     Var2        2              Name2          - 6                -75%

Name2     Var1        8              Name3            1                 25%   

Name2     Var2       2             

Name3     Var1       4                  

Name3     Var2       5              

etc.                   


You really need to provide some details what the Var values play in this scenario. Such as are there ever values other than Var1, Var2? If so provide an example of the data and what the result should be in that case.

What you show does not involve the value of the variable Var at all.

 

Providing data in the form of data step helps resolve questions about actual variable names and types. Paste into a code box to preserve formatting of code and values as the message windows my reformat code or data such that it will not actually run.

 

One way to do this:

data have;
   input name $ value;
datalines;
A 4
A 2
B 8
B 2
C 4
C 5
;
 
data want;
   set have;
   by name;
   lv= lag(value);
   dv= dif(value);
   if not (first.name) then do;
      dif = DV;
      pct = dv/lv;
      output;
   end;
   keep name dif pct;
run;

   

Assumes the data is sorted by name. If you have multiple values of Name (more than 2) you will get difference per sequential pair. If there is only one name value the code above will not output the name as there is no difference to calculate.

crawfe
Quartz | Level 8

I was careless! I called it Var1,2. I should have called it Label1, Label2. 

For example a Before/After comparison. 

ballardw
Super User

@crawfe wrote:

I was careless! I called it Var1,2. I should have called it Label1, Label2. 

For example a Before/After comparison. 


Still haven't actually described the role in the output. You don't show either value in the "want".

So it sounds like you may want to go back and revisit what your actual desired output should look like.

 

And consider if it is all possible that you have 3 or more identical name values or only one. If so, what output do you want for those cases?

Jagadishkatam
Amethyst | Level 16

Please try the below code

 

data have;
inut Name$ Var$ Value;
cards;        
Name1 Var1  4            
Name1 Var2  2            
Name2 Var1   8              
Name2 Var2   2             
Name3 Var1  4                  
Name3 var2  5  
;

proc sort data=have;
by name var;
run;

data want;
set have;
by name var;
if first.name then sum=value;
else sum+value;
dif=value-lag(value);
pct=(value-lag(value))/lag(value);
if first.name then do;dif=.;pct=.;end;
if last.name;
format pct percentn.;
run;
Thanks,
Jag
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

@crawfe are there always 2 records for each calculation?

if so then the solutions provided work.

If not then the solutions provided may give unexpected results if your data is like this.

data have;
input Name$ Var$ Value;
cards;        
Name1 Var1  4            
Name1 Var2  2
Name1 Var3  1 
Name2 Var1   8              
Name2 Var2   2             
Name3 Var1  4                  
Name3 var2  5  
;
crawfe
Quartz | Level 8

LOL. I was just going to mention that. The code is just right if my table is perfectly symmetric.

Browsing my table, I see cases similar to what you

mention where there is not always a pair of Names to calculate the difference or percent change.

Apparently, if there was no input, there was no line.

 

I will apparently have to put in a few IF-THEN patches

like if pct ="" then pct = 100% (or -100% depending on the before/after flag). 😞

 

data have;
input Name$ Var$ Value;
cards;        
Name1 Before  4            
Name1 After   2
Name2 Before  8              
Name2 After   2 
Name3 After 3 Name4 Before 4 Name4 After 5
Name5 Before 6
Name6 Before 3
Name6 After 1 ; 

 

Jagadishkatam
Amethyst | Level 16

so you can try the below code, same as i posted before but with slight changes as per the new data

 

data have;
input Name$ Var$ Value;
if var='Before' then varn=1;
else if var='After' then varn=1;
cards;        
Name1 Before  4            
Name1 After   2
Name2 Before  8              
Name2 After   2 
Name3 After   3            
Name4 Before  4                  
Name4 After   5 
Name5 Before  6
Name6 Before  3
Name6 After   1
;

proc sort data=have;
by name varn;
run;

data want;
set have;
by name varn;
if first.name then sum=value;
else sum+value;
dif=value-lag(value);
pct=(value-lag(value))/lag(value);
if first.name then do;dif=.;pct=.;end;
if last.name;
format pct percentn.;
drop sum;
run;
Thanks,
Jag
ballardw
Super User

Perhaps reshaping the data would simplify things:

Again assuming sort by Name and no more than two possible values of Var:

proc transpose data=have
   out=trans;
   by name;
   id var;
   var value;
run;

Now you have two variables name Before and After with the values of Value assigned. Use a data step to calculate on the row. This would allow likely simpler coding for what to do with either the Before or After missing (not to mention the potential divide by 0 from no change that hadn't been addressed yet).

crawfe
Quartz | Level 8

Interesting. I have never seen transpose before. I can see how it could be useful. Thanks.

PGStats
Opal | Level 21

This should work:

 

data have;
input Name$ Var$ Value;
cards;        
Name1 Before  4            
Name1 After   2
Name2 Before  8              
Name2 After   2 
Name3 After   3            
Name4 Before  4                  
Name4 After   5 
Name5 Before  6
Name6 Before  3
Name6 After   1
;

data want;
merge have(where=(Var="Before") rename=Value=Value1)
    have(where=(Var="After") rename=Value=Value2);
by Name;
if nmiss(Value1, value2) = 0 then do;
    difference = Value2-Value1;
    PctDiff = Difference / Value1;
    end;
format pctDiff percentn7.1;
keep name difference pctDiff;
run;

proc print data=want noobs; run;
                           Name     difference    PctDiff

                           Name1        -2        -50.0%
                           Name2        -6        -75.0%
                           Name3         .           .
                           Name4         1         25.0%
                           Name5         .           .
                           Name6        -2        -66.7%
PG
crawfe
Quartz | Level 8

Several good solution variations here. Thanks!!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 14868 views
  • 0 likes
  • 5 in conversation