BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hk2013
Fluorite | Level 6

i am trying to subtract rows based on variables 

 

id        Month         TYPE       ENC          count      
11      OCT2019     DO           Pan            249
11      OCT2019     DO           Unpan        12
11      OCT2019     TA             Pan            63
11      OCT2019     TA             Unpan         5

11     OCT2019      VA            Pan             3

11      NOV2019     DO           Pan            202
11      NOV2019     DO           Unpan        23
11      NOV2019     TA             Pan            55
11      NOV2019     TA             Unpan        12    

 

what i want

id        Month         TYPE            diff 
11      OCT2019     DO               -237
11      OCT2019     TA                -58

11      OCT2019    VA                -3 

11      NOV2019     DO               -178
11      NOV2019     TA                 -43

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If there are always exactly two values of ENC for each Type and they occur so that you want the last difference then this may work:

data want;
   set have;
   by id month type notsorted;
   diff = dif(count);
   if last.type;
   drop enc count;
run;

DIF is a special function that does the difference between the current value and the previous value. You can look at a lot of previous by using DIF2 DIF3 .. DIF99 and such where the number indicates how far back you want to look in the data set but it is ONLY those two specific values. The default is DIF1 if no number is supplied.

 

View solution in original post

6 REPLIES 6
ballardw
Super User

If there are always exactly two values of ENC for each Type and they occur so that you want the last difference then this may work:

data want;
   set have;
   by id month type notsorted;
   diff = dif(count);
   if last.type;
   drop enc count;
run;

DIF is a special function that does the difference between the current value and the previous value. You can look at a lot of previous by using DIF2 DIF3 .. DIF99 and such where the number indicates how far back you want to look in the data set but it is ONLY those two specific values. The default is DIF1 if no number is supplied.

 

hk2013
Fluorite | Level 6

thank you that works however i just noticed that i have another ENC 'VA' which sometimes has a pan and some times has UNpan and i am noticing that its not giving the correct difference for that ENC 

 

 

so that the dat looks like this 

 

 

The data i have 

id        Month         TYPE       ENC          count      
11      OCT2019     DO           Pan            249
11      OCT2019     DO           Unpan        12
11      OCT2019     TA             Pan            63
11      OCT2019     TA             Unpan         5

11     OCT2019      VA            Pan             3

11      NOV2019     DO           Pan            202
11      NOV2019     DO           Unpan        23
11      NOV2019     TA             Pan            55
11      NOV2019     TA             Unpan        12    

 

what i want

id        Month         TYPE            diff 
11      OCT2019     DO               -237
11      OCT2019     TA                -58

11      OCT2019    VA                -3 

11      NOV2019     DO               -178
11      NOV2019     TA                 -43

ballardw
Super User

@hk2013 wrote:

thank you that works however i just noticed that i have another ENC 'VA' which sometimes has a pan and some times has UNpan and i am noticing that its not giving the correct difference for that ENC 

 

 

so that the dat looks like this 

 

 

The data i have 

id        Month         TYPE       ENC          count      
11      OCT2019     DO           Pan            249
11      OCT2019     DO           Unpan        12
11      OCT2019     TA             Pan            63
11      OCT2019     TA             Unpan         5

11     OCT2019      VA            Pan             3

11      NOV2019     DO           Pan            202
11      NOV2019     DO           Unpan        23
11      NOV2019     TA             Pan            55
11      NOV2019     TA             Unpan        12    

 

what i want

id        Month         TYPE            diff 
11      OCT2019     DO               -237
11      OCT2019     TA                -58

11      OCT2019    VA                -3 

11      NOV2019     DO               -178
11      NOV2019     TA                 -43


Please note that my response says : If there are always exactly two values of ENC for each Type and they occur so that you want the last difference then this may work.

 

So you have different data then you provided as an example. What should the value be when there is a type with only one ENC value. Please provide an explicit rule as a single example doesn't really show all the possibilities.

 

This may also be a good time to look if you have more combinations of Type and Enc. What if you have Type with a single Enc of Unpan? Would that have a different rule than a single Enc of Pan? Are there any other values of Enc then Pan and Unpan? Would they have  different rules?

hk2013
Fluorite | Level 6

Sorry i know your reply had said that it will only work for two values. i just noticed the 3rd one just now. The issue is with ENC 'VA' it is sometime with type pan and sometimes with unpan but i have gone through all my data and there are only three ENC "TA" 'DO' and "VA' and type 'pan' and 'unpan' 

 

id        Month         TYPE       ENC          count      
11      OCT2019     DO           Pan            249
11      OCT2019     DO           Unpan        12
11      OCT2019     TA             Pan            63
11      OCT2019     TA             Unpan         5

11     OCT2019      VA            Pan             3

11      NOV2019     DO           Pan            202
11      NOV2019     DO           Unpan        23
11      NOV2019     TA             Pan            55
11      NOV2019     TA             Unpan        12  

11     NOV2019      VA           Unpan          9

 

what i want

id        Month         TYPE            diff 
11      OCT2019     DO               -237
11      OCT2019     TA                -58

11      OCT2019    VA                -3 

11      NOV2019     DO               -178
11      NOV2019     TA                 -43

11      NOV2019     VA                  9 

ballardw
Super User

@hk2013 wrote:

Sorry i know your reply had said that it will only work for two values. i just noticed the 3rd one just now. The issue is with ENC 'VA' it is sometime with type pan and sometimes with unpan but i have gone through all my data and there are only three ENC "TA" 'DO' and "VA' and type 'pan' and 'unpan' 

 

id        Month         TYPE       ENC          count      
11      OCT2019     DO           Pan            249
11      OCT2019     DO           Unpan        12
11      OCT2019     TA             Pan            63
11      OCT2019     TA             Unpan         5

11     OCT2019      VA            Pan             3

11      NOV2019     DO           Pan            202
11      NOV2019     DO           Unpan        23
11      NOV2019     TA             Pan            55
11      NOV2019     TA             Unpan        12  

11     NOV2019      VA           Unpan          9

 

what i want

id        Month         TYPE            diff 
11      OCT2019     DO               -237
11      OCT2019     TA                -58

11      OCT2019    VA                -3 

11      NOV2019     DO               -178
11      NOV2019     TA                 -43

11      NOV2019     VA                  9 


I suspect I guess the rule for the single TYPE cases but I really would like you to state it explicitly.

When you use a BY statement in the data step as in this case there are a number of automatic variables created for each one. You reference these with the First.variablename or Last.varaiblename syntax. These automatic variables have a value of 1 when true First.variablename =1 means this is first record of that group, Last.variablename=1 means it is the last record of a group, otherwise the result is 0. You can test if a group only has one record by:

If First.variablename and Last.variablename 

So an outline would be something like:

data want;
   set have;
   by id month type notsorted;
   diff = dif(count);
   if first.type and last.type then do;
     <something apparently depending on the value of ENC>
   end;
   else if last.type then output;
   drop enc count;
run;

where you would replace the "<something apparently depending on the value of ENC>" with the rules of how to assign the Diff value when only one Type exists. And would likely include an output as the last bit inside the Do/End.

 

Astounding
PROC Star

The original solultion looks like this:

 

data want;
   set have;
   by id month type notsorted;
   diff = dif(count);
   if last.type;
   drop enc count;
run;

It seems a small change ... adding one line ... will satisfy the more complex requirements:

 

data want;
   set have;
   by id month type notsorted;
   diff = dif(count);
   if first.type then diff = -count;
   if last.type;
   drop enc count;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4562 views
  • 1 like
  • 3 in conversation