BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mar_Lds
Calcite | Level 5

I cannot figure out why the dif function is going back 2 rows in the second observation.  It works for the first observation and all subsequent observations.  In the 2nd and fourth row the dif function is not calculating the value I would like.  I am sure that my lack of understanding of first is the culprit, but I can't figure it out.  (Row 2 difference should be 0, Row 4 difference should be 1)

 

Any help is greatly appreciated (I use SAS EG 7.1, SAS 9.4).

 

Output data 

ClaimNum

Participant_Name

TX_DOS

difference

CLAIM1

NAME1

9/13/2016

0

CLAIM1

NAME1

9/13/2016

.

CLAIM2

NAME2

8/25/2015

0

CLAIM2

NAME2

8/26/2015

-384

CLAIM2

NAME2

8/27/2015

1

 

 

SAS Code

DATA EM_LIST_W_DAYS;
  set WORK.EM_LIST;

       by ClaimNum ParticipantID;
             if first.ClaimNum and first.ParticipantID then 
                      difference = 0;
             else if first.ClaimNum=0 and first.ParticipantID=1 then 
                      difference = 0;
             else if first.ClaimNum=0 and first.ParticipantID=0 then
                      difference = DIF(TX_DOS);
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
Jade | Level 19

DIF updates a queue of differences.  If you only occasionally execute DIF then the sequence of differences are applied only to the corresponding observations.  This program accommodates that property:

 

DATA EM_LIST_W_DAYS;
  set WORK.EM_LIST;
  by ClaimNum ParticipantID;
  dif=ifn(first.paarticipantid,0,dix(tx_dos));
run;

 

 

Two points:

  1. Using "by cliaminum participantid;".  The first. and last. indicators are hierarchical.   All instances of first.claimnum=1 are subsets of the instances in which first.participantid=1.  And (less petinent in your case) every time first.claimnum=1 all the by-variables to its right will have first. indicators set to 1.

  2. The IFN function evaluates both outcomes (the 2nd and 3rd arguments), regardless of the results of the first argument.  As a result the DIF function is updated with every observation, even though it is not always assigned to the result variable.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

2 REPLIES 2
Astounding
Opal | Level 21

For DIF to operate on consecutive observations, it must execute on every observation.  So calculate something like this:

 

temp = dif(TX_DOS);

 

Then use TEMP in your IF/THEN statements:

 

else if first.ClaimNum=0 and first.ParticipantID=0 then difference=temp;

mkeintz
Jade | Level 19

DIF updates a queue of differences.  If you only occasionally execute DIF then the sequence of differences are applied only to the corresponding observations.  This program accommodates that property:

 

DATA EM_LIST_W_DAYS;
  set WORK.EM_LIST;
  by ClaimNum ParticipantID;
  dif=ifn(first.paarticipantid,0,dix(tx_dos));
run;

 

 

Two points:

  1. Using "by cliaminum participantid;".  The first. and last. indicators are hierarchical.   All instances of first.claimnum=1 are subsets of the instances in which first.participantid=1.  And (less petinent in your case) every time first.claimnum=1 all the by-variables to its right will have first. indicators set to 1.

  2. The IFN function evaluates both outcomes (the 2nd and 3rd arguments), regardless of the results of the first argument.  As a result the DIF function is updated with every observation, even though it is not always assigned to the result variable.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 708 views
  • 1 like
  • 3 in conversation