BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mirisage
Obsidian | Level 7

Hi Forum,

In the attached dataset, I need to see if the “Balance” of duplicate accounts are identical or not.

E.g. Account_number 111 has two records. “Balance” of each record are 25. So, they are identical.

My approach.

I used diff function (from Cody’s book).

data want;

SET tt;

BY account_number;

*REMOVE accounts WITH ONE record ;

IF FIRST.account_number AND LAST.account_number

THEN DELETE;

RETAIN R_HR ;

IF FIRST.account_number THEN DO;

R_HR = balance;

END;

IF LAST.account_number THEN DO;

DIFF_HR = balance - R_HR;

OUTPUT;

END;

DROP R_: ;

RUN;

proc  freq data = want;

  tables DIFF_HR;

  run;

Question:

I want to determine if the values of “arrears_band” variable are also identical or not in duplicate accounts.

I do not know how to use diff function for “character” variable.

Could any one of you help me?

Thanks

Mirisage

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

If you don't mind an SQL answer :

proc sql;

select account_number from sasforum.Post_this

group by account_number

having count(distinct Arrears_band) > 1;

quit;

PG

PG

View solution in original post

4 REPLIES 4
PGStats
Opal | Level 21

If you don't mind an SQL answer :

proc sql;

select account_number from sasforum.Post_this

group by account_number

having count(distinct Arrears_band) > 1;

quit;

PG

PG
art297
Opal | Level 21

You said you used the dif function, but that isn't shown in your code.  Anyhow, to answer your question, the dif function is limited to numeric variables.  You could accomplish the same thing with the lag function.  e.g.:

data want (keep=account_number identical);

  SET tt;

  BY account_number;

  retain identical;

  last_arrears_band=lag(arrears_band);

  if not(FIRST.account_number and LAST.account_number);

  IF FIRST.account_number then identical=1;

  else if arrears_band ne last_arrears_band then identical=0;

  IF last.account_number  THEN output;

RUN;

Haikuo
Onyx | Level 15

If you only care about first or last of each group like your original code shows, then a small tweak will be sufficient do what you want for character variable:

data want;

SET tt;

BY account_number;

*REMOVE accounts WITH ONE record ;

IF FIRST.account_number AND LAST.account_number

THEN DELETE;

RETAIN r_ab ;

IF FIRST.account_number then r_ab = Arrears_Band;

IF LAST.account_number THEN DO;

DIFF_AB = (Arrears_Band = r_ab);

OUTPUT;

END;

DROP R_:;

RUN;

Basically, beside changing variable names, the only change need to be made is to replace the minus sign '-' with equal sign '='.

Haikuo

Mirisage
Obsidian | Level 7

Hi PGStats, Art and Hai.Kuo,

Many thanks for every one of you. All three codes are working well.

(I cannot click "correct answer" for all three responses for some reason)

Hi Art,

Now only I realized that what I have used is not diff function.

Thanks again!

Mirisage

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1317 views
  • 6 likes
  • 4 in conversation