BookmarkSubscribeRSS Feed
sophia_SAS
Obsidian | Level 7

Hi SAS experts,

Please advise on a SAS procedure for a large dataset that will allow me to identify subjects who have similar, but not identical ID numbers (i.e. all ID numbers are the same except for the last 2 numbers).

For example, a study has the following 8 subject ID numbers:

888709

234294

888710

098762

546849

888721

234276

888733

The SAS procedure should be able to identify the following matched groups:

Group 1 -- 888709, 888710, 888721, 888733  (same 8887 string)

Group 2 -- 234294, 234276 (same 2342 string)

ID numbers 098762, 546849 do not have matches.

Thanks,

SS

3 REPLIES 3
Reeza
Super User

Assuming ID numbers are character:

*Create the group of 4 characters;

data want;

set have;

first_four=substr(id, 1, 4);

run;

*sort it by the group;

proc sort data=want; by first_four; run;

*Identify each group uniquely;

data group;

set want;

retain group 0;

if first.first_four then group+1;

else group;

run;

sophia_SAS
Obsidian | Level 7

Thanks Reeza.  I'm a bit confused by the last lines of the code .

I can't seem to figure out how to assign the grouped (matched?) values detailed in the last set of code. 

data group;

set want;

retain group 0;

if first.first_four then group+1;

else group;

run;

Thanks.

Linlin
Lapis Lazuli | Level 10

is the example helpful?

data have;

input id $ @@;

cards;

a b c d a b c d d e

;

proc sort;

by id;

proc print;

run;

data grouped;

set have;

by id;

if first.id then group+1;

run;

proc print;

run;

Obs    id    group

  1    a       1

  2    a       1

  3    b       2

  4    b       2

  5    c       3

  6    c       3

  7    d       4

  8    d       4

  9    d       4

10    e       5

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1348 views
  • 6 likes
  • 3 in conversation