BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lydiawawa
Lapis Lazuli | Level 10

Hi All,

 

I have a dataset with duplicates and I'm trying to study duplicated records, the dataset is shaped as of the following, and I'm trying to create a dup variable. 

 

The dup count will be grouped by session_id, device_name and time, and it will have repeated numbers if the defined group are identical within session_id.

 

session_id  device_name    time   dup
    1          desktop       12    1
    1          desktop       12    1
    1          desktop       12    1
    1          tablet        10    2
2 tablet 11 1
2 tablet 11 1
2 mobile 10 2
2 desktop 10 3
3 desktop 10 1
3 desktop 10 1

 

 Appreciate for any help. 

1 ACCEPTED SOLUTION

Accepted Solutions
Oligolas
Barite | Level 11

yes or just :

 

data want;
   set have;
   by session_id device_name notsorted time notsorted;

   if first.session_id then dup=0;
   if first.time then dup+1;
run;
________________________

- Cheers -

View solution in original post

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

You can do something like this

 

data have;
input session_id device_name $ time;
datalines;
1 desktop 12
1 desktop 12
1 desktop 12
1 tablet 10
2 tablet 11
2 tablet 11
2 mobile 10
2 desktop 10
3 desktop 10
3 desktop 10
;

data want(drop=_:);
   set have;
   by session_id device_name notsorted time notsorted;

   _session_id=lag1(session_id);
   _device_name=lag1(device_name); 
   _time=lag1(time);

   if session_id ne _session_id | device_name ne _device_name | time ne _time then do;
      dup+1;
   end;
   
   if first.session_id then dup=1;
run;
Oligolas
Barite | Level 11

yes or just :

 

data want;
   set have;
   by session_id device_name notsorted time notsorted;

   if first.session_id then dup=0;
   if first.time then dup+1;
run;
________________________

- Cheers -

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 951 views
  • 3 likes
  • 3 in conversation