BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hein68
Quartz | Level 8

Hello.  I have a dataset like this:

 

ID       date

1        5/10/2017

1        5/10/2017

1        8/24/2017

2        3/23/2017

2        3/26/2017

2        3/26/2017

 

I want to create a variable that numbers the unique ID and date combinations, like this:

 

ID       date                newvar

1        5/10/2017        1

1        5/10/2017        2

1        8/24/2017        1

2        3/23/2017        1

2        3/26/2017        1

2        3/26/2017        2

 

Basically, I want to end up with a data file with one row for each ID and date combination and I need this new variable to achieve this.

 

Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

There's a tutorial here to achieve enumeration variables.

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

 

This is what you're looking for, example 2 is pretty much exactly your situation.

 


@hein68 wrote:

Hello.  I have a dataset like this:

 

ID       date

1        5/10/2017

1        5/10/2017

1        8/24/2017

2        3/23/2017

2        3/26/2017

2        3/26/2017

 

I want to create a variable that numbers the unique ID and date combinations, like this:

 

ID       date                newvar

1        5/10/2017        1

1        5/10/2017        2

1        8/24/2017        1

2        3/23/2017        1

2        3/26/2017        1

2        3/26/2017        2

 

Basically, I want to end up with a data file with one row for each ID and date combination and I need this new variable to achieve this.

 

Thanks!

 


 

View solution in original post

10 REPLIES 10
HB
Barite | Level 11 HB
Barite | Level 11

"I need this new variable to achieve this."

 

Maybe not.

 

What happens to the rest of the data in the collapsed rows?

 

 

novinosrin
Tourmaline | Level 20
data have;
input ID       date :mmddyy10.;
format date mmddyy10.;
cards;
1        5/10/2017
1        5/10/2017
1        8/24/2017
2        3/23/2017
2        3/26/2017
2        3/26/2017
;

data want;
if 0 then set have;
do newvar=1 by 1 until(last.date);
set have;
by id date;
output;
end;
run;
Reeza
Super User

There's a tutorial here to achieve enumeration variables.

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

 

This is what you're looking for, example 2 is pretty much exactly your situation.

 


@hein68 wrote:

Hello.  I have a dataset like this:

 

ID       date

1        5/10/2017

1        5/10/2017

1        8/24/2017

2        3/23/2017

2        3/26/2017

2        3/26/2017

 

I want to create a variable that numbers the unique ID and date combinations, like this:

 

ID       date                newvar

1        5/10/2017        1

1        5/10/2017        2

1        8/24/2017        1

2        3/23/2017        1

2        3/26/2017        1

2        3/26/2017        2

 

Basically, I want to end up with a data file with one row for each ID and date combination and I need this new variable to achieve this.

 

Thanks!

 


 

hein68
Quartz | Level 8
That's exactly what I needed. Great article. Thanks very much.


SuryaKiran
Meteorite | Level 14

If your goal is to remove duplicates, you can use PROC SORT.

proc sort data=have nodupkey DUPOUT=dup out=want;;
by id date;
run;

 

Alternatively, FIRST.

proc sort data=have;
by id date;

data want;
set have;
by id date;
if first.date then newvar=1;
else  newvar+1;
run;
Thanks,
Suryakiran
HB
Barite | Level 11 HB
Barite | Level 11

@Reeza 's solution is probably best.  But just for fun:

 

data have;
	input ID Date:mmddyy10. ;
	format Date date9. ;
	datalines;
1        5/10/2017
1        5/10/2017
1        8/24/2017
2        3/23/2017
2        3/26/2017
2        3/26/2017
;
run;

data want;
	set have;
	code = id || date;
run;

proc sql;
	create table unique_id_and_date as 
	select distinct code, id, date
 	from want;
quit;

data uidt;
	set unique_id_and_date;
	drop code;
run;

gives you

 

 

 Obs    ID         Date

                                                     1      1    10MAY2017
                                                     2      1    24AUG2017
                                                     3      2    23MAR2017
                                                     4      2    26MAR2017

MarkWik
Quartz | Level 8

Hi @HB@SuryaKiran 's solution is pretty much alike the document recommend by @ Reeza. And yes, I agree that is simple and easy to follow. However, even for fun(which is fine) your code is giving the wrong results. It may not help the cause of OP or others in my humble opinion

HB
Barite | Level 11 HB
Barite | Level 11

How do you see wrong results?

 

There are 4 unique data and id combos. 

Reeza
Super User

@MarkWik I believe their code answers the OP actual question - how to get one row for each ID. The question of creating enumeration variables is the portion my code solves. 

 

Basically the OP posted a question and what they thought the approach should be, however, there are easier or more efficient ways to achieve the end result desired. Pointing those out is not problematic IMO. 

 

 

Astounding
PROC Star

It's possible that we're making an easy problem look difficult.  Have you tried:

 

data want;

set have;

by id date;

if first.date then newvar = 1;

else newvar + 1;

run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 3104 views
  • 0 likes
  • 7 in conversation