SAS Data Management

hein68 · Posted 09-04-2018 01:03 PM

Hello. I have a dataset like this:

ID date

1 5/10/2017

1 8/24/2017

2 3/23/2017

2 3/26/2017

I want to create a variable that numbers the unique ID and date combinations, like this:

ID date newvar

1 5/10/2017 1

1 5/10/2017 2

1 8/24/2017 1

2 3/23/2017 1

2 3/26/2017 1

2 3/26/2017 2

Basically, I want to end up with a data file with one row for each ID and date combination and I need this new variable to achieve this.

Thanks!

Reeza · Posted 09-04-2018 01:18 PM

There's a tutorial here to achieve enumeration variables.

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

This is what you're looking for, example 2 is pretty much exactly your situation.

@hein68 wrote:

Hello. I have a dataset like this:

ID       date

1        5/10/2017

1        5/10/2017

1        8/24/2017

2        3/23/2017

2        3/26/2017

2        3/26/2017

I want to create a variable that numbers the unique ID and date combinations, like this:

ID       date                newvar

1        5/10/2017        1

1        5/10/2017        2

1        8/24/2017        1

2        3/23/2017        1

2        3/26/2017        1

2        3/26/2017        2

Basically, I want to end up with a data file with one row for each ID and date combination and I need this new variable to achieve this.

Thanks!

View solution in original post

HB · Posted 09-04-2018 01:07 PM

"I need this new variable to achieve this."

Maybe not.

What happens to the rest of the data in the collapsed rows?

novinosrin · Posted 09-04-2018 01:13 PM

data have;
input ID       date :mmddyy10.;
format date mmddyy10.;
cards;
1        5/10/2017
1        5/10/2017
1        8/24/2017
2        3/23/2017
2        3/26/2017
2        3/26/2017
;

data want;
if 0 then set have;
do newvar=1 by 1 until(last.date);
set have;
by id date;
output;
end;
run;

Reeza · Posted 09-04-2018 01:18 PM

There's a tutorial here to achieve enumeration variables.

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

This is what you're looking for, example 2 is pretty much exactly your situation.

@hein68 wrote:

Hello. I have a dataset like this:

ID       date

1        5/10/2017

1        5/10/2017

1        8/24/2017

2        3/23/2017

2        3/26/2017

2        3/26/2017

I want to create a variable that numbers the unique ID and date combinations, like this:

ID       date                newvar

1        5/10/2017        1

1        5/10/2017        2

1        8/24/2017        1

2        3/23/2017        1

2        3/26/2017        1

2        3/26/2017        2

Basically, I want to end up with a data file with one row for each ID and date combination and I need this new variable to achieve this.

Thanks!

hein68 · Posted 09-04-2018 03:00 PM

That's exactly what I needed. Great article. Thanks very much.

SuryaKiran · Posted 09-04-2018 01:30 PM

If your goal is to remove duplicates, you can use PROC SORT.

proc sort data=have nodupkey DUPOUT=dup out=want;;
by id date;
run;

Alternatively, FIRST.

proc sort data=have;
by id date;

data want;
set have;
by id date;
if first.date then newvar=1;
else  newvar+1;
run;

Thanks,
Suryakiran

HB · Posted 09-04-2018 01:36 PM

@Reeza 's solution is probably best. But just for fun:

data have;
	input ID Date:mmddyy10. ;
	format Date date9. ;
	datalines;
1        5/10/2017
1        5/10/2017
1        8/24/2017
2        3/23/2017
2        3/26/2017
2        3/26/2017
;
run;

data want;
	set have;
	code = id || date;
run;

proc sql;
	create table unique_id_and_date as 
	select distinct code, id, date
 	from want;
quit;

data uidt;
	set unique_id_and_date;
	drop code;
run;

gives you

Obs    ID         Date

                                                     1      1    10MAY2017
                                                     2      1    24AUG2017
                                                     3      2    23MAR2017
                                                     4      2    26MAR2017

MarkWik · Posted 09-04-2018 01:50 PM

Hi @HB, @SuryaKiran 's solution is pretty much alike the document recommend by @ Reeza. And yes, I agree that is simple and easy to follow. However, even for fun(which is fine) your code is giving the wrong results. It may not help the cause of OP or others in my humble opinion

HB · Posted 09-04-2018 02:11 PM

How do you see wrong results?

There are 4 unique data and id combos.

Reeza · Posted 09-04-2018 02:33 PM

@MarkWik I believe their code answers the OP actual question - how to get one row for each ID. The question of creating enumeration variables is the portion my code solves.

Basically the OP posted a question and what they thought the approach should be, however, there are easier or more efficient ways to achieve the end result desired. Pointing those out is not problematic IMO.

Astounding · Posted 09-04-2018 02:11 PM

It's possible that we're making an easy problem look difficult. Have you tried:

data want;

set have;

by id date;

if first.date then newvar = 1;

else newvar + 1;

run;

SAS Data Management

Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Re: Assign sequential id by id and date

Follow Us

What is...

SAS Data Management

Our biggest data and AI event of the year.

Follow Us

What is...