STACK? ARRAY? What is the best approach?

jduffy · Posted 12-14-2018 10:03 AM

I have 100k obs of data with the variable NAME. Looks as follows.

OBS	NAME
1	John
2	Mary
3	Joe
4	John
5	Steve
6	Joan
7	John
8	Mary
9	Steve
10	shawn

What I would like to do is create a list that has each unique value of the variable NAME. From the sample above the output list would be...

Obs	Name
1	John
2	Mary
3	Joe
4	Steve
5	Joan
6	Shawn

I only want to capture each unique value of the variable NAME. I've thought of doing this using a STACK feature in other languages and read up on the ARRAY function in SAS. I need to be able to read my input dataset capture the value of NAME and examine it against a stack or array to see if I have already stored its value.

Is there a better way of doing this? Is there a PROCEDURE that will do this? I'm thinking the DATA Step is the best approach but wanted to share my problem with this forum.

Best regards

Jagadishkatam · Posted 12-14-2018 10:12 AM

please try the proc sort procedure with the nodupkey option to remove the duplicates.

proc sort data=have out=want nodupkey;
by name;
run;

Thanks,
Jag

Astounding · Posted 12-14-2018 10:16 AM

The suggested SORT will work, as will PROC SQL using SELECT DISTINCT.

Before that, however, you have to decide what makes a name different? You have "shawn" in your sample data. Is "shawn" different than "Shawn" ? You might want to account for capitalization before applying a procedure.

STACK? ARRAY? What is the best approach?

Re: STACK? ARRAY? What is the best approach?

Re: STACK? ARRAY? What is the best approach?

STACK? ARRAY? What is the best approach?

Re: STACK? ARRAY? What is the best approach?

Re: STACK? ARRAY? What is the best approach?

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away