BookmarkSubscribeRSS Feed
deleted_user
Not applicable
I am asking for help developing an ARRAY. This is my 1st post, I am a new SAS user.

Working with a large health dataset. Used PROC TRANSPOSE to create the new dataset for ID and date from a much larger health dataset.

Have one line per ID (more than 100 ID's), each ID has many dates (variables) from 1 to 1,100. Many of the dates are duplicates, meaning they received different health services on the same date. Looking to reduce the dataset to a more manageable level by consolidating the duplicate dates for each ID, resulting in fewer columns.

So the var should look like date1, date2, date3...How do I create an ARRAY which eliminates the duplicate dates? Thank you
2 REPLIES 2
Cynthia_sas
SAS Super FREQ
Hi:
Have you looked into the PROC SORT with the NODUPS or NODUPKEY option??? This might be useful before you run your PROC TRANSPOSE to get rid of all the possible dates.

This is also an instance where seeing some fake data would help. For example, is your data in this structure to start:
[pre]
OPTION A
PT_ID DATE
10 04/03/2009
10 04/03/2009
10 05/06/2009
10 08/18/2009
10 08/18/2009
11 01/01/2009
11 03/02/2009
11 03/02/2009
12 02/12/2009
12 02/12/2009
12 05/15/2009
12 05/15/2009
[/pre]
OR
this structure to start:
[pre]
OPTION B
PT_ID DATE1 DATE2 DATE3 DATE4 DATE5
10 04/03/2009 04/03/2009 05/06/2009 08/18/2009 08/18/2009
11 01/01/2009 03/02/2009 03/02/2009 . .
12 02/12/2009 02/12/2009 05/15/2009 05/15/2009 .

[/pre]

And what is your desired result given the above "fake" data???

Also, an ARRAY in SAS is not a physical data construct (as it is in some languages). A SAS Array is just a convenient way to reference a group of variables for the purpose of treating them as a group and possibly processing the individual variables as though they were members of an array. Variables that you use when you declare an ARRAY in a SAS program can be numbered variables or they can just be differently named variables. For example, these are both valid ARRAY statements:
[pre]
array lovelucy fred ethel lucy ricky;
array regsl regsale1 regsale2 regsale3 regsale4 regsale5 regsale6;

[/pre]

The first array, LOVELUCY, treats 4 differently named variables as though they were members of an array. The second array, REGSL treats the numbered variables, REGSALE1-REGSALE6 as though they were members of an array -- which means that the arrays could be used within a DO loop for loop processing.

cynthia
deleted_user
Not applicable
Hello Cynthia

My dataset looks like Option B, with the number of columns for each ID's dates goes as high as Date1800, most go no higher than Date300

Looking to keep each unique date per ID.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 596 views
  • 0 likes
  • 2 in conversation