Hi all,
I have hospital admission data for a two year period. Example of data is showed below (i just put the variables that is connected with my question):
date id
04/05/2012 1
05/07/2012 1
16/07/2012 1
07/12/2013 1
05/08/2012 2
12/10/2012 2
01/05/2012 3
06/05/2012 3
08/06/2012 3
12/10/2012 3
16/11/2012 3
01/01/2013 3
For each patient (id) i need to select the first and the last admission during the time period, and i need randomly select another one from the rest. For example, for the ID=1, first admission would be with date 01/05/2012 and the last admission would be 01/01/2013; how can i randomly select one more admission from the rest?
Thank you.
@viollete wrote:
Do I have to use STRATA option in proc surveyselect? there are more patients who had more than two admissions during time period and for each patients I want to select randomly one.
Your STRATA would be the ID variable and you tell survey select to select one per strata.
Here is my take on this problem:
data have; input date ddmmyy10. id $; format date ddmmyy10.; datalines; 04/05/2012 1 05/07/2012 1 16/07/2012 1 07/12/2013 1 05/08/2012 2 12/10/2012 2 01/05/2012 3 06/05/2012 3 08/06/2012 3 12/10/2012 3 16/11/2012 3 01/01/2013 3 ; run; proc sort data=have; by id date; run; data firstlast others; set have; by id; if first.id or last.id then output firstlast; else output others; run; proc surveyselect data=others out=othersamp sampsize=1 ; strata id; run; data want; merge firstlast othersamp ; by id date; run;
Note the use of a data step to provide the example data in a form that others can use. Also the use of a code box for code using the forum {I} icon to paste code so that the message windows do not reformat code and to visually separate code from narrative.
You will end up with two additional variables with this that show the sample probability and weight for the other than first or last observations.
Do not come to us when you only have one record because your data only has a single observation in the starting data...
Sort the data, then in a datastep output the first and last, then the remaining run a surveyselct procedure over it:
data firstlast other; set have; by id; if first.id or last.id then output firstlast; else output other; run; proc surveyselect...; run;
Do I have to use STRATA option in proc surveyselect? there are more patients who had more than two admissions during time period and for each patients I want to select randomly one.
You would really need to explain your issue. Post test data in the form of a datastep, and show what the output should look like. You can where clause the data before taking first and last, that would leave just the visits which are within the window, but not first and last to take a sampling of 1 record per id.
@viollete wrote:
Do I have to use STRATA option in proc surveyselect? there are more patients who had more than two admissions during time period and for each patients I want to select randomly one.
Your STRATA would be the ID variable and you tell survey select to select one per strata.
Here is my take on this problem:
data have; input date ddmmyy10. id $; format date ddmmyy10.; datalines; 04/05/2012 1 05/07/2012 1 16/07/2012 1 07/12/2013 1 05/08/2012 2 12/10/2012 2 01/05/2012 3 06/05/2012 3 08/06/2012 3 12/10/2012 3 16/11/2012 3 01/01/2013 3 ; run; proc sort data=have; by id date; run; data firstlast others; set have; by id; if first.id or last.id then output firstlast; else output others; run; proc surveyselect data=others out=othersamp sampsize=1 ; strata id; run; data want; merge firstlast othersamp ; by id date; run;
Note the use of a data step to provide the example data in a form that others can use. Also the use of a code box for code using the forum {I} icon to paste code so that the message windows do not reformat code and to visually separate code from narrative.
You will end up with two additional variables with this that show the sample probability and weight for the other than first or last observations.
Do not come to us when you only have one record because your data only has a single observation in the starting data...
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.