Hi!
I really need help with some SAS coding (and some logic thinking perhaps?) 🙂
I have a dataset which contains some variables, this one is named followUp_complete. From this dataset I extracted the variable ID to create a new dataset, this one is named bipolar.nr_followUpTemp.
If I group my dataset (bipolar.nr_followUpTemp) after ID, the groups are at different sizes (going from 1 to 5 observations per group). My next step is to make the groups the same size, so that only the two first observations recorded are saved, and discard the other observations (i.e. if there are obs. no. 3 and/or 4 and/or 5).
My thoughts was to create a loop that made a count for each observation within the group. The loop starts over again when a new group comes in the dataset. I would discard observations with a value above 2 - the only problem is that I cant create this loop on my own. All I get is crap 🙂
This is what I'm working with:
*Counts the number of observation each group has;
PROC SQL;
CREATE TABLE bipolar.nr_followUpTemp as
SELECT id, count(ID) AS nrFollowUp
FROM bipolar.followUp_complete
GROUP BY ID;
RUN;
*Sorts the complete dataset which contains stuff which I'd like to include in my final dataset;
PROC SORT DATA = bipolar.followUp_complete;
BY id;
RUN;
*Sort the number of observations in each group by ID;
PROC SORT DATA = bipolar.nr_followUpTemp;
BY id;
RUN;
/*Mergin the two datasets and also create a loop... The loop clearly doesnt work since it makes a count for each observation.
*/
DATA bipolar.test;
MERGE bipolar.followUp_complete bipolar.nr_followUpTemp;
BY id;
DO counter = 1 TO nrFollowUp;
OUTPUT;
END;
RUN;
Is it possible to adjust the loop somehow so it only does a run for each group, and not each observation?
Thank you for your time!
/Christian.