Howdy folks! I have a large dataset with IDs that have anywhere from one to seven observations, and I'm trying to collapse them into one per ID. It's from an Excel file, but I'll try to mimic some of the data here for the sake of the following code: data visits; input dovisit date9. person_id sex :$1. nvisit :$1. fvisit :$1. avisit :$1.; datalines; 18dec2007 444 M T F F 18dec2007 444 M T F F 20dec2008 444 M F F T 23apr2009 444 M T T T 31mar2010 444 M F F F 10jan2007 365 M T F F 10jan2007 365 M T F F 11jan2008 265 M F F T 01feb2007 212 F T T T 01feb2007 212 F T F T ; run; /*create a data set of the duplicates using DUPOUT= option */
proc sort data=visits dupout=visits_dup nodupkey;
by person_id;
run;
/* Create a macro variable with the variable names that are to */
/* be merged. The variables considered BY variables are excluded */
/* from going into the macro variable using the NOT IN operator. */
/* The resulting macro variable is in the format varname=varname_2 */
proc sql noprint;
select trim(name) || '=' || trim(name) || '_2'
into :varlist separated by ' '
from DICTIONARY.COLUMNS
WHERE LIBNAME EQ "WORK" and MEMNAME EQ "VISITS"
and upcase(name) not in ('PERSON_ID' 'DOVISIT' 'SEX');
quit;
/*Merge the two data sets using the macro variable to rename the */
/*common variables in the second (duplicates) data set. */
data merged;
merge visits visits_dup (rename=(&varlist));
by person_id;
run;
proc print;
run; Now what I want to modify this code to do is to have columns nvisit, fvisit, avisit, nvisit_2, fvisit_2, avisit_2, nvisit_3, fvisit_3, avisit_3, and so on, all the way to _7, but when I try to modify the code, specifically this line: select trim(name) || '=' || trim(name) || '_2' ...nothing I do seems to stick. The only change I've been able to make so far without getting an error: select trim(name) || '=' || trim(name) || '_2' || '_3' turns my nvisit_2 into an nvisit_2_3 and so on, instead of actually creating the separate column nvisit_3. I'm sure I'm making a syntax error but I'm not sure how to fix it. Thank you so much for taking a look! (PS - This code is almost identical to what I found at Collapse observations in BY-Group so values from duplicate observations have new names; all I did here was add some more datalines because the code was originally only written to collapse two observations.)
... View more