Hello,
I have a table which has 3 character columns and then another 5 character columns.
I want to collapse the table so there is only 1 row when first 3 columns are same.
For the remaining 5 columns I want to take the value from the only record in the group which will be non-blank.
eg.
A B C xxx ___ ___ ___ ___
A B C ___ yyy ___ ___ ___
A B C ___ ___ ___ ___ ttt
A B C ___ ___ zzz ___ ___
A B C ___ ___ ___ sss ___
X Y Z ___ bbb ___ ___ ___
X Y Z ___ ___ ___ ___ aaa
Desired result:
A B C xxx yyy zzz sss ttt
X Y Z ___ bbb ___ ___ aaa
Any assistance will be liked promptly.
Kind regards,
Mark
Use by-group processing (by your first three columns) in a data step.
retain new variables for each column you want to accumulate.
at first.(third variable of the by), set all your new variables to missing/empty
everytime you encounter a non-missing value, assign it to the corresponding new variable
at last.(third variable of the by), output.
keep the three by-variables an the new variables
To get prompt assistance, provide a detailed example. Your request is very vague at best.
Edit: Much better now. I now see why @Reeza quotes the question in the first reply 🙂
Use by-group processing (by your first three columns) in a data step.
retain new variables for each column you want to accumulate.
at first.(third variable of the by), set all your new variables to missing/empty
everytime you encounter a non-missing value, assign it to the corresponding new variable
at last.(third variable of the by), output.
keep the three by-variables an the new variables
One solution is to use merge with WHERE clauses, assuming that your variables are named var1 through var8:
data want;
merge
have(keep=var1 var2 var3 var4 where=(var4))
have(keep=var1 var2 var3 var5 where=(var5))
have(keep=var1 var2 var3 var6 where=(var6))
have(keep=var1 var2 var3 var7 where=(var7))
have(keep=var1 var2 var3 var8 where=(var8))
;
by var1 var2 var3;
run;
To update records while ignore missing values, the UPDATE statement is usually used.
data want;
update have(obs=0) have;
by id;
run;
@ChrisNZ and @Ksharp are moving in the right direction. SInce you haven't told us what the names of your variables are, I'll just use COL1, COL2, and COL3 (and don't care about the other names). This should do it:
proc sort data=have;
by col1 col2 col3;
run;
data want;
update have (obs=0) have;
by col1 col2 col3;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.