Solved: collapse table

markc · Posted 06-05-2018 12:19 AM

Hello,

I have a table which has 3 character columns and then another 5 character columns.

I want to collapse the table so there is only 1 row when first 3 columns are same.

For the remaining 5 columns I want to take the value from the only record in the group which will be non-blank.

eg.

A B C xxx ___ ___ ___ ___

A B C ___ yyy ___ ___ ___

A B C ___ ___ ___ ___ ttt

A B C ___ ___ zzz ___ ___

A B C ___ ___ ___ sss ___

X Y Z ___ bbb ___ ___ ___

X Y Z ___ ___ ___ ___ aaa

Desired result:

A B C xxx yyy zzz sss ttt

X Y Z ___ bbb ___ ___ aaa

Any assistance will be liked promptly.

Kind regards,

Mark

Kurt_Bremser · Posted 06-05-2018 02:32 AM

Use by-group processing (by your first three columns) in a data step.

retain new variables for each column you want to accumulate.

at first.(third variable of the by), set all your new variables to missing/empty

everytime you encounter a non-missing value, assign it to the corresponding new variable

at last.(third variable of the by), output.

keep the three by-variables an the new variables

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

ChrisNZ · Posted 06-05-2018 12:35 AM

To get prompt assistance, provide a detailed example. Your request is very vague at best.

Edit: Much better now. I now see why @Reeza quotes the question in the first reply 🙂

High-Performance SAS Coding - Third Edition

Kurt_Bremser · Posted 06-05-2018 02:32 AM

Use by-group processing (by your first three columns) in a data step.

retain new variables for each column you want to accumulate.

at first.(third variable of the by), set all your new variables to missing/empty

everytime you encounter a non-missing value, assign it to the corresponding new variable

at last.(third variable of the by), output.

keep the three by-variables an the new variables

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

s_lassen · Posted 06-05-2018 06:33 AM

One solution is to use merge with WHERE clauses, assuming that your variables are named var1 through var8:

data want;

merge

have(keep=var1 var2 var3 var4 where=(var4))

have(keep=var1 var2 var3 var5 where=(var5))

have(keep=var1 var2 var3 var6 where=(var6))

have(keep=var1 var2 var3 var7 where=(var7))

have(keep=var1 var2 var3 var8 where=(var8))

;

by var1 var2 var3;

run;

ChrisNZ · Posted 06-05-2018 07:44 AM

To update records while ignore missing values, the UPDATE statement is usually used.

High-Performance SAS Coding - Third Edition

Ksharp · Posted 06-05-2018 08:59 AM

data want;

update have(obs=0) have;

by id;

run;

Astounding · Posted 06-05-2018 09:07 AM

@ChrisNZ and @Ksharp are moving in the right direction. SInce you haven't told us what the names of your variables are, I'll just use COL1, COL2, and COL3 (and don't care about the other names). This should do it:

proc sort data=have;

by col1 col2 col3;

run;

data want;

update have (obs=0) have;

by col1 col2 col3;

run;

collapse table

Re: collapse table

Re: collapse table

Re: collapse table

Re: collapse table

Re: collapse table

Re: collapse table

Re: collapse table

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away