Solved: Re: Combining Variables

Lost_Gary · Posted 10-28-2025 08:45 AM

I am trying to combine 3 sets of variables into a single group of variables and I am struggling to get my head around the most efficient way to do this. I currently have about 150 if then statements and I still can't get this to line up as I would like. Here is what i am after:

	Have
Obs	A1	A2	A3	A4	A5	B1	B2	B3	B4	B5	C1	C2	C3	C4	C5
1	1	2				3	44				22
2						1
3	2					22
4	4	2				22	23
5	5										1
6											4

	Want
Obs	ABC1	ABC2	ABC3	ABC4	ABC5
1	1	2	3	44	22
2	1
3	2	22
4	4	2	22	23
5	5	1
6	4

Any thoughts are appreciated.

Kurt_Bremser · Posted 10-28-2025 10:13 AM

Use a double TRANSPOSE:

data have;
infile datalines dsd truncover;
n = _N_;
input a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 c1 c2 c3 c4 c5;
datalines;
1,2,,,,3,44,,,,22
,,,,,1
2,,,,,22
;

proc transpose data=have out=long (drop=_name_ where=(col1 ne .));
by n;
var a: b: c:;
run;

proc transpose data=long out=want (drop=_name_) prefix=abc;
by n;
var col1;
run;

Please provide example data in the future like this (DATA step with DATALINES), so we don't have to do it for you.

I have a strong suspicion that your original data is not in this wide form; if yes, please show us an example. This might make it much easier to get your result, which looks more like a report than a dataset.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

quickbluefish · Posted 10-28-2025 09:05 AM

If I'm understanding this correctly, you could try:

data want;
set have;
array have {*} A1-A5 B1-B5 C1-C5;
array ABC {*} ABC1-ABC5;
pos=0;
do i=1 to dim(have);
    if have[i]>. then do;
        pos+1;
        ABC[pos]=have[i];
    end;
end;
drop i pos;
run;

LinusH · Posted 10-28-2025 10:00 AM

Your example is abstract, but even so your "want" data set looks hard to use.
What is the purpose of the different variables?
Without knowing the application I would suggest transposing the data instead. This removes all blank cells, and keeps the lineage from which columns the values originated.

Data never sleeps

PaigeMiller · Posted 10-28-2025 10:36 AM

@LinusH wrote:
Your example is abstract, but even so your "want" data set looks hard to use.
What is the purpose of the different variables?
Without knowing the application I would suggest transposing the data instead. This removes all blank cells, and keeps the lineage from which columns the values originated.

@LinusH makes a very good point, that sometimes people spend time and effort converting a data set layout that is difficult to use to another data set layout that is equally difficult to use. It is time and effort for no benefit.

So instead of telling us what the re-arrangement of the data is desired, @Lost_Gary should tell us the final result of this programming effort, what tables/plots/analysis are you going to wind up doing with this data. Then we can advise in a much more specific fashion.

--
Paige Miller

Lost_Gary · Posted 10-28-2025 11:52 AM

Ha - this is such a great observation. Now when I tell you that cobbling data in this format is for government reporting and following government specs, maybe you'll give me some slack.

Kurt_Bremser · Posted 10-28-2025 01:28 PM

@Lost_Gary wrote:
Ha - this is such a great observation. Now when I tell you that cobbling data in this format is for government reporting and following government specs, maybe you'll give me some slack.

The keyword is right there: reporting.

If the final goal is to create a report for some agency, this does not mean you have to do all the work in-between the source data and the report with such hilarious, unwieldy wide structures.

Unless you're lookin for job security, of course 😉

Note that SAS's reporting procedures are all designed to create the report from long dataset layouts.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 10-28-2025 10:13 AM

Use a double TRANSPOSE:

data have;
infile datalines dsd truncover;
n = _N_;
input a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 c1 c2 c3 c4 c5;
datalines;
1,2,,,,3,44,,,,22
,,,,,1
2,,,,,22
;

proc transpose data=have out=long (drop=_name_ where=(col1 ne .));
by n;
var a: b: c:;
run;

proc transpose data=long out=want (drop=_name_) prefix=abc;
by n;
var col1;
run;

Please provide example data in the future like this (DATA step with DATALINES), so we don't have to do it for you.

I have a strong suspicion that your original data is not in this wide form; if yes, please show us an example. This might make it much easier to get your result, which looks more like a report than a dataset.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Registration is open

SAS Training: Just a Click Away