BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Lost_Gary
Quartz | Level 8

I am trying to combine 3 sets of variables into a single group of variables and I am struggling to get my head around the most efficient way to do this. I currently have about 150 if then statements and I still can't get this to line up as I would like.  Here is what i am after:

 

 

Have

                           

Obs

A1

A2

A3

A4

A5

B1

B2

B3

B4

B5

C1

C2

C3

C4

C5

1

1

2

 

 

 

3

44

 

 

 

22

 

 

 

 

2

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

3

2

 

 

 

 

22

 

 

 

 

 

 

 

 

 

4

4

2

 

 

 

22

23

 

 

 

 

 

 

 

 

5

5

 

 

 

 

 

 

 

 

 

1

 

 

 

 

6

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

                               

 

 

Want

         

Obs

ABC1

ABC2

ABC3

ABC4

ABC5

 

1

1

2

3

44

22

 

2

1

 

 

 

 

 

3

2

22

 

 

 

 

4

4

2

22

23

 

 

5

5

1

 

 

 

 

6

4

 

 

 

 

 

 

Any thoughts are appreciated. 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Use a double TRANSPOSE:

data have;
infile datalines dsd truncover;
n = _N_;
input a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 c1 c2 c3 c4 c5;
datalines;
1,2,,,,3,44,,,,22
,,,,,1
2,,,,,22
;

proc transpose data=have out=long (drop=_name_ where=(col1 ne .));
by n;
var a: b: c:;
run;

proc transpose data=long out=want (drop=_name_) prefix=abc;
by n;
var col1;
run;

Please provide example data in the future like this (DATA step with DATALINES), so we don't have to do it for you.

 

I have a strong suspicion that your original data is not in this wide form; if yes, please show us an example. This might make it much easier to get your result, which looks more like a report than a dataset.

View solution in original post

6 REPLIES 6
quickbluefish
Barite | Level 11

If I'm understanding this correctly, you could try:

data want;
set have;
array have {*} A1-A5 B1-B5 C1-C5;
array ABC {*} ABC1-ABC5;
pos=0;
do i=1 to dim(have);
    if have[i]>. then do;
        pos+1;
        ABC[pos]=have[i];
    end;
end;
drop i pos;
run;
LinusH
Tourmaline | Level 20
Your example is abstract, but even so your "want" data set looks hard to use.
What is the purpose of the different variables?
Without knowing the application I would suggest transposing the data instead. This removes all blank cells, and keeps the lineage from which columns the values originated.
Data never sleeps
PaigeMiller
Diamond | Level 26

@LinusH wrote:
Your example is abstract, but even so your "want" data set looks hard to use.
What is the purpose of the different variables?
Without knowing the application I would suggest transposing the data instead. This removes all blank cells, and keeps the lineage from which columns the values originated.

@LinusH makes a very good point, that sometimes people spend time and effort converting a data set layout that is difficult to use to another data set layout that is equally difficult to use. It is time and effort for no benefit.

 

So instead of telling us what the re-arrangement of the data is desired, @Lost_Gary should tell us the final result of this programming effort, what tables/plots/analysis are you going to wind up doing with this data. Then we can advise in a much more specific fashion.

--
Paige Miller
Lost_Gary
Quartz | Level 8
Ha - this is such a great observation. Now when I tell you that cobbling data in this format is for government reporting and following government specs, maybe you'll give me some slack.
Kurt_Bremser
Super User

@Lost_Gary wrote:
Ha - this is such a great observation. Now when I tell you that cobbling data in this format is for government reporting and following government specs, maybe you'll give me some slack.

The keyword is right there: reporting.

If the final goal is to create a report for some agency, this does not mean you have to do all the work in-between the source data and the report with such hilarious, unwieldy wide structures.

Unless you're lookin for job security, of course 😉

 

Note that SAS's reporting procedures are all designed to create the report from long dataset layouts.

Kurt_Bremser
Super User

Use a double TRANSPOSE:

data have;
infile datalines dsd truncover;
n = _N_;
input a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 c1 c2 c3 c4 c5;
datalines;
1,2,,,,3,44,,,,22
,,,,,1
2,,,,,22
;

proc transpose data=have out=long (drop=_name_ where=(col1 ne .));
by n;
var a: b: c:;
run;

proc transpose data=long out=want (drop=_name_) prefix=abc;
by n;
var col1;
run;

Please provide example data in the future like this (DATA step with DATALINES), so we don't have to do it for you.

 

I have a strong suspicion that your original data is not in this wide form; if yes, please show us an example. This might make it much easier to get your result, which looks more like a report than a dataset.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 385 views
  • 2 likes
  • 5 in conversation