BookmarkSubscribeRSS Feed
Ashraya
Calcite | Level 5

I want to concatenate the three data sets by its observation, please give me the idea to proceed on it.

data set A;

Num Var1

1     abc

4     xyz

5     mno

Data set B;

Num Var1

2     aaa

5     qqq

7     zzz

Data set C;

Num Var1

3     qwe

6     ert

4     rty

Result dataset should be as below:

Num Var1

1     abc

2     aaa

3     qwe

4     xyz

5     qqq

6     ert

5     mno

7     zzz

4     rty


The result dataset should be read first combine observation of each dataset, then second observations , so on so forth. Its like dataset A have questions data. B has answers data and C has explanation data. The result should be  Question, answer and explanation in the order.


thanks and regards

Ash

15 REPLIES 15
Tom
Super User Tom
Super User

Sort them first and use SET with a BY statement.

data want;

set a b c ;

by num;

run;

Or set them first without the BY and then use PROC SORT to order the results.

Steelers_In_DC
Barite | Level 11

You'll have to add some logical way to order these.  This will work:

data seta ;

input num var1 $;

cards;

1 abc

4 xyz

5 mno

;

data setb ;

input num var1 $;

cards;

2 aaa

5 qqq

7 zzz

;

data setc ;

input num var1 $;

cards;

3 qwe

6 ert

4 rty

;

data seta;

set seta;

count +1;

run;

data setb;

set setb;

count +1;

run;

data setc;

set setc;

count +1;

run;

data want(drop=count);

set seta setb setc;

by count;

run;

ballardw
Super User

A somewhat different approach:

data seta ;
input num var1 $;
order=_n_*3;
cards;
1 abc
4 xyz
5 mno
;
run;

data setb ;
input num var1 $;
order=_n_*3 +1;
cards;
2 aaa
5 qqq
7 zzz
;
run;

data setc ;
input num var1 $;
order=_n_*3+2;
cards;
3 qwe
6 ert
4 rty
;
run;

data want;
   merge seta setb setc;
   by order;

   drop order;

run;

Tom
Super User Tom
Super User

If the datasets all have the same number of observations then you could do this (without any sorting).

data want;

  set a; output;

  set b; output;

  set c; output;

run;

Now if one or more of the datasets has fewer observations than the others this will stop as soon as it reads past the end of the shortest dataset.

Astounding
PROC Star

Even when the data sets have different numbers of observations, you can program around it:

data want;

   if end1=0 then do;

      set a end=end1; output;

   end;

   if end2=0 then do;

      set b end=end2; output;

   end;

   if end3=0 then do;

      set c end=end3; output;

   end;

   if end1 and end2 and end3 then stop;

run;

Haikuo
Onyx | Level 15

Or 3XDOW:

data want;

do _n_=1 to 1 while (not end1);

  set a end=end1;

output;

end;

do _n_=1 to 1 while (not end2);

  set b end=end2;

output;

end;

do _n_=1 to 1 while (not end3);

  set c end=end3;

output;

end;

run;

Tom
Super User Tom
Super User

That won't do what the OP wanted.  Instead it does the same thing as :

data want; set a b c; run;

Haikuo
Onyx | Level 15

Notice '_n_=1 to 1', This time  I stand correctly, Tom Smiley Happy

Tom
Super User Tom
Super User

Normally I would code DO _N_ = 1 BY 1 UNTIL(...), so I just assumed you had done the same.

But in that case the _N_=1 to 1 is doing nothing of value.  Why not just IF not EOF1 then DO; ?

Haikuo
Onyx | Level 15

Absolutely, Tom. Mine resulted from the old habit format that DOW usually takes. The sole purpose, as you noted, is to force 'set' only once at a time, otherwise, it would be identical to what Astounding does, which btw is much clearer regarding the programming logic.

Astounding
PROC Star

I think it won't quite work, but for a different reason.  I think you will get one extra observation added to the data set at the end, with a message about the data step ending due to looping.  (Too busy to test it right now, but that's what I would expect.)

Haikuo
Onyx | Level 15

You are right about 'due to looping', not sure how that affects the final results, maybe just performance?

data class1 class2;

     set sashelp.class;

     if _n_<10 then

           output class1;

     output class2;

run;

data want;

     do _n_=1 to 1 while (not end1);

           set class2 end=end1;

           output;

     end;

     do _n_=1 to 1 while (not end2);

           set class1 end=end2;

           output;

     end;

run;

NOTE: DATA STEP stopped due to looping.

NOTE: There were 19 observations read from the data set WORK.CLASS2.

NOTE: There were 9 observations read from the data set WORK.CLASS1.

2 The SAS System                             07:45 Wednesday, June 10, 2015

NOTE: The data set WORK.WANT has 28 observations and 5 variables.

NOTE: Compressing data set WORK.WANT increased size by 100.00 percent.

      Compressed is 2 pages; un-compressed would require 1 pages.

NOTE: DATA statement used (Total process time):

      real time           0.01 seconds

      cpu time            0.02 seconds

Edit:

Should have added one of your statements at the end: if end1 and end2 and end3 then stop;

Astounding
PROC Star

You're right, it's just performance (plus an annoying message).  Because of the presence of OUTPUT statements, the program doesn't create that extra record at the end.  Contrast that with:

data want;

if not eof then set have end=eof;

run;

Haikuo
Onyx | Level 15

Thanks, Robert.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1327 views
  • 0 likes
  • 6 in conversation