BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

What is the way to using data sets t1,t2,t3 in order to get the wanted data set ?

As you can see the desire action is set of data sets but need also to fill up empty fields by the values above

 

 

Data t1;
input ID team group Var $ value;
1 555 8 X1 15
1 555 8 X2 30
1 555 8 X3 40
1 555 8 X4 10
1 555 8 X5 28
2 444 7 X1 48
2 444 7 X2 52
2 444 7 X3 20
2 444 7 X4 15
2 444 7 X5 31
;
Run;

Data t2;
input ID Z ;
cards;
1 100
2 200
;
Run;


Data t3;
input ID W ;
cards;
1 50
2 80
;
Run;

/*Wanted data set*/
/*1 555 8 X1 15*/
/*1 555 8 X2 30*/
/*1 555 8 X3 40*/
/*1 555 8 X4 10*/
/*1 555 8 X5 28*/
/*1 555 8 Z 100*/
/*1 555 8 W 50*/
/*2 444 7 X1 48*/
/*2 444 7 X2 52*/
/*2 444 7 X3 20*/
/*2 444 7 X4 15*/
/*2 444 7 X5 31*/
/*2 444 7 Z 200*/
/*2 444 7 W 801*/
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

There will be several ways.  Here's a way that uses a single DATA step:

data want;
   set t1;
   by id;
   output;
   if last.id;
   group_no + 1;
   set t2 point=group_no;
    var = "Z";
    value = z;
    output;
   set t3 point=group_no;
   var = "W";
   value = w;
   output;
   drop group_no w z;
run;

If ID really does take on sequential values 1, 2, 3, etc., the program could be simplified.  Get rid of GROUP_NO and use ID as the POINT= variable.

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

You should make sure the datasets are using the same structure before using the SET statement to combine them.

So you should have VAR values of Z and W and that actual data in VALUE.

Also Your Z and W datasets only have the ID variable.

Let's assume the other variables are just constant attributes of the ID value.

Data t1;
  input ID team group Var $ value;
cards;
1 555 8 X1 15
1 555 8 X2 30
1 555 8 X3 40
1 555 8 X4 10
1 555 8 X5 28
2 444 7 X1 48
2 444 7 X2 52
2 444 7 X3 20
2 444 7 X4 15
2 444 7 X5 31
;

Data t2;
  input ID Z ;
cards;
1 100
2 200
;

Data t3;
  input ID W ;
cards;
1 50
2 80
;

proc sort data=t1(keep=id team group) out=ids nodupkey;
  by id;
run;

data z ;
  merge ids t2;
  by id;
  length var $8 value 8;
  var='Z';
  value=z;
  drop z;
run;

data w ;
  merge ids t3;
  by id;
  length var $8 value 8;
  var='W';
  value=w;
  drop w;
run;

data want;
  set t1 w z;
  by id;
run;

proc print;
run;

Results

Obs    ID    team    group    Var    value

  1     1     555      8      X1       15
  2     1     555      8      X2       30
  3     1     555      8      X3       40
  4     1     555      8      X4       10
  5     1     555      8      X5       28
  6     1     555      8      W        50
  7     1     555      8      Z       100
  8     2     444      7      X1       48
  9     2     444      7      X2       52
 10     2     444      7      X3       20
 11     2     444      7      X4       15
 12     2     444      7      X5       31
 13     2     444      7      W        80
 14     2     444      7      Z       200

 

Astounding
PROC Star

There will be several ways.  Here's a way that uses a single DATA step:

data want;
   set t1;
   by id;
   output;
   if last.id;
   group_no + 1;
   set t2 point=group_no;
    var = "Z";
    value = z;
    output;
   set t3 point=group_no;
   var = "W";
   value = w;
   output;
   drop group_no w z;
run;

If ID really does take on sequential values 1, 2, 3, etc., the program could be simplified.  Get rid of GROUP_NO and use ID as the POINT= variable.

Ronein
Meteorite | Level 14

Thanks,

The ID in real data is not a sequence number.

May you please tell the terminology that you used in your code.

I would like to understand your code better .

Why do you use "output" statement 3 times?

Why do you use last statement one time?

Why do you use point statement two times?

Data t1;
  input ID team group Var $ value;
cards;
111 555 8 X1 15
111 555 8 X2 30
223 444 7 X3 20
223 444 7 X4 15
111 555 8 X3 40
111 555 8 X4 10
111 555 8 X5 28
223 444 7 X1 48
223 444 7 X2 52
223 444 7 X5 31
;
Run;

Data t2;
  input ID Z ;
cards;
111 100
223 200
;
Run;


Data t3;
  input ID W ;
cards;
111 50
223 80
;
Run;

proc sort data=t1;by ID;Run;
proc sort data=t2;by ID;Run;
proc sort data=t3;by ID;Run;

data want;
set t1;
by id;
output;
if last.id;
group_no + 1;
set t2 point=group_no;
var = "Z";
value = z;
output;
set t3 point=group_no;
var = "W";
value = w;
output;
drop group_no w z;
run;



 
Astounding
PROC Star

There are three OUTPUT statements, because there are three situations where I want to output an observation:  after reading an observation from t1, after reading an observation from t2, and after reading an observation from t3.

 

There is only one BY statement, so there is only one set of first. and last. variables available.  No need to refer to any other such variables, because they don't exist.

 

Using point= twice is overkill.  In retrospect, none of them are needed.  The program could be shortened as follows:

data want;
set t1;
by id;
output;
if last.id;
set t2;
var = "Z";
value = z;
output;
set t3;
var = "W";
value = w;
output;
drop w z;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 834 views
  • 2 likes
  • 3 in conversation