Hello
What is the way to using data sets t1,t2,t3 in order to get the wanted data set ?
As you can see the desire action is set of data sets but need also to fill up empty fields by the values above
Data t1;
input ID team group Var $ value;
1 555 8 X1 15
1 555 8 X2 30
1 555 8 X3 40
1 555 8 X4 10
1 555 8 X5 28
2 444 7 X1 48
2 444 7 X2 52
2 444 7 X3 20
2 444 7 X4 15
2 444 7 X5 31
;
Run;
Data t2;
input ID Z ;
cards;
1 100
2 200
;
Run;
Data t3;
input ID W ;
cards;
1 50
2 80
;
Run;
/*Wanted data set*/
/*1 555 8 X1 15*/
/*1 555 8 X2 30*/
/*1 555 8 X3 40*/
/*1 555 8 X4 10*/
/*1 555 8 X5 28*/
/*1 555 8 Z 100*/
/*1 555 8 W 50*/
/*2 444 7 X1 48*/
/*2 444 7 X2 52*/
/*2 444 7 X3 20*/
/*2 444 7 X4 15*/
/*2 444 7 X5 31*/
/*2 444 7 Z 200*/
/*2 444 7 W 801*/
There will be several ways. Here's a way that uses a single DATA step:
data want;
set t1;
by id;
output;
if last.id;
group_no + 1;
set t2 point=group_no;
var = "Z";
value = z;
output;
set t3 point=group_no;
var = "W";
value = w;
output;
drop group_no w z;
run;
If ID really does take on sequential values 1, 2, 3, etc., the program could be simplified. Get rid of GROUP_NO and use ID as the POINT= variable.
You should make sure the datasets are using the same structure before using the SET statement to combine them.
So you should have VAR values of Z and W and that actual data in VALUE.
Also Your Z and W datasets only have the ID variable.
Let's assume the other variables are just constant attributes of the ID value.
Data t1;
input ID team group Var $ value;
cards;
1 555 8 X1 15
1 555 8 X2 30
1 555 8 X3 40
1 555 8 X4 10
1 555 8 X5 28
2 444 7 X1 48
2 444 7 X2 52
2 444 7 X3 20
2 444 7 X4 15
2 444 7 X5 31
;
Data t2;
input ID Z ;
cards;
1 100
2 200
;
Data t3;
input ID W ;
cards;
1 50
2 80
;
proc sort data=t1(keep=id team group) out=ids nodupkey;
by id;
run;
data z ;
merge ids t2;
by id;
length var $8 value 8;
var='Z';
value=z;
drop z;
run;
data w ;
merge ids t3;
by id;
length var $8 value 8;
var='W';
value=w;
drop w;
run;
data want;
set t1 w z;
by id;
run;
proc print;
run;
Results
Obs ID team group Var value 1 1 555 8 X1 15 2 1 555 8 X2 30 3 1 555 8 X3 40 4 1 555 8 X4 10 5 1 555 8 X5 28 6 1 555 8 W 50 7 1 555 8 Z 100 8 2 444 7 X1 48 9 2 444 7 X2 52 10 2 444 7 X3 20 11 2 444 7 X4 15 12 2 444 7 X5 31 13 2 444 7 W 80 14 2 444 7 Z 200
There will be several ways. Here's a way that uses a single DATA step:
data want;
set t1;
by id;
output;
if last.id;
group_no + 1;
set t2 point=group_no;
var = "Z";
value = z;
output;
set t3 point=group_no;
var = "W";
value = w;
output;
drop group_no w z;
run;
If ID really does take on sequential values 1, 2, 3, etc., the program could be simplified. Get rid of GROUP_NO and use ID as the POINT= variable.
Thanks,
The ID in real data is not a sequence number.
May you please tell the terminology that you used in your code.
I would like to understand your code better .
Why do you use "output" statement 3 times?
Why do you use last statement one time?
Why do you use point statement two times?
Data t1;
input ID team group Var $ value;
cards;
111 555 8 X1 15
111 555 8 X2 30
223 444 7 X3 20
223 444 7 X4 15
111 555 8 X3 40
111 555 8 X4 10
111 555 8 X5 28
223 444 7 X1 48
223 444 7 X2 52
223 444 7 X5 31
;
Run;
Data t2;
input ID Z ;
cards;
111 100
223 200
;
Run;
Data t3;
input ID W ;
cards;
111 50
223 80
;
Run;
proc sort data=t1;by ID;Run;
proc sort data=t2;by ID;Run;
proc sort data=t3;by ID;Run;
data want;
set t1;
by id;
output;
if last.id;
group_no + 1;
set t2 point=group_no;
var = "Z";
value = z;
output;
set t3 point=group_no;
var = "W";
value = w;
output;
drop group_no w z;
run;
There are three OUTPUT statements, because there are three situations where I want to output an observation: after reading an observation from t1, after reading an observation from t2, and after reading an observation from t3.
There is only one BY statement, so there is only one set of first. and last. variables available. No need to refer to any other such variables, because they don't exist.
Using point= twice is overkill. In retrospect, none of them are needed. The program could be shortened as follows:
data want;
set t1;
by id;
output;
if last.id;
set t2;
var = "Z";
value = z;
output;
set t3;
var = "W";
value = w;
output;
drop w z;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.