Hello
I am performing vertical merging (UNION) of 3 data sets.
In the resulted data set there are 2 columns with some null values.
I want to ask please what is the way to put the value from the upper rows down (per customerID)?
Data tbl1;
INPUT CustomerID product $ ID Exp_VAR $ coefficient;
cards;
11111 UE 3333322312 Inter 1000
11111 UE 3333322312 Age 1000
11111 UE 3333322312 Educ 1500
11111 UE 3333322312 Income 1500
22222 TE 3333322312 Inter 3000
22222 TE 3333322312 Age 2000
22222 TE 3333322312 Educ 1000
22222 TE 3333322312 Income 2000
;
Run;
Data tbl2;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL1 5000
22222 TOTAL1 8000
;
RUN:
Data tbl3;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL2 4000
22222 TOTAL2 5000
;
RUN:
Data wanted;
SET tbl1 tbl2 tbl3;
BY CustomerID;
RUN;
Add one more step.
Data tbl1;
INPUT CustomerID product $ ID Exp_VAR $ coefficient;
cards;
11111 UE 3333322312 Inter 1000
11111 UE 3333322312 Age 1000
11111 UE 3333322312 Educ 1500
11111 UE 3333322312 Income 1500
22222 TE 3333322312 Inter 3000
22222 TE 3333322312 Age 2000
22222 TE 3333322312 Educ 1000
22222 TE 3333322312 Income 2000
;
Run;
Data tbl2;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL1 5000
22222 TOTAL1 8000
;
RUN:
Data tbl3;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL2 4000
22222 TOTAL2 5000
;
RUN:
Data wanted;
SET tbl1 tbl2 tbl3;
BY CustomerID;
RUN;
data final_want;
update wanted(obs=0) wanted;
by CustomerID;
output;
run;
You could change the last datastep to something like:
Data wanted;
SET tbl1(in=in1) tbl2 tbl3;
BY CustomerID;
if in1 then do;
_product=product;
_id=id;
end;
else do;
product=_product;
id=_id;
end;
retain _:;
drop _:;
output;
if last.CustomerID then
call missing(of _:);
RUN;
I put the CALL MISSING statement in after the OUTPUT, in case there are customer IDs not in TBL1 - otherwise they would get values from the previous customer ID.
Add one more step.
Data tbl1;
INPUT CustomerID product $ ID Exp_VAR $ coefficient;
cards;
11111 UE 3333322312 Inter 1000
11111 UE 3333322312 Age 1000
11111 UE 3333322312 Educ 1500
11111 UE 3333322312 Income 1500
22222 TE 3333322312 Inter 3000
22222 TE 3333322312 Age 2000
22222 TE 3333322312 Educ 1000
22222 TE 3333322312 Income 2000
;
Run;
Data tbl2;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL1 5000
22222 TOTAL1 8000
;
RUN:
Data tbl3;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL2 4000
22222 TOTAL2 5000
;
RUN:
Data wanted;
SET tbl1 tbl2 tbl3;
BY CustomerID;
RUN;
data final_want;
update wanted(obs=0) wanted;
by CustomerID;
output;
run;
It is great but what happened If there is a specific column that on this column I dont want to copy the values from upper rows.
Please see this example please:
In column Ind I dont want to copy values .
Data tbl1;
INPUT CustomerID product $ ID Exp_VAR $ coefficient Ind;
cards;
11111 UE 3333322312 Inter 1000 1
11111 UE 3333322312 Age 1000 1
11111 UE 3333322312 Educ 1500 1
11111 UE 3333322312 Income 1500 1
22222 TE 3333322312 Inter 3000 1
22222 TE 3333322312 Age 2000 1
22222 TE 3333322312 Educ 1000 1
22222 TE 3333322312 Income 2000 1
;
Run;
Data tbl2;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL1 5000
22222 TOTAL1 8000
;
RUN:
Data tbl3;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL2 4000
22222 TOTAL2 5000
;
RUN:
Data wanted;
SET tbl1 tbl2 tbl3;
BY CustomerID;
RUN;
data final_want;
update wanted(obs=0) wanted;
by CustomerID;
output;
run;
OK. Add one more step .
Data tbl1;
INPUT CustomerID product $ ID Exp_VAR $ coefficient Ind;
cards;
11111 UE 3333322312 Inter 1000 1
11111 UE 3333322312 Age 1000 1
11111 UE 3333322312 Educ 1500 1
11111 UE 3333322312 Income 1500 1
22222 TE 3333322312 Inter 3000 1
22222 TE 3333322312 Age 2000 1
22222 TE 3333322312 Educ 1000 1
22222 TE 3333322312 Income 2000 1
;
Data tbl2;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL1 5000
22222 TOTAL1 8000
;
Data tbl3;
INPUT CustomerID Exp_VAR $ coefficient;
cards;
11111 TOTAL2 4000
22222 TOTAL2 5000
;
Data wanted;
SET tbl1 tbl2 tbl3;
BY CustomerID;
RUN;
data final_want(drop=ind);
update wanted(obs=0) wanted;
by CustomerID;
output;
run;
data final_want2;
merge final_want wanted(keep=ind);
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.