Hello Everyone,
I have reimbusement data in which each person file how much they spend on Hotel, Meal, and Airfare for each travel (1, 2, 3,4...).
Now I want to each payment is record in a seperate row.
Say for a travel (number 2 in the sample), the original is 500, 100, 300.
I want to tranform to 3 row
500, 0,0
0,100,0
0,0,300
I am not sure how to get it done.
Any help is very much appreciated.
HC
data have;
input travel hotel meal airfare ;
datalines;
1 0 20 0
2 500 100 300
3 200 100 0
;run;
data want:
day | Hotel | Meal | Airfare |
1 | 0 | 20 | 0 |
2 | 500 | 0 | 0 |
2 | 0 | 100 | 0 |
2 | 0 | 0 | 300 |
3 | 200 | 0 | 0 |
3 | 0 | 100 | 0 |
Not sure where a data structure like you ask for would be useful. It feels like you're after such a structure to solve another problem. Experience tells me: If you need such an uncommon data structure as input to a downstream process then may-be this downstream process needs some re-thinking.
Anyway: Below code should give you what you have been asking for.
data have;
input travel hotel meal airfare ;
datalines;
1 . 20 .
2 500 100 300
3 200 100 .
;
run;
data want(drop=_:);
merge have have(rename=(hotel=_hotel meal=_meal airfare=_airfare));
array amt_in {*} _hotel _meal _airfare;
array amt_out {*} hotel meal airfare;
do _i=1 to dim(amt_out);
amt_out[_i]=0;
end;
do _i=1 to dim(amt_in);
if amt_in[_i]>0 then
do;
amt_out[_i]=amt_in[_i];
output;
amt_out[_i]=0;
end;
end;
run;
Below doesn't give you exactly what you are asking for but it transposes the data into a narrow structure the way that's normally done.
Such a structure is then also what you need for many of the SAS Proc's.
data have;
input travel hotel meal airfare ;
datalines;
1 0 20 0
2 500 100 300
3 200 100 0
;
run;
data want(keep=travel type amount);
set have;
array atype {*} hotel meal airfare;
do i=1 to dim(atype);
if atype[i]>0 then
do;
type=vname(atype[i]);
amount=atype[i];
output;
end;
end;
run;
options missing='0';
proc tabulate data=want;
class travel type;
var amount;
table travel, type=' '*amount=' '*sum=' ';
run;
Mine best is below.
Is there anyway to combine them?
data have;
input travel hotel meal airfare ;
datalines;
1 . 20 .
2 500 100 300
3 200 100 .
;run;
data want;
set have nobs=totalobs;
i+1;
set have point=i;
array _col{*} hotel meal airfare;
do k=1 to dim(_col);
if _col{k}^=. then do;
output;
end;
end;
run;
data want; set want;
drop k kk;
array _col{*} hotel meal airfare;
do kk=1 to dim(_col);
if kk^=k then do;
_col(kk)=0;
end;
end;
run;
Not sure where a data structure like you ask for would be useful. It feels like you're after such a structure to solve another problem. Experience tells me: If you need such an uncommon data structure as input to a downstream process then may-be this downstream process needs some re-thinking.
Anyway: Below code should give you what you have been asking for.
data have;
input travel hotel meal airfare ;
datalines;
1 . 20 .
2 500 100 300
3 200 100 .
;
run;
data want(drop=_:);
merge have have(rename=(hotel=_hotel meal=_meal airfare=_airfare));
array amt_in {*} _hotel _meal _airfare;
array amt_out {*} hotel meal airfare;
do _i=1 to dim(amt_out);
amt_out[_i]=0;
end;
do _i=1 to dim(amt_in);
if amt_in[_i]>0 then
do;
amt_out[_i]=amt_in[_i];
output;
amt_out[_i]=0;
end;
end;
run;
Thanks a lot, members.
HC
If you want to produce the desired result from raw data, you can read lines one variable at a time, using the trailing @ sign:
data want (drop=i);
retain travel hotel meal airfare 0;
input travel @;
array x {*} hotel -- airfare;
do i=1 to dim(x);
input x{I} @;
if x{I}^=0 then do; output; x{I}=0; end;
end;
datalines;
1 0 20 0
2 500 100 300
3 200 100 0
;run;
But if you want to produce WANT from HAVE, you can emulate the above by having multiple SET statements (instead of multiple INPUT statements). Each SET produces an independent stream of data. And because each keeps only one variable at a time you can test for non-zero values, then output (and reset to 0) when appropriate:
data want;
set have (keep=travel hotel);
retain meal airfare 0;
if hotel^=0 then do; output; hotel=0;end;
set have (keep=meal);
if meal ^=0 then do; output; meal=0;end;
set have (keep=airfare);
if airfare ^=0 then do; output; airfare=0;end;
run;
The earlier version, from raw data to WANT, is easier to scale to a large number of variables.
This doesn't add a hell of a lot of value that I can see.
Wouldn't a long format make more sense?
You could get do a double transpose to get it done dynamically, I think you'd have to add some sort of indicator variable though.
This is untested but looks promising:
data want;
set have;
do hotel2 = 0, hotel;
do meal2 = 0, meal;
do airfare2 = 0, airfare;
n_zeros=(hotel2=0 + meal2=0 + airfare2=0);
if n_zeros=2 then output;
end;
end;
end;
drop hotel meal airfare;
rename hotel2=hotel meal2=meal airfare2=airfare;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.