Im trying to add 2 columns to want_wide dataset.
This is my current code:
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;
proc sort data=have;
by type quarter;
run;
data want;
set have;
where phase ne .;
by type;
retain diff_phase;
if first.type then diff_phase = phase;
if last.type
then do;
diff_phase = phase - diff_phase;
output;
end;
diff_phase = phase;
keep type diff_phase;
run;
proc sort data=have;
by type quarter;
run;
data long;
set
have
;
run;
proc transpose
data=long
out=wide (
drop=_name_
where=(type ne "")
)
prefix=q_
;
by type;
var phase;
id quarter;
run;
data want_wide;
merge
wide
want
;
by type;
run;
This is how want_wide looks like right now:
type q_202003 q_202101 q_202102 q_202104 q_202103 diff_phase
K-11 3 1 2 3 . -1
K-12 . 3 1 1 1 0
K-13 2 3 2 . 2 0
This is how I want want_wide to look:
type q_202003 q_202101 q_202102 q_202104 q_202103 diff_phase last_color second_last_color date
K-11 3 1 2 3 . -1 black blue 20200101
K-12 . 3 1 1 1 0 yellow white 20200101
K-13 2 3 2 . 2 0 blue red 20200101
How can I achieve that?
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;
proc sort data=have;
by type quarter;
run;
data long;
set
have
;
run;
proc transpose
data=long
out=wide (
drop=_name_
where=(type ne "")
)
prefix=q_
;
by type;
var phase;
id quarter;
run;
data temp;
merge have have(firstobs=3 keep=type rename=(type=_type));
if type ne _type;
drop _type;
run;
data temp;
set temp;
by type;
if first.type then n=0;
n+1;
if n=1 then id='second_last_color';
else id='last_color ';
proc transpose data=temp out=want(drop=_name_);
by type ;
id id;
var colors;
run;
data want_wide;
merge
wide
want
;
by type;
run;
@Ksharp want_wide looks good but I miss 2 columns, diff_phase and creation date are not in want_wide anymore. Why is that?
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;
proc sort data=have;
by type quarter;
run;
data long;
set
have
;
run;
proc transpose
data=long
out=wide (
drop=_name_
where=(type ne "")
)
prefix=q_
;
by type;
var phase;
id quarter;
run;
data temp;
merge have have(firstobs=3 keep=type rename=(type=_type));
if type ne _type;
drop _type;
run;
data temp;
set temp;
by type;
if first.type then n=0;
n+1;
if n=1 then id='second_last';
else id='last ';
run;
proc transpose data=temp out=want1(drop=_name_) suffix=_color;
by type creation_date;
id id;
var colors ;
run;
proc transpose data=temp out=want2(drop=_name_) suffix=_phase;
by type creation_date;
id id;
var phase ;
run;
data want;
merge want1 want2;
by type creation_date;
run;
data want_wide;
merge
wide
want
;
by type;
diff=second_last_phase-last_phase;
run;
To avoid any re-sorting of the table, run a double DO loop:
data want;
counta = 0;
do until (last.type);
set have;
by type;
counta + 1;
end;
countb = 0;
do until (last.type);
set have;
by type;
countb + 1;
if countb = counta - 1 then second_last_color = color;
if countb = count then last_color = color;
end;
keep type last_color second_last_color;
run;
You can merge this table to the transposed one.
I think the double DO is a bit overkill in this particular case. A lag function does all that is needed:
data want;
set have (keep=type colors rename=(colors=last_color));
by type;
second_last_color=lag(last_color);
if last.type;
/*if first.type=1 then call missing(second_last_color);*/
run;
De-comment the "if first.type=1" statement if you expect any TYPE with only one observation, thereby avoiding an erroneous second_last_color.
Then as @Kurt_Bremser suggested, merge it with the transposed table.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.