Im trying to add 2 columns to want_wide dataset.
This is my current code:
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;
proc sort data=have;
by type quarter;
run;
data want;
set have;
where phase ne .;
by type;
retain diff_phase;
if first.type then diff_phase = phase;
if last.type
then do;
diff_phase = phase - diff_phase;
output;
end;
diff_phase = phase;
keep type diff_phase;
run;
proc sort data=have;
by type quarter;
run;
data long;
set
have
;
run;
proc transpose
data=long
out=wide (
drop=_name_
where=(type ne "")
)
prefix=q_
;
by type;
var phase;
id quarter;
run;
data want_wide;
merge
wide
want
;
by type;
run;
This is how want_wide looks like right now:
type q_202003 q_202101 q_202102 q_202104 q_202103 diff_phase
K-11 3 1 2 3 . -1
K-12 . 3 1 1 1 0
K-13 2 3 2 . 2 0
This is how I want want_wide to look:
type q_202003 q_202101 q_202102 q_202104 q_202103 diff_phase last_color second_last_color date
K-11 3 1 2 3 . -1 black blue 20200101
K-12 . 3 1 1 1 0 yellow white 20200101
K-13 2 3 2 . 2 0 blue red 20200101
How can I achieve that?
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;
proc sort data=have;
by type quarter;
run;
data long;
set
have
;
run;
proc transpose
data=long
out=wide (
drop=_name_
where=(type ne "")
)
prefix=q_
;
by type;
var phase;
id quarter;
run;
data temp;
merge have have(firstobs=3 keep=type rename=(type=_type));
if type ne _type;
drop _type;
run;
data temp;
set temp;
by type;
if first.type then n=0;
n+1;
if n=1 then id='second_last_color';
else id='last_color ';
proc transpose data=temp out=want(drop=_name_);
by type ;
id id;
var colors;
run;
data want_wide;
merge
wide
want
;
by type;
run;
@Ksharp want_wide looks good but I miss 2 columns, diff_phase and creation date are not in want_wide anymore. Why is that?
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;
proc sort data=have;
by type quarter;
run;
data long;
set
have
;
run;
proc transpose
data=long
out=wide (
drop=_name_
where=(type ne "")
)
prefix=q_
;
by type;
var phase;
id quarter;
run;
data temp;
merge have have(firstobs=3 keep=type rename=(type=_type));
if type ne _type;
drop _type;
run;
data temp;
set temp;
by type;
if first.type then n=0;
n+1;
if n=1 then id='second_last';
else id='last ';
run;
proc transpose data=temp out=want1(drop=_name_) suffix=_color;
by type creation_date;
id id;
var colors ;
run;
proc transpose data=temp out=want2(drop=_name_) suffix=_phase;
by type creation_date;
id id;
var phase ;
run;
data want;
merge want1 want2;
by type creation_date;
run;
data want_wide;
merge
wide
want
;
by type;
diff=second_last_phase-last_phase;
run;
To avoid any re-sorting of the table, run a double DO loop:
data want;
counta = 0;
do until (last.type);
set have;
by type;
counta + 1;
end;
countb = 0;
do until (last.type);
set have;
by type;
countb + 1;
if countb = counta - 1 then second_last_color = color;
if countb = count then last_color = color;
end;
keep type last_color second_last_color;
run;
You can merge this table to the transposed one.
I think the double DO is a bit overkill in this particular case. A lag function does all that is needed:
data want;
set have (keep=type colors rename=(colors=last_color));
by type;
second_last_color=lag(last_color);
if last.type;
/*if first.type=1 then call missing(second_last_color);*/
run;
De-comment the "if first.type=1" statement if you expect any TYPE with only one observation, thereby avoiding an erroneous second_last_color.
Then as @Kurt_Bremser suggested, merge it with the transposed table.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.