I have two datasets `HAVE` and `HAVE2`. I'm trying to add the column last_color, second_last_color, third_last_color etc. until eight_last_color for every type. last_color is the color which is mentioned in the last quarter (so max quarter). second_last_color is the color which is mentioned in the second_last_quarter (so one before the max) etc. Now in my current code I was only able to calculate the last and second_last_color, I dont know how to calculate until the eight_last_color. But I think I'm very close...
These are my 2 datasets:
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 201904 1 blue 20200101
K-11 202001 1 blue 20200101
K-11 202002 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202004 3 black 20200101
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202103 3 blue 20200101
K-11 202104 3 black 20200101
K-12 201904 3 blue 20200101
K-12 202001 3 blue 20200101
K-12 202002 1 white 20200101
K-12 202003 1 white 20200101
K-12 202004 1 yellow 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 201904 2 green 20200101
K-13 202001 3 green 20200101
K-13 202002 2 red 20200101
K-13 202003 2 blue 20200101
K-13 202004 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
K-13 202104 2 blue 20200101
;
data have2;
infile datalines truncover;
input type $ quarter;
datalines;
K-11 -1
K-12 2
K-13 3
;
This is my code:
data long;
set
work.HAVE
;
run;
proc transpose
data=long
out=wide (
drop=_name_
where=(type ne "")
)
prefix=q_
;
by type;
var phase;
id quarter;
run;
data calc_yearquarter;
merge
wide
work.have2
;
by type;
run;
data temp;
merge work.HAVE work.HAVE(firstobs=3 keep=type rename=(type=_type));
if type ne _type;
drop _type;
run;
data temp;
set temp;
by type;
if first.type then n=0;
n+1;
if n=1 then id='second_last_advice';
else id='last_advice';
run;
proc transpose data=temp out=calc_advice(drop=_name_);
by type ;
id id;
var colors;
run;
data SAS11
;
merge
calc_yearquarter
calc_advice
;
by type;
run;
So to conclude, right now I have calculated 2 quarters back. I want to calculate 8 quarters back...
something like this?
data have;
infile datalines truncover;
input type $ quarter phase colors $ creation_date :yymmdd6.;
format creation_date yymmdd6.;
datalines;
K-11 201904 1 blue 20200101
K-11 202001 1 blue 20200101
K-11 202002 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202004 3 black 20200101
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202103 3 blue 20200101
K-11 202104 3 black 20200101
K-15 202101 3 green 20200101
K-15 202102 2 red 20200101
K-15 202103 2 blue 20200101
K-15 202104 2 blue 20200101
K-12 201904 3 blue 20200101
K-12 202001 3 blue 20200101
K-12 202002 1 white 20200101
K-12 202003 1 white 20200101
K-12 202004 1 yellow 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 201904 2 green 20200101
K-13 202001 3 green 20200101
K-13 202002 2 red 20200101
K-13 202003 2 blue 20200101
K-13 202004 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
K-13 202104 2 blue 20200101
K-14 202101 3 green 20200101
K-14 202102 2 red 20200101
K-14 202103 2 blue 20200101
K-14 202104 2 blue 20200101
;
run;
%let lookback = 8;
data want (rename=(lag_type=type));
do point = nobs to 1 by -1, nobs;
set have point = point nobs=nobs;
lag_type = lag(type);
array color_last_[&lookback.] $ 12;
if lag_type ne type then
do;
if lag_type ne " " then output;
_N_ = 1;
call missing(of color_last_{*});
end;
if _N_ <= &lookback. then color_last_[_N_] = colors;
_N_ + 1;
end;
stop;
keep lag_type color_last_:;
run;
B.
But how to do it in combination with my existing code?
what is the purpose of have2 data set?
what final result are you expecting?
Bart
I suck at code but if I interpret your question correctly you can sort your data and create a new column using a 'first' function and count the rows within a specific group...
sort your data descending (last color on top)
count per group with a rank number +1 for every new instance until you encounter the next group (The K-11,K-12 in your data set I think?)
And then only filter the rows labelled 1 till 8 and reverse them again if needed.
data "have";
SET "output tabel";
by "K-11 colunm name";
RANK +1;
if first."K-11 colum name" then RANK = 1;
RUN;
first sort it the right way, and maybe add other column to your BY group when needed.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.