BookmarkSubscribeRSS Feed
Andalusia
Obsidian | Level 7

Im trying to add 2 columns to want_wide dataset. 

  1. I'm trying to add the column last_color and second_last_color for every type. last_color is the color which is mentioned in the last quarter (so max quarter). second_last_color is the color which is mentioned in the second_last_quarter (so one before the max)

This is my current code:

data have;
  infile datalines truncover;
  input type $ quarter phase colors $ creation_date :yymmdd6.;
  format creation_date yymmdd6.;
  datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;

proc sort data=have;
by type quarter;
run;

data want;
set have;
where phase ne .;
by type;
retain diff_phase;
if first.type then diff_phase = phase;
if last.type
then do;
  diff_phase = phase - diff_phase;
  output;
end;
diff_phase = phase;
keep type diff_phase;
run;

proc sort data=have;
by type quarter;
run;

data long;
set
  have
;
run;

proc transpose
  data=long
  out=wide (
    drop=_name_
    where=(type ne "")
  )
  prefix=q_
;
by type;
var phase;
id quarter;
run;


data want_wide;
merge
  wide
  want
;
by type;
run;

This is how want_wide looks like right now:

type    q_202003     q_202101         q_202102        q_202104       q_202103        diff_phase  
K-11    3            1                2               3              .               -1
K-12    .            3                1               1              1                0
K-13    2            3                2               .              2                0

This is how I want want_wide to look:

type    q_202003     q_202101         q_202102        q_202104       q_202103        diff_phase  last_color   second_last_color date
K-11    3            1                2               3              .               -1          black        blue              20200101
K-12    .            3                1               1              1                0          yellow       white             20200101
K-13    2            3                2               .              2                0          blue         red               20200101

How can I achieve that?

 

 

6 REPLIES 6
Ksharp
Super User
data have;
  infile datalines truncover;
  input type $ quarter phase colors $ creation_date :yymmdd6.;
  format creation_date yymmdd6.;
  datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;

proc sort data=have;
by type quarter;
run;

data long;
set
  have
;
run;

proc transpose
  data=long
  out=wide (
    drop=_name_
    where=(type ne "")
  )
  prefix=q_
;
by type;
var phase;
id quarter;
run;



data temp;
merge have have(firstobs=3 keep=type rename=(type=_type));
if type ne _type;
drop _type;
run;
data temp;
 set temp;
 by type;
 if first.type then n=0;
 n+1;
 if n=1 then id='second_last_color';
  else id='last_color             ';
proc transpose data=temp out=want(drop=_name_);
by type ;
id id;
var colors;
run;





data want_wide;
merge
  wide
  want
;
by type;
run;
Andalusia
Obsidian | Level 7

@Ksharp want_wide looks good but I miss 2 columns, diff_phase and creation date are not in want_wide anymore. Why is that?

Ksharp
Super User
data have;
  infile datalines truncover;
  input type $ quarter phase colors $ creation_date :yymmdd6.;
  format creation_date yymmdd6.;
  datalines;
K-11 202101 1 blue 20200101
K-11 202102 2 blue 20200101
K-11 202003 3 blue 20200101
K-11 202104 3 black 20200101
K-12 202101 3 blue 20200101
K-12 202102 1 white 20200101
K-12 202103 1 white 20200101
K-12 202104 1 yellow 20200101
K-13 202003 2 green 20200101
K-13 202101 3 green 20200101
K-13 202102 2 red 20200101
K-13 202103 2 blue 20200101
;

proc sort data=have;
by type quarter;
run;

data long;
set
  have
;
run;

proc transpose
  data=long
  out=wide (
    drop=_name_
    where=(type ne "")
  )
  prefix=q_
;
by type;
var phase;
id quarter;
run;



data temp;
merge have have(firstobs=3 keep=type rename=(type=_type));
if type ne _type;
drop _type;
run;
data temp;
 set temp;
 by type;
 if first.type then n=0;
 n+1;
 if n=1 then id='second_last';
  else id='last      ';
run;
proc transpose data=temp out=want1(drop=_name_) suffix=_color;
by type creation_date;
id id;
var colors ;
run;
proc transpose data=temp out=want2(drop=_name_) suffix=_phase;
by type creation_date;
id id;
var phase ;
run;
data want;
 merge want1 want2;
by type creation_date;
run;



data want_wide;
merge
  wide
  want
;
by type;
diff=second_last_phase-last_phase;
run;
Kurt_Bremser
Super User

To avoid any re-sorting of the table, run a double DO loop:

data want;
counta = 0;
do until (last.type);
  set have;
  by type;
  counta + 1;
end;
countb = 0;
do until (last.type);
  set have;
  by type;
  countb + 1;
  if countb = counta - 1 then second_last_color = color;
  if countb = count then last_color = color;
end;
keep type last_color second_last_color;
run;

You can merge this table to the transposed one.

 

mkeintz
PROC Star

I think the double DO is a bit overkill in this particular case.  A lag function does all that is needed:

 

data want;
  set have (keep=type colors rename=(colors=last_color));
  by type;
  second_last_color=lag(last_color);
  if last.type;
  /*if first.type=1 then call missing(second_last_color);*/
run;

De-comment the "if first.type=1" statement if you expect any TYPE with only one observation, thereby avoiding an erroneous second_last_color.

 

Then as @Kurt_Bremser suggested, merge it with the transposed table.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Andalusia
Obsidian | Level 7
I ran your snippet and merged it with the transposed table, but I miss 2 essential columns that I had in my HAVE dataset. I miss diff_phase and ceation_date.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 538 views
  • 4 likes
  • 4 in conversation