Hello,
I'm looking for each state's first week and last week in the sample dataset. Please help. Thanks.
data Have;
length State $5 Week 3;
infile datalines delimiter=',';
input State Week;
datalines;
GA,2101,
GA,2102,
GA,2104,
GA,2105,
GA,2106,
GA,2109,
GA,2110,
GA,2115,
GA,2117,
GA,2118,
GA,2120,
GA,2123,
GA,2124,
NV,2201,
NV,2202,
NV,2204,
NV,2205,
NV,2206,
NV,2209,
NV,2210,
NV,2213,
;
data want_week_1st_last;
length State $5 Week_start 3 Week_end 3;
infile datalines delimiter=',';
input State Week_start Week_end;
datalines;
GA, 2101,2124,
NV, 2201,2213,
;
I would use first.variable and last.variable in BY-Groups.
data want;
set have;
by state;
if not (first.state or last.state) then delete;
retain first_week last_week;
if first.state then first_week=week;
if last.state then last_week=week;
if last.state;
drop week;
proc print; run;
SQL is a good way to do this:
proc sql;
create table Want as
select State
,min(Week) as Week_Start
,max(Week) as Week_End
from Have
group by State
;
quit;
@ybz12003 wrote:
I would like to do it in the data step
Why?
RETAIN week_start. Use BY state. Set week_start at first.state. Use last.state as condition in a subsetting if, and set week_end there.
I would use first.variable and last.variable in BY-Groups.
data want;
set have;
by state;
if not (first.state or last.state) then delete;
retain first_week last_week;
if first.state then first_week=week;
if last.state then last_week=week;
if last.state;
drop week;
proc print; run;
@A_Kh you can get the result with less coding:
data want;
retain state first_week end_week;
set have (rename=(week=end_week9);
by state;
if first.state then first_week = end_week;
if last.state;
run;
Proc summary:
proc summary data=have nway; class state; var week; output out=want (drop=_:) min(week)=week_start max(week)=week_end; run;
Some possible reasons to use proc summary/means:
If you want to know how many weeks are represented don't drop the _freq_ variable that is normally included.
If of interest much easier to add a "mean week" or median.
If the data has observations missing week then asking for NMISS stat . Adding all that to a data step would be ugly.
If the desired output is just to see then use a report procedure.
proc tabulate data=have f=best5.; class state; var week; table state, week=' '*(min='Week start' max='Week end') ; run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.