Hello,
I'm looking for each state's first week and last week in the sample dataset. Please help. Thanks.
data Have;
length State $5 Week 3;
infile datalines delimiter=',';
input State Week;
datalines;
GA,2101,
GA,2102,
GA,2104,
GA,2105,
GA,2106,
GA,2109,
GA,2110,
GA,2115,
GA,2117,
GA,2118,
GA,2120,
GA,2123,
GA,2124,
NV,2201,
NV,2202,
NV,2204,
NV,2205,
NV,2206,
NV,2209,
NV,2210,
NV,2213,
;
data want_week_1st_last;
length State $5 Week_start 3 Week_end 3;
infile datalines delimiter=',';
input State Week_start Week_end;
datalines;
GA, 2101,2124,
NV, 2201,2213,
;
I would use first.variable and last.variable in BY-Groups.
data want;
set have;
by state;
if not (first.state or last.state) then delete;
retain first_week last_week;
if first.state then first_week=week;
if last.state then last_week=week;
if last.state;
drop week;
proc print; run;
SQL is a good way to do this:
proc sql;
create table Want as
select State
,min(Week) as Week_Start
,max(Week) as Week_End
from Have
group by State
;
quit;
@ybz12003 wrote:
I would like to do it in the data step
Why?
RETAIN week_start. Use BY state. Set week_start at first.state. Use last.state as condition in a subsetting if, and set week_end there.
I would use first.variable and last.variable in BY-Groups.
data want;
set have;
by state;
if not (first.state or last.state) then delete;
retain first_week last_week;
if first.state then first_week=week;
if last.state then last_week=week;
if last.state;
drop week;
proc print; run;
@A_Kh you can get the result with less coding:
data want;
retain state first_week end_week;
set have (rename=(week=end_week9);
by state;
if first.state then first_week = end_week;
if last.state;
run;
Proc summary:
proc summary data=have nway; class state; var week; output out=want (drop=_:) min(week)=week_start max(week)=week_end; run;
Some possible reasons to use proc summary/means:
If you want to know how many weeks are represented don't drop the _freq_ variable that is normally included.
If of interest much easier to add a "mean week" or median.
If the data has observations missing week then asking for NMISS stat . Adding all that to a data step would be ugly.
If the desired output is just to see then use a report procedure.
proc tabulate data=have f=best5.; class state; var week; table state, week=' '*(min='Week start' max='Week end') ; run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.