Hello,
I have a sample data set "Have". I would like to do the following steps:
1. Total Count the week gap greater than 3 based on the state; the result is shown similarly to dataset "want_count".
2. List the start week and end week number based on the state, the result is shown similarly to dataset "want_weekgap_seach".
3. List the start week and end week number based on the following rules, the result is shown similarly to dataset "want_week_gap".
a. the week gap number is the largest in the gap
b. when the week gap number is the same, list the latest week start and end.
data Have;
length State $5 Week 3;
infile datalines delimiter=',';
input State Week;
datalines;
GA,2101,
GA,2102,
GA,2104,
GA,2105,
GA,2106,
GA,2109,
GA,2110,
GA,2115,
GA,2117,
GA,2118,
GA,2120,
GA,2123,
GA,2124,
NV,2201,
NV,2202,
NV,2204,
NV,2205,
NV,2206,
NV,2209,
NV,2210,
NV,2213,
;
data want_count;
length state $5 count_total 3;
infile datalines delimiter=',';
input state Count_total;
datalines;
GA, 3,
NV, 2,
;
data want_weekgap_seach;
length State $5 Week_start 3 Week_end 3;
infile datalines delimiter=',';
input State Week_start Week_end;
datalines;
GA, 2106,2109,
GA, 2110,2115,
GA, 2120,2123,
NV, 2206,2209,
NV, 2210,2213,
;
data want_week_gap;
length State $5 Week_start 3 Week_end 3;
infile datalines delimiter=',';
input State Week_start Week_end;
datalines;
GA, 2110,2115,
NV, 2210,2213,
;
hI @ybz12003
One way could be this:
data Have;
length State $5 Week 8;
infile datalines delimiter=',';
input State Week;
datalines;
GA,2101,
GA,2102,
GA,2104,
GA,2105,
GA,2106,
GA,2109,
GA,2110,
GA,2115,
GA,2117,
GA,2118,
GA,2120,
GA,2123,
GA,2124,
NV,2201,
NV,2202,
NV,2204,
NV,2205,
NV,2206,
NV,2209,
NV,2210,
NV,2213,
;
run;
data
want_count (keep=State Count_total)
want_weekgap_search (keep=State Week_start Week_end gap);
;
set have;
by State;
length Count_total Week_start Week_end 8;
if first.State then call missing(of Count_total Week_start Week_end);
lastWeek = lag(Week);
if not first.state then do;
gap = week - lastWeek;
if gap > 2 then do;
Count_total + 1;
Week_start = lastWeek;
Week_end = Week;
output want_weekgap_search;
end;
end;
if last.state then output want_count;
run;
proc sql;
create table want_week_gap
as select
State,
Week_start,
Week_end,
gap
from want_weekgap_search
group by State
having gap = max(gap)
order by
State,
Week_end
;
quit;
Note that you get 3 records in the data set want_week_gap for State = NV, because there are 2 gaps of equal size, and it is not clear which one to choose in this case.
What is a "week gap" to count? I can't follow your description to determine what that is much less how to count it.
You may well cause problems by setting length 3 for your "week" variable. I would strongly suggest not using Length options for any numeric until you have completed tested ALL your code using the variables as very unexpected truncation of values may result.
I will say that making values like 2101 and calling them "week" without any reference doesn't help.
Date values are generally going to be a lot more flexible when it comes to determining things like intervals between values as there is a function INTCK that does exactly that: count intervals between date, time or datetime values.
@ybz12003 wrote:
Discontinuing numbers from the Week column. For example, there is a missing number, 2103, between 2102 and 2104. Because there are two gaps from 2102 to 2104, the gap numbers didn't match the requirement.
Do the weeks ever continue into the next year? If one number is 2151 and the next number 2203, how many weeks gap is that?
hI @ybz12003
One way could be this:
data Have;
length State $5 Week 8;
infile datalines delimiter=',';
input State Week;
datalines;
GA,2101,
GA,2102,
GA,2104,
GA,2105,
GA,2106,
GA,2109,
GA,2110,
GA,2115,
GA,2117,
GA,2118,
GA,2120,
GA,2123,
GA,2124,
NV,2201,
NV,2202,
NV,2204,
NV,2205,
NV,2206,
NV,2209,
NV,2210,
NV,2213,
;
run;
data
want_count (keep=State Count_total)
want_weekgap_search (keep=State Week_start Week_end gap);
;
set have;
by State;
length Count_total Week_start Week_end 8;
if first.State then call missing(of Count_total Week_start Week_end);
lastWeek = lag(Week);
if not first.state then do;
gap = week - lastWeek;
if gap > 2 then do;
Count_total + 1;
Week_start = lastWeek;
Week_end = Week;
output want_weekgap_search;
end;
end;
if last.state then output want_count;
run;
proc sql;
create table want_week_gap
as select
State,
Week_start,
Week_end,
gap
from want_weekgap_search
group by State
having gap = max(gap)
order by
State,
Week_end
;
quit;
Note that you get 3 records in the data set want_week_gap for State = NV, because there are 2 gaps of equal size, and it is not clear which one to choose in this case.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.