I have a data set below
grp_name is the name of client data_have is like what months we have received the data, start_dt is from when we should we receive the data, end_dt is till when we have to receive the data.
grp_name data_have start_dt end_dt
X Jan2016 Jan2016 Dec2016
X Mar2016 Jan2016 Dec2016
X Dec2016 Jan2016 Dec2016
y Jan2017 Jan2017 Apr2017
y Feb2017 Jan2017 Apr2017
y Mar2017 Jan2017 Apr2017
y Apr2017 Jan2017 Apr2017
z Apr2018 Jan2018 Dec2018
;
What i need is
For x group we have for Jan, Mar, Dec 2016 but we should have data from jan to dec 2016 so a flag should be created and should be marked as 1 else 0.
Thanks in advance
So you need to identify groups that have no complete series of months?
Use the lag() and intck() functions. Like
data flags (keep=grp_name flag);
set have;
by grp_name;
retain flag;
prev_date = lag(data_have);
if first.grp_name then flag = 0;
else do;
if intck('month',prev_date,data_have) > 1 then flag = 1;
end;
if last.grp_name then output;
run;
Actually I wanted to know the data gaps.
What i need is for suppose
x client has given data for jan, Mar, Dec 2017 but we are suppose to receive the data for jan2017 to dec 2017 our code should identify the data gaps for example FEB, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV is missing data so we should have flag "should reach out"
Thanks in advance
Please show what exactly what you expect as output for the input you've provided.
@srinath3111 wrote:
Actually I wanted to know the data gaps.
What i need is for suppose
x client has given data for jan, Mar, Dec 2017 but we are suppose to receive the data for jan2017 to dec 2017 our code should identify the data gaps for example FEB, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV is missing data so we should have flag "should reach out"
Thanks in advance
An approach using an array:
data have;
input grp_name $ (data_have start_dt end_dt) (:anydtdte.);
format data_have start_dt end_dt mmyyd7.;
datalines;
X Jan2016 Jan2016 Dec2016
X Mar2016 Jan2016 Dec2016
X Dec2016 Jan2016 Dec2016
y Jan2017 Jan2017 Apr2017
y Feb2017 Jan2017 Apr2017
y Mar2017 Jan2017 Apr2017
y Apr2017 Jan2017 Apr2017
z Apr2018 Jan2018 Dec2018
;
proc sort data=have; by grp_name start_dt end_dt data_have; run;
data gaps;
array m{0:99};
do until (last.end_dt);
set have; by grp_name start_dt end_dt;
month = intck("month", start_dt, data_have);
m{month} = 1;
end;
do month = 0 to intck("month", start_dt, end_dt);
if not m{month} then do;
missing_dt = intnx("month", start_dt, month);
output;
end;
end;
format missing_dt mmyyd7.;
keep grp_name missing_dt;
run;
proc print data=gaps noobs; run;
data have;
input grp_name $ (data_have start_dt end_dt) (:anydtdte.);
format data_have start_dt end_dt mmyyd7.;
datalines;
X Jan2016 Jan2016 Dec2016
X Mar2016 Jan2016 Dec2016
X Dec2016 Jan2016 Dec2016
y Jan2017 Jan2017 Apr2017
y Feb2017 Jan2017 Apr2017
y Mar2017 Jan2017 Apr2017
y Apr2017 Jan2017 Apr2017
z Apr2018 Jan2018 Dec2018
;
proc sort data=have; by grp_name start_dt end_dt data_have; run;
data gaps1;
if _n_=1 then do;
declare hash H (dataset:'have',ordered: "y") ;
h.definekey ('grp_name',"data_have") ;
h.definedone () ;
end;
set have;
by grp_name;
if first.grp_name;
do until(start_dt>end_dt);
if h.check(key:grp_name,key:start_dt) ne 0 then do;
missing_dt=start_dt;
output;
end;
start_dt=intnx('month',start_dt,1,'b');
end;
format missing_dt mmyyd7.;
keep grp_name missing_dt;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.