BookmarkSubscribeRSS Feed
RickyS
Quartz | Level 8

GA everyone,

I have an existing customer status data set that is only updated when a status change change occurs.  I want to do status counts by month, year and segment.  So the challenge is creating a temp table for missing values for the month date periods that are not in the data table in order to enable a count of status by month and year. 

 

Important assumption:

- Customer_Status is the same when it is not reported

 

 

I should be able to count status for the Status_Date that are missing for Company02

 

Dates Missing                    Assumed Status

05/01/2005 -  05/31/2009  Active

07/01/2009 -  09/31/2009  CYRisk

11/01/2009 -  07/31/2013  Active

09/01/2013 - Today()        CYRisk

 

Sample Data 

 

data sample_data;
input @1 Account_id 2.
@3 Account_name $10.
@13Segment $14.
@28 Customer_Status 2.
@30Customer_Status_Name $7.
+1 Status_Date mmddyy10.;
format Status_Date mmddyy10.;
cards;
02 Company01 HighNewtWorth 04 CPLost 12/05/2006
03 Company02 MidSizeMarket 01 Active 04/25/2005
03 Company02 MidSizeMarket 08 CYRisk 06/25/2009
03 Company02 MidSizeMarket 01 Active 10/05/2009
03 Company02 MidSizeMarket 08 CYRisk 08/09/2013
04 Company03 HighNewtWorth 01 Active 01/26/2005
04 Company03 HighNewtWorth 04 CPLost 12/02/2016
05 Company04 HighNewtWorth 01 Active 09/13/2004
06 Company05 HighNewtWorth 01 Active 10/01/2004
06 Company05 HighNewtWorth 02 Inactv 08/18/2011
07 Company06 HighNewtWorth 01 Active 09/06/2006
08 Company07 HighNewtWorth 01 Active 08/01/2005
09 Company08 HighNewtWorth 01 Active 11/01/2005
09 Company08 HighNewtWorth 02 Inactv 11/23/2010
09 Company08 HighNewtWorth 06 Termin 05/07/2011
10 Company09 HighNewtWorth 01 Active 07/31/2007
10 Company09 HighNewtWorth 11 Infreq 10/27/2011
10 Company09 HighNewtWorth 01 Active 11/02/2012
10 Company09 HighNewtWorth 11 Infreq 11/14/2018
11 Company10 HighNewtWorth 01 Active 01/04/2005
11 Company10 HighNewtWorth 04 CPLost 06/05/2015
12 Company11 HighNewtWorth 01 Active 10/26/2004
13 Company12 HighNewtWorth 01 Active 01/14/2005
13 Company12 HighNewtWorth 02 Inactv 07/21/2011
14 Company13 HighNewtWorth 01 Active 01/04/2005
14 Company13 HighNewtWorth 02 Inactv 08/02/2012
15 Company14 HighNewtWorth 01 Active 09/01/2005
15 Company14 HighNewtWorth 02 Inactv 09/19/2008
16 Company15 HighNewtWorth 01 Active 09/28/2006
17 Company15 HighNewtWorth 06 Termin 05/19/2005
18 Company16 HighNewtWorth 01 Active 02/11/2005
19 Company17 HighNewtWorth 01 Active 10/26/2004
20 Company18 HighNewtWorth 01 Active 08/24/2005
;
run;
quit;

 

 

Thanks

4 REPLIES 4
PGStats
Opal | Level 21

You would better calculate the status duration (number of days) and use that as a FREQ variable in summary procedures. Get the duration by lookahead:

 

data want;
set sample_data; by account_id;
if not last then
    set sample_data(firstobs=2 keep=status_date rename=status_date=next_date) end=last;
if last.account_id then status_duration = 1; /* Default duration */ 
else status_duration = next_date - status_date;
drop next_date;
run;
PG
RickyS
Quartz | Level 8

Thank you for the quick turnaround, this works as intended but given how I need to use the data, I really need to retain the values and assumed status for each identified missing date that is created as I will be doing multiple date range analysis.

 

In data set

03 Company02 MidSizeMarket 01 Active 04/25/2005
03 Company02 MidSizeMarket 08 CYRisk 06/25/2009
03 Company02 MidSizeMarket 01 Active 10/05/2009
03 Company02 MidSizeMarket 08 CYRisk 08/09/2013

 

Required output

-keeps the original value and produces the missing data, just listing a few records that would be created  between observations  for Company02

 

03 Company02 MidSizeMarket 01 Active 04/25/2005 - original

03 Company02 MidSizeMarket 01 Active 05/01/2005 - created 

03 Company02 MidSizeMarket 01 Active 06/01/2005 - created

03 Company02 MidSizeMarket 01 Active 07/01/2005 - created 

until we reach month b4 next status date 05/01/2009 - created 

03 Company02 MidSizeMarket 08 CYRisk 06/25/2009 - original

03 Company02 MidSizeMarket 08 CYRisk 07/01/2009 - created

03 Company02 MidSizeMarket 08 CYRisk 08/01/2009 - created

03 Company02 MidSizeMarket 08 CYRisk 09/01/2009 - created

03 Company02 MidSizeMarket 01 Active 10/05/2009  - original

03 Company02 MidSizeMarket 01 Active 11/01/2009  - created

03 Company02 MidSizeMarket 01 Active 12/01/2009 - created 

03 Company02 MidSizeMarket 01 Active 12/01/2009 - created 

03 Company02 MidSizeMarket 01 Active 01/01/2010 - created 

until we reach month b4 next status date 07/01/2013 - created 

03 Company02 MidSizeMarket 08 CYRisk 08/09/2013 - original 

03 Company02 MidSizeMarket 08 CYRisk 09/09/2013 - created

---------------------until we reach most recent mm/01/yyyy

 

I'm standardizing the output to day 01 for new date, hope this makes sense  

TomKari
Onyx | Level 15

This will build you a table that I think you can use to generate whatever transactions you need to complete your list of accounts and months.

 

Tom

 

/* Get the Accounts */
proc sql noprint;
	create table Check_Accounts as
		select distinct Account_id as Check_Account from Sample_Data;
quit;

/* Get the earliest and latest date */
proc sql noprint;
	select min(Status_Date), max(Status_Date) into :Min_Date, :Max_Date
		from Sample_Data;

	/* Get a list of all the needed dates */
data Check_Months(drop=_:);
	_Number_Of_Months = intck('month', &Min_Date., &Max_Date.);

	do _i = 0 to _Number_Of_Months;
		Check_Month = intnx("month", &Min_Date., _i, 'beginning');
		format Check_Month date.;
		output;
	end;
run;

/* Get a cross-combination of accounts and dates */
proc sql noprint;
	create table Check_Table as select Check_Account, Check_Month, 'Missing' as Check_Status from Check_Accounts cross join Check_Months;
quit;

/* Get the Account and Month */
data Existing_Company_Month;
	set Sample_Data;
	Month_Active = intnx("month", Status_Date, 0, 'beginning');
	format Month_Active date.;
	keep Account_id Month_Active;
run;

/* Sort, and then keep the first (get unique) */
proc sort data=Existing_Company_Month;
	by Account_id Month_Active;
run;

data Existing_Company_Month;
	set Existing_Company_Month;
	by Account_id Month_Active;
	Current_Status = "Active";

	if first.Month_Active then
		output;
run;

/* Join the two tables to figure out which months we have and which ones we need to add data for */
proc sql noprint;
	create table Combined_Tables as select m.*, e.Current_Status
		from Check_Table m left outer join Existing_Company_Month e
			on(m.Check_Account = e.Account_id and m.Check_Month = e.Month_Active)
		order by m.Check_Account, m.Check_Month;
quit;
PGStats
Opal | Level 21

OK. You asked for it:

 

data want;
set sample_data; by account_id;
if not last then
    set sample_data(firstobs=2 keep=status_date rename=status_date=next_date) end=last;
if last.account_id then output;
else do status_date = status_date to next_date - 1; output; end;
drop next_date;
run;
PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 723 views
  • 0 likes
  • 3 in conversation