I have Enterprise guide 7.1
I have a table, which looks like this:
activity startdate enddate
act 1 01-10-2016 15-01-2017
act 2 24-02-2015
act 1 06-05-2014 23-06-2016
act 3 20-10-2016 23-04-2017
and I want a table, which looks like:
activity 01-2016 02-2016 03-2016 04-2016 05-2016 .... 03-2017
act 1 1 3 3 2 1 6
act 2 3 4 5 2 2 1
act 3 1 1 2 1 0 2
How can transform first table to the second table? I would really like to make it, so it continues 'forever'?
Thank you 🙂
Fix it for the missing end_date. data have; infile cards truncover; informat activity $8. date_start date_end ddmmyy10.; format date_: date9.; input activity $ date_start date_end; cards; act1 01-10-2016 15-01-2017 act2 24-02-2015 act1 06-05-2014 23-06-2016 act3 20-10-2016 23-04-2017 act1 20-01-2016 01-04-2017 ; run; data temp; set have; if not missing(date_start) and not missing(date_end) then do; do i=date_start to date_end; month=month(i); if month ne lag(month) then output; end; end; else if missing(date_end) then do; do i=date_start to today(); month=month(i); if month ne lag(month) then output; end; end; format i date9.; keep activity i ; run; proc freq data=temp noprint; table activity*i/out=temp1 list; format i monyy7.; run; proc transpose data=temp1 out=want(drop=_:); by activity; var count; id i; run;
A couple of questions:
1. What do the numbers under the various date columns represet?
2, Do you want the columns to represent month-year?
3. What do you want to show when there is a start date, but no end date?
Art, CEO, AnalystFinder.com
Not exhaustively tested:
data Inter1(keep=activity IntervalDate);
set Have;
if missing(enddate) then
enddate = startdate;
startdate = intnx('month', startdate, 0, 'b');
enddate = intnx('month', enddate, 0, 'b');
format IntervalDate date9.;
IntervalDate = startdate;
StopFlag = 0;
do while(^StopFlag);
output;
if IntervalDate = enddate then
StopFlag = 1;
IntervalDate = intnx('month', IntervalDate, 1, 'b');
end;
run;
proc means data=Inter1 nway noprint;
class activity IntervalDate;
var IntervalDate;
output out=Inter2(drop=_TYPE_ _FREQ_) n=Count;
run;
/* Get a list of unique activity names */
proc sql noprint;
create table Names as
select distinct i.activity from Inter2 i;
quit;
/* Generate every first of the month from Jan 1 2014 to Dec 31 2017 */
data AllMonths;
do Year = 2014 to 2017;
do Month = 1 to 12;
DateToCheck = mdy(Month, 1, Year);
format DateToCheck date9.;
keep DateToCheck;
output;
end;
end;
/* Cartesian join to get a list of months and company names */
proc sql noprint;
create table AllMonthsNames as
select n.activity, a.DAteToCheck from Names n cross join AllMonths a;
quit;
/* Join the two, to get a full list of months with Type */
proc sql noprint;
create table Inter3 as
select i.*, a.DateToCheck from Inter2 i right join AllMonthsNames a on(i.IntervalDate = a.DateToCheck and i.activity = a.activity)
order by a.activity, a.DateToCheck;
quit;
/* Ripple the company names and fill in the zero counts both before and after the actual data */
data Inter3(drop=_:);
set Inter3(drop=IntervalDate);
length _OldActivity $5;
retain _OldActivity _OldDate;
if _n_ = 1 then do;
_OldActivity = activity;
_OldDate = DateToCheck;
end;
else if DateToCheck < _OldDate then _OldActivity = activity;
if missing(activity) then
activity = _OldActivity;
else _OldActivity = activity;
if missing(Count) then
Count = 0;
_OldDate = DateToCheck;
SeqNum = _n_; /* We need this to turn the dataset upside down */
run;
/* Sort descsending to ripple early records */
proc sort data=Inter3;
by descending SeqNum;
run;
data Inter3(drop=_:);
set Inter3;
length _OldActivity $5;
retain _OldActivity _OldDate;
if _n_ = 1 then do;
_OldActivity = activity;
_OldDate = DateToCheck;
end;
else if DateToCheck > _OldDate then _OldActivity = activity;
if missing(activity) then
activity = _OldActivity;
else _OldActivity = activity;
_OldDate = DateToCheck;
format DateToCheck yymm5.;
run;
proc sort data=Inter3 out=Inter3(drop=SeqNum);
by SeqNum;
run;
proc transpose data=Inter3 out=Want(drop=_NAME_) prefix=D;
by activity;
id DateToCheck;
var Count;
run;
I think something like this may work as well, not sure how much control you need but I suspect PROC TABULATE will generate the table you desire.
*create sample data;
data have;
infile cards truncover;
informat activity $8. date_start date_end ddmmyy10.;
format date_: date9.;
input activity $ date_start date_end;
cards;
act1 01-10-2016 15-01-2017
act2 24-02-2015
act1 06-05-2014 23-06-2016
act3 20-10-2016 23-04-2017
act1 20-01-2016 01-04-2017
;
run;
data long;
set have;
*calculate the end date, use today if no end date;
date_end=coalesce(date_end, date());
*calculate the number of months of 'activity';
nmonths=intck('month', date_start, date_end, 'D');
*create record for every month;
do i=0 to nmonths;
Month=intnx('month', date_start, i, 'b');
output;
end;
format month yymon7.;
drop nmonths date_start date_end;
run;
proc tabulate data=long;
where year(month)=2016;
*limit to specific year for demo;
class activity month;
table activity, month*n / misstext='0';
run;
data have; infile cards truncover; informat activity $8. date_start date_end ddmmyy10.; format date_: date9.; input activity $ date_start date_end; cards; act1 01-10-2016 15-01-2017 act2 24-02-2015 act1 06-05-2014 23-06-2016 act3 20-10-2016 23-04-2017 act1 20-01-2016 01-04-2017 ; run; data temp; set have; if not missing(date_start) and not missing(date_end) then do; do i=date_start to date_end; month=month(i); if month ne lag(month) then output; end; end; else if not missing(date_start) then do;i=date_start;output;end; else if not missing(date_end) then do;i=date_end;output;end; format i date9.; keep activity i ; run; proc freq data=temp noprint; table activity*i/out=temp1 list; format i monyy7.; run; proc transpose data=temp1 out=want(drop=_:); by activity; var count; id i; run;
Fix it for the missing end_date. data have; infile cards truncover; informat activity $8. date_start date_end ddmmyy10.; format date_: date9.; input activity $ date_start date_end; cards; act1 01-10-2016 15-01-2017 act2 24-02-2015 act1 06-05-2014 23-06-2016 act3 20-10-2016 23-04-2017 act1 20-01-2016 01-04-2017 ; run; data temp; set have; if not missing(date_start) and not missing(date_end) then do; do i=date_start to date_end; month=month(i); if month ne lag(month) then output; end; end; else if missing(date_end) then do; do i=date_start to today(); month=month(i); if month ne lag(month) then output; end; end; format i date9.; keep activity i ; run; proc freq data=temp noprint; table activity*i/out=temp1 list; format i monyy7.; run; proc transpose data=temp1 out=want(drop=_:); by activity; var count; id i; run;
Hi again
I can see, that the programming is restricted to the table i maid. It is an example of the data set i have.
To be more specific, i have 30 activities and each activity has 30 or more observations, that I wish to count.
The end_date has '.' and is not missing, when it is still active.
Thank you sooo much for your time and help - it is appreciated 🙂
My code should work fine under the conditions you specify, you just need to adjust the start and end years to accomodate your range of dates.
Please be a little more clear on what it means when end date is missing.
Tom
I had assumed that a missing enddate implied it was a one-day activity.
In my code, replace
if missing(enddate) then enddate = startdate;
with
if missing(enddate) then enddate = today();
Tom
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.