BookmarkSubscribeRSS Feed
rdum96
Calcite | Level 5

Hi,

I'm trying to get if someone is enrolled in a program for each month from Jan 2020 to Feb 2021. So I'm interested in an array with month_1 to month_14. 

The data is something like this:

 

ID         Start_date            end_date

011      1/1/1996               5/3/2020

003      2/1/2020               12/1/2020

 

So typically the output I'm looking for is 

ID         Start_date            end_date        month_1  month_2  month_3 month_4...............................

011      1/1/1996               5/3/2020           1                  1            1              0          0          0 

003      2/1/2020               12/1/2020         0                  1            1             1            1          1 

 

Would love some help on this!

9 REPLIES 9
PaigeMiller
Diamond | Level 26

@rdum96 wrote:

Hi,

I'm trying to get if someone is enrolled in a program for each month from Jan 2020 to Feb 2021. So I'm interested in an array with month_1 to month_14. 

The data is something like this:

 

ID         Start_date            end_date

011      1/1/1996               5/3/2020

003      2/1/2020               12/1/2020

 

So typically the output I'm looking for is 

ID         Start_date            end_date        month_1  month_2  month_3 month_4...............................

011      1/1/1996               5/3/2020           1                  1            1              0          0          0 

003      2/1/2020               12/1/2020         0                  1            1             1            1          1 

 

Would love some help on this!


This is an incredibly poor choice of data layout, that will make your programming more difficult; and almost never necessary. No array is necessary to determine if "someone is enrolled in a program for each month from Jan 2020 to Feb 2021".

 

To answer that qestion, how about this:

 

data want;
     set have;
     if start_date<='01JAN2020'd and end_date>='01FEB2021'd then result=1;
     else result=0
run;
--
Paige Miller
rdum96
Calcite | Level 5
I'm trying to get a total count for each month. I'm new to this so I'm prolly not thinking the right way. Would appreciate your help
ballardw
Super User

Add an actual DATE variable.  Then use a proper date format to make nice appearing values.

This creates a Have data that looks like your "example" (Hint: data steps are the best way to show values so we can code against them.)

Then adds a date that is the first day of the month in the interval you asked for.

Since your "month" values contained 0/1 coded values I have added a random value that takes 0/1 values.

Then show use of Proc Summary to create a data set for the sum of 1's which I think you imply as the desired result.

Proc Report or Tabulate could make nice tables of either the sums or the counts for each of 1 and 0.

data have;
  input ID $ Start_date :mmddyy10.  end_date :mmddyy10.;
  format start_date end_date mmddyy10.;
datalines;
011      1/1/1996               5/3/2020
003      2/1/2020               12/1/2020
;

data want;
   set have;
   do date = '01Jan2020'd to '01Feb2021'd ;
      randomvalue = rand('bernoulli',.5);
      output;
      date=intnx('month',date,1,'B');
   end;
   format date yymon7.;
run;

/* how to sum (count) 1/0 variable by calendar month*/
proc summary data=want nway;
   class date;
   format date yymon7.;
   var randomvalue;
   output out=summary (drop=_:) sum=sumrandvalues;
run;

When dealing with dates it is almost always a good idea to use actual SAS date values. Formats can create groups that will be honored by analysis, reporting or graphing procedures. The functions INTNX and INTCK can increment or determine intervals between dates (and times or datetimes).

 

https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.

PaigeMiller
Diamond | Level 26

@rdum96 wrote:
I'm trying to get a total count for each month. I'm new to this so I'm prolly not thinking the right way. Would appreciate your help

Could you please state clearly the desired output? Could you please give a realistic example of the input data set (not just two records) that covers the real-world examples that might occur in your data, and the output from that example data?

 

Again, arrays are not the way to get total counts by month.

--
Paige Miller
Kurt_Bremser
Super User

@rdum96 wrote:
I'm trying to get a total count for each month. I'm new to this so I'm prolly not thinking the right way. Would appreciate your help

That's why you want a long dataset (Maxim 19):

data want;
set have;
month = start_date;
format month yymmd7.;
do while (month le end_date);
  output;
  month = intnx('month',month,1,'s');
end;
keep id month;
run;

From that, it is now easy to get a count of enrolled ID's for a given time period:

proc sql;
create table enrolled as
  select
    month,
    count(id) as count
  from want
  where '01jan2020'd le month le '01feb2021'd
  group by month
;
quit;

or

proc freq data=want;
where '01jan2020'd le month le '01feb2021'd;
tables month / nopercent nocum out=enrolled;
run;

 

rdum96
Calcite | Level 5
So I would need to include people who joined somewhere in the middle of the year or left in the middle too. I think this would include only people enrolled during the entire spell?
PaigeMiller
Diamond | Level 26

@rdum96 wrote:
So I would need to include people who joined somewhere in the middle of the year or left in the middle too. I think this would include only people enrolled during the entire spell?

I don't know what this means, or how it relates to the data you have already provided. Please provide example data that covers the real world situations that you are likely to have, and the desired output from this data.

--
Paige Miller
andreas_lds
Jade | Level 19

What do expect as final result?

Ksharp
Super User
data have;
input ID         Start_date : ddmmyy10.           end_date : ddmmyy10.;
format  Start_date     end_date ddmmyy10.;
cards;
011      1/1/1996               5/3/2020
003      2/1/2020               12/1/2020
;
data want;
 set have;
 array x{*} Jan2020 Feb2020 Mar2020 Apr2020;

do j=1 to dim(x);
  x{j}=0;
 end;

do i= Start_date to    end_date;
 do j=1 to dim(x);
   if upcase(put(i,monyy7.))=upcase(vname(x{j})) then x{j}=1;
 end;
end;
drop i j;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1189 views
  • 0 likes
  • 6 in conversation