BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
moussa854
Calcite | Level 5

Dear all,

I am new to SAS but I have some programming experience. I would like some help finding the longest consecutive coverage for A and B. As an alternative method I used to include only rows that have yes in all months which made me miss some entries. Appreciate your help.

 

IDyear123456789101112
A2010NoNoYesYesYesYesYesYesYesYesYesYes
A2011YesYesYesYesYesYesYesYesYesYesYesNo
A2012NoNoYesYesYesYesYesYesYesYesYesYes
A2013YesYesNoNoNoNoNoNoNoNoNoNo
B2011YesYesYesYesYesYesYesYesYesYesYesYes
B2012YesYesYesNoNoNoNoNoNoNoNoNo
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

As @mkeintz stated, you will find that a tall skinny data set is the best way to handle this type of problem in SAS. So we create a tall skinny data set.

 

Also, as @mkeintz stated, you will get faster and more accurate help if you provide your data as a SAS data step. Fortunately, @novinosrin has done that, but in the future, you should do that yourself.

 

My solution:

data tall_skinny;
    informat ID $1. year 4.;
    array col(12) $3;
    input ID year col(*);
    do i=1 to 12;
        month=i;
        result=col(i);
        output;
    end;
    keep id year result month;
cards;
A	2010	No	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
A	2011	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No
A	2012	No	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
A	2013	Yes	Yes	No	No	No	No	No	No	No	No	No	No
B	2011	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
B	2012	Yes	Yes	Yes	No	No	No	No	No	No	No	No	No
;

data tall_skinny1;
    set tall_skinny;
    by id;
    if first.id or upcase(result)='NO' then consecutive=0;
    if upcase(result)="YES" then consecutive+1;
run;
proc summary data=tall_skinny1 nway;
    class id;
    var consecutive;
    output out=stats(drop=_:) max= maxid(consecutive(month))=maxmonth 
        maxid(consecutive(year))=maxyear;
run;

/* Select months starting at maxmonth maxyear and go backwards to first month of sequence with coverage */
data final;
    merge tall_skinny1 stats;
    by id;
    if mdy(month,1,year)>intnx('month',mdy(maxmonth,1,maxyear),-consecutive,'b')
        and mdy(month,1,year)<=mdy(maxmonth,1,maxyear);
run;
--
Paige Miller

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

What should the output be?

moussa854
Calcite | Level 5

For entry A will only include data of 03/2010-11/2011 (21 months coverage)

For entry B will only include data of 01/2011-03/2012 (15 months coverage).

 

IDyear123456789101112
A2010NoNoYesYesYesYesYesYesYesYesYesYes
A2011YesYesYesYesYesYesYesYesYesYesYesNo
A2012NoNoNoNoNoNoNoNoNoNoNoNo
A2013NoNoNoNoNoNoNoNoNoNoNoNo
B2011YesYesYesYesYesYesYesYesYesYesYesYes
B2012YesYesYesNoNoNoNoNoNoNoNoNo

 

No or blank will be OK

Thanks in advance.

mkeintz
PROC Star

Is your sas data already in the layout you show?  You'd be better of with a tall-skinny dataset, with one row per month.

 

Could you provide a DATA step generate the sas data you already have?  Then we can provide more useful answers.

 

Editted addition: as @novinosrin asks, what should the output look like?  One row per ID?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
moussa854
Calcite | Level 5

The data is an upload of csv file in the provided format. The year in one column and the coverage months (Jan-Dec) each has a separate column.

 

I just want to identify the longest consecutive coverage year and months. Later I will be using it to clean other tables where the date of service is not in the longest consecutive coverage

 

Best

novinosrin
Tourmaline | Level 20

Good morning Mark @mkeintz,  I hope your day is going well. I am afraid I am late to college and off to Professor appointment et al priorities. I didn't expect you to pop in early hours but if you have time and  wanna take a stab, here is the data have. I will catch up on the thread later. Have a good day!

 

 


data have;
informat ID $1. year 4.;
array month(12) $3;
input ID	year month(*);
cards;
A	2010	No	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
A	2011	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No
A	2012	No	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
A	2013	Yes	Yes	No	No	No	No	No	No	No	No	No	No
B	2011	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
B	2012	Yes	Yes	Yes	No	No	No	No	No	No	No	No	No
;

 

 

 

 

 

PaigeMiller
Diamond | Level 26

As @mkeintz stated, you will find that a tall skinny data set is the best way to handle this type of problem in SAS. So we create a tall skinny data set.

 

Also, as @mkeintz stated, you will get faster and more accurate help if you provide your data as a SAS data step. Fortunately, @novinosrin has done that, but in the future, you should do that yourself.

 

My solution:

data tall_skinny;
    informat ID $1. year 4.;
    array col(12) $3;
    input ID year col(*);
    do i=1 to 12;
        month=i;
        result=col(i);
        output;
    end;
    keep id year result month;
cards;
A	2010	No	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
A	2011	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No
A	2012	No	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
A	2013	Yes	Yes	No	No	No	No	No	No	No	No	No	No
B	2011	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
B	2012	Yes	Yes	Yes	No	No	No	No	No	No	No	No	No
;

data tall_skinny1;
    set tall_skinny;
    by id;
    if first.id or upcase(result)='NO' then consecutive=0;
    if upcase(result)="YES" then consecutive+1;
run;
proc summary data=tall_skinny1 nway;
    class id;
    var consecutive;
    output out=stats(drop=_:) max= maxid(consecutive(month))=maxmonth 
        maxid(consecutive(year))=maxyear;
run;

/* Select months starting at maxmonth maxyear and go backwards to first month of sequence with coverage */
data final;
    merge tall_skinny1 stats;
    by id;
    if mdy(month,1,year)>intnx('month',mdy(maxmonth,1,maxyear),-consecutive,'b')
        and mdy(month,1,year)<=mdy(maxmonth,1,maxyear);
run;
--
Paige Miller
moussa854
Calcite | Level 5

Thank you, Paige, for your answer. I briefly used SAS in the past (6 years ago) and now I am back to it. I am trying to learn it from books and the web. I am trying my best.

  

This worked:

merge tall_skinny stats;

 

 Thanks again Paige

PaigeMiller
Diamond | Level 26

Oh yeah, good catch!

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1100 views
  • 1 like
  • 4 in conversation