BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
devsas
Pyrite | Level 9

Hi all, I have the data in following format 

medicaid_Id total_months start_month end_month
00001463A 12 22013 112015
00001773A 12 32013 12015
00004763A 1 32014 32014
00006014A 1 62014 32015
00007453A 12 12013 12016
00008433A 5 42013 112013
00009943A 7 72013 12014
00010933A 1 72013 72013
00010933A 1 32014 32014
00011183A 7 12013 12014
00011183A 1 62014 72015
00011184A 2 72013 82013
00043093A 4 72013 102013
00043093A 9 102013 122078

 

I need to count total_months for each medicaid_id. Fairly simple, but here is the catch. Some of those id's- which have multiple records- have the first time end_month as same as second time start_month or the second time end_month same as third time start_month. I need to subtract total months by 1 in those cases. Here is what I have coded and its not working. Help will be really appreciated.

 

data want;
set hAVE;
by medicaid_id start_month;
retain end_month =n;
if first.medicaid_id then total = total_months;
if not first.medicaid_id and start_month NE n -1 then total + total_months; else total +total_months - 1 ;

end;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Ah, that makes a whole lot more sense.  OK, here's one way:

 

data want;

set have;

by medicaid_id;

prior_end = lag(end_month);

if first.medicaid_id then total = total_months;

else do;

   total + total_months;

   if start_month = prior_end then total = total - 1;

end;

drop prior_end;

run;

View solution in original post

7 REPLIES 7
Astounding
PROC Star

First thing you better do is fix your data.  If you have a date like 12115, you can't tell if that's December 1 or January 21.

devsas
Pyrite | Level 9

Thanks. My bad, yes, i didnt notice that. Actually i used following method to get month year combine. On the other hand, it may not matter as its characters we are comparing, so it solves the purpose of identifying those cases where start month and end month of previous iteration is same.

 

data inal (drop = year month year_ month_);
set hh_2014_4;
month = month(start_dt);
year = year(start_dt);
start_month = compress(month||year);

month_ = month(end_dt);
year_ = year(end_dt);
end_month = compress(month_||year_);

run;

 

I'm sure there is better way to extract month and year than above?

Astounding
PROC Star

If that code is producing the sample data you posted, there is definitely something wrong.  Why do some of your dates contain the day, when you are only putting together the month and the year?

 

I wouldn't feel confident in any solution until I know more about your original date variables.  When you run a PROC CONTENTS, what is in them?  I assume they are numeric, but do they have a format connected with them?  What do they look like when they are printed (unformatted vs. formatted)?

 

Once that is sorted out, it would probably be simpler to keep four variables instead of two:  start_month, start_year, end_month, and end_year.

devsas
Pyrite | Level 9

No sir, there are no days in the data i have or posted. First character is month (1 to 12) and after that its year (2013, 2014, 2015, 2016 or 2078)

 

These are the results after proc contents:

 

SAS Output

# Variable Type Len Format Informat
3 end_Dt Num 8 MMDDYY10. MMDDYY10.
6 end_month Char 24    
1 medicaid_Id Char 20   $20.
2 start_Dt Num 8 MMDDYY10. MMDDYY10.
5 start_month Char 24    
4 total_months Num 8    
Astounding
PROC Star

Ah, that makes a whole lot more sense.  OK, here's one way:

 

data want;

set have;

by medicaid_id;

prior_end = lag(end_month);

if first.medicaid_id then total = total_months;

else do;

   total + total_months;

   if start_month = prior_end then total = total - 1;

end;

drop prior_end;

run;

devsas
Pyrite | Level 9

Wonderful! Yes, that did it..so it seems retain was not required here--im still new to lag function..

I have to further do this below to get distinct id's with total months..

proc sql;
create table want_ as select medicaid_id, max(total) as total_ from want group by medicaid_id; quit;

 

Hope that's correct.

Astounding
PROC Star

In a way, RETAIN is required.  This statement is key:

 

total + total_months;

 

That form of statement (variable name + some numeric value) actually has a more complex set of meanings, one of which is to retain TOTAL.

 

Also note, if you only want a single observation per ID, the same DATA step can accomplish that.  Just before the RUN statement, add:

 

if last.medicaid_id;

 

Your SQL code looks like it works as well.

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1764 views
  • 3 likes
  • 2 in conversation