BookmarkSubscribeRSS Feed
ChaseA
Calcite | Level 5

I'm fairly new to the idea of macro variables and I think it is the best option I'm looking for right now. I'm working with a data set to track gains and losses in membership rolls. I have my data set sorted by MEMBER_ID and want to look at the beginning and ending dates for each member. We get this data snapshot monthly at the end of the month thus having a field called MONTHLY_DATE containing values like 2018-04-30, 2018-05-31, 2018-06-30, etc. The data starts at 2012-09-31 and currently ends at 2019-04-30. However, there is a possibility that we could get more data prior to 2012-09-31 and we will definitely be receiving data after 2019-04-30. Attached is a spreadsheet with an example data set containing the original data set as well as the two tables I wish to create.

 

I think I need a few DO statements. First it will look at the MEMBER_ID then look at the MONTHLY_DATE field. If the MONTHLY_DATE field of the first record of the MEMBER_ID (in the data set 'A123B78') is not equal to the MIN(MONTHLY_DATE) then it will write that record to the GAINS table. It will keep cycling through the first MEMBER_ID until it gets to the last record of MEMBER_ID, 'A123B78' at which point it will compare MONTHLY_DATE to the MAX(MONTHLY_DATE). If MONTHLY_DATE is not equal to MAX(MONTHLY_DATE) then it will be written to the LOSSES table. It then repeats this process for the next MEMBER_ID, 'J800K99' and so on and so forth. 

 

What I'm not sure about is how to effectively do this loop and if macro variables are need in the code at all. I know this is vague and I'm not sure where to start with the code. Any and all help is greatly appreciated.

 

Starter Data Set

MEMBER_IDMONTHLY_DATECONTRIBUTIONLOCATION
A123B789/30/201240PA
A123B7810/31/201240PA
A123B7811/30/201240PA
A123B7812/31/201240PA
A123B781/31/201240PA
A123B782/29/201240PA
J800K995/31/201525PA
J800K996/30/201525PA
J800K997/31/201525PA
J800K998/31/201525PA
R675T906/30/201315NY
R675T907/31/201315NY
R675T908/31/201315NY
R675T909/30/201315NY
W786L132/28/201925CA
W786L133/31/201925CA
W786L134/30/201925CA
Y889F458/31/201840CA
Y889F459/30/201840CA
Y889F4510/31/201840CA
Y889F4511/30/201840CA
Y889F4512/31/201840CA
Y889F451/31/201940CA
Y889F452/28/201940CA
Y889F453/31/201940CA
Y889F454/30/201940CA

 

 

LOSSES

MEMBER_IDMONTHLY_DATECONTRIBUTIONLOCATION
A123B782/29/201240PA
J800K998/31/201525PA
R675T909/30/201315

NY

 

GAINS

MEMBER_IDMONTHLY_DATECONTRIBUTIONLOCATION
J800K995/31/201525PA
R675T906/30/201315NY
W786L132/28/201925CA
Y889F458/31/201840CA
3 REPLIES 3
ballardw
Super User

You need to provide a cleaner definition of what MIN(MONTHLY_DATE) and MAX (MONTHLY_DATE)  mean. Are they the single smalles/largest dates included in the entire data set? Within a specific period of interest? Specified somewhere? If so, where or how?

The definition needs clarification because each member_id will have their own min and max date. And it not clear which actual value you are comparing the "desired result" dates to.

 

Also please clarify that the date variable in question is actually as SAS date value, not a character value.

 

Your attachment didn't make it. But since this is SAS providing SAS data sets is much better as that answers all the questions about data types and such.

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

ChaseA
Calcite | Level 5

MIN(MONTHLY_DATE) and MAX(MONTHLY_DATE) would be the minimum and maximum dates in the entire data set.

 

The date variable in question is an official SAS formatted date field.

 

The attachment just contained the tables I had typed out in my original question.

ballardw
Super User

Here's a way to FLAG records for gain or loss that uses a SAS supplied data set you should have available since you haven't provided example data in a form useable for code.

proc summary data=sashelp.stocks;
   var date;
   output out=work.datesum max= min=/autoname;
run;

proc sql noprint;
   select min(date), max(date) into :mindate, :maxdate
   from sashelp.stocks
   ;
quit;

/* sort the data as not in stock, date order
   remove some records to have different date
   ranges as the same for the stocks
   in the base data
*/
proc sort data=sashelp.stocks
    out=work.stocks (where= ( (stock='IBM' and date<'01JUL2005'd)
                            or (stock='Microsoft' and date>'01JUN1987'd)
                            )
                    );
   by stock date;
run;

data example;
  set work.stocks;
  by stock;
  if first.stock and date ne &mindate. then GainFlag=1;
  if last.stock and date ne &maxdate. then LossFlag =1;
run;

If the proper values are flagged then it is a trivial exercise to OUTPUT the flagged records to a data set, separate data sets or write  reports with the information of which records are gains and losses.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 505 views
  • 0 likes
  • 2 in conversation