BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Dear All,

I would describe my knowledge and abilities in programing and macros as basic. This is my problem:

I have a dataset. I will refer to it as "history". In history, each observation is a summary of a report, including a date of issue and various variables. Reports are issued at different times, i.e. the difference in months between a report and a previous report could be any value between 1 and 24.

I would like to create a new dataset. In this dataset the first observation would be the most recent report in history. The second observation would be the next and closest report in history, conditional on the fact it was issued at least 6 months later than the report in the FIRST observation. The third observation would now be the next report in history, conditional on the fact it was issued at least 6 months later than the report in the SECOND observation, and so forth. In other words, I have a reference date that needs to be updated.

1. Can this be done without a macro? How?

2. Must this be done with a macro? How?

Any help would be highly appreciated.

Michael

1 ACCEPTED SOLUTION

Accepted Solutions
dkb
Quartz | Level 8 dkb
Quartz | Level 8

Dr van Straten, didn't you broadcast on LBC some years ago?  What a surprise!

Try this:

data aborthist;

set aborthist;

by descending startdate;

retain last_kept; /* remember its value from row to row  */

drop last_kept;  /* but do not bother to write it to the output  */

if _n_=1 then do; /* definitely want the first one */

     output;

     last_kept = startdate; /* keep a note of the date we just wrote */

end;

else if last_kept - startdate ge 180 /* compare current date to last one written */

     then do;

     output;

     last_kept = startdate; /* and this is the one we compare next time */

end;

run;


View solution in original post

9 REPLIES 9
LinusH
Tourmaline | Level 20

1. Yes. Sort your data on report id and descending report date. Use a data step with BY report date. Use retain (with firs. logic) or lag() to compare report date between observations. Keep records that meets your criteria.

2. No.

Data never sleeps
MichaelvanStraten
Fluorite | Level 6

Thank you. I'll try that.

MichaelvanStraten
Fluorite | Level 6

Sorry, this solution is not working;

I used:

data aborthist;

set aborthist;

by descending startdate;

if _n_=1 OR (lag(startdate)-startdate ge 180) then output;

run;

Apparently, the lag function queues the value of the previous observation. However, the previous observation is not necessarily the observation I need for the reference date. The reference date is the date belonging to the previous observation THAT WAS KEPT. For example, I have the  dates:

May 2014

Dec 2013

Jun 2013

Mar 2013

Jan 2013

Aug 2012

Mar 2012

Jan 2011

Sep 2011

Mar 2011

The following reports SHOULD be extracted: May 2014, Jun 2013, Aug 2012, Dec 2012, Mar 2011.

However, only the following reports ARE extracted: May 2014, Jun 2013, March 2011.

dkb
Quartz | Level 8 dkb
Quartz | Level 8

Dr van Straten, didn't you broadcast on LBC some years ago?  What a surprise!

Try this:

data aborthist;

set aborthist;

by descending startdate;

retain last_kept; /* remember its value from row to row  */

drop last_kept;  /* but do not bother to write it to the output  */

if _n_=1 then do; /* definitely want the first one */

     output;

     last_kept = startdate; /* keep a note of the date we just wrote */

end;

else if last_kept - startdate ge 180 /* compare current date to last one written */

     then do;

     output;

     last_kept = startdate; /* and this is the one we compare next time */

end;

run;


MichaelvanStraten
Fluorite | Level 6

Sorry to disappoint you dkb, I'm not THE Michael van Straten (Health Journalist and much more), I'm just a Michael van Straten (Vet).

Thank you very much for your solution. I'll check it out but am already quite confident it will work as you have addressed the key problem. So I know for the future: if one refers to a variable AFTER using the output command (like you did above), will the value of the variable always be that belonging to the observation that was "output"?

Kurt_Bremser
Super User

The OUTPUT statement does not change any variable values at all; it just writes the contents of the PDV to the dataset.

Variable values change:

- when an input dataset is read (SET or MERGE statement)

- when values are assigned through direct assignment (=), incrementation (like x + 1;) or by a "call" subroutine

- when a new data step iteration begins (all variables not RETAINED are set to missing)

MichaelvanStraten
Fluorite | Level 6

Worked perfect! Thanks.

MichaelvanStraten
Fluorite | Level 6

Yes, sorry. Earlier of course.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1350 views
  • 0 likes
  • 4 in conversation