BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
John4
Obsidian | Level 7

I have a large data (sorted by recent date) set like this :

 

ID             NAME             BEGIN              END               TOP

191080    John             25/04/2016    01/05/2016        NO
191080    John           08/06/2016    10/06/2016         NO
191080    John           21/06/2016    24/06/2016         Yes

191080    John           21/07/2016    24/09/2016         Yes
191141    Alex           16/01/2017    17/01/2017          NO

191141    Alex           22/02/2017    29/04/2017          NO

 

If "Yes" and "NO" occurs in TOP for an observation, then the most recent dates is choosen

But if only "NO" occurs in TOP for an observation, then the minimum of BEGIN and maximum of END are choosen

I mean how get this :

 

ID             NAME             BEGIN              END             

191080    John           21/07/2016    24/09/2016    
191141    Alex           16/01/2017    29/04/2017      

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

You asked for: "If "Yes" and "NO" occurs in TOP for an observation, then the most recent dates is choosen".

I assume if an ID has all and only "YES" you still want the recent dates.

If max(END) may be less then recent END date than retain max_end too, and care to update it.

 

data want;

  set have;

   by ID;   /* assuming data sorted by ID BEGIN */

        retain yes_flag min_begin;

        if first.ID then do; yes_flag=0; min_begin=begin; end;

        if upcase(top) = 'YES' then yes_flag=1;

        if last.ID then do;

           if yes_flag=0 then begin=min_begin;

           output;

       end;

run;

 

 

 

 

View solution in original post

5 REPLIES 5
Shmuel
Garnet | Level 18

You asked for: "If "Yes" and "NO" occurs in TOP for an observation, then the most recent dates is choosen".

I assume if an ID has all and only "YES" you still want the recent dates.

If max(END) may be less then recent END date than retain max_end too, and care to update it.

 

data want;

  set have;

   by ID;   /* assuming data sorted by ID BEGIN */

        retain yes_flag min_begin;

        if first.ID then do; yes_flag=0; min_begin=begin; end;

        if upcase(top) = 'YES' then yes_flag=1;

        if last.ID then do;

           if yes_flag=0 then begin=min_begin;

           output;

       end;

run;

 

 

 

 

John4
Obsidian | Level 7

date value are numeric, how put them as SAS DATE format ?

Shmuel
Garnet | Level 18

@John4 wrote:

date value are numeric, how put them as SAS DATE format ?


What do you mean by "numeric" ? When you browse your data what do see in the date columns?

If date is July 1st, 2016 do you see 01072016 or other number?

if positive then you can convert it to date formet by:

    date = input(put(date,z8.),ddmmyy8.);

 

Beyond, consider having TOP values per ID, whose dates to output:

1) NO, YES            (mixed ended by YES)

2) NO, YES, NO    (mixed ended by NO)

3) YES, NO           (mixed ended by NO) 

4) NO, NO             (only NO)

5) YES, YES          (only YES)

mkeintz
PROC Star

A variant on @Shmuel's suggestion:

 

data want;
  retain minbeg;

  merge have (keep=id top where=(top="YES"))
        have;
  by ID;

  if first.id then minbeg=(ifn(top="YES",begin,.);
  if last.id;
  begin=coalesce(minbeg,begin);
run;

 

If there are zero YES values, then it takes the most recent date range (i.e. last in a BY group), otherwise begin is set to its first value, but END is unmodified.  

 

The MERGE statement allows you to determine, at the beginning of each BY group, whether any YESes will be present, but it does not change the sequence of BEGIN/END.  If no YESes are present then minbeg is set to missing.

 

The subsetting if ("if last.id;") can be your friend, as in this case it avoids the need for an explicit output statement - just a syle preference really.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
KevinViel
Pyrite | Level 9

It appears that you need to capture the first (least) BEGIN and note whether TOP = "Yes" occurs.  I assume that if TOP = "Yes" occurs, but is followed by an observation with TOP = "NO", then we are interested in the observation with TOP = "Yes".  Here first., last., and retain are useful:

 

 

data top

        ( drop = __:

                 top

        )

     ;

  set top ;

  by id ;

  retain __begin

         __end .

         __top "0"

         ;

  if first.id

  then

    do ;

       __begin = begin ;

       __top = "0" ;

    end ;

  if top = "Yes"

  then

    do ;

       __top = "1" ;

       __begin = begin ;

       __end = end ;

    end ;

 

  if last.id

  then

    do ;

       if __top = "1"

       then

         do ;

            begin = __begin ;

            end = __end ;

            output ;

         end ;

        else

          do ;

             begin = __begin ;

             output ;

          end ;

    end ;

run ;

 

HTH,

 

Kevin

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 733 views
  • 4 likes
  • 4 in conversation