BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ScotchCat
Obsidian | Level 7

Please be gentle - I'm new - using SAS 9.4      I'm trying to find the first date the Pick_Date is Null.   Any assistance is appreciated!  Thanks in advance!! 

Here is what my data table looks like: 

Table Orchard       
Apple_Var Ready_Date Pick_Date Change_Date
Red_Del 8/24/2024 8/24/2024 8/24/2024
Red_Del 8/24/2024 8/25/2024 8/25/2024
Red_Del 8/24/2024 8/26/2024 8/26/2024
Red_Del 8/24/2024 8/27/2027 8/27/2024
Red_Del 8/24/2024   8/28/2024
Granny 8/26/2024 8/26/2024 8/26/2024
Granny 8/26/2024 8/27/2024 8/27/2024
Granny 8/26/2024 8/28/2024 8/28/2024
Granny 8/26/2024   8/29/2024

 

Here is my desired output:  

Table Orchard       
Apple_Var Ready_Date Pick_Date Change_Date
Red_Del 8/24/2024   8/28/2024
Granny 8/26/2024   8/29/2024
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

First in what sense?

First sorted by what variables?

 

To get what you showed you probably want the data sorted by APPLE_VAR and CHANGE_DATE and just find the first per APPLE_VAR that has missing value of PICK_DATE.

 

First let's convert your LISTING into an actual DATASET so we have something to work from.  Let's display the dates with a format that will not confuse half of our audience.

data HAVE;
  input Apple_Var :$10. (Ready_Date Pick_Date Change_Date) (:mmddyy.);
  format Ready_Date Pick_Date Change_Date date9.;
cards;
Red_Del 8/24/2024 8/24/2024 8/24/2024
Red_Del 8/24/2024 8/25/2024 8/25/2024
Red_Del 8/24/2024 8/26/2024 8/26/2024
Red_Del 8/24/2024 8/27/2027 8/27/2024
Red_Del 8/24/2024 .  8/28/2024
Granny 8/26/2024 8/26/2024 8/26/2024
Granny 8/26/2024 8/27/2024 8/27/2024
Granny 8/26/2024 8/28/2024 8/28/2024
Granny 8/26/2024 .  8/29/2024
;

Will need to make sure the data is sorted properly so we can use BY statement in our data step.

Note that since your example only has TWO apple varieties you could get it to work without the sort by using the DESCENDING keyword in the BY statement to let the data step know that you have the apple varieties in descending alphabetical order.

proc sort data=HAVE;
  by apple_var change_date;
run;

data WANT;
  set have;
  by apple_var change_date;
  where missing(pick_date);
  if first.apple_var;
run;

Results

       Apple_        Ready_                   Change_
Obs      Var           Date    Pick_Date         Date

 1     Granny     26AUG2024            .    29AUG2024
 2     Red_Del    24AUG2024            .    28AUG2024

 

 

 

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

First in what sense?

First sorted by what variables?

 

To get what you showed you probably want the data sorted by APPLE_VAR and CHANGE_DATE and just find the first per APPLE_VAR that has missing value of PICK_DATE.

 

First let's convert your LISTING into an actual DATASET so we have something to work from.  Let's display the dates with a format that will not confuse half of our audience.

data HAVE;
  input Apple_Var :$10. (Ready_Date Pick_Date Change_Date) (:mmddyy.);
  format Ready_Date Pick_Date Change_Date date9.;
cards;
Red_Del 8/24/2024 8/24/2024 8/24/2024
Red_Del 8/24/2024 8/25/2024 8/25/2024
Red_Del 8/24/2024 8/26/2024 8/26/2024
Red_Del 8/24/2024 8/27/2027 8/27/2024
Red_Del 8/24/2024 .  8/28/2024
Granny 8/26/2024 8/26/2024 8/26/2024
Granny 8/26/2024 8/27/2024 8/27/2024
Granny 8/26/2024 8/28/2024 8/28/2024
Granny 8/26/2024 .  8/29/2024
;

Will need to make sure the data is sorted properly so we can use BY statement in our data step.

Note that since your example only has TWO apple varieties you could get it to work without the sort by using the DESCENDING keyword in the BY statement to let the data step know that you have the apple varieties in descending alphabetical order.

proc sort data=HAVE;
  by apple_var change_date;
run;

data WANT;
  set have;
  by apple_var change_date;
  where missing(pick_date);
  if first.apple_var;
run;

Results

       Apple_        Ready_                   Change_
Obs      Var           Date    Pick_Date         Date

 1     Granny     26AUG2024            .    29AUG2024
 2     Red_Del    24AUG2024            .    28AUG2024

 

 

 

ScotchCat
Obsidian | Level 7
Thank you!! I really appreciate - and learning so much from this community!
Tom
Super User Tom
Super User

To do what your subject line literal says you would want something like this:

data want;
  set have;
  if missing(pick_date)
    and (not missing(lag(pick_date)))
    and (apple_var = lag(apple_var))
  ;
run;

The last condition makes sure you don't compare pick_date values across apple varieties.

It also means you cannot select an observation that has missing pick_date that is the first for this apple variety.

And if you have multiple places where PICK_DATE transitioned from being present to being missing for the same apple variety all of those transitions would be selected.

ballardw
Super User

Since you appear to want to process data in the order the data set is currently then that would mean to use a data set in SAS as that processes records in order.

You did not state but seem to imply that you want the result based on the grouping value of a variable named Apple_var.

What you did not say explicitly what to do in the cases of:

1) the only observations for pick_date with missing do not have any "prior" records with none-missing values

2) if there are no missing Pick_date values.

 

You can use a BY statement to process by the values of a variable. This assumes the data is sorted by that variable. If the data is not actually sorted but is grouped by that variable (your example Red_Del would not come before Granny by sort order) you can use the keyword NOTSORTED to prevent an error. Since you are looking for a "first and only" you need something to help identify if it is the first. That would typically call for flag variable that is RETAINED, keeps the same value from observation to observation unless reset to indicate when to output the needed records.

 

BEST is to provide example data in the form of a working data step and paste into a text box opened with the </> icon that appears above the message window.

 

So something like this, creates a data set hopefully similar to yours and then selects the observations.

data have;
    input Apple_Var $ 	Ready_Date :mmddyy10.	Pick_Date :mmddyy10.	Change_Date  :mmddyy10. ;
    format Ready_Date :mmddyy10.	Pick_Date :mmddyy10.	Change_Date  :mmddyy10. ;
datalines;
Red_Del 	8/24/2024 	8/24/2024 	8/24/2024
Red_Del 	8/24/2024 	8/25/2024 	8/25/2024
Red_Del 	8/24/2024 	8/26/2024 	8/26/2024
Red_Del 	8/24/2024 	8/27/2027 	8/27/2024
Red_Del 	8/24/2024 	.        	8/28/2024
Granny 	8/26/2024 	8/26/2024 	8/26/2024
Granny 	8/26/2024 	8/27/2024 	8/27/2024
Granny 	8/26/2024 	8/28/2024 	8/28/2024
Granny 	8/26/2024 	.        	8/29/2024
;

data want;
   set have;
   by apple_var notsorted;
   retain found;
   if first.apple_var then found=0 ;
   if missing(pick_date) and found=0 then do;
      found=1;
      output;
   end;
   drop found;
run;

This doesn't handle the cases where the first observations all have missing pick_dates though.

The BY creates a temporary variable for each variable on the BY statement that indicate first or last value of the group and are 1/0 coded for true/false.

The Output statement makes only explicitly stated output records to write to the output data set.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 941 views
  • 0 likes
  • 3 in conversation