BookmarkSubscribeRSS Feed
melc
Calcite | Level 5

I want to select the smallest value of two variables for each row in a data set.

 

The two variables have date values as character strings that follow the order of 4 digit year, 2 digit month, 2 digit day.

 

Some values for these variables have full dates (yyyymmdd) and others have partial dates (yyyymm or yyyy). 

 

How can I select the smallest value between each of the two variables for each row? Will I need to convert the character strings to dates first? However, won't date formats on partial dates return missing value (".")? 

 

For example: 

 

I tried using the MIN function as Var3 = MIN(Var1, Var2)

 

Var1
Var2
Expected Minimum Date
Resulting Minimum Date Based on Date format
Resulting Minimum Date Based on Character String
202108
20211201
202108
20211201 (because partial date of 202108 returns as "." when converting column to a date. Or is there a way to handle column of dates of varying lengths to prevent partial dates from being ".")
202108
202112
20210824
20210824
20210824
202112

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Dates should NEVER be character strings if you are going to work with them. They should always be converted to dates.

 

As far as partial dates are concerned, you need to come up with logic on how to handle these, how to turn these into "non-partial" dates. Then once you have such logic, you can turn them into valid SAS dates which contain year/month/day (not year/month) and then program the rest of your operations.

--
Paige Miller
data_null__
Jade | Level 19

You can get good mileage using 8601 date format.  SAS can read it and set missing day to 01 and same for missing month.  But you can do compare with the character variables.  Tones of papers and threads about partial dates you should check it out.  YYYY-MM-DD works well in listings too.

 

 

49         data _null_;
50            v0='2021';
51            v1='2021-08';
52            v2='2021-12-01';
53            t1=v1 min v2;
54            t2=v1 max v2;
55            t3=v0 eq v1;
56            t4=v0 eq: v2;
57            put 'NOTE: ' (_all_)(/=);
58            run;
2 The SAS System                               15:14 Friday, March 29, 2024

NOTE: 
v0=2021
v1=2021-08
v2=2021-12-01
t1=2021-08
t2=2021-12-01
t3=0
t4=1
mkeintz
PROC Star

Given that you have partial dates (of the form YYYY, YYYYMM) it's uncharacteristically lucky that you have kept these as character variables.   You can find the minimum of two character dates (partial or complete) by using a comparison operator (i.e "<", ">"), which will honor lexicographic ordering (i.e. "alphabetic" ordering).

data want;
   set have;
   mindate=ifc(var1<var2,var1,var2);
run;

Now mindate will still be a character variable, which you may need to convert to a regular numeric variable holding date values.  But the initial comparison is easy.

 

CAVEAT: If var1 or var2 is blank that blank value will be less than all YYYY, YYYYMM, and YYYYMMDD character values.  So you may need to protect against blank VAR1 or VAR2 values.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ballardw
Super User

@melc wrote:

I want to select the smallest value of two variables for each row in a data set.

 

The two variables have date values as character strings that follow the order of 4 digit year, 2 digit month, 2 digit day.

 

Some values for these variables have full dates (yyyymmdd) and others have partial dates (yyyymm or yyyy). 

 

How can I select the smallest value between each of the two variables for each row? Will I need to convert the character strings to dates first? However, won't date formats on partial dates return missing value (".")? 

First worry about INFORMATS to read the values. Pick the desired format after you have a date value to display.

Things that appear as YYMM or YYYYMM can be read into a data value using the YYMMNw. informat. It will assume that the day of the month used for the date is the first day.

 

If you only have a 4-digit year then convert to numeric and use the MDY function to create the day and month in that year that you want to use for comparisons.

 

The real problem can be if you have mixed 6 character values that are in the YYMMDD and YYYYMM as 201212 might be 12 Dec 2020 (or 1920) or Dec 2012.

Tom
Super User Tom
Super User

Please share your code.  Looks like normal comparisons do what you want.

data have;
  input (Var1 Var2 Expected) (:$8.);
cards;
202108 20211201 202108
202112 20210824 20210824
;

data want;
  set have;
  if var1<var2 then min=var1; else min=var2;
run;

Result

Obs     Var1       Var2      Expected      min

 1     202108    20211201    202108      202108
 2     202112    20210824    20210824    20210824

Patrick
Opal | Level 21

As already suggested you could use the appropriate informat to convert the date strings to SAS Date values (count of days since 1/1/1960 stored in a numerical variable).

For your date and partial date strings informat b8601da. appears to do the job. Partial dates will get aligned to the beginning of what's available (=beginning of month or beginning of year).

data demo;
  infile datalines truncover dlm=' ';
  input dt_string_1:$8. dt_string_2:$8.;

  sas_dt_1=input(dt_string_1,b8601da.);
  sas_dt_2=input(dt_string_2,b8601da.);
  min_date=min(sas_dt_1,sas_dt_2);
  format sas_dt_1 sas_dt_2 min_date yymmdd10.;
  datalines;
202108 20211201
202112 20210824
2021 20210824
;

proc print data=demo;
run;

Patrick_0-1711770584049.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 412 views
  • 1 like
  • 7 in conversation