BookmarkSubscribeRSS Feed
knveraraju91
Barite | Level 11

Dear,

 

I have a quick question about sorting data by dates which are in character form. (eg. sorting data in ex dataset by usubjid and exstdtc ( character date) and finding first.obs or last.obs.)

 

Is it ok to sort data with out converting the exstdtc to numeric form and then find first.obs and last.obs (or) it is required to convert to numeric form and do the data step.

  Please suggest.

 

 

4 REPLIES 4
Reeza
Super User
It depends. What format are the dates in? If it's YYMMDD it'll naturally sort out. If its Date9 it won't because 01Aug is alphabetically before 01Jan.
Kurt_Bremser
Super User

There's no reason for storing dates as character. Numeric dates can be stored in 4 bytes (less than any character date), and they can be used directly in date functions and with date formats. On top of that, they always sort in correct temporal order.

SuryaKiran
Meteorite | Level 14

If you want to use first. and last. and not to create a numeric variable, you can try something like this: 

PROC SQL;
select *,case when INPUT(Date,DATE9.)=min(INPUT(Date,DATE9.)) then "First"
			  When INPUT(Date,DATE9.)=max(INPUT(Date,DATE9.)) then "Last"
			  else " " end as F_L
	from test
	Group by ID
order by ID,INPUT(Date,DATE9.);
quit;
Thanks,
Suryakiran
ballardw
Super User

What appearance do your current character-valued dates have?

 

About the only one that sorts correctly with regularity is a yyyymmdd or one of its variants where mm(month) and dd(day) have leading zeroes. 2017-2-18 would come after 2017-11-15, or 2018-2-9 2018-2-12  because character comparisons for sorting are done left to right

(the -2 is after -1 in the first case and -9 after -1 in the second). So to get those to sort properly you need 2017-02-18 2018-02-09 and 2018-02-12.

Two digits years are also a potential issue if the data crosses the century boundary: 990105  (5Jan1999) would come after 051212 (12Dec2005) as the 99 comes after 05.

 

If the dates are mmddyy then order would basically ignore year as 01012018 would come before 01022017.

if the dates are ddmmyy then you would have the first of all months then the second 01012018 01022018 01032018 then 0201208 (back to January).

Date9 or Date7 (05Jan2017) would have April and August before January, February and March...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2800 views
  • 4 likes
  • 5 in conversation