BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Yoko
Obsidian | Level 7

Hello, 

I have a code somebody wrote: 

%let dob = PRXPARSE ("/\d{5} | \d{4} | \d{3} | \d{2} | \d{1} /");

I understand that SAS date is expressed anywhere from 1 to 5 digits.  

I think I need to add a negative sign for those who were born before 1960.  

Does anybody know how to add it to the code above? 

 

Thank you, 

 

Yoko

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@Yoko wrote:
Hello,
This is a part of a large program which identifies valid, invalid, or missing field values for multiple variables. This part is for date of birth (DOB). I found that the program categorises DOB before 1960 as 'INVALID' with %let dob = PRXPARSE ("/\d{5} | \d{4} | \d{3} | \d{2} | \d{1} /"); I suspect that this is because the code categorises any 1-5 positive digits as valid, but categorises negative digits invalid. So, I'm looking for a solution for DOB as SAS date.
Thank you,
Yoko


You need to define what an "invalid" date might look like. If the variable is a SAS date value then it can be from the year 1582 to 20,000 (yes a 5-digit year ).

Generally I validate date values when the values are read into SAS as text. The informats that read common date structures will take care of validating whether the string is a valid date and catches the stuff like incorrect leap days and invalid days like November 31, which that code looking at digits would not.

 

I might look for specific dates, such as 1/1/1900 (a common value for people to use with Microsoft products) or year of 9999 as somewhat common codes from different places.

 

If you have a reason to believe that your dates should be in a specific range then something like

 

'01JAN1925'd le DOB le today()

checks for a date of birth between 1 January 1925 and today. Note that your current check will not catch dates of birth in the future. A "valid" number, according to that PRX, of 99999 would correspond to 15 Oct 2233, hardly a likely valid DOB value. In fact, running that code on 23 Feb 2023 any number greater than 23064 would be very suspect for an actual date of birth.

 

View solution in original post

4 REPLIES 4
ballardw
Super User

If you have a SAS date value then perhaps you should tell us what you are looking for. Searching strings, which is what the PRX functions are for is likely going to be a headache.

If you have a variable that supposedly contains a date of birth and you are looking for those born before 1960 then use the year function.

if year(dateofbirthvariable) < 1960 then /* do what ever is needed*/

There a many functions for working with date, time and datetime values and I am fairly certain they are going to much easier than trying to parse strings.

 

https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.

Yoko
Obsidian | Level 7
Hello,
This is a part of a large program which identifies valid, invalid, or missing field values for multiple variables. This part is for date of birth (DOB). I found that the program categorises DOB before 1960 as 'INVALID' with %let dob = PRXPARSE ("/\d{5} | \d{4} | \d{3} | \d{2} | \d{1} /"); I suspect that this is because the code categorises any 1-5 positive digits as valid, but categorises negative digits invalid. So, I'm looking for a solution for DOB as SAS date.
Thank you,
Yoko

ballardw
Super User

@Yoko wrote:
Hello,
This is a part of a large program which identifies valid, invalid, or missing field values for multiple variables. This part is for date of birth (DOB). I found that the program categorises DOB before 1960 as 'INVALID' with %let dob = PRXPARSE ("/\d{5} | \d{4} | \d{3} | \d{2} | \d{1} /"); I suspect that this is because the code categorises any 1-5 positive digits as valid, but categorises negative digits invalid. So, I'm looking for a solution for DOB as SAS date.
Thank you,
Yoko


You need to define what an "invalid" date might look like. If the variable is a SAS date value then it can be from the year 1582 to 20,000 (yes a 5-digit year ).

Generally I validate date values when the values are read into SAS as text. The informats that read common date structures will take care of validating whether the string is a valid date and catches the stuff like incorrect leap days and invalid days like November 31, which that code looking at digits would not.

 

I might look for specific dates, such as 1/1/1900 (a common value for people to use with Microsoft products) or year of 9999 as somewhat common codes from different places.

 

If you have a reason to believe that your dates should be in a specific range then something like

 

'01JAN1925'd le DOB le today()

checks for a date of birth between 1 January 1925 and today. Note that your current check will not catch dates of birth in the future. A "valid" number, according to that PRX, of 99999 would correspond to 15 Oct 2233, hardly a likely valid DOB value. In fact, running that code on 23 Feb 2023 any number greater than 23064 would be very suspect for an actual date of birth.

 

Yoko
Obsidian | Level 7
Thank you for sharing your ideas!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 365 views
  • 1 like
  • 2 in conversation