I have a first treatment date in E8601DA10. format. I want to see if an event date occurs before or after the first treatment date, but in some cases I have unknown days. So I want to be able to compare the month and years of the event and first treatment date to better evaluate whether the event started before treatment. How do I get month and year out of a E8601DA10. format date?
If the variable is in a SAS data set and has the E8601DAw. format then the variable is a DATE value.
If you want to compare another date valued variable then just compare the two.
If you need the Year or Month value from any date value then use the Year (variable) and Month(variable) functions.
If you did not have days when the variable with the E8601DAw. format was created then your value is going to be missing. So you may have to provide more details for a full solution.
No, to clarify, event date = character string $25. Some of the event dates have month and year but no day. See screenshot below. If treatment happened in 2023, I could still compare vs these events and know the event happened before treatment. Just need to understand how to extract the year and month out of the E8601DA10. date.
These are positively NOT ISO 8601 dates. ISO dates must be in YMD order.
The month and year can be extracted like this:
month = input(scan(string,1,"/"),2.);
year = input(scan(scan(string,1," "),3,"/"),4.);
Do you have a SAS data set or not?
If you have a variable with $ format then it is not a variable with Format E8601DA10 it is a character variable that appears to hold date info. In SAS FORMAT has a very specific meaning combining data values and how to display them. An actual format of E8601DA is only applicable to numeric values and displays those in a certain appearance matching the E8601DA guidelines. Note: E8601DA has the date portion appear as YYYY-MM-DD and that is not what you show.
This shows how to get a numeric valued month and year from a string constructed that way.
data example; stringvalue = '07/UNK/2022 NUL:NUL:NUL'; month = input(scan(stringvalue,1,'/'),f2.); year = input(scan(stringvalue,3,'/'),f4.); run;
Scan pulls an entire word, in this case with the "word" separated by / characters. Input is used to read the resulting word using the supplied INFORMAT. F2 expects exactly 2 digits and F4 expects 4 (so ignores the remainder of string.
But do you have any actual date values for the other observations? If not I would suggest something more like:
data have; infile datalines truncover; input stringvalue $25. ; datalines; 12/14/2021 NUL:NUL:NUL 07/UNK/2022 NUL:NUL:NUL ; data want; set have; datevalue= input(stringvalue,?? mmddyy10. ) ; if missing (datevalue) then do; month = input(scan(stringvalue,1,'/'),f2.); year = input(scan(stringvalue,3,'/'),f4.); end; format datevalue E8601DA10.; run;
Which shows one way to create an actual SAS date value when all the information is there.
If it is acceptable to assume a day of the month when missing in the original value you could use the MDY function to create a date value such as
if missing (datevalue) then do; month = input(scan(stringvalue,1,'/'),f2.); year = input(scan(stringvalue,3,'/'),f4.); datevalue= mdy(month,15,year); end;
which would treat the UNK days as day 15 of the month. Or 1 would be another likely choice.
And where are the dates that are to be compared? Are they actual SAS date values or more character values?
For future questions, please do not post pictures of data. It is better to open a text box using the </> icon above the message windows and paste the text
Yes, I have a SAS dataset. I think the fact that I posted the picture of the character date is confusing people. because they think I was showing the ISO date. I'll try to explain better:
data want; set have;
*variables existing in the dataset currently. There is another event date variable that is date formatted,
but if the day is unknown that variable is missing. So therefore for missing day of a date like in the
screenshot above I need to work with the character string of the date; format treatdate E8601DA10. eventdate $25;
*want to derive below; treatmonth = monthpart (treatdate); treatday = daypart (treatdate); treatyear = yearpart (treatdate); eventmonth= monthpart (eventdate); eventday = daypart(eventdate); eventyear = yearpart (eventdate); *this facilitates this comparison; if eventday = . and eventmonth = . then do; if eventyear>treatyear then ontreatment='Y'; else
if eventyear<treatyear then ontreatment='N';
end; else
if eventyear=treatyear then do;
if eventmonth>treatmonth then ontreatment='Y'; else
if eventmonth<treatmonth then ontreatment='N';
end; run;
Long story short, I just need to figure out how to get the month and year out of an ISO date.
format treatdate E8601DA10. eventdate $25;
Means that you have zero, nada, zilch "E8601DA10" dates. You have 3 string values. Period.
So the example code that i show for pulling apart one of those not actually even in E8601DA structure strings needs to be done with every single one of those variables using SCAN and INPUT.
3 input string variables, one statement to get "year", another get month and another get day. But you didn't share any example of "treatdate" or "eventdate" that I can tell so I am not sure if the exact same code works for the other variables.
Or create 3 date variables as I suggested.
Never mind... it's not productive going round and round about whether I have a SAS dataset or whether I have a date formatted variable, getting feedback on how to parse character variables, and nothing addressing the actual question of how to extract a month or year from the ISO date. The format statement was to indicate the 2 variables being compared. It was not a statement that I am creating those variables. I don't know how to show you the date field without the screenshot, but also why do you need to see it when it's simply an ISO date? Anyway, this has been super unhelpful so I'll go elsewhere to solve the problem.
Oh, and I figured out the solution.
TSTART = PUT(trtstartdate,E8601DA.); tmonth = substr(tstart,6,2); tyear = substr(tstart,1,4);
If you have a SAS date (count of days starting at 1960-01-01), no matter how it is formatted, the MONTH() and YEAR() functions will extract those values. Use the Z2. format to display leading zeroes for the month.
If you need strings, use PUT:
tmonth = put(month(trtstartdate),z2.);
tyear = put(year(trtstartdate),4.);
But what about the funny strings with "UNK" from your second post?
Glad you found a solution. Re:
@JulesBre wrote:
Never mind... it's not productive going round and round about whether I have a SAS dataset or whether I have a date formatted variable, getting feedback on how to parse character variables, and nothing addressing the actual question of how to extract a month or year from the ISO date.
The questions/confusion here were around trying to understand the problem statement. SAS doesn't have the concept of an "ISO date", so for people who think in SAS terms/constructs, it wasn't clear what you meant.
SAS has a concept of a "date", and what some people call a "date variable" in SAS is actually a numeric variable, storing the number of days since Jan 1, 1960.
Sometimes people will store a date in SAS as a character variable. So you can have a character value "08/16/2004" or "16 Aug 2004" or "UNK Aug 2004". In that case the person thinks of it as date, but to SAS it's just a character string.
A SAS dataset only has two types of variables, numeric or character. It's not like other data storage formats which might have 10 or more types of variables.
Most of the questions people asked were trying to understand whether your starting point was a numeric variable, or character variable, so that they could provide recommendations that would apply to your scenario.
@JulesBre wrote:
Never mind... it's not productive going round and round about whether I have a SAS dataset or whether I have a date formatted variable, getting feedback on how to parse character variables, and nothing addressing the actual question of how to extract a month or year from the ISO date. The format statement was to indicate the 2 variables being compared. It was not a statement that I am creating those variables. I don't know how to show you the date field without the screenshot, but also why do you need to see it when it's simply an ISO date? Anyway, this has been super unhelpful so I'll go elsewhere to solve the problem.
Oh, and I figured out the solution.
TSTART = PUT(trtstartdate,E8601DA.); tmonth = substr(tstart,6,2); tyear = substr(tstart,1,4);
Please show some results where that actually works for a value where the original DAY value was UNK.
I also strongly suggest going back to what ever source this data set was created from and doing some comparisons by something where you can identify the same observation.
Your picture shows a potential problem: An informat of 10. would be used to read a string of digits, no longer than 10 characters. Nothing with any separators like / or -. Unless there was some serious parsing of the resulting numeric value that was read then the date value represented in that format may be incorrect.
Second, any value that had a missing day originally, unless someone imputed a day of the month similar to the manner I showed previously, your "day" is still missing and date value cannot have a missing day component and the result for any function on those values of TRTSDT will be missing. IF there is a value then the Month and Year functions would be more appropriate because the code you show will 1) generate notes like this in the LOG
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
Some organizations spend a lot of time making sure that they have clean logs, as in no notes like that because it often indicates potential problems.
Also the result means Tmonth and tyear are character values and you may find that comparisons with those involving < or > result in oddities. Maybe not so much with values with leading 0 but '7' is greater than '09' .
Plus this picture does not say anything about the other two variables. If you want to share and actual description of the data set variables then run Proc Contents data=whateverthenameofthedatasetis;run; and share the result of that code.
https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.