Hi all,
I have some data which includes 4 date variables. They came as string variables in Excel, so in setting up the new date variables, first I get rid of the text stuff. So if the date variable (in an array) is missing, N/A, or other text value, the new date var gets set to a nonsense value:
do i = 1 to dim(_dtvar_sa);
if _dtvar_sa[i] = ' ' then _cleandate_all[i] = '01JAN1900'd;
else if _dtvar_sa[i] in ('N/A', 'NA') then _cleandate_all[i]='01JAN1901'd;
else if anyalpha(compress(_dtvar_sa[i], ' /.'))>0 then _cleandate_all[i]='01JAN1902'd;
This part seems to be working okay from what I can tell. However, the other part is taking the strings that are probably dates and putting them together to form a complete date value for SAS. What I'm trying to do is split the strings into month, day, and year values, which I then use mdy to combine as a final date value.
However, part of my sequence is giving me trouble. After I pull out the day and month values from the strings, I have these commands which theoretically build numeric day and month vars from those values:
then do;
_dayvar_sa[i] = scan(left(strip(_dtvar_sa[i])), 1, ' / .');
end;
_dayvar_sa2[i] = input(_dayvar_sa[i], 3.);
then do;
_mthvar_sa[i] = scan(left(strip(_dtvar_sa[i])), 2, ' / .');
end;
_mthvar_sa2[i] = input(_mthvar_sa[i], 3.);
It seems like the first command is working fine for both day and month, but SAS is completely ignoring the input commands for both day and month. I have a similar one for year which is working just fine.
I am getting notes in the log to the effect of Invalid value for input function, but when I look at the values in question, they don't have letters or anything so I'm not sure what the problem is.
Please do not post disconnected snippets of code. In part8'icular, it is UTTERLY useless to post a THEN branch without the IF and the condition., as that is the most important part of a conditional statement.
So pleasepost:
Please do not post disconnected snippets of code. In part8'icular, it is UTTERLY useless to post a THEN branch without the IF and the condition., as that is the most important part of a conditional statement.
So pleasepost:
Ok, that's the code. You have ignored this request from @Kurt_Bremser
some examples of your data in usable form (data step with datalines, DO NOT SKIP THIS!)
@Walternate wrote:
Raw date var: Desired date var
1/1/1900
N/A 1/1/1901
RANDOM TEXT 1/1/1902
5/3/22 05/03/2022
24-02-2021 24/02/2021
Please look very closely at that example. Note that it is not possible to see the "raw date" for the first line. Paste text data into a TEXT box opened on the forum with the </> icon to prevent the message window from "cleaning up" text by removing leading spaces and such.
Now we need proc contents since all of your arrays are defaulting to the type of the original variables from your source data set. You create a bunch of CHARACTER values for Day, Month and Year. Why? Keeping dates as character just adds complexity for any use such as determining report periods, intervals, comparing dates and such.
The last two values should be read with a DDMMYY10 informat and no pulling apart month and day needed. Assign what ever format you want for appearance.
data example; input date :ddmmyy10.; put date= date9. date= ddmmyy10.; datalines; 5/3/22 24-02-2021 ;
The put is just another example using a desired format at the time of display.
If your "date" is really inconsistent then perhaps you want to investigate the ANYDTDTE informat as that handles a variety of cases that you have not shown.
And I still think that setting date values of 01Jan1900 is very poor.
Strong hint: Show actual values that you are working with that do not show the expected result.
It is very likely that there is a proper Informat to read the dates and your tearing unshown strings apart to get at pieces does not show what is needed.
Stronger hint: Show the entire data step. I can't tell if you expect numeric or character values for some of those arrays and need the array definition to possibly debug your issues.
Suggestion: Do not provide valid date values as "nonsense" values. Somewhere some one is going to use those values and be surprised at a result and then spend time trying to figure out why there is garbage in the data. SAS has a group of "values" for numeric variables that are special missing values. These are a dot followed by a letter or the underscore character, i.e ._ .A to .Z . You can use those to indicate the status of the original improper value such as .B for the 'blank', .N for the 'NA' and so one. These values are missing for use in any function and you can create custom formats to display text indicating the status as needed.
Example dates from my data that are being flagged as not working by the log:
26/2/22
29/12/22
28/1/22
14/4/22
@Walternate wrote:
Example dates from my data that are being flagged as not working by the log:
26/2/22
29/12/22
28/1/22
14/4/22
None of those look like dates to me. There is no month 14, 26, 28, or 29. They do not indicate what century they should be in.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.