Hello,
I need to strip off data before 01-Nov-2019 from all the Raw datasets and I am trying to write a macro for the same. I am calling a macro within a macro as first I need to build up my date from Raw Date variables - 'Day','Month' and 'Year'.
The whole code is working fine except for variable &&out is not resolving to the actual date value - but is instead resolving to AUDAT_.
Any help would much appreciable!
%macro strip_dt(yy, mm, dd, out,ds);
proc sort data=ole.&ds(encoding=asciiany) out=&ds.1; by subject;
data &ds.2 ;
set &ds.1;
if &yy ne . and &mm ne . and &dd ne . then
&out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.))||'-'||strip(put(&dd,z2.));
else if &yy ne . and &mm ne . and &dd eq . then
&out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.));
else if &yy ne . and &mm eq . and &dd eq . then
&out=strip(put(&yy,best.));
if &&out <='2019-11-01' then
flag='yes';
else flag='No';
if flag='yes' then output;
run;
%mend strip_dt;
run;
%strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo);
Thank you !
@PA8 wrote:
I tried the suggested code but got the "Missing values were generated error" message :
MPRINT(STRIP_DT): data auxo52 strip.auxo5; SYMBOLGEN: Macro variable DS resolves to auxo5 MPRINT(STRIP_DT): set auxo51; SYMBOLGEN: Macro variable OUT resolves to AUDAT_ SYMBOLGEN: Macro variable MM resolves to AUDAT_MM SYMBOLGEN: Macro variable DD resolves to AUDAT_DD SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY MPRINT(STRIP_DT): AUDAT_=mdy(max(AUDAT_MM,1),max(AUDAT_DD,1),AUDAT_YYYY); SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): format AUDAT_ yymmdd10.; SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): if 0 < AUDAT_ <= '01NOV2019'd ; SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): drop AUDAT_; MPRINT(STRIP_DT): run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 6 at 23186:116 NOTE: There were 3354 observations read from the data set WORK.AUXO51. NOTE: The data set WORK.AUXO52 has 1986 observations and 44 variables. NOTE: Compressing data set WORK.AUXO52 decreased size by 92.00 percent. Compressed is 4 pages; un-compressed would require 50 pages. NOTE: The data set STRIP.AUXO5 has 1986 observations and 44 variables. NOTE: Compressing data set STRIP.AUXO5 decreased size by 92.00 percent. Compressed is 4 pages; un-compressed would require 50 pages. NOTE: DATA statement used (Total process time): real time 0.09 seconds user cpu time 0.04 seconds system cpu time 0.03 seconds memory 1899.62k OS Memory 54004.00k Timestamp 02/06/2020 07:53:18 PM Step Count 1204 Switch Count 0
6 out of 3354 input records => data values likely the issue.
I would add something like
if missing(Audat_) then put "ERROR: Record=" _n_ +1 &mm= +1 &dd= +1 &yy=;
to the code to get diagnostics which will tell you which observation of the source data set has the data value problem(s).
At the top of your program, insert this command and then run it again
options mprint symbolgen mlogic;
Then show us the log of the part where you run this data step. Please, in order so that the log is formatted properly, which makes it much more readable to us; please copy the appropriate part of the log as text and paste it into the window that appears when you click on the {i}. DO NOT SKIP THIS STEP. If you don't follow this procedure, we will simply ask again that you follow this procedure.
I need to strip off data before 01-Nov-2019 from all the Raw datasets
Also, please give us an example of what you are trying to do here.
thank you so much for the prompt helpI
I had included the options as stated but missed to include the log. Please find the same attached :
17801
17802 dm 'log' clear;
17803 options mprint symbolgen mlogic;
17804
17805 %macro strip_dt(yy, mm, dd, out,ds);
17806
17807 proc sort data=ole.&ds(encoding=asciiany) out=&ds.1; by subject;
17808
17809 data &ds.2 ;
17810 set &ds.1;
17811
17812 if &yy ne . and &mm ne . and &dd ne . then
17813 &out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.))||'-'||strip(put(&dd,z2.));
17814 else if &yy ne . and &mm ne . and &dd eq . then
17815 &out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.));
17816 else if &yy ne . and &mm eq . and &dd eq . then
17817 &out=strip(put(&yy,best.));
17818
17819 if &&out <='2019-11-01' then
17820 flag='yes';
17821 else flag='No';
17822 if flag='yes' then output;
17823 run;
17824
17825 %mend strip_dt;
17826 run;
17827
17828 %strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo);
MLOGIC(STRIP_DT): Beginning execution.
MLOGIC(STRIP_DT): Parameter YY has value AUDAT_YYYY
MLOGIC(STRIP_DT): Parameter MM has value AUDAT_MM
MLOGIC(STRIP_DT): Parameter DD has value AUDAT_DD
MLOGIC(STRIP_DT): Parameter OUT has value AUDAT_
MLOGIC(STRIP_DT): Parameter DS has value auxo
SYMBOLGEN: Macro variable DS resolves to auxo
SYMBOLGEN: Macro variable DS resolves to auxo
MPRINT(STRIP_DT): proc sort data=ole.auxo(encoding=asciiany) out=auxo1;
MPRINT(STRIP_DT): by subject;
SYMBOLGEN: Macro variable DS resolves to auxo
NOTE: There were 1272 observations read from the data set OLE.AUXO.
NOTE: The data set WORK.AUXO1 has 1272 observations and 44 variables.
NOTE: Compressing data set WORK.AUXO1 decreased size by 90.63 percent.
Compressed is 3 pages; un-compressed would require 32 pages.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.56 seconds
user cpu time 0.04 seconds
system cpu time 0.04 seconds
memory 8689.59k
OS Memory 54000.00k
Timestamp 02/06/2020 06:44:33 PM
Step Count 1169 Switch Count 0
MPRINT(STRIP_DT): data auxo2 ;
SYMBOLGEN: Macro variable DS resolves to auxo
MPRINT(STRIP_DT): set auxo1;
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
MPRINT(STRIP_DT): if AUDAT_YYYY ne . and AUDAT_MM ne . and AUDAT_DD ne . then
AUDAT_=strip(put(AUDAT_YYYY,best.))||'-'||strip(put(AUDAT_MM,z2.))||'-'||strip(put(AUDAT_DD,z2.));
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
MPRINT(STRIP_DT): else if AUDAT_YYYY ne . and AUDAT_MM ne . and AUDAT_DD eq . then
AUDAT_=strip(put(AUDAT_YYYY,best.))||'-'||strip(put(AUDAT_MM,z2.));
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
MPRINT(STRIP_DT): else if AUDAT_YYYY ne . and AUDAT_MM eq . and AUDAT_DD eq . then
AUDAT_=strip(put(AUDAT_YYYY,best.));
SYMBOLGEN: && resolves to &.
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT): if AUDAT_ <='2019-11-01' then flag='yes';
MPRINT(STRIP_DT): else flag='No';
MPRINT(STRIP_DT): if flag='yes' then output;
MPRINT(STRIP_DT): run;
NOTE: There were 1272 observations read from the data set WORK.AUXO1.
NOTE: The data set WORK.AUXO2 has 1272 observations and 46 variables.
NOTE: Compressing data set WORK.AUXO2 decreased size by 90.63 percent.
Compressed is 3 pages; un-compressed would require 32 pages.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
user cpu time 0.03 seconds
system cpu time 0.03 seconds
memory 1441.43k
OS Memory 46672.00k
Timestamp 02/06/2020 06:44:34 PM
Step Count 1170 Switch Count 0
MLOGIC(STRIP_DT): Ending execution.
The variable highlighted in Red color is not resolved to date value.
So, the intent is to consider only data on or prior to 01-Nov-2019 from all the Raw datasets.
Please, in order so that the log is formatted properly, which makes it much more readable to us; please copy the appropriate part of the log as text and paste it into the window that appears when you click on the {i}. DO NOT SKIP THIS STEP. If you don't follow this procedure, we will simply ask again that you follow this procedure.
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY SYMBOLGEN: Macro variable MM resolves to AUDAT_MM SYMBOLGEN: Macro variable DD resolves to AUDAT_DD SYMBOLGEN: Macro variable OUT resolves to AUDAT_ SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY MPRINT(STRIP_DT): else if AUDAT_YYYY ne . and AUDAT_MM eq . and AUDAT_DD eq . then AUDAT_=strip(put(AUDAT_YYYY,best.)); SYMBOLGEN: && resolves to &. SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): if AUDAT_ <='2019-11-01' then flag='yes'; MPRINT(STRIP_DT): else flag='No'; MPRINT(STRIP_DT): if flag='yes' then output; MPRINT(STRIP_DT): run; NOTE: There were 1272 observations read from the data set WORK.AUXO1. NOTE: The data set WORK.AUXO2 has 1272 observations and 46 variables. NOTE: Compressing data set WORK.AUXO2 decreased size by 90.63 percent. Compressed is 3 pages; un-compressed would require 32 pages. NOTE: DATA statement used (Total process time): real time 0.06 seconds user cpu time 0.03 seconds system cpu time 0.03 seconds memory 1441.43k OS Memory 46672.00k Timestamp 02/06/2020 06:44:34 PM Step Count 1170 Switch Count 0
I believe you should try and compare SAS Date values and not strings. The string comparison as you've coded it won't return the outcome you expect and want.
data &ds.2;
set &ds.1;
&out=mdy(coalesce(&mm,1),coalesce(&dd,1),&yy);
if &out <= '01NOV2019'd then
flag='yes';
else flag='No';
if flag='yes' then
output;
run;
Not sure what this Flag variable is about given that you only output rows with flag='yes'.
Also given that you wrote "I need to strip off data before 01-Nov-2019" shouldn't the condition be:
if &out >= '01NOV2019'd then...
I tried the suggested code but unfortunately I got the error message in log -
Invalid argument to function MDY(2018,10,12).
MPRINT(STRIP_DT): set auxo1; SYMBOLGEN: Macro variable OUT resolves to AUDAT_ SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY SYMBOLGEN: Macro variable MM resolves to AUDAT_MM SYMBOLGEN: Macro variable DD resolves to AUDAT_DD MPRINT(STRIP_DT): AUDAT_=mdy(AUDAT_YYYY,coalesce(AUDAT_MM,1),coalesce(AUDAT_DD,1)); SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): if AUDAT_ <= '01NOV2019'd then flag='yes'; MPRINT(STRIP_DT): else flag='No'; MPRINT(STRIP_DT): if flag='yes' then output; MPRINT(STRIP_DT): run; NOTE: Invalid argument to function MDY(2018,10,12) at line 22655 column 107.
Fixed the code in-between in the post. Try again.
The function is MDY() so the Y variable needs to be the 3rd parameter and not the first one.
Thanks so much ! It worked absolutely fine ! But i received Missing values error which I handled by adding the below line :
My only concern is now in case of any incomplete dates - what if Day or Month is missing ?
%macro strip_dt(yy, mm, dd, out,ds); proc sort data=ole.&ds(encoding=asciiany) out=&ds.1; by subject; data &ds.2 strip.&ds; set &ds.1; if &yy ne . and &mm ne . and &dd ne . then &out=mdy(coalesce(&mm,1),coalesce(&dd,1),&yy); if &out <= '01NOV2019'd then flag='yes'; else flag='No'; if flag='yes' then output; drop flag &out; run; %mend strip_dt; run; %strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo); %strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo5);
You would likely be much better off actually using a real date if you are comparing to a single fixed date.
I think that data step code can be pretty much reduced to:
data &ds.2 ; set &ds.1; if 0< mdy(max(&mm,1),max(&dd,1),&yy) le '01NOV2019'd; run;
The function MDY returns a SAS date value from numeric variables of month, day and year. Actually if the values are character but resolve to numbers then SAS will attempt to use the numeric version. The special value of a quoted string in DATE9 format ddMONyyyy with the d at the end tells SAS that this is a date literal and can compare to the date value created by the MDY function.
If any of the month, day and year values are missing or result in an invalid date such as 31 November, the function MDY will return a missing value.
The use of Max with the &mm and &dd variables means that if one of the variables is missing then the month 1 or day 1 will be used to create a valid date for comparison. Which basically replicates the choices you made for the different ways of comparing with missing values.
If you really need that OUT variable to indicate the date compared then use:
data &ds.2 ; set &ds.1; &out = mdy(max(&mm,1),max(&dd,1),&yy); format &out yymmdd10.; if 0< &out le '01NOV2019'd; run;
An IF without any "then" clause is a subsetting if and will only keep the observations where the If is true. Which is why I have the "if 0< " part. Missing values are always less than any given value so you want to exclude those as well which would occur with a missing &yy variable.
Caution: if you have invalid date combinations like &mm > 12 or an &dd value not valid for a given month (and year in the case of February and leap years) then the mdy function will return missing.
I tried the suggested code but got the "Missing values were generated error" message :
MPRINT(STRIP_DT): data auxo52 strip.auxo5; SYMBOLGEN: Macro variable DS resolves to auxo5 MPRINT(STRIP_DT): set auxo51; SYMBOLGEN: Macro variable OUT resolves to AUDAT_ SYMBOLGEN: Macro variable MM resolves to AUDAT_MM SYMBOLGEN: Macro variable DD resolves to AUDAT_DD SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY MPRINT(STRIP_DT): AUDAT_=mdy(max(AUDAT_MM,1),max(AUDAT_DD,1),AUDAT_YYYY); SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): format AUDAT_ yymmdd10.; SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): if 0 < AUDAT_ <= '01NOV2019'd ; SYMBOLGEN: Macro variable OUT resolves to AUDAT_ MPRINT(STRIP_DT): drop AUDAT_; MPRINT(STRIP_DT): run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 6 at 23186:116 NOTE: There were 3354 observations read from the data set WORK.AUXO51. NOTE: The data set WORK.AUXO52 has 1986 observations and 44 variables. NOTE: Compressing data set WORK.AUXO52 decreased size by 92.00 percent. Compressed is 4 pages; un-compressed would require 50 pages. NOTE: The data set STRIP.AUXO5 has 1986 observations and 44 variables. NOTE: Compressing data set STRIP.AUXO5 decreased size by 92.00 percent. Compressed is 4 pages; un-compressed would require 50 pages. NOTE: DATA statement used (Total process time): real time 0.09 seconds user cpu time 0.04 seconds system cpu time 0.03 seconds memory 1899.62k OS Memory 54004.00k Timestamp 02/06/2020 07:53:18 PM Step Count 1204 Switch Count 0
@PA8 wrote:
I tried the suggested code but got the "Missing values were generated error" message :
Then use the macro generated code from the log and analyze your data. In below code analyse the data in table missing_audat_.
Once you understand which values lead to missings you then can decide on the logic to deal with such data.
data selected missing_audat_ discarded;
set auxo51;
AUDAT_=mdy(max(AUDAT_MM,1),max(AUDAT_DD,1),AUDAT_YYYY);
format AUDAT_ yymmdd10.;
if 0 < AUDAT_ <= '01NOV2019'd then output selected;
else if missing(AUDAT_) then output missing_audat_;
else output discarded;
drop AUDAT_;
run;
As a general remark: It's often better to first develop and test working SAS code and only if this does what you need generalize the code and add the macro bits.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.