BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

Hello,

 

I need to strip off data before 01-Nov-2019 from all the Raw datasets and I am trying to write a macro for the same. I am calling a macro within a macro as first I need to build up my date from Raw Date variables - 'Day','Month' and 'Year'.

 

The whole code is working fine except for variable &&out is not resolving to the actual date value - but is instead resolving to AUDAT_.

 

Any help would much appreciable!


%macro strip_dt(yy, mm, dd, out,ds);

proc sort data=ole.&ds(encoding=asciiany) out=&ds.1; by subject;

data &ds.2 ;
set &ds.1;

if &yy ne . and &mm ne . and &dd ne . then
&out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.))||'-'||strip(put(&dd,z2.));
else if &yy ne . and &mm ne . and &dd eq . then
&out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.));
else if &yy ne . and &mm eq . and &dd eq . then
&out=strip(put(&yy,best.));

if &&out <='2019-11-01' then
flag='yes';
else flag='No';
if flag='yes' then output;
run;

%mend strip_dt;
run;

%strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo);

 

Thank you !

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@PA8 wrote:

I tried the suggested code but got the "Missing values were generated error" message :

 

MPRINT(STRIP_DT):   data auxo52 strip.auxo5;
SYMBOLGEN:  Macro variable DS resolves to auxo5
MPRINT(STRIP_DT):   set auxo51;
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
SYMBOLGEN:  Macro variable MM resolves to AUDAT_MM
SYMBOLGEN:  Macro variable DD resolves to AUDAT_DD
SYMBOLGEN:  Macro variable YY resolves to AUDAT_YYYY
MPRINT(STRIP_DT):   AUDAT_=mdy(max(AUDAT_MM,1),max(AUDAT_DD,1),AUDAT_YYYY);
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   format AUDAT_ yymmdd10.;
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   if 0 < AUDAT_ <= '01NOV2019'd ;
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   drop AUDAT_;
MPRINT(STRIP_DT):   run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      6 at 23186:116
NOTE: There were 3354 observations read from the data set WORK.AUXO51.
NOTE: The data set WORK.AUXO52 has 1986 observations and 44 variables.
NOTE: Compressing data set WORK.AUXO52 decreased size by 92.00 percent.
      Compressed is 4 pages; un-compressed would require 50 pages.
NOTE: The data set STRIP.AUXO5 has 1986 observations and 44 variables.
NOTE: Compressing data set STRIP.AUXO5 decreased size by 92.00 percent.
      Compressed is 4 pages; un-compressed would require 50 pages.
NOTE: DATA statement used (Total process time):
      real time           0.09 seconds
      user cpu time       0.04 seconds
      system cpu time     0.03 seconds
      memory              1899.62k
      OS Memory           54004.00k
      Timestamp           02/06/2020 07:53:18 PM
      Step Count                        1204  Switch Count  0

6 out of 3354 input records => data values likely the issue.

I would add something like

         if missing(Audat_) then put "ERROR: Record=" _n_ +1 &mm= +1 &dd= +1 &yy=;

to the code to get diagnostics which will tell you which observation of the source data set has the data value problem(s).

 

 

View solution in original post

17 REPLIES 17
PaigeMiller
Diamond | Level 26

At the top of your program, insert this command and then run it again

options mprint symbolgen mlogic;

Then show us the log of the part where you run this data step. Please, in order so that the log is formatted properly, which makes it much more readable to us; please copy the appropriate part of the log as text and paste it into the window that appears when you click on the {i}. DO NOT SKIP THIS STEP. If you don't follow this procedure, we will simply ask again that you follow this procedure.

 

I need to strip off data before 01-Nov-2019 from all the Raw datasets

 

Also, please give us an example of what you are trying to do here.

--
Paige Miller
PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

thank you so much for the prompt helpI

 

I had included the options as stated but missed to include the log. Please find the same attached :

 

17801
17802 dm 'log' clear;
17803 options mprint symbolgen mlogic;

17804
17805 %macro strip_dt(yy, mm, dd, out,ds);
17806
17807 proc sort data=ole.&ds(encoding=asciiany) out=&ds.1; by subject;
17808
17809 data &ds.2 ;
17810 set &ds.1;
17811
17812 if &yy ne . and &mm ne . and &dd ne . then
17813 &out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.))||'-'||strip(put(&dd,z2.));
17814 else if &yy ne . and &mm ne . and &dd eq . then
17815 &out=strip(put(&yy,best.))||'-'||strip(put(&mm,z2.));
17816 else if &yy ne . and &mm eq . and &dd eq . then
17817 &out=strip(put(&yy,best.));
17818
17819 if &&out <='2019-11-01' then
17820 flag='yes';
17821 else flag='No';
17822 if flag='yes' then output;
17823 run;
17824
17825 %mend strip_dt;
17826 run;
17827
17828 %strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo);
MLOGIC(STRIP_DT): Beginning execution.
MLOGIC(STRIP_DT): Parameter YY has value AUDAT_YYYY
MLOGIC(STRIP_DT): Parameter MM has value AUDAT_MM
MLOGIC(STRIP_DT): Parameter DD has value AUDAT_DD
MLOGIC(STRIP_DT): Parameter OUT has value AUDAT_
MLOGIC(STRIP_DT): Parameter DS has value auxo
SYMBOLGEN: Macro variable DS resolves to auxo
SYMBOLGEN: Macro variable DS resolves to auxo
MPRINT(STRIP_DT): proc sort data=ole.auxo(encoding=asciiany) out=auxo1;
MPRINT(STRIP_DT): by subject;
SYMBOLGEN: Macro variable DS resolves to auxo
NOTE: There were 1272 observations read from the data set OLE.AUXO.
NOTE: The data set WORK.AUXO1 has 1272 observations and 44 variables.
NOTE: Compressing data set WORK.AUXO1 decreased size by 90.63 percent.
Compressed is 3 pages; un-compressed would require 32 pages.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.56 seconds
user cpu time 0.04 seconds
system cpu time 0.04 seconds
memory 8689.59k
OS Memory 54000.00k
Timestamp 02/06/2020 06:44:33 PM
Step Count 1169 Switch Count 0


MPRINT(STRIP_DT): data auxo2 ;
SYMBOLGEN: Macro variable DS resolves to auxo
MPRINT(STRIP_DT): set auxo1;
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
MPRINT(STRIP_DT): if AUDAT_YYYY ne . and AUDAT_MM ne . and AUDAT_DD ne . then
AUDAT_=strip(put(AUDAT_YYYY,best.))||'-'||strip(put(AUDAT_MM,z2.))||'-'||strip(put(AUDAT_DD,z2.));
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
MPRINT(STRIP_DT): else if AUDAT_YYYY ne . and AUDAT_MM ne . and AUDAT_DD eq . then
AUDAT_=strip(put(AUDAT_YYYY,best.))||'-'||strip(put(AUDAT_MM,z2.));
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN: Macro variable MM resolves to AUDAT_MM
SYMBOLGEN: Macro variable DD resolves to AUDAT_DD
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
SYMBOLGEN: Macro variable YY resolves to AUDAT_YYYY
MPRINT(STRIP_DT): else if AUDAT_YYYY ne . and AUDAT_MM eq . and AUDAT_DD eq . then
AUDAT_=strip(put(AUDAT_YYYY,best.));
SYMBOLGEN: && resolves to &.
SYMBOLGEN: Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT): if AUDAT_ <='2019-11-01' then flag='yes';
MPRINT(STRIP_DT): else flag='No';
MPRINT(STRIP_DT): if flag='yes' then output;
MPRINT(STRIP_DT): run;

NOTE: There were 1272 observations read from the data set WORK.AUXO1.
NOTE: The data set WORK.AUXO2 has 1272 observations and 46 variables.
NOTE: Compressing data set WORK.AUXO2 decreased size by 90.63 percent.
Compressed is 3 pages; un-compressed would require 32 pages.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
user cpu time 0.03 seconds
system cpu time 0.03 seconds
memory 1441.43k
OS Memory 46672.00k
Timestamp 02/06/2020 06:44:34 PM
Step Count 1170 Switch Count 0


MLOGIC(STRIP_DT): Ending execution.

 

 

PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

The variable highlighted in Red color is not resolved to date value. 

So, the intent is to consider only data on or prior to 01-Nov-2019 from all the Raw datasets.

PaigeMiller
Diamond | Level 26

Please, in order so that the log is formatted properly, which makes it much more readable to us; please copy the appropriate part of the log as text and paste it into the window that appears when you click on the {i}. DO NOT SKIP THIS STEP. If you don't follow this procedure, we will simply ask again that you follow this procedure.

--
Paige Miller
PA8
Fluorite | Level 6 PA8
Fluorite | Level 6
SYMBOLGEN:  Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN:  Macro variable MM resolves to AUDAT_MM
SYMBOLGEN:  Macro variable DD resolves to AUDAT_DD
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
SYMBOLGEN:  Macro variable YY resolves to AUDAT_YYYY
MPRINT(STRIP_DT):   else if AUDAT_YYYY ne . and AUDAT_MM eq . and AUDAT_DD eq . then
AUDAT_=strip(put(AUDAT_YYYY,best.));
SYMBOLGEN:  && resolves to &.
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   if AUDAT_ <='2019-11-01' then flag='yes';
MPRINT(STRIP_DT):   else flag='No';
MPRINT(STRIP_DT):   if flag='yes' then output;
MPRINT(STRIP_DT):   run;

NOTE: There were 1272 observations read from the data set WORK.AUXO1.
NOTE: The data set WORK.AUXO2 has 1272 observations and 46 variables.
NOTE: Compressing data set WORK.AUXO2 decreased size by 90.63 percent.
      Compressed is 3 pages; un-compressed would require 32 pages.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      user cpu time       0.03 seconds
      system cpu time     0.03 seconds
      memory              1441.43k
      OS Memory           46672.00k
      Timestamp           02/06/2020 06:44:34 PM
      Step Count                        1170  Switch Count  0
PA8
Fluorite | Level 6 PA8
Fluorite | Level 6
Apologies for misunderstanding the concept of {i}.
Patrick
Opal | Level 21

I believe you should try and compare SAS Date values and not strings. The string comparison as you've coded it won't return the outcome you expect and want.

  data &ds.2;
    set &ds.1;
    &out=mdy(coalesce(&mm,1),coalesce(&dd,1),&yy);
    if &out <= '01NOV2019'd then
      flag='yes';
    else flag='No';
    if flag='yes' then
      output;
  run;

 

Not sure what this Flag variable is about given that you only output rows with flag='yes'.

 

Also given that you wrote "I need to strip off data before 01-Nov-2019" shouldn't the condition be:

if &out >= '01NOV2019'd then...

 

PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

I tried the suggested code but unfortunately I got the error message in log -

 

Invalid argument to function MDY(2018,10,12).

MPRINT(STRIP_DT):   set auxo1;
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
SYMBOLGEN:  Macro variable YY resolves to AUDAT_YYYY
SYMBOLGEN:  Macro variable MM resolves to AUDAT_MM
SYMBOLGEN:  Macro variable DD resolves to AUDAT_DD
MPRINT(STRIP_DT):   AUDAT_=mdy(AUDAT_YYYY,coalesce(AUDAT_MM,1),coalesce(AUDAT_DD,1));
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   if AUDAT_ <= '01NOV2019'd then flag='yes';
MPRINT(STRIP_DT):   else flag='No';
MPRINT(STRIP_DT):   if flag='yes' then output;
MPRINT(STRIP_DT):   run;

NOTE: Invalid argument to function MDY(2018,10,12) at line 22655 column 107.
Patrick
Opal | Level 21

Fixed the code in-between in the post. Try again.

 

The function is MDY() so the Y variable needs to be the 3rd parameter and not the first one.

PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

Thanks so much ! It worked absolutely fine ! But i received Missing values error which I handled by adding the below line :

 

My only concern is now in case of any incomplete dates - what if Day or Month is missing ?

 

%macro strip_dt(yy, mm, dd, out,ds);

proc sort data=ole.&ds(encoding=asciiany) out=&ds.1; by subject;
data &ds.2 strip.&ds;
set &ds.1;
if &yy ne . and &mm ne . and &dd ne . then
&out=mdy(coalesce(&mm,1),coalesce(&dd,1),&yy);
if &out <= '01NOV2019'd then
flag='yes';
else flag='No';
if flag='yes' then
output;
drop flag &out; 
run;

%mend strip_dt;
run;

%strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo);

%strip_dt(AUDAT_YYYY, AUDAT_MM, AUDAT_DD,AUDAT_,auxo5);
ballardw
Super User

You would likely be much better off actually using a real date if you are comparing to a single fixed date.

 

I think that data step code can be pretty much reduced to:

 

data &ds.2 ;
   set &ds.1;
   if 0< mdy(max(&mm,1),max(&dd,1),&yy) le '01NOV2019'd;

run;

The function MDY returns a SAS date value from numeric variables of month, day and year. Actually if the values are character but resolve to numbers then SAS will attempt to use the numeric version. The special value of a quoted string in DATE9 format ddMONyyyy with the d at the end tells SAS that this is a date literal and can compare to the date value created by the MDY function.

If any of the month, day and year values are missing or result in an invalid date such as 31 November, the function MDY will return a missing value.

The use of Max with the &mm and &dd variables means that if one of the variables is missing then the month 1 or day 1 will be used to create a valid date for comparison. Which basically replicates the choices you made for the different ways of comparing with missing values.

If you really need that OUT variable to indicate the date compared then use:

data &ds.2 ;
   set &ds.1;
   &out =  mdy(max(&mm,1),max(&dd,1),&yy);
   format &out yymmdd10.;
   if 0< &out le '01NOV2019'd;

run;

 

An IF without any "then" clause is a subsetting if and will only keep the observations where the If is true. Which is why I have the "if 0< " part. Missing values are always less than any given value so you want to exclude those as well which would occur with a missing &yy variable.

 

Caution: if you have invalid date combinations like &mm > 12 or an &dd value not valid for a given month (and year in the case of February and leap years) then the mdy function will return missing.

 

 

 

PA8
Fluorite | Level 6 PA8
Fluorite | Level 6
Thanks a lot for your help! My next question was actually how to handle missing date parts. Let me try this code. Hopefully should be resolved !
PA8
Fluorite | Level 6 PA8
Fluorite | Level 6

I tried the suggested code but got the "Missing values were generated error" message :

 

MPRINT(STRIP_DT):   data auxo52 strip.auxo5;
SYMBOLGEN:  Macro variable DS resolves to auxo5
MPRINT(STRIP_DT):   set auxo51;
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
SYMBOLGEN:  Macro variable MM resolves to AUDAT_MM
SYMBOLGEN:  Macro variable DD resolves to AUDAT_DD
SYMBOLGEN:  Macro variable YY resolves to AUDAT_YYYY
MPRINT(STRIP_DT):   AUDAT_=mdy(max(AUDAT_MM,1),max(AUDAT_DD,1),AUDAT_YYYY);
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   format AUDAT_ yymmdd10.;
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   if 0 < AUDAT_ <= '01NOV2019'd ;
SYMBOLGEN:  Macro variable OUT resolves to AUDAT_
MPRINT(STRIP_DT):   drop AUDAT_;
MPRINT(STRIP_DT):   run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      6 at 23186:116
NOTE: There were 3354 observations read from the data set WORK.AUXO51.
NOTE: The data set WORK.AUXO52 has 1986 observations and 44 variables.
NOTE: Compressing data set WORK.AUXO52 decreased size by 92.00 percent.
      Compressed is 4 pages; un-compressed would require 50 pages.
NOTE: The data set STRIP.AUXO5 has 1986 observations and 44 variables.
NOTE: Compressing data set STRIP.AUXO5 decreased size by 92.00 percent.
      Compressed is 4 pages; un-compressed would require 50 pages.
NOTE: DATA statement used (Total process time):
      real time           0.09 seconds
      user cpu time       0.04 seconds
      system cpu time     0.03 seconds
      memory              1899.62k
      OS Memory           54004.00k
      Timestamp           02/06/2020 07:53:18 PM
      Step Count                        1204  Switch Count  0
Patrick
Opal | Level 21

@PA8 wrote:

I tried the suggested code but got the "Missing values were generated error" message :

 

Then use the macro generated code from the log and analyze your data. In below code analyse the data in table missing_audat_.

Once you understand which values lead to missings you then can decide on the logic to deal with such data.

 

data selected missing_audat_ discarded;
  set auxo51;
  AUDAT_=mdy(max(AUDAT_MM,1),max(AUDAT_DD,1),AUDAT_YYYY);
  format AUDAT_ yymmdd10.;
  if 0 < AUDAT_ <= '01NOV2019'd then output selected;
  else if missing(AUDAT_) then output missing_audat_;
  else output discarded;
  drop AUDAT_;
run;

As a general remark: It's often better to first develop and test working SAS code and only if this does what you need generalize the code and add the macro bits.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1303 views
  • 0 likes
  • 4 in conversation