BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Crystal_F
Quartz | Level 8

I was trying to create some macro to select the first non missing value for every variable of interest per subject.

 

After sorting the data...

%fstnmissvalue(dsn,var);
   data temp..&var ;
      set &dsn (keep=bene_id A0310F_ENTRY_DSCHRG_CD TRGT_DT rename=(TRGT_DT=&VAR._DT  A0310F_ENTRY_DSCHRG_CD=&VAR._ENTRY_DSCHRG_CD))  ;
	  where &var ne . and &var ne "";
	  by bene_id;
	  if first.bene_id;
   run;
%mend;


%macro fstnmissvalue_loop(varlist);
    %let n1=%sysfunc(countw(&varlist));
	%let dsn1=temp.mds_asmt11_sum;
 	   %do i=1 %to &n1;
	      %let var1=%scan(&varlist, &i);
	         %fstnmissvalue(&dsn1,&var1);
	   %end;
%mend;

%fstnmissvalue_loop(G0110H1_EATG_SELF_CD G0110B1_TRNSFR_SELF_CD G0110I1_TOILTG_SELF_CD G0120A_BATHG_SELF_CD );

 

Notes from the log:

 

785  %fstnmissvalue(dsn,var);
     -
     180
WARNING: Apparent invocation of macro FSTNMISSVALUE not resolved.

ERROR 180-322: Statement is not valid or it is used out of proper order.

786     data temp..&var ;
             -----
             22
             201
WARNING: Apparent symbolic reference VAR not resolved.
787        set &dsn (keep=bene_id A0310F_ENTRY_DSCHRG_CD TRGT_DT rename=(TRGT_DT=&VAR._DT
                                                                                 -         ----------------------
                                                                                 214       79
                                                                                 23
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;, _DATA_, _LAST_, _NULL_.

ERROR 201-322: The option is not recognized and will be ignored.

ERROR 214-322: Variable name & is not valid.

ERROR 79-322: Expecting a =.

ERROR 23-7: Invalid value for the RENAME option.

                                                                                  -------
                                                                                  23
787!       set &dsn (keep=bene_id A0310F_ENTRY_DSCHRG_CD TRGT_DT rename=(TRGT_DT=&VAR._DT
                                                                                  -------
                                                                                  214
787! A0310F_ENTRY_DSCHRG_CD=&VAR._ENTRY_DSCHRG_CD))  ;
WARNING: Apparent symbolic reference VAR not resolved.
WARNING: Apparent symbolic reference VAR not resolved.
ERROR 214-322: Variable name VAR._DT is not valid.

788        where &var ne . and &var ne "";
WARNING: Apparent symbolic reference VAR not resolved.
788        where &var ne . and &var ne "";
                 -
                 180
ERROR 180-322: Statement is not valid or it is used out of proper order.

WARNING: Apparent symbolic reference VAR not resolved.
WARNING: Apparent symbolic reference VAR not resolved.
ERROR: Syntax error while parsing WHERE clause.
789        by bene_id;
790        if first.bene_id;
791     run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.VAR may be incomplete.  When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.VAR was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.09 seconds
      cpu time            0.03 seconds


ERROR: No matching %MACRO statement for this %MEND statement.
792  %mend;
793
794
795  %macro fstnmissvalue_loop(varlist);
796      %let n1=%sysfunc(countw(&varlist));
797      %let dsn1=temp.mds_asmt11_sum;
798         %do i=1 %to &n1;
799            %let var1=%scan(&varlist, &i);
800               %fstnmissvalue(&dsn1,&var1);
801         %end;
802  %mend;
803
804  /*%let dsn=temp.mds_asmt11_sum;*/
805  %fstnmissvalue_loop(G0110H1_EATG_SELF_CD G0110B1_TRNSFR_SELF_CD G0110I1_TOILTG_SELF_CD G0120A_BATHG_SELF_CD );
NOTE: Line generated by the invoked macro "FSTNMISSVALUE_LOOP".
1       %fstnmissvalue(&dsn1,&var1);
        -
        180
WARNING: Apparent invocation of macro FSTNMISSVALUE not resolved.

ERROR 180-322: Statement is not valid or it is used out of proper order.

WARNING: Apparent invocation of macro FSTNMISSVALUE not resolved.
NOTE: Line generated by the invoked macro "FSTNMISSVALUE_LOOP".
3                  %fstnmissvalue(&dsn1,&var1);
                   -
                   180

ERROR 180-322: Statement is not valid or it is used out of proper order.

WARNING: Apparent invocation of macro FSTNMISSVALUE not resolved.
NOTE: Line generated by the invoked macro "FSTNMISSVALUE_LOOP".
5                  %fstnmissvalue(&dsn1,&var1);
                   -
                   180

ERROR 180-322: Statement is not valid or it is used out of proper order.

WARNING: Apparent invocation of macro FSTNMISSVALUE not resolved.
NOTE: Line generated by the invoked macro "FSTNMISSVALUE_LOOP".
7                  %fstnmissvalue(&dsn1,&var1);
                   -
                   180

ERROR 180-322: Statement is not valid or it is used out of proper order.


I have no idea where went wrong with my code. Is there any one that could help with my problem? I'll be very appreicated about your help. Thanks!!!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Crystal_F
Quartz | Level 8

Now it works with the modified code as below

 

%macro fstnmissvalue(dsn,var);

data temp.&var ;

set &dsn (keep=bene_id A0310F_ENTRY_DSCHRG_CD TRGT_DT &var

rename=(TRGT_DT=&VAR._DT A0310F_ENTRY_DSCHRG_CD=&VAR._ENDS_CD)) ;

where &var ne "-" and &var ne "";

by bene_id;

if first.bene_id;

run;

%mend;

 

%macro fstnmissvalue_loop(varlist);

%let n1=%sysfunc(countw(&varlist));

%let dsn1=TEMP.MDS_ASMT11_SUM;

%do i=1 %to &n1;

%let var1=%scan(&varlist, &i);

%fstnmissvalue(&dsn1,&var1);

%end;

%mend;

/*%let dsn=temp.mds_asmt11_sum;*/

%fstnmissvalue_loop(G0110H1_EATG_SELF_CD G0110B1_TRNSFR_SELF_CD G0110I1_TOILTG_SELF_CD G0120A_BATHG_SELF_CD );

 

View solution in original post

17 REPLIES 17
ballardw
Super User

Macro definition starts with %macro . The first time you used %fstnmissvalue(dsn,var); was apparently in the definition and should be

 

%macro fstnmissvalue(dsn,var);

 

No comments about the rest of the code as I stopped with the first error. fixing that will likely clear up many of the error messages.

Crystal_F
Quartz | Level 8

Oooooops, I did that again! Thank you so much for pointing it out! I fixed some other issue in my code but still have error.

 

%macro fstnmissvalue(dsn,var);
   data temp.&var ;
      set &dsn (keep=bene_id A0310F_ENTRY_DSCHRG_CD TRGT_DT rename=(TRGT_DT=&VAR._DT  A0310F_ENTRY_DSCHRG_CD=&VAR._ENDS_CD))  ;
	  where &var ne . and &var ne "";
	  by bene_id;
	  if first.bene_id;
   run;
%mend;


%macro fstnmissvalue_loop(varlist);
    %let n1=%sysfunc(countw(&varlist));
	%let dsn1=TEMP.MDS_ASMT11_SUM;
 	   %do i=1 %to &n1;
	      %let var1=%scan(&varlist, &i);
	         %fstnmissvalue(&dsn1,&var1);
	   %end;
%mend;

%fstnmissvalue_loop(G0110H1_EATG_SELF_CD G0110B1_TRNSFR_SELF_CD G0110I1_TOILTG_SELF_CD G0120A_BATHG_SELF_CD );

 

 

ERROR: Variable G0110H1_EATG_SELF_CD is not on file TEMP.MDS_ASMT11_SUM.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set TEMP.G0110H1_EATG_SELF_CD may be incomplete.  When this step was stopped there were 0 observations and 3
         variables.
WARNING: Data set TEMP.G0110H1_EATG_SELF_CD was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.10 seconds
      cpu time            0.03 seconds


ERROR: Variable G0110B1_TRNSFR_SELF_CD is not on file TEMP.MDS_ASMT11_SUM.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set TEMP.G0110B1_TRNSFR_SELF_CD may be incomplete.  When this step was stopped there were 0 observations and
         3 variables.
WARNING: Data set TEMP.G0110B1_TRNSFR_SELF_CD was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.01 seconds


ERROR: Variable G0110I1_TOILTG_SELF_CD is not on file TEMP.MDS_ASMT11_SUM.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set TEMP.G0110I1_TOILTG_SELF_CD may be incomplete.  When this step was stopped there were 0 observations and
         3 variables.
WARNING: Data set TEMP.G0110I1_TOILTG_SELF_CD was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.01 seconds


ERROR: Variable G0120A_BATHG_SELF_CD is not on file TEMP.MDS_ASMT11_SUM.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set TEMP.G0120A_BATHG_SELF_CD may be incomplete.  When this step was stopped there were 0 observations and 3
         variables.
WARNING: Data set TEMP.G0120A_BATHG_SELF_CD was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.00 seconds

I double checked with those variables in my dataset, they do exist with such variable names.

 

 

878  proc print data= TEMP.MDS_ASMT11_SUM (obs=1);
879  var bene_id G0110H1_EATG_SELF_CD;
880  run;

NOTE: There were 1 observations read from the data set TEMP.MDS_ASMT11_SUM.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.06 seconds
      cpu time            0.00 seconds

Crystal_F
Quartz | Level 8

Now it works with the modified code as below

 

%macro fstnmissvalue(dsn,var);

data temp.&var ;

set &dsn (keep=bene_id A0310F_ENTRY_DSCHRG_CD TRGT_DT &var

rename=(TRGT_DT=&VAR._DT A0310F_ENTRY_DSCHRG_CD=&VAR._ENDS_CD)) ;

where &var ne "-" and &var ne "";

by bene_id;

if first.bene_id;

run;

%mend;

 

%macro fstnmissvalue_loop(varlist);

%let n1=%sysfunc(countw(&varlist));

%let dsn1=TEMP.MDS_ASMT11_SUM;

%do i=1 %to &n1;

%let var1=%scan(&varlist, &i);

%fstnmissvalue(&dsn1,&var1);

%end;

%mend;

/*%let dsn=temp.mds_asmt11_sum;*/

%fstnmissvalue_loop(G0110H1_EATG_SELF_CD G0110B1_TRNSFR_SELF_CD G0110I1_TOILTG_SELF_CD G0120A_BATHG_SELF_CD );

 

Crystal_F
Quartz | Level 8

Now my problem is sovled. Thanks again!

Astounding
PROC Star

Alternatively, you could scrap all the macro language.  Here's a program to get the first nonmissing value of every variable by BENE_ID:

 

data temp;

set have;

obsno=_n_;

run;

proc sort data=temp;

by bene_id descending obsno;

run;

data want;

update temp (obs=0 drop=obsno) temp (drop=obsno);

by bene_id;

run;

 

You don't need to know the variable names, which ones are character and which are numeric ... just let UPDATE handle it.

Crystal_F
Quartz | Level 8

That looks like a briliant idea to me. Is there any way to also catch the date when the value of each target variable became non missing with your code? My original thinking was to rename the datetime variable and merge them together to achieve one record for each id.  I'd like to know about your way.

Astounding
PROC Star

You would need to jump through quite a few hoops to make this happen.  But it could be done since a DATA step (including one that uses UPDATE) permits data manipulation statements.  Here's the idea of what you would need to do.

 

For each target variable ...

 

  • Create a new variable to hold a copy of the target variable, and retain it
  • Create a new variable to hold the date being assigned to that target variable, and retain it
  • Reset these new variables to missing at the beginning of each BY group

On each observation:

 

  • Check if the target variable is nonmissing.  For a nonmissing value, copy it to the "copy of the target variable", and copy the date variable to the "date being assigned to that target variable".
  • Set the target variable to missing.  This is the key step, overriding the fact that UPDATE holds on to the nonmissing values and ignores the missing values.

Other hoops:

 

Drop the target variable(s).  Rename the "copy of the target variable" to be the "target variable".

 

So it can be done within the context of a single DATA step.  But it might be easier to create the DATE variables in a separate DATA step that processes the observations in the original order, then merge with the results from this DATA step.

 

mkeintz
PROC Star

@Astounding

 

I think you can take advantage of a preceding SQL, and a two-line modification to your "data temp" step.  Everything else in your program could be left unchanged:

 

Assuming you have a variable named daterec in each record, then this should work:

 

proc sql noprint;
  select distinct cats('if not(missing(',name,')) then firstdate_',name,'=daterec;')
  into :myrecode separated by ' '
  from dictionary.columns where libname='WORK' and memname='HAVE';
quit;

data temp;
  set have;
  obsno=_n_;
  &myrecode ;
  format firstdate_: yymmddn8. ;
run;

** rest of @astounding's code here **;

 

So for variables X Y Z in dataset have, three new variables firstdata_X firstdate_Y firstdate_z are created.   If you are worried about the lengh of the varnames change "firstdate_" in the above to "fd_".

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Astounding
PROC Star

Excellent idea.  So the plan is to create a "copy of the date variable", populated only when the "variable of interest" is nonmissing.  Then let the sorting and UPDATE approach pick the earliest populated "copy of the date variable".  I like it.

Crystal_F
Quartz | Level 8

I have several extended questions @ballardwon the codes shared by you and @mkeintz.

 

1. If I want to select the first non-missing and not equal to ''-'' value (or any other specified value) for every target vaiable and the related dates, is there any way to make it work with your plan?

2. What if i want to select the last non-missing values?

3. Or select the last non-missing and not equal to "-" value?

4. I'm wondering if the data step would take more time for processing compared to my original plan, if I'm working on some huge dataset, let's say population size is 1 million.

 

Thanks everyone, for your help and sharing your valuable thoughts!

Crystal_F
Quartz | Level 8

I guess one way to select non-missing and not equal certain spcified value is to recode the specified value to missing. Any other thoughts?

 

I

mkeintz
PROC Star

Question 1:  By asking for first/last non-missing that is also not a particular value, you will defeat the update statement, which provides the last non-missing value in this program.  Of course, it's actually the earliest non-missing value, since the dataset is sorted by descending obsn.  Your speculation of recoding offending values to missing would solve the problem.

 

Question 2: Answer is embedded in answer above.

Question 3: Answer is embedded in answer above.

 

Question 4:  Your original plan was to read the dataset once per variable, so 10 vars means reading 10 times.  @Astounding,s program reads it once, although there is an expensive sort to conduct.  But then your home free.

 

As an efficiency suggestion.

 

Instead of

   DATE TEMP;

Do

   DATA VTEMP/view=VTEMP;

 

And then modify the proc sort statement to

  PROC SORT DATA=VTEMP OUT=TEMP;

 

The "data vtemp/view=vtemp" construct a data set VIEW rather than a data set FILE.  As such, the view is not implemented until called later (in this case by proc sort).  And when it is called, it outputs to the calling proc, NOT to the DISK.  As a result you save (1) writing VTEMP to disk, and (2) re-reading it from disk into proc sort.  That should help a lot.

 

Also do a test on restricted subset of your data.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Crystal_F
Quartz | Level 8
Oh, I didn't see your comments, which sounds great, before I posted the solution. I'll try your code combined with Astounding's for sure. Thank you for your valuable inputs!
Crystal_F
Quartz | Level 8

Sorry, I have to ask again. I tried your code by replacing daterec with my date variable

TRGT_DT

 

proc sql noprint;
  select distinct cats('if not(missing(',name,')) then fd_',name,'=TRGT_DT;')
  into :myrecode separated by ' '
  from dictionary.columns where libname='WORK' and memname='have';
quit;

But in the log, it says

 

 

1081 proc sql noprint;

1082 select distinct cats('if not(missing(',name,')) then fd_',name,'=TRGT_DT;')

1083 into :myrecode separated by ' '

1084 from dictionary.columns where libname='WORK' and memname='have';

NOTE: No rows were selected.

1085 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.01 seconds

cpu time 0.00 seconds

 

1086 data temp;

1087 set have;

1088 obsno=_n_;

1089 &myrecode ;

-

180

WARNING: Apparent symbolic reference MYRECODE not resolved.

ERROR 180-322: Statement is not valid or it is used out of proper order.

 

Could you please help me out with this? Thank you.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 17 replies
  • 3995 views
  • 5 likes
  • 4 in conversation