BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Solph
Pyrite | Level 9

I've a large data set (Data A, easily > 500K rows) with a long list of conditions  (A01...A38, B02... B75, C04..... S99). I want to add variable description using an external data set (Data B), with a corresponding list of variable names and description.

 

I found a SAS code but it works on occasions, e.g. on A01-B20, but not when I tried to include all variables. I don't know why. Also I'm hoping the code can work efficiently as my data can be large.

 

The sample code is as below. The actual label data and the data to be applied (w/ 100 cases) are attached so that you can test the macro %varlabel on them, and it may help to see why the macro works on the sample data but not the actual data. Thanks. Or any other SAS codes that would work for my purpose would do too. Thanks in advance.

 

data labeldata; input var $ 1-3 labels $ 5-17;
datalines;
A01	Diebetes
A02 Heart disease
B01 Cancer
B02 Renal failure
;
data mydata; input ID 1 A01 3 A02 5 B01 7 B02 9;
datalines;
1 1 1 1 0
2 0 0 0 1
3 1 0 1 1
4 1 0 0 1
;
run;
%macro varlabel(indata=, lib=, labeldata=, labelvar=, labeldesc=);
data _null_; 
     set &labeldata; 
     call symput("var" || trim(left(_N_)), trim(left(&labelvar))); 
     call symput("label" || trim(left(_N_)), trim(left(&labeldesc))); 
     call symput("nobs", trim(left(_N_))); 
run;
proc datasets library = &lib memtype = data nolist; 
     modify &indata; 
     label 
          %do i = 1 %to &nobs; 
               &&var&i = &&label&i 
          %end; ; 
     quit;
run;
%mend;
%varlabel(indata=mydata, lib=work, labeldata=labeldata, labelvar=var, labeldesc=labels);
proc contents data=mydata order=varnum; run;

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Solph
Pyrite | Level 9

Thanks a lot. Indeed that's the reason. The code works now, with the fixes.

%macro varlabel(indata=, lib=, labeldata=, labelvar=, labeldesc=);
* create distinct macro variables for each variable name and label; 
data _null_; 
     set &labeldata; 
     call symput("var" || trim(left(_N_)), trim(left(&labelvar))); 
     call symput("label" || trim(left(_N_)), trim(left(&labeldesc))); 
     call symput("nobs", trim(left(_N_))); 
run;
proc datasets library = &lib memtype = data nolist; 
     modify &indata; 
     label 
          %do i = 1 %to &nobs; 
               &&var&i = "&&label&i"

          %end; ; 
     quit;
run;
%mend;
%varlabel(indata=mydata, lib=work, labeldata=labeldata, labelvar=var, labeldesc=labels);

 

View solution in original post

5 REPLIES 5
ChrisNZ
Tourmaline | Level 20

Shouldn't it be

&&var&i = "&&label&i"

?

 

 

 

 

Solph
Pyrite | Level 9

Thanks a lot. Indeed that's the reason. The code works now, with the fixes.

%macro varlabel(indata=, lib=, labeldata=, labelvar=, labeldesc=);
* create distinct macro variables for each variable name and label; 
data _null_; 
     set &labeldata; 
     call symput("var" || trim(left(_N_)), trim(left(&labelvar))); 
     call symput("label" || trim(left(_N_)), trim(left(&labeldesc))); 
     call symput("nobs", trim(left(_N_))); 
run;
proc datasets library = &lib memtype = data nolist; 
     modify &indata; 
     label 
          %do i = 1 %to &nobs; 
               &&var&i = "&&label&i"

          %end; ; 
     quit;
run;
%mend;
%varlabel(indata=mydata, lib=work, labeldata=labeldata, labelvar=var, labeldesc=labels);

 

ballardw
Super User

Instead of creating a bunch of macro variables and referencing issues you might consider CALL EXECUTE statements instead.

Something like:

data _null_;
   set &labeldata end=last;
   if _n_= 1 then do;
      call execute( "proc datasets library=&lib. memtype=data nolist;");
      call execute( "modify &indata.;");
      call execute( "label");
   end;
   call execute (var||'=' quote(strip(label)));
   if last call execute("quit;");
end;

And if your datalabel set actually had separate datasets then sorting the data by the dataset names, using BY  in the data _null_ with first.datasetname could be used to insert the modify statement for each data set, though you would use the variable name instead of &indata macro variable and the statement would be more like call execute ("modify "|| datasetnamevar ||";");

ChrisNZ
Tourmaline | Level 20

@ballardw I'd use call execute as well for his task. 

 

Tom
Super User Tom
Super User

Putting the metadata into macro variables is just going to make it harder to deal.  Leave data in datasets.

To add labels you want to generate label statement(s). 

For that you just need your metadata file:

data labeldata;
  infile cards truncover ;
   input  name :$32. label $255. ;
cards;
A01 Diabetes
A02 Heart disease
B01 Cancer
B02 Renal failure
;

It is probably easiest to write the label statement o a file than try to generate strings to push onto the program stack with CALL EXECUTE().

filename labels temp;
data _null_;
  set labeldata end=eof ;
  file labels;
  if _n_=1 then put 'label';
  put @3 name '=' label :$quote. ;
  if eof then put ';' ;
run;

Then you just need to use %INCLUDE to use that code to modify the dataset.

proc datasets nolist libname=work;
modify mydata;
%include labels ;
run;
quit;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1810 views
  • 1 like
  • 4 in conversation