BookmarkSubscribeRSS Feed
Ninna1
Fluorite | Level 6

Hello fellow programmers!

 

I have tried to figure this out myself, but I guess I need help:

 

I have a large amount of survey data, and all the answer-variables have labels that are the survey questions - naturally. Not all of them are correct from the beginning, so I put them in a csv file and send them to be reviewed. Usually there is not anything to correct, so I just leave it there. If there is a single mistake i just correct it manually in my programming.

 

But now, there is many variables that needs new labels, so I would like to be able to do it more clever than copy/paste.

 

The csv file is just two collums, one with the variable names and one with the labels.

 

Any ideas or tips?

 

Looking forward to hearing from you!

 

Regards

10 REPLIES 10
JosvanderVelden
SAS Super FREQ
/* 
Post_title:
	Labelling my data with labels corrected in a CSV file.
Link to post: 
	https://communities.sas.com/t5/SAS-Programming/Labelling-my-data-with-labels-corrected-in-a-CSV-file/m-p/775060
Important post details:	
	The csv file is just two collums, one with the variable names and one with the labels.

*/

*if you are using SAS OnDemand for Academics (ODA) 
    change your-userid-on-oda for your userid
 if you are using another option then change
    /home/your-userid-on-oda/ for a path/folder 
    you can use;
%let mypath = /home/your-userid-on-oda/;

data work.dataset_with_labels;
   variable_name = "abc";
   variable_label = "label-for-abc";
run;

proc export data = work.dataset_with_labels
            outfile = "&mypath.dataset_with_labels.csv"
            replace;
run;

proc import datafile = "&mypath.dataset_with_labels.csv"
            out = work.imported_dataset_with_labels
            replace;
run;

Does this make sense for you?

 

Best regards, Jos

Ninna1
Fluorite | Level 6
Yes it makes perfect sense but i can't really see how this brings me any closer to replacing be wrong labels with the correct ones?
FreelanceReinh
Jade | Level 19

Hello @Ninna1,

 

So, you have

  1. a dataset with variables containing the answers to survey questions and some of these variables have wrong labels
  2. a CSV file with pairs of variable names from the above dataset and correct labels.

Let me create sample data for demonstration:

data have;
array s[*] $ Q1a Q1b Q2 Q3;
label Q1a='First question, part a)'
      Q1b='Wrong label'
      Q2 ='Second question'
      Q3 ='Again, incorrect label';
input s[*];
cards;
Yes No Maybe Never
; 

filename namlab 'C:\Temp\correct_labels.csv';

data _null_;
file namlab;
put 'Q1b,"First question, part b)"' /
    'Q3,Third question';
run;

To correct the labels in dataset HAVE you can create PROC DATASETS code (in a temporary file) using the information from the CSV file and then run it per %INCLUDE statement:

filename labels temp;

data _null_;
file labels;
infile namlab dlm=',' dsd end=last;
input name :$32. label :$256.;
if _n_=1 then put 'proc datasets lib=work nolist;' /
                  'modify have;' /
                  'label ';
put name '=' label;
if last then put '; quit;';
run;

%inc labels;

 

Ninna1
Fluorite | Level 6

Thank you so much for your reply @FreelanceReinh ! I seems like it could work!

 

However I get an error. It doesn't seem to accept the:

 

put name '=' label;

 

I get this :

 

ERROR 73-322: Expecting an =.

ERROR 76-322: Syntax error, statement will be ignored.

Kurt_Bremser
Super User

We need to see an example of your csv file with names and labels to provide tested code that works with it.

Copy/paste this text into a box opened with this button:

Bildschirmfoto 2020-04-07 um 08.32.59.jpg

FreelanceReinh
Jade | Level 19

@Ninna1 wrote:

Thank you so much for your reply @FreelanceReinh ! I seems like it could work!

 

However I get an error. It doesn't seem to accept the:

 

put name '=' label;

 

I get this :

 

ERROR 73-322: Expecting an =.

ERROR 76-322: Syntax error, statement will be ignored.


I suspect that you are using non-standard variable names, e.g. containing blanks such as Question 1a. In this case you will have

options validvarname=any;

in effect and you need to modify the data _null_ step as follows in order to handle name literals:

data _null_;
file labels;
infile namlab dlm=',' dsd end=last;
input name :$32. label :$256.;
length nl $35; /* safer: $51 */
nl=nliteral(name);
if _n_=1 then put 'proc datasets lib=work nolist;' /
                  'modify have;' /
                  'label ';
put nl '=' label;
if last then put '; quit;';
run;

Edit: Here is a complete example using name literals:

Spoiler
options validvarname=any;

data have;
array s[*] $ 'First Q a'n 'First Q b'n 'Second Q'n '3rd Q.'n; /* name literals -- not recommended */
label 'First Q a'n = 'First question, part a)'
      'First Q b'n = 'Wrong label'
      'Second Q'n  = 'Second question'
      '3rd Q.'n    = 'Again, incorrect label';
input s[*];
cards;
Yes No Maybe Never
; 

filename namlab 'C:\Temp\correct_labels.csv';

data _null_;
file namlab;
put 'First Q b,"First question, part b)"' /
    '3rd Q.,Third question';
run;

filename labels temp;

data _null_;
file labels;
infile namlab dlm=',' dsd end=last;
input name :$32. label :$256.;
length nl $35;
nl=nliteral(name);
if _n_=1 then put 'proc datasets lib=work nolist;' /
                  'modify have;' /
                  'label ';
put nl '=' label;
if last then put '; quit;';
run;

%inc labels;

 

Ninna1
Fluorite | Level 6
Thank you very much for your effort @FreelanceReinhard. It almost works.

I get a ; instead of the =

I have no idea why. However, I have allready spent too much time on this today and I have a deadline. So I will have to work on it some other time.

Thank you all.
FreelanceReinh
Jade | Level 19

@Ninna1 wrote:
I get a ; instead of the =

I have no idea why. However, I have allready spent too much time on this today and I have a deadline. So I will have to work on it some other time.

No problem. When you get back to this, please post the complete log of the step that doesn't work into a code box opened with the </> ("Insert Code") button.

Tom
Super User Tom
Super User

Make sure to define nl long enough for any possible nliteral.  You need 32 characters for the actual name plus three for the quotes and N suffix.  But you also need up to 16 more for doubling up any included quotes.  So use

length nl $51;
FreelanceReinh
Jade | Level 19

Thanks, @Tom, good point. Indeed, the "worst case" regarding variable names that I had envisioned was not bad enough.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2563 views
  • 3 likes
  • 5 in conversation