Hello fellow programmers!
I have tried to figure this out myself, but I guess I need help:
I have a large amount of survey data, and all the answer-variables have labels that are the survey questions - naturally. Not all of them are correct from the beginning, so I put them in a csv file and send them to be reviewed. Usually there is not anything to correct, so I just leave it there. If there is a single mistake i just correct it manually in my programming.
But now, there is many variables that needs new labels, so I would like to be able to do it more clever than copy/paste.
The csv file is just two collums, one with the variable names and one with the labels.
Any ideas or tips?
Looking forward to hearing from you!
Regards
/*
Post_title:
Labelling my data with labels corrected in a CSV file.
Link to post:
https://communities.sas.com/t5/SAS-Programming/Labelling-my-data-with-labels-corrected-in-a-CSV-file/m-p/775060
Important post details:
The csv file is just two collums, one with the variable names and one with the labels.
*/
*if you are using SAS OnDemand for Academics (ODA)
change your-userid-on-oda for your userid
if you are using another option then change
/home/your-userid-on-oda/ for a path/folder
you can use;
%let mypath = /home/your-userid-on-oda/;
data work.dataset_with_labels;
variable_name = "abc";
variable_label = "label-for-abc";
run;
proc export data = work.dataset_with_labels
outfile = "&mypath.dataset_with_labels.csv"
replace;
run;
proc import datafile = "&mypath.dataset_with_labels.csv"
out = work.imported_dataset_with_labels
replace;
run;
Does this make sense for you?
Best regards, Jos
Hello @Ninna1,
So, you have
Let me create sample data for demonstration:
data have;
array s[*] $ Q1a Q1b Q2 Q3;
label Q1a='First question, part a)'
Q1b='Wrong label'
Q2 ='Second question'
Q3 ='Again, incorrect label';
input s[*];
cards;
Yes No Maybe Never
;
filename namlab 'C:\Temp\correct_labels.csv';
data _null_;
file namlab;
put 'Q1b,"First question, part b)"' /
'Q3,Third question';
run;
To correct the labels in dataset HAVE you can create PROC DATASETS code (in a temporary file) using the information from the CSV file and then run it per %INCLUDE statement:
filename labels temp;
data _null_;
file labels;
infile namlab dlm=',' dsd end=last;
input name :$32. label :$256.;
if _n_=1 then put 'proc datasets lib=work nolist;' /
'modify have;' /
'label ';
put name '=' label;
if last then put '; quit;';
run;
%inc labels;
Thank you so much for your reply @FreelanceReinh ! I seems like it could work!
However I get an error. It doesn't seem to accept the:
put name '=' label;
I get this :
ERROR 73-322: Expecting an =.
ERROR 76-322: Syntax error, statement will be ignored.
We need to see an example of your csv file with names and labels to provide tested code that works with it.
Copy/paste this text into a box opened with this button:
@Ninna1 wrote:
Thank you so much for your reply @FreelanceReinh ! I seems like it could work!
However I get an error. It doesn't seem to accept the:
put name '=' label;
I get this :
ERROR 73-322: Expecting an =.
ERROR 76-322: Syntax error, statement will be ignored.
I suspect that you are using non-standard variable names, e.g. containing blanks such as Question 1a. In this case you will have
options validvarname=any;
in effect and you need to modify the data _null_ step as follows in order to handle name literals:
data _null_; file labels; infile namlab dlm=',' dsd end=last; input name :$32. label :$256.; length nl $35; /* safer: $51 */ nl=nliteral(name); if _n_=1 then put 'proc datasets lib=work nolist;' / 'modify have;' / 'label '; put nl '=' label; if last then put '; quit;'; run;
Edit: Here is a complete example using name literals:
options validvarname=any;
data have;
array s[*] $ 'First Q a'n 'First Q b'n 'Second Q'n '3rd Q.'n; /* name literals -- not recommended */
label 'First Q a'n = 'First question, part a)'
'First Q b'n = 'Wrong label'
'Second Q'n = 'Second question'
'3rd Q.'n = 'Again, incorrect label';
input s[*];
cards;
Yes No Maybe Never
;
filename namlab 'C:\Temp\correct_labels.csv';
data _null_;
file namlab;
put 'First Q b,"First question, part b)"' /
'3rd Q.,Third question';
run;
filename labels temp;
data _null_;
file labels;
infile namlab dlm=',' dsd end=last;
input name :$32. label :$256.;
length nl $35;
nl=nliteral(name);
if _n_=1 then put 'proc datasets lib=work nolist;' /
'modify have;' /
'label ';
put nl '=' label;
if last then put '; quit;';
run;
%inc labels;
@Ninna1 wrote:
I get a ; instead of the =
I have no idea why. However, I have allready spent too much time on this today and I have a deadline. So I will have to work on it some other time.
No problem. When you get back to this, please post the complete log of the step that doesn't work into a code box opened with the </> ("Insert Code") button.
Make sure to define nl long enough for any possible nliteral. You need 32 characters for the actual name plus three for the quotes and N suffix. But you also need up to 16 more for doubling up any included quotes. So use
length nl $51;
Thanks, @Tom, good point. Indeed, the "worst case" regarding variable names that I had envisioned was not bad enough.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.