Hello fellow programmers!
I have tried to figure this out myself, but I guess I need help:
I have a large amount of survey data, and all the answer-variables have labels that are the survey questions - naturally. Not all of them are correct from the beginning, so I put them in a csv file and send them to be reviewed. Usually there is not anything to correct, so I just leave it there. If there is a single mistake i just correct it manually in my programming.
But now, there is many variables that needs new labels, so I would like to be able to do it more clever than copy/paste.
The csv file is just two collums, one with the variable names and one with the labels.
Any ideas or tips?
Looking forward to hearing from you!
Regards
/*
Post_title:
Labelling my data with labels corrected in a CSV file.
Link to post:
https://communities.sas.com/t5/SAS-Programming/Labelling-my-data-with-labels-corrected-in-a-CSV-file/m-p/775060
Important post details:
The csv file is just two collums, one with the variable names and one with the labels.
*/
*if you are using SAS OnDemand for Academics (ODA)
change your-userid-on-oda for your userid
if you are using another option then change
/home/your-userid-on-oda/ for a path/folder
you can use;
%let mypath = /home/your-userid-on-oda/;
data work.dataset_with_labels;
variable_name = "abc";
variable_label = "label-for-abc";
run;
proc export data = work.dataset_with_labels
outfile = "&mypath.dataset_with_labels.csv"
replace;
run;
proc import datafile = "&mypath.dataset_with_labels.csv"
out = work.imported_dataset_with_labels
replace;
run;
Does this make sense for you?
Best regards, Jos
Hello @Ninna1,
So, you have
Let me create sample data for demonstration:
data have;
array s[*] $ Q1a Q1b Q2 Q3;
label Q1a='First question, part a)'
Q1b='Wrong label'
Q2 ='Second question'
Q3 ='Again, incorrect label';
input s[*];
cards;
Yes No Maybe Never
;
filename namlab 'C:\Temp\correct_labels.csv';
data _null_;
file namlab;
put 'Q1b,"First question, part b)"' /
'Q3,Third question';
run;
To correct the labels in dataset HAVE you can create PROC DATASETS code (in a temporary file) using the information from the CSV file and then run it per %INCLUDE statement:
filename labels temp;
data _null_;
file labels;
infile namlab dlm=',' dsd end=last;
input name :$32. label :$256.;
if _n_=1 then put 'proc datasets lib=work nolist;' /
'modify have;' /
'label ';
put name '=' label;
if last then put '; quit;';
run;
%inc labels;
Thank you so much for your reply @FreelanceReinh ! I seems like it could work!
However I get an error. It doesn't seem to accept the:
put name '=' label;
I get this :
ERROR 73-322: Expecting an =.
ERROR 76-322: Syntax error, statement will be ignored.
We need to see an example of your csv file with names and labels to provide tested code that works with it.
Copy/paste this text into a box opened with this button:
@Ninna1 wrote:
Thank you so much for your reply @FreelanceReinh ! I seems like it could work!
However I get an error. It doesn't seem to accept the:
put name '=' label;
I get this :
ERROR 73-322: Expecting an =.
ERROR 76-322: Syntax error, statement will be ignored.
I suspect that you are using non-standard variable names, e.g. containing blanks such as Question 1a. In this case you will have
options validvarname=any;
in effect and you need to modify the data _null_ step as follows in order to handle name literals:
data _null_; file labels; infile namlab dlm=',' dsd end=last; input name :$32. label :$256.; length nl $35; /* safer: $51 */ nl=nliteral(name); if _n_=1 then put 'proc datasets lib=work nolist;' / 'modify have;' / 'label '; put nl '=' label; if last then put '; quit;'; run;
Edit: Here is a complete example using name literals:
options validvarname=any;
data have;
array s[*] $ 'First Q a'n 'First Q b'n 'Second Q'n '3rd Q.'n; /* name literals -- not recommended */
label 'First Q a'n = 'First question, part a)'
'First Q b'n = 'Wrong label'
'Second Q'n = 'Second question'
'3rd Q.'n = 'Again, incorrect label';
input s[*];
cards;
Yes No Maybe Never
;
filename namlab 'C:\Temp\correct_labels.csv';
data _null_;
file namlab;
put 'First Q b,"First question, part b)"' /
'3rd Q.,Third question';
run;
filename labels temp;
data _null_;
file labels;
infile namlab dlm=',' dsd end=last;
input name :$32. label :$256.;
length nl $35;
nl=nliteral(name);
if _n_=1 then put 'proc datasets lib=work nolist;' /
'modify have;' /
'label ';
put nl '=' label;
if last then put '; quit;';
run;
%inc labels;
@Ninna1 wrote:
I get a ; instead of the =
I have no idea why. However, I have allready spent too much time on this today and I have a deadline. So I will have to work on it some other time.
No problem. When you get back to this, please post the complete log of the step that doesn't work into a code box opened with the </> ("Insert Code") button.
Make sure to define nl long enough for any possible nliteral. You need 32 characters for the actual name plus three for the quotes and N suffix. But you also need up to 16 more for doubling up any included quotes. So use
length nl $51;
Thanks, @Tom, good point. Indeed, the "worst case" regarding variable names that I had envisioned was not bad enough.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.