I imported two csv files with survey response data into SAS 9.4 M8. The files are identical with one exception: one has the responses coded, and one has the actual formats. I need assistance with creating a format library. I currently have 359 (inefficiently created) data sets, one for each question, and all sets have been named DATA001, DATA002 ... DATA359. I dedup’d them, so they all look like the one below (except with no blank rows). If it matters, the numeric responses are not always sequential, nor do they begin at 1, nor are they intuitively assigned. By the way, this was messy without the Access to PC Files.
gain | gainf | |||
1 | Yes | |||
2 | No | |||
3 | I don't know | |||
gender | Genderf | |||
1 | Male | |||
2 | Female | |||
3 | Prefer not to answer | |||
marital status | marital statusf | |||
2 | Divorced | |||
3 | Married | |||
4 | Single | |||
5 | Unknown | |||
6 | Widowed |
Sticking with what you have this should work.
libname projfmt "/home/fkhurshed/CLASS1";
options obs=max fmtsearch=(projfmt);
data data001;
infile cards dsd truncover;
input gain gainf $20.;
cards;
1, Yes
2, No
3, I Don't Know
;
data data002;
infile cards dsd truncover;
input gender genderf $25.;
cards;
1, Female
2, Male
3, Prefer not to answer
;
%macro uploadfmt();
%do i=1 %to 2;
%let dsnin= data%sysfunc(putn(&i, z3));
*get list of variable names;
proc contents data=&dsnin out=varlist noprint;
run;
*rename variables;
data _null_;
set varlist end=eof;
if _n_=1 then
do;
call execute("data _&dsnin.; set &dsnin;");
call execute(cat("rename ", name, '=start;'));
end;
if _n_=2 then
do;
call execute(cat("rename ", name, '=label;'));
call execute(catt("fmtname='", name, "';"));
end;
if eof then
call execute('run;quit;');
run;
*create format;
proc format cntlin=_&dsnin. lib=projfmt;
run;
proc sql;
drop table varlist;
drop table _&dsnin.;
quit;
%end;
%mend;
%uploadfmt();
proc format cntlout=want lib=projfmt;
run;
Hi, Reeza. Yes to the CNTLIN. As for the format name, they provided me one data set with 1 2 3 and one data set with Yes No IDK. I merged them to get what we see there. There are a lot of questions (some with the same question numbers) so I'll just number the questions sequentially (except for YN since that's repeated). No extra lines.
Your description is unclear but I think you are saying that you have two CSV files with the same number of columns and the same number of lines and the same header row (variable names). But in one of the files the values are the CODES and the other file the values are the DECODES. And you want to use this to derive the formats that were used to convert the CDOES into the DECODES.
It might be simplest to ignore the variable names right now.
But it does helps to know how many columns (variables) there are.
So let's first make a simple example pair of CSV files with 3 variables.
filename codes temp;
filename decodes temp;
options parmcards=codes;
parmcards4;
name,sex,age
Alfred,M,14
Alice,F,13
Barbara,F,13
;;;;
options parmcards=decodes;
parmcards4;
name,sex,age
Alfred,Male,Older
Alice,Female,Younger
Barbara,Female,Younger
;;;;
Now we just need to read in both files at once and make some VALUE/LABEL pairs and then summarize.
data code_decode ;
row +1 ;
do col=1 to 3;
infile codes dsd firstobs=2 truncover ;
input value :$100. @ ;
infile decodes dsd firstobs=2 truncover ;
input label :$100. @ ;
output;
end;
run;
proc freq data=code_decode ;
tables col*value*label / missing list noprint out=formats ;
run;
Results
Obs col value label COUNT PERCENT 1 1 Alfred Alfred 1 11.1111 2 1 Alice Alice 1 11.1111 3 1 Barbara Barbara 1 11.1111 4 2 F Female 2 22.2222 5 2 M Male 1 11.1111 6 3 13 Younger 2 22.2222 7 3 14 Older 1 11.1111
At this point you will want to make some decisions about which variables need to have formats defined.
You might also need to decide whether you need to make FORMATs so you can display the CODE values as the DECODE values.
Or if you want to create INFORMATs do you can read the CODE values and generate the DECODE values into the dataset. (Or perhaps the reverse to read the DECODE values and generate the CODE values into the dataset.)
Sticking with what you have this should work.
libname projfmt "/home/fkhurshed/CLASS1";
options obs=max fmtsearch=(projfmt);
data data001;
infile cards dsd truncover;
input gain gainf $20.;
cards;
1, Yes
2, No
3, I Don't Know
;
data data002;
infile cards dsd truncover;
input gender genderf $25.;
cards;
1, Female
2, Male
3, Prefer not to answer
;
%macro uploadfmt();
%do i=1 %to 2;
%let dsnin= data%sysfunc(putn(&i, z3));
*get list of variable names;
proc contents data=&dsnin out=varlist noprint;
run;
*rename variables;
data _null_;
set varlist end=eof;
if _n_=1 then
do;
call execute("data _&dsnin.; set &dsnin;");
call execute(cat("rename ", name, '=start;'));
end;
if _n_=2 then
do;
call execute(cat("rename ", name, '=label;'));
call execute(catt("fmtname='", name, "';"));
end;
if eof then
call execute('run;quit;');
run;
*create format;
proc format cntlin=_&dsnin. lib=projfmt;
run;
proc sql;
drop table varlist;
drop table _&dsnin.;
quit;
%end;
%mend;
%uploadfmt();
proc format cntlout=want lib=projfmt;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.