Hi there,
I had this same issue with a public data set. The instructions to the data set labeled the values of -996 as a V, for a valid skip. When you try to filter for -996 or V, it doesn't work.
You have to harness the VVALUEX formatting of the column. Below is my solution. I am changing the format for 'valid skips', missing values, 'errors', 'refused', and 'don't know'.
1. I first create a table that has all of the column variable names, their order and their data type.
2. I determine the max size of the table.
3. I throw that table reference table into a macro and sequence them to the maximum record value.
4. The macro cycles through each column name and data type associated with the reference table Ith value
5. Logic is applied.
6. Lastly, a proc freq of the new table outputs the distribution of new codes. I recommend taking a before and after snapchat to validate the code is correct.
*** OUTPUT ALL TABLE CONTENTS TO A NEW TABLE ***;
PROC CONTENTS DATA= OUTFILE.AGES5_17 OUT= COLUMN_NAMES NOPRINT;
RUN;
*** FORMAT THE COLUMN NAMES TO BE INCLUDED IN THE MACRO FOR VARIABLE AND DATA TYPE COLUMNS ***;
PROC SQL;
CREATE TABLE COLUMN_NAMES2 AS
SELECT VARNUM AS POSITION, NAME
,CATS("'",NAME,"'") AS VARIABLE
,TYPE
,CASE WHEN TYPE EQ 1 THEN "'NUMERIC'" ELSE "'CHARACTER'" END AS VARIABLE_TYPE FORMAT = $15.
FROM COLUMN_NAMES
ORDER BY VARNUM
;QUIT;
*** DETERMINE THE MAXIMUM NUMBER OF RECORDS IN THE TABLE ***;
PROC SQL NOPRINT;
SELECT MAX INTO: MAX
FROM (SELECT MAX(POSITION) AS MAX FROM COLUMN_NAMES2)
;QUIT;
%PUT MAX NUMBER OF COLUMNS === &MAX.;
*** TEST FILE ***;
DATA TEST ;
SET OUTFILE.AGES5_17 (OBS = 1000);
RUN;
*** THESE OPTIONS MUTES THE SAS LOG ***;
OPTIONS NOPRINTMSGLIST NOMPRINT NONOTES NOSOURCE NOSOURCE2 ERROR = 0;
*** THIS MACRO ASSIGNS THE TARGET FILE, ITERATES THROUGH EACH REFERENCE VARIABLE AND DATA TYPE UNTIL THE MAX RECORD IS REACH AND APPLIES THE LOGIC TO THE VVALUEX FORMAT OF THE COLUMN ***;
%MACRO VARS(FILE);
%DO I = 1 %TO &MAX.;
PROC SQL NOPRINT;
SELECT NAME INTO: VARLIST_&I.
FROM COLUMN_NAMES2
WHERE POSITION EQ &I.
;QUIT;
PROC SQL NOPRINT;
SELECT VARIABLE_TYPE INTO: TYPELIST_&I.
FROM COLUMN_NAMES2
WHERE POSITION EQ &I.
;QUIT;
%PUT VARIABLE NAME === &&&VARLIST_&I. ||| VARIABLE TYPE === &&&TYPELIST_&I.;
DATA TEST /*(KEEP = &&&VARLIST_&I.)*/;
SET TEST;
IF COMPRESS(VVALUE(&&&VARLIST_&I.)) = 'V' THEN
DO;
&&&VARLIST_&I. = 996;
OUTPUT;
END;
ELSE IF COMPRESS(VVALUE(&&&VARLIST_&I.)) = 'R' OR COMPRESS(VVALUE(&&&VARLIST_&I.)) = 'D'
OR COMPRESS(VVALUE(&&&VARLIST_&I.)) = .
THEN
DO;
&&&VARLIST_&I. = 999;
OUTPUT;
END;
ELSE IF COMPRESS(VVALUE(&&&VARLIST_&I.)) = 'E' THEN
DO;
&&&VARLIST_&I. = 994;
OUTPUT;
END;
ELSE
DO;
&&&VARLIST_&I. = &&&VARLIST_&I.;
OUTPUT;
END;
%END;
%MEND VARS;
%VARS(FILE = OUTFILE.AGES5_17);
PROC FREQ DATA = TEST;
TABLES MARITAL BARCODE_FAHITEMS / MISSING;
RUN;
HOPE THIS HELPS!!!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.