I have the following data set, and I want to remove the rows that Q1_1345 to Q14b_6061 are all blanks. In this case, I want my output excludes the entire row of Program 302 and 402.
State | City | District | Program | Q1_1345 | Q2_1112 | Q3_1236 | Q4_8626 | Q5_1457 | Q6_5656 | Q7_8989 | Q8_1568 | Q9_8521 | Q10_0123 | Q11_2456 | Q12_9630 | Q13_2121 | Q14a_6060 | Q14b_6061 |
AK | Anchorage | A | 301 | 1 | Yes | Yes | Yes | 2 | 50 | Yes | A1 | 86 | 54 | 405 | Yes | Yes | Yes | 1 |
AK | Sitka | A | 302 | |||||||||||||||
IL | Chicago | A | 401 | 5 | No | No | No | 2 | 4 | No | C2 | 20 | 74 | 54 | Yes | Yes | Yes | 1 |
AL | Mobile | A | 262 | 54 | Yes | Yes | Yes | 21 | 4 | No | C2 | 40 | 65 | 40 | No | No | Yes | 1 |
IL | Chicago | B | 402 | |||||||||||||||
NY | New York | B | 998 | 1 | No | No | No | 2 | 44 | No | B3 | 56 | 10 | 40 | No | Yes | Yes | 1 |
ID | Nampa | A | 744 | 574 | 2 | 878 | No | D5 | 586 | 40 | 6 | Yes | No | No | 2 | |||
NC | Charlotte | A | 601 | 77 | Yes | Yes | Yes | 2 | 5852 | No | E1 | 748 | 0 | 74 | Yes | Yes | Yes | 1 |
HI | Honolulu | A | 101 | 74 | No | No | No | 2 | 42 | No | D4 | No | No | No | 2 |
Function cmiss accepts both numeric and character variables. So you can use
if cmiss(of Q1_1345 -- Q14b_6061) = <number of variables> then delete;
Note the double dash between the first and last variables in the list.
Are the variables concerned numeric, character or a mix?
I ask because if they are all numeric then
if max(<put your variable list here>) = . then delete;
if mixed character then use
if nmiss(<list of the numeric variables>)= <the number of numeric variables> and cmiss(<list of character variables>) = <number of character variables> then delete.
Thank you for the reply. It is a mix of numeric and character. Is there a better way of listing variables out? Because the dataset has about 200 variables, the table i posted is a shorten version.
Function cmiss accepts both numeric and character variables. So you can use
if cmiss(of Q1_1345 -- Q14b_6061) = <number of variables> then delete;
Note the double dash between the first and last variables in the list.
There are a number of ways of listing varaibles:
Q: (note the colon) would all variables that start with Q
Q13: all variables that start with Q13
if the Names are sequential and of the same type (all numeric or all character)
Q13245 - Q14456 would work,
two -- can be used to signify those that are adjacent left to right such as
Q15_23 -- Q27_abc
You can use mulitple groups of these in a single function call sucha as
Max(q13: , Q14_1 - Q15_6, Q27_abc -- Q39end)
When I looked at your Q1_1345 Q2_1112 Q3_1236 I did wonder what the _xxxx added to the variable name instead of Q1 Q2 Q3.
And of the variable you shows the only one that I would possibly have had as character was Q8. And if the range of values was small I may have had a custom informat/format pair to have that as numeric as well. I would say that for 99% of variable coded as Yes/No they should be read with an informat to assign 1/0.
Hi,
it seems you have more Q variable to check than 'KeyVariables'
You could use this:
%MACRO DeleteEmptyObs(inputLib=,inputDs=,outputDs=,keyVars=); %let keyVars=%upcase(&keyVars.); PROC SQL noprint; SELECT 'missing('||strip(name)||')' into :MissingNames separated by ' AND ' FROM sashelp.vcolumn WHERE libname eq upcase("&inputLib.") AND memname eq upcase("&inputDS.") AND upcase(name) not in ("&keyVars.") ; QUIT; DATA &outputDs.; set &inputLib..&inputDS.; if &MissingNames. then delete; RUN; %MEND DeleteEmptyObs; %DeleteEmptyObs(inputLib=sashelp,inputDs=class,outputDs=work.test,keyVars=%str(name,age,sex));
- Cheers -
CODE NOT TESTED. options missing=' '; data want; set have; if missing(cats(of Q1_1345 -- Q14b_6061 ) ) then delete; run;
One more tip I didn't see listed in comments:
You need to know the order of variables in order to use the convenient
varnam1--varnamex method.
You can get that via
proc contents data =[dataset] position;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.