I am new to SAS so forgive me if this is basic. I have a data set with 150,000+ rows, each with a unique ID, and 250 columns, all columns must be retained. However for this specific inquiry, I am only interested in 13 columns; each column is a variable with a valueof "0" "1" "7" or "9". I would like to make a new column that counts the number of times a "1" populates a column cell across the row/observation.
For example:
count of "1"
ID Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8................WANT
152 1 0 1 0 1 9 1 7 4
153 0 0 7 7 1 9 0 9 1
154 1 9 1 9 9 0 0 9 2
155 7 1 9 1 9 1 1 9 4
156 9 0 0 9 0 0 0 0 0
157 9 7 9 0 7 1 0 0 1
I have tried solutions from other forums with similar problems haven't found anything that has worked for me. I appreciate the help!!
A "variable" only occurs once per observation. So no need to count.
A VALUE on the other hand is a different matter.
Are your variables actually numeric or character? This would work for either though if the variables ar
data want; set have; countval = countc(cats(of col1-col13),'1'); run;
If your actual variables are not consecutively numbered then list them instead of the "of col1 - col13" I used above: countc (cats(col1,col4,col11,<finish the list>),'1');
The countc function counts characters, so would not work if you have values like 11 (which would be two 1's), the Cats function concatenates all of the variables into a single string that countc wants as the "search" string. The second parameter for the countc function is the specific characters to search for. So if you place '79' in the second position you would get count of 7 and 9.
Would I be correct guessing that you have survey data with question answers of Yes (1) , No(0), Don't know(7) and Refused (9) ?
This looks like a classic application of the array statement, followed by a loop over that array, as in:
data want;
set have;
array testvals col1 col2 ... col8 .. ;
do over testvals;
want=sum(want,testvals in (0,1,7,9));
end;
run;
The ARRAY statement names an array whose elements are col1 col2 ... col8 ....
the DO OVER arrayname loops over those elements. The "testvals in (0,1,7,9) returns a 1 or 0 depending no whether the IN condition is satisfied.
The more modern use of array in sas wold use array names with subscripts:
data want;
set have;
array testvals {*} col1 col2 ... col8 .. ;
do i=1 to dim(testvals);
want=sum(want,testvals{i} in (0,1,7,9));
end;
drop i;
run;
But for this task, I like the old way - no need to create, and then drop the i variable.
BTW, you are counting the occurrence of a value across columns, not occurrence of a variable.
data have;
input ID Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8;*................WANT ;
cards;
152 1 0 1 0 1 9 1 7 4
153 0 0 7 7 1 9 0 9 1
154 1 9 1 9 9 0 0 9 2
155 7 1 9 1 9 1 1 9 4
156 9 0 0 9 0 0 0 0 0
157 9 7 9 0 7 1 0 0 1
;
data want;
set have;
want=countc(cats(of col1-col8),'1');
run;
Hello,
I am having a similar situation where I need to count another value as well. In the example you have, I would like to count either 1 or 9, which would be 5 instead of 4 in the variable "want".
I would appreciate it if you could help me on this.
Thank you!
@SAS_SB wrote:
Hello,
I am having a similar situation where I need to count another value as well. In the example you have, I would like to count either 1 or 9, which would be 5 instead of 4 in the variable "want".
I would appreciate it if you could help me on this.
Thank you!
Re-read the marked answer as that case was already covered. Instead of passing the one character string '1' to the COUNTC() function you want to pass the two character string '19'.
For the array based solutions instead of test if the value is equal to 1 you can test if the value is in the list 1 9.
Got it, thanks @Tom !
A "variable" only occurs once per observation. So no need to count.
A VALUE on the other hand is a different matter.
Are your variables actually numeric or character? This would work for either though if the variables ar
data want; set have; countval = countc(cats(of col1-col13),'1'); run;
If your actual variables are not consecutively numbered then list them instead of the "of col1 - col13" I used above: countc (cats(col1,col4,col11,<finish the list>),'1');
The countc function counts characters, so would not work if you have values like 11 (which would be two 1's), the Cats function concatenates all of the variables into a single string that countc wants as the "search" string. The second parameter for the countc function is the specific characters to search for. So if you place '79' in the second position you would get count of 7 and 9.
Would I be correct guessing that you have survey data with question answers of Yes (1) , No(0), Don't know(7) and Refused (9) ?
@kirkends wrote:
This is exactly what I was looking for. Sorry about the term mix up I am still learning! This does explains why me searches were fruitless.... I am working on Survey Data and 1=yes, 0=No, 7=unknown, 9=missing, I just converted characters to numerics for ease of uses in other inquiries.
Thank you
There is absolutely no reason to apologize for converting the codes. I would myself. And could very well create re-coded variables setting the 7 and 9 to missing to report on prevalence among respondents that answered the question yes or no.
Arrays are a perfect approach to do that.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.