BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
turcay
Lapis Lazuli | Level 10

I take a data with 2000 columns. Names of the columns are like q1,q2,q3.... I want to create a new table with the ones that have the  count of non missing(+zero) rows more than 70% in the column. I also want to join the some column whose percentage is less than 70 to the table. My question is , how do i get 2000 output for 2000 columns with a single program and how do i specify and add the column to be created to the data.

The table that I want to create should look like the one below.

 

Q1 resultPercantage

Q2 resultPercantage

Q3 resultPercantage

There are also some codes that I tried to create below.

%LET VARNAME =

Column1/*Q1*/

Column2/*Q2*/

Column3/*Q3*/

;

%percentage(0.70)

%macro procfreq(Var);

%macro percentage(deger);

PROC SQL;

Create Table new as

SELECT (tt.nonMissingCount/tt.toplamCount) as result format=percent10.2,

case

         when calculated result>&deger then out=yeni /*????*/

         else 'false /*????*/'

         end as yeniSonuc

from(

SELECT DISTINCT

(SELECT COUNT(&Var) as NonMissing from work.exampledata WHERE &Var ne 0) as nonMissingCount

,(SELECT COUNT(&Var) as SumCnt from work.exampedata) as sumCount from work.exampledata)as tt;

QUIT;

%mend;

%mend;

%macro loop;

%let i=1;

%do %while(%scan(&VARNAME.,&i.,%STR( ))~=);

%procfreq(%scan(&VARNAME.,&i.,%STR( )));

%let i=&i.+1;

%end;

%mend;

%loop;

 

I will be very greatful if I hear from you as soon as possible.

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisWard
SAS Employee

This code will create a table with one row for each column and the %age of records with a >0 value;  any splitting of records with >70% etc can then easily be accomplished.

 

 

%let colCnt=2000;
data out_data(keep=var pcntage);
	format pcntage percent7.2;
    array t t1-t&colCnt.;
	retain t1-t&colCnt. 0;  * retain some total counts;
	set x end = eof;
		array q q1-q&colCnt.;
		do i=1 to &colcnt;
		    t(i)=t(i)+(q(i)>0);  * add each non-missing to the total count ;
		end;
		if eof then do i=1 to &colcnt.;
			var='q'||strip(put(i,best.));  * create column name ;
			pcntage=t(i)/_n_; * calculate %age ;
			output; * output one row for each of the analyzed columns;
		end;
	run;

 

 

View solution in original post

11 REPLIES 11
Reeza
Super User
I doubt this is efficient but here's my macro:
https://gist.github.com/statgeek/c3a9ddcb002c469e9d61

Another way might be to transpose your data so you have a column identifying the variable and then the value and run a single proc sql or proc freq.
turcay
Lapis Lazuli | Level 10

Where this count value comes from ?

 

Number_Missing=&obs_count-count;

Reeza
Super User
from the proc freq output, the macro generates it.

At the bottom of the code is an example of how to execute the macro, but I'll admit the documentation is non existent.
data_null__
Jade | Level 19
Show example data. What you have followed by what you want to see.
turcay
Lapis Lazuli | Level 10

 

There are 4000 numbers of Q columns

 

sample.png

 

I want to see all of them percentage like below then i need to transpose it

 

Q1.png

 

 

Q1 87.50%;

Q2 50.50%;

Q3 40.50%;

Q4 90.50%;

Q5 65.50%;

Q6 70.50%;

Q7 60.50%;

Q8 80.50%;

Reeza
Super User

It looks like you're in EG?

 

Use Tasks>Transpose and switch your data structure such that it becomes

 

ID Question Value

1 Q1 1

1 Q2 0

1 Q3 1

....

 

 

1 Q5000 0

2 Q1 0

 

 

Then use Query Builder with an N() and NMISS() function to determine the number of missing with the Question in the GROUP By Column.

turcay
Lapis Lazuli | Level 10

Thanks Reeza, i'll check. On the other hand your code involves a lot of things. I just a little bit confused( https://gist.github.com/statgeek/c3a9ddcb002c469e9d61  ) Can you show me a simpler method to reach my loop and nonmissing aim,please .

 

Thank  you.

Reeza
Super User

Here's a worked example using both a proc transpose and the macro. The macro is inefficient but works quickly and across all types of variables, numeric/character. FYI You should show data in the text, rather than images, so we can copy and paste it into our code, it helps us to help you.  

 

*Generate sample data;
data have;
array q(20) q1-q20;
do id=1 to 100;
do i=1 to 20;
q(i)=floor(rand('uniform')*100+1);
x=rand('bernoulli', 0.1);
if x=1 then q(i)=.;
end;
output;
end;

drop i x;
run;

*check format;
proc print data=have(obs=5);
run;

*Flip dataset;
proc transpose data=have out=flipped(rename=(_name_=Question Col1=Value));
by id;
run;

*check format again;
proc print data=flipped(obs=10);
run;

*Count number missing using SQL step;
proc sql;
create table want as
select question, nmiss(value) as NMISS, count(question) as N,
	nmiss(value)/count(question) as PCT_MISS format=percent8.1
from flipped
group by question;
quit;

proc print data=want;
run;

*Sample output if used macro, make sure to run the macro first;
%sum_missing(work, have, want_macro);
proc print data=want_macro;
run;
Tom
Super User Tom
Super User

You do not need macro code, much less a maco loop.

Look at this question that just used PROC FREQ to generate the counts for ALL variables.

https://communities.sas.com/t5/Base-SAS-Programming/Missing-Values-for-character-variables/m-p/20924...

 

ChrisWard
SAS Employee

This code will create a table with one row for each column and the %age of records with a >0 value;  any splitting of records with >70% etc can then easily be accomplished.

 

 

%let colCnt=2000;
data out_data(keep=var pcntage);
	format pcntage percent7.2;
    array t t1-t&colCnt.;
	retain t1-t&colCnt. 0;  * retain some total counts;
	set x end = eof;
		array q q1-q&colCnt.;
		do i=1 to &colcnt;
		    t(i)=t(i)+(q(i)>0);  * add each non-missing to the total count ;
		end;
		if eof then do i=1 to &colcnt.;
			var='q'||strip(put(i,best.));  * create column name ;
			pcntage=t(i)/_n_; * calculate %age ;
			output; * output one row for each of the analyzed columns;
		end;
	run;

 

 

turcay
Lapis Lazuli | Level 10

Thanks a lot. I was very important for me thanks again 🙂

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1201 views
  • 0 likes
  • 5 in conversation