BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Franck1048
Fluorite | Level 6

My data has columns with values that are either missing or 1 :

 

data test;
input id error1 error2;
cards;
23	1	.
2	.	1
345	1	1
3	1	.
45	1	.
67	.	1
789	.	1
89	.	1
34	.	1
21	1	1
;

I want to output a report that counts the number of error1 occurences and its percentage of the total observations. I also need the report to output the same information for error2, error3, error4.. and so on if they exist .. basically for error* regardless of the number of error columns.

 

Example of result I'm looking for :

 

All

error1

 

error2

 

error3..

 

Count

Count

Percent

Count

Percent

Count

Percent

10

5

50%

7

70%

 

 

 

 

 Thank you

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Can you replace missing with 0?

If so, then you can run a proc means. You want the sum and mean of the variables and you get them in an output data set called want. 

 

data test;
input id error1 error2;
if error1=. then error1=0;
if error2=. then error2=0;
cards;
23	1	.
2	.	1
345	1	1
3	1	.
45	1	.
67	.	1
789	.	1
89	.	1
34	.	1
21	1	1
;
run;

proc means data=test n sum mean stackods;
var error1-error2; *list error variables here;
ods output summary=want;
run;

proc print data=want;
format mean percent12.1;
run;

@Franck1048 wrote:

My data has columns with values that are either missing or 1 :

 

data test;
input id error1 error2;
cards;
23	1	.
2	.	1
345	1	1
3	1	.
45	1	.
67	.	1
789	.	1
89	.	1
34	.	1
21	1	1
;

I want to output a report that counts the number of error1 occurences and its percentage of the total observations. I also need the report to output the same information for error2, error3, error4.. and so on if they exist .. basically for error* regardless of the number of error columns.

 

Example of result I'm looking for :

 

All

error1

 

error2

 

error3..

 

Count

Count

Percent

Count

Percent

Count

Percent

10

5

50%

7

70%

 

 

 

 

 Thank you

 


 

View solution in original post

4 REPLIES 4
Reeza
Super User

Can you replace missing with 0?

If so, then you can run a proc means. You want the sum and mean of the variables and you get them in an output data set called want. 

 

data test;
input id error1 error2;
if error1=. then error1=0;
if error2=. then error2=0;
cards;
23	1	.
2	.	1
345	1	1
3	1	.
45	1	.
67	.	1
789	.	1
89	.	1
34	.	1
21	1	1
;
run;

proc means data=test n sum mean stackods;
var error1-error2; *list error variables here;
ods output summary=want;
run;

proc print data=want;
format mean percent12.1;
run;

@Franck1048 wrote:

My data has columns with values that are either missing or 1 :

 

data test;
input id error1 error2;
cards;
23	1	.
2	.	1
345	1	1
3	1	.
45	1	.
67	.	1
789	.	1
89	.	1
34	.	1
21	1	1
;

I want to output a report that counts the number of error1 occurences and its percentage of the total observations. I also need the report to output the same information for error2, error3, error4.. and so on if they exist .. basically for error* regardless of the number of error columns.

 

Example of result I'm looking for :

 

All

error1

 

error2

 

error3..

 

Count

Count

Percent

Count

Percent

Count

Percent

10

5

50%

7

70%

 

 

 

 

 Thank you

 


 

Franck1048
Fluorite | Level 6

I can indeed replace the missing with zeros, as I have control over the input data.

 

Using proc means I get this result :

  

Obs.VariableNSumMean
1error110550,00%
2error210770,00%

 

It gives me the results I want, except in a row format, instead of columns. I tried different methods to get the output in columns but I can't wrap my head around it.

 
Reeza
Super User

lol, I explictly tried to get that format cause it's cleaner. If you want the observations in columns remove the STACKODS option from the PROC MEANS. 

 


@Franck1048 wrote:

I can indeed replace the missing with zeros, as I have control over the input data.

 

Using proc means I get this result :

  

Obs. Variable N Sum Mean
1 error1 10 5 50,00%
2 error2 10 7 70,00%

 

It gives me the results I want, except in a row format, instead of columns. I tried different methods to get the output in columns but I can't wrap my head around it.

 

 

Franck1048
Fluorite | Level 6

Yeah I can see it's a lot cleaner. The _N columns clutter the second option way too much. Thank you for your help

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 789 views
  • 2 likes
  • 2 in conversation