Solved: PROC MEANS DISTINCT VALUES

jacobfitz · Posted 04-18-2017 03:36 PM

Hello,

I have the following data set and need to summarize in the manner of the following code. However, instead of outputting to total for the variable PS, I need the distinct total only.

proc means data=WORK.TEST noprint;

by R TC RN SS SH;

var PS;

output out= TEST1 sum (E)=E_F n(PS)=NB_PS;

run;

RN	E	TC	R	SS	SH	PS
1	707	5003	A02	SS		2028106601
1	698	5003	A02	SS		2028106601
1	1119	5003	A02	SS		2028106601
1	2099	5003	A02			2028612941
1	1370	5003	A02			2024667991
1	865	5003	A02			2028221751
1	3026	5003	A02			2028221751
1	1008	5003	A02			2025744721
1	604	6003	A09		SH	2027282411
1	1285	6003	A09		SH	2027282411
1	2107	6003	A09			2025877621
1	854	6003	A09			2025250411
1	955	6003	A09			2024224891
1	502	6003	A09			2022288221

So instead of getting the output:

R	TC	RN	SS	SH	E_F	NB_PS
A02	5003	1	SS		8368	3
A02	5003	1			2524	5
A09	6003	1		SH	1889	2
A09	6003	1			4418	4

I need the output:

R	TC	RN	SS	SH	E_F	NB_PS
A02	5003	1	SS		8368	1
A02	5003	1			2524	4
A09	6003	1		SH	1889	1
A09	6003	1			4418	4

Thank you for any help!

Reeza · Posted 04-18-2017 04:20 PM

Here's the structure for a PROC SQL solution.

Proc sql;

create table want as

select R, TC, RN, SS, SH, /*grouping variables*/

/*aggregate functions*/

sum(e) as e_f,

count(distinct PS) as N_PS

from test

group by R, TC, RN, SS, SH

quit;

View solution in original post

Reeza · Posted 04-18-2017 03:41 PM

Explain your logic please?

Where does RN come from?

Do you need the SUM for all records or only the distinct records and how do you identify the distinct records if this is the case?

Proc means cannot do a distinct count, you can try SQL or a double PROC FREQ instead.

jacobfitz · Posted 04-18-2017 03:46 PM

RN is a number assigned to each PS. It can be 1,51,55 etc. In this specific example they are all 1.

I need E_F to be the sum of all E but NB_PS needs to be the count of distinct PS. I am not sure how to get the output I need using SQL or PROC FREQ.

Reeza · Posted 04-18-2017 04:20 PM

Here's the structure for a PROC SQL solution.

Proc sql;

create table want as

select R, TC, RN, SS, SH, /*grouping variables*/

/*aggregate functions*/

sum(e) as e_f,

count(distinct PS) as N_PS

from test

group by R, TC, RN, SS, SH

quit;

jacobfitz · Posted 04-18-2017 04:51 PM

Thank you! This is exactly what I needed.

PaigeMiller · Posted 04-18-2017 03:43 PM

This is a job for PROC FREQ, with a BY statement.

The first run of PROC FREQ will tell you, for example, that 2028106601 appears 3 times in it's BY group. The second run of PROC FREQ will tell you that there is only one value of NB_PS in that BY group.

--
Paige Miller

PROC MEANS DISTINCT VALUES

Re: PROC MEANS DISTINCT VALUES

Re: PROC MEANS DISTINCT VALUES

Re: PROC MEANS DISTINCT VALUES

Re: PROC MEANS DISTINCT VALUES

Re: PROC MEANS DISTINCT VALUES

Re: PROC MEANS DISTINCT VALUES

SAS Innovate 2025: Call for Content

Classroom Training Available!