BookmarkSubscribeRSS Feed
monday89
Fluorite | Level 6

I have the following table:

 

NameDATEactual
11111/2/20191
11111/2/20191
11111/4/20190
11111/5/20190
11111/6/20190
22221/4/20191
22221/5/20191
22221/6/20191
33331/6/20191
33331/10/20190
33331/10/20190
33331/13/20191
33331/14/20191
33331/15/20191

 

I want total counts per Name (if two dates are the same twice, count as twice) and then only 1 for actual (sum) 

 

NAMETOTAL_COUNTSACTUAL_COUNTS
111152
222233
333364

 

I tried the nested loop but no luck

 


proc sql;

select DISTINCT
a.NAME,
count(a.DATE) as Total_counts,
a.actual_counts
from (select distinct a.NAME,
count(a.actual) as actual_counts
from sql.p_IE_ELIGIBILITY a
where a.actual="1"
group by a.NAME)
group by a.NAME as a full join
(select distinct a.NAME, count(*) as total from TEST) as b
on a.NAME = b.NAME;
quit;

13 REPLIES 13
PaigeMiller
Diamond | Level 26

No looping is needed here. You don't have to write your own code to perform simple operations like counting. SAS has many many built-in procedures that will do what you want, which is counting. It would certainly be a good idea for you to familarize yourself with many of the basic procedures, such as PROC MEANS/PROC SUMMARY and PROC FREQ, as this will enable you to work a lot faster than writing your own looping code.

 

PROC SUMMARY solution

 

proc summary data=have nway;
    class name;
    var actual;
    output out=want n=total_counts sum=actual_counts;
run;

 

 

--
Paige Miller
novinosrin
Tourmaline | Level 20

data have;
input Name	DATE :mmddyy10.	actual;
cards;
1111	1/2/2019	1
1111	1/2/2019	1
1111	1/4/2019	0
1111	1/5/2019	0
1111	1/6/2019	0
2222	1/4/2019	1
2222	1/5/2019	1
2222	1/6/2019	1
3333	1/6/2019	1
3333	1/10/2019	0
3333	1/10/2019	0
3333	1/13/2019	1
3333	1/14/2019	1
3333	1/15/2019	1
;

proc sql;
create table want as
select name,count(name) as TOTAL_COUNTS,sum(actual) as ACTUAL_COUNTS
from have
group by name;
quit;
JeanDo
Obsidian | Level 7

Hello,

Did you try this:

 

proc sql;

select DISTINCT
NAME,
count(DATE) as Total_counts,
sum(actual_counts) as actual_counts
from sql.p_IE_ELIGIBILITY
group by name; quit;

 I think it will be easier!

 

Regards,

 

JD

yabwon
Onyx | Level 15

Hi,

 

do you mean something like this:

 

data have;
input
Name : $ DATE : mmddyy.	actual
;
format date yymmdd10;
cards;
1111 1/2/2019 1
1111 1/2/2019 1
1111 1/4/2019 0
1111 1/5/2019 0
1111 1/6/2019 0
2222 1/4/2019 1
2222 1/5/2019 1
2222 1/6/2019 1
3333 1/6/2019 1
3333 1/10/2019 0
3333 1/10/2019 0
3333 1/13/2019 1
3333 1/14/2019 1
3333 1/15/2019 1
;
run;

proc sql;
select 
  name
, count(date) as TOTAL_COUNTS
, count(distinct case when actual = 1 then date else . end) as ACT_COUNTS
from 
have
group by name
;
quit;

output:

Name TOTAL_COUNTS ACT_COUNTS 
1111 5 1 
2222 3 3 
3333 6 4 
_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



ed_sas_member
Meteorite | Level 14

Hi @monday89 

 

You can do it either through a proc means or a proc sql:

proc means data=have noprint;
	var actual;
	class Name;
	ways 1;
	output out=want (drop =_type_ _freq_) n=TOTAL_COUNTS sum=ACTUAL_COUNTS;
run;
proc sql;
	create table want as
	select name,
		   count(actual) as TOTAL_COUNTS,
		   sum(actual) as ACTUAL_COUNTS
	from have
	group by name;
quit;

 Capture d’écran 2020-02-05 à 15.37.28.png

 

mkeintz
PROC Star

Allow me to offer some unsolicited advice, whose bottom line is learn to use the proc summary (identical to "proc means") solution offered by @PaigeMiller and @ed_sas_member .

 

Sure SQL can be used to do this task, but analysis and tabulation of this sort are the raison d'etre of SAS.  In  particular proc summary/means can do lots of aggregation and many statistics that would require twisting yourself into knots when using SQL.

 

Using SQL to solve many common (and not-so-common) tabulation tasks is like treating all problems as nails suitable for the SQL hammer.  Yes, it's a nice hammer, but the other tools are there for good reason.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

@mkeintz wrote:

Allow me to offer some unsolicited advice, whose bottom line is learn to use the proc summary (identical to "proc means") solution offered by @PaigeMiller and @ed_sas_member .

 

Sure SQL can be used to do this task, but analysis and tabulation of this sort are the raison d'etre of SAS.  In  particular proc summary/means can do lots of aggregation and many statistics that would require twisting yourself into knots when using SQL.

 

Using SQL to solve many common (and not-so-common) tabulation tasks is like treating all problems as nails suitable for the SQL hammer.  Yes, it's a nice hammer, but the other tools are there for good reason.


In addition, if you have a case where you need to slice the data two (or more) different ways — for example, slice the data by state to compute state-level statistics; and also slice the data by county to produce county level statistics — SQL requires two PROC SQL blocks and two passes through the data, while PROC SUMMARY requires one PROC SUMMARY block and one pass through the data. Not only that, but SUMMARY handles missing values and computes weighted statistics properly while SQL doesn't always do that; and SUMMARY has many built in statistics that I don't think SQL has built-in (although since I don't use SQL for this purpose, maybe I'm wrong, can SQL compute the 99th percentile statistic?).

--
Paige Miller
yabwon
Onyx | Level 15

Hi @PaigeMiller ,

 

According to the SAS 9.4M6 doc this are SQL's aggregation functions:

FunctionDefinition
AVG, MEAN
mean or average of values
COUNT, FREQ, N
number of nonmissing values
CSS
corrected sum of squares
CV
coefficient of variation (percent)
MAX
largest value
MIN
smallest value
NMISS
number of missing values
PRT
probability of a greater absolute value of Student's t
RANGE
range of values
STD
standard deviation
STDERR
standard error of the mean
SUM
sum of values
SUMWGT
sum of the WEIGHT variable values
T
Student's t value for testing the hypothesis that the population mean is zero
USS
uncorrected sum of squares
VAR
variance

so no quantiles 🙂

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



PaigeMiller
Diamond | Level 26

The function MEDIAN() works in SQL, but not P99 and not MODE. Also, PROC SUMMARY contains ability to compute confidence intervals and t-statistics and their probability.

--
Paige Miller
yabwon
Onyx | Level 15

Hi @monday89 ,

 

you wrote:

I want total counts per Name (if two dates are the same twice, count as twice) and then only 1 for actual (sum) 

in such case should the value of ACTUAL_COUNTS for 1111 be 2? since the first date (1/2/2019) is doubled shouldn't it be 1?

 

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



novinosrin
Tourmaline | Level 20

Hi @yabwon  I like your diligence and wasn't sure of that part too. The WANT shown by OP seems to suggest a straight forward GROUP BY sum/count respectively. 

yabwon
Onyx | Level 15

@novinosrin 

 

That line of "specification" was the reason why I went into:

count(distinct case when actual = 1 then date else . end) as ACT_COUNTS

instead classic aggregation procedure like summary/means or tabulate. Lack of handling "twisted" aggregation logic is the reason why sometimes SQL "outperforms" good classic procedures. I wish I could use something like `variable * n(distinct)` in tabulate.

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



novinosrin
Tourmaline | Level 20

Fully agree. I might place my bet on your understanding to be the right comprehension of the requirement. After all, looking at OP's use of SQL syntax, it appears OP does seems know SQL and may not be the straight forward summary stat assumption of mine and others. Hmm time for a coffee break? Oh rather a corona beer to beat the corona virus. 🙂

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1366 views
  • 6 likes
  • 7 in conversation