BookmarkSubscribeRSS Feed
wlierman
Lapis Lazuli | Level 10

I have a quick sql question that involves counting up several values by individual dates.

Contact_ID	     NAICS_Sector	     Sector_Type	      Date	    N
1056200	              GOODS	      AGRICULTURE FORESTRY	10-Jun-20	1
1069490		          GOODS	      AGRICULTURE FORESTRY	13-Jun-20	1
1012517		          GOODS	        MANUFACTURING	    13-Jun-20	1
1074445			      GOODS	        MANUFACTURING	    14-Jun-20	1
1072985		          GOODS	        MANUFACTURING	    14-Jun-20	1
951206			      GOODS	        MANUFACTURING	    14-Jun-20	1
1074884		          GOODS	        MANUFACTURING	    14-Jun-20	1
1074374			      GOODS	        MANUFACTURING	    14-Jun-20	1
1074312			      GOODS	        MANUFACTURING	    14-Jun-20	1
1077220			      GOODS	        MANUFACTURING	    15-Jun-20	1
1079736		          GOODS	        MANUFACTURING	    16-Jun-20	1
1083450		          GOODS	        CONSTRUCTION	    17-Jun-20	1
1087981			      GOODS	        MANUFACTURING	    18-Jun-20	1
1090459			      GOODS	      AGRICULTURE FORESTRY	18-Jun-20	1
1090413		          GOODS	      AGRICULTURE FORESTRY	18-Jun-20	1
9991088408			  GOODS	        MANUFACTURING	    18-Jun-20	1
1089620	              GOODS	        MANUFACTURING	    18-Jun-20	1
1092420	      		  GOODS	        MANUFACTURING	    19-Jun-20	1
9991088337			  GOODS	        MANUFACTURING	    19-Jun-20	1
1081349		          GOODS	        MANUFACTURING	    20-Jun-20	1
1098376			      GOODS	        MANUFACTURING	    20-Jun-20	1
1097263			      GOODS	        MANUFACTURING	    20-Jun-20	1
1098412		          GOODS	        CONSTRUCTION	    20-Jun-20	1
1080505		          GOODS	        MANUFACTURING	    20-Jun-20	1
1098407			      GOODS	        MANUFACTURING	    20-Jun-20	1
1098432			      GOODS	        CONSTRUCTION	    20-Jun-20	1

The above rows show what the data basically looks like.  What I wanted is for all the same dates the total of the count. For example,

                       19-Jun-20      2

                       20-Jun-20      6

 

and so on.  My sql code looks like

Proc sql noprint;
CREATE Table SAS_EMPL.POE_Goods_Trend AS
   SELECT Contact_Person_ID,
          County_1,
		  Race_Ethnicity,
		  NAICS_Sector,
		  Sector_Type,
		  datepart(Created_On)AS Date label = "Date" format=date9.,
		  count(distinct Contact_Person_ID) AS N
   FROM SAS_EMPL.POE_GOODS_FINI
   Where '01APR20'd <= Calculated Date <= '31JAN22'd
   And NAICS_Sector = "GOODS"
   Group by date
   Order by 1;
   quit;

But doesn't add up properly.

 

I appreciate your assistance on this question.

 

TRhank you.

 

wlierman

7 REPLIES 7
Reeza
Super User
Do you want that total added in as a new column to the table above or summarized?
Your query has a note about re-merging statistics I assume?
wlierman
Lapis Lazuli | Level 10

Yes I did want a separate column.  And my query did have the remerging note.

 

Thank you.

wlierman
Lapis Lazuli | Level 10
Yes I did want a total column.  The remerging note also appeared in the log.
Thank you.
wlierman

wlierman
Lapis Lazuli | Level 10

I do some other analytics after the total count is calculated.  Also wanted to

to give more detail of the xlsx spreadsheet.

 

Thanks.

 

wlierman

 

 

Sajid01
Meteorite | Level 14

Hello @wlierman 

You have not specified what is the error you are having OR how you want the error to be.
From the SQL code you have given,  I see that you have included the aggregate column (Contact_ID) in the select statement and your group statement does not include the non-aggregate columns. The result that comes in this scenario would be confusing or not what one would expect. I am giving a specimen code without the above two factors and you can see the difference.
Experiment a bit and I am sure you would get what you want.

Lastly, a very important fact, please provide the source data as a datastep or atleast properly formatted. I have taken only three rows, try using the complete data.

data have;
Retain Contact_ID NAICS_Sector Sector_Type Date N;
informat Date ANYDTDTE9.;
Format date date9.;
length Sector_Type $ 20;
input Contact_ID NAICS_Sector $ Sector_Type $ Date N;
datalines;
1056200 GOODS AGRICULTURE_FORESTRY 10-Jun-20 1
1069490 GOODS AGRICULTURE_FORESTRY 13-Jun-20 1
1012517 GOODS MANUFACTURING 13-Jun-20 1
;
run;
Proc sql noprint;
CREATE Table want AS
   SELECT Date, Count(Contact_ID) as Number_of_contacts,NAICS_Sector, N
   FROM have
   Where '01APR20'd <= Date <= '31JAN22'd
   And NAICS_Sector = "GOODS"
   Group by date ,NAICS_Sector,N 
   Order by 1;
   quit;

The output would be like this

Sajid01_0-1651159806871.png

 

GraphGuy
Meteorite | Level 14

If you only want the date, and the count of the number of obsns for each date, you might try something like ...

 

proc sql noprint;

create table foo as

select unique date, count(*) as count

from my_data

group by date;

quit; run;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 535 views
  • 0 likes
  • 5 in conversation