BookmarkSubscribeRSS Feed
wlierman
Lapis Lazuli | Level 10

I have a quick sql question that involves counting up several values by individual dates.

Contact_ID	     NAICS_Sector	     Sector_Type	      Date	    N
1056200	              GOODS	      AGRICULTURE FORESTRY	10-Jun-20	1
1069490		          GOODS	      AGRICULTURE FORESTRY	13-Jun-20	1
1012517		          GOODS	        MANUFACTURING	    13-Jun-20	1
1074445			      GOODS	        MANUFACTURING	    14-Jun-20	1
1072985		          GOODS	        MANUFACTURING	    14-Jun-20	1
951206			      GOODS	        MANUFACTURING	    14-Jun-20	1
1074884		          GOODS	        MANUFACTURING	    14-Jun-20	1
1074374			      GOODS	        MANUFACTURING	    14-Jun-20	1
1074312			      GOODS	        MANUFACTURING	    14-Jun-20	1
1077220			      GOODS	        MANUFACTURING	    15-Jun-20	1
1079736		          GOODS	        MANUFACTURING	    16-Jun-20	1
1083450		          GOODS	        CONSTRUCTION	    17-Jun-20	1
1087981			      GOODS	        MANUFACTURING	    18-Jun-20	1
1090459			      GOODS	      AGRICULTURE FORESTRY	18-Jun-20	1
1090413		          GOODS	      AGRICULTURE FORESTRY	18-Jun-20	1
9991088408			  GOODS	        MANUFACTURING	    18-Jun-20	1
1089620	              GOODS	        MANUFACTURING	    18-Jun-20	1
1092420	      		  GOODS	        MANUFACTURING	    19-Jun-20	1
9991088337			  GOODS	        MANUFACTURING	    19-Jun-20	1
1081349		          GOODS	        MANUFACTURING	    20-Jun-20	1
1098376			      GOODS	        MANUFACTURING	    20-Jun-20	1
1097263			      GOODS	        MANUFACTURING	    20-Jun-20	1
1098412		          GOODS	        CONSTRUCTION	    20-Jun-20	1
1080505		          GOODS	        MANUFACTURING	    20-Jun-20	1
1098407			      GOODS	        MANUFACTURING	    20-Jun-20	1
1098432			      GOODS	        CONSTRUCTION	    20-Jun-20	1

The above rows show what the data basically looks like.  What I wanted is for all the same dates the total of the count. For example,

                       19-Jun-20      2

                       20-Jun-20      6

 

and so on.  My sql code looks like

Proc sql noprint;
CREATE Table SAS_EMPL.POE_Goods_Trend AS
   SELECT Contact_Person_ID,
          County_1,
		  Race_Ethnicity,
		  NAICS_Sector,
		  Sector_Type,
		  datepart(Created_On)AS Date label = "Date" format=date9.,
		  count(distinct Contact_Person_ID) AS N
   FROM SAS_EMPL.POE_GOODS_FINI
   Where '01APR20'd <= Calculated Date <= '31JAN22'd
   And NAICS_Sector = "GOODS"
   Group by date
   Order by 1;
   quit;

But doesn't add up properly.

 

I appreciate your assistance on this question.

 

TRhank you.

 

wlierman

7 REPLIES 7
Reeza
Super User
Do you want that total added in as a new column to the table above or summarized?
Your query has a note about re-merging statistics I assume?
wlierman
Lapis Lazuli | Level 10

Yes I did want a separate column.  And my query did have the remerging note.

 

Thank you.

wlierman
Lapis Lazuli | Level 10
Yes I did want a total column.  The remerging note also appeared in the log.
Thank you.
wlierman

wlierman
Lapis Lazuli | Level 10

I do some other analytics after the total count is calculated.  Also wanted to

to give more detail of the xlsx spreadsheet.

 

Thanks.

 

wlierman

 

 

Sajid01
Meteorite | Level 14

Hello @wlierman 

You have not specified what is the error you are having OR how you want the error to be.
From the SQL code you have given,  I see that you have included the aggregate column (Contact_ID) in the select statement and your group statement does not include the non-aggregate columns. The result that comes in this scenario would be confusing or not what one would expect. I am giving a specimen code without the above two factors and you can see the difference.
Experiment a bit and I am sure you would get what you want.

Lastly, a very important fact, please provide the source data as a datastep or atleast properly formatted. I have taken only three rows, try using the complete data.

data have;
Retain Contact_ID NAICS_Sector Sector_Type Date N;
informat Date ANYDTDTE9.;
Format date date9.;
length Sector_Type $ 20;
input Contact_ID NAICS_Sector $ Sector_Type $ Date N;
datalines;
1056200 GOODS AGRICULTURE_FORESTRY 10-Jun-20 1
1069490 GOODS AGRICULTURE_FORESTRY 13-Jun-20 1
1012517 GOODS MANUFACTURING 13-Jun-20 1
;
run;
Proc sql noprint;
CREATE Table want AS
   SELECT Date, Count(Contact_ID) as Number_of_contacts,NAICS_Sector, N
   FROM have
   Where '01APR20'd <= Date <= '31JAN22'd
   And NAICS_Sector = "GOODS"
   Group by date ,NAICS_Sector,N 
   Order by 1;
   quit;

The output would be like this

Sajid01_0-1651159806871.png

 

GraphGuy
Meteorite | Level 14

If you only want the date, and the count of the number of obsns for each date, you might try something like ...

 

proc sql noprint;

create table foo as

select unique date, count(*) as count

from my_data

group by date;

quit; run;

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 916 views
  • 0 likes
  • 5 in conversation