BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vallsas
Pyrite | Level 9

Hi All,

I have a datetime variable and would like to use datepart(datetime) as monthyear and applied as monyy7. format in proc sql. 

in the second step I was doing grouping and where clause on monthyear >012022

but not working.  It works only >'01jan2022'd 

also my grouping giving by date wise when i check the data so what i want by group by monthyear .

I thought it would group by monthyear but internally it store full value of date(01jan2022) . so when we do group by monthyear then my count values coming by date not by monthyear.

note:  I tried like month(datepart(date)) as month and year(datepart(date)) as year

and combined as cat(month,year) as monthyear but it will be converted to character, cant use in where statement > < symbols (where monthyear>012022 not working as well.

 

I would like to do it in proc sql only this grouping. appreciate your thoughts pls.

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

As shown in my first reply (just for a different summary function) you can get the result from HAVE also in one step.

 

Here is a variant of that using a WHERE clause instead of a HAVING clause:

proc sql;
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, count(distinct x) as s
from have
where datetime>='01FEB2022:0:0'dt
group by monthyear;
quit;

View solution in original post

13 REPLIES 13
tarheel13
Rhodochrosite | Level 12

show your sample data please and code. 

FreelanceReinh
Jade | Level 19

Hi @vallsas,

 

Let me first create sample data to have something to work with:

data have;
input datetime :datetime. x;
format datetime datetime19.;
cards;
05JAN2022:12:34:56 8
31JAN2022:13:45:00 2
05FEB2022:12:34:56 7
28FEB2022:13:45:00 5
05MAR2022:12:34:56 3
28MAR2022:13:45:00 4
05APR2022:12:34:56 6
28APR2022:13:45:00 9
;

From this HAVE dataset you want to create an aggregated table with a grouping variable MONTHYEAR by means of PROC SQL? Then MONTHYEAR must have the same value for all datetimes in a month. Hence datepart(datetime) is not suitable, even if formatted with the MONYY7. format, because the internal values, not the formatted values would be used to group observations.

 

But you could map all datetimes to the first day of the month:

proc sql;
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, sum(x) as s
from have
group by monthyear
having monthyear>'01JAN2022'd;
quit;

Result:

monthyear         s
-------------------
  FEB2022        12
  MAR2022         7
  APR2022        15

Note that the rows are sorted in chronological order, as they should, because the internal values of MONTHYEAR are SAS date values. This is also the reason why the HAVING clause uses a date literal.

vallsas
Pyrite | Level 9

 

proc sql;

create table new as select a.id, datepart(datetime) as monthyear format=monyy7.,

b.id,b.age,b.cti_id

from T1TAB LEFT JOIN T2TAB

ON a.id=b.id

quit;

proc sql;

create table new1 as select monthyear,cti_id,count(distinct id) as countid, count(distinct age) as coutage from new

where monthyear >'01jan2022'd

group by monthyear,cti_id;

quit;

here results coming 

jan2022 but it wasnot grouping by monthyear when i see the data i was grouping by date .

so i get more number of observation as summary of count.

FreelanceReinh
Jade | Level 19

So in your second PROC SQL step you would select intnx('month',monthyear,0) format=monyy7. either as a new variable (e.g. monthyear1) or again as monthyear. In the latter case you would need to refer to calculated monthyear in the GROUP BY clause. The HAVING clause would replace the WHERE clause:

group by calculated monthyear, cti_id
having monthyear>'01JAN2022'd;

Edit: Or adapt the WHERE clause to the fact that it uses the internal values of the original monthyear variable:

where monthyear>'31JAN2022'd
group by calculated monthyear, cti_id;
vallsas
Pyrite | Level 9
thank you so very much, i wll test it on monday and reply .
vallsas
Pyrite | Level 9
Hi 

please check the error.

 proc sql;
 select intnx('month',datepart(datetime),0) format=monyy7. as monthyear,count(distinct x) as s
 from have
 where monthyear>'31JAN2022'd
group by calculated monthyear, x;
ERROR: The following columns were not found in the contributing tables: monthyear.
FreelanceReinh
Jade | Level 19

If variable MONTHYEAR is not contained in the input dataset HAVE, but just being created in the current SELECT statement, you need to write calculated monthyear in a WHERE clause (as you did redundantly in the GROUP BY clause). Alternatively, you can use the HAVING clause (but the WHERE clause should be more efficient). My suggested code with "where monthyear ..." referred to your second PROC SQL step where MONTHYEAR had already been created in the preceding step.

 

Also, are you sure that you want to include x in the GROUP BY clause when your summary statistic is count(distinct x)?

vallsas
Pyrite | Level 9
HI Thankyou.
here is the code i executed two steps separtely but in the second step got error as I would like to check count by distinct x with monthyear >jan2022 (here we cant pass as monthyear as jan2022 so we put 31jan2022,
but data should full by monthyear all the feb count should be 2, march 2,april 2 and by monthyear.

proc sql;
create table tt as
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, x
from have;

quit;
MONTHYEAR X
1 JAN2022 8
2 JAN2022 2
3 FEB2022 7
4 FEB2022 5
5 MAR2022 3
6 MAR2022 4
7 APR2022 6
8 APR2022 9

69 proc sql;
70 select intnx('month',monthyear,0) format=monyy7. as monthyear,count(distinct x) as s
71 from tt
72 where monthyear>'31JAN2022'd
73 group by calculated monthyear, S;
ERROR: Summary functions are restricted to the SELECT and HAVING clauses only.
Kurt_Bremser
Super User

Since the group determines the result of a summary function, you cannot use this result to determine the group. You created a circular definition.

Please provide example data (as shown, in a data step with datalines), and what you want to get out of it.

FreelanceReinh
Jade | Level 19

Just omit S in the GROUP BY clause.

 

Using the HAVE dataset from my earlier post I assume you have created something like:

proc sql;
create table tt as
select datepart(datetime) format=monyy7. as monthyear, x
from have;
quit;

Now you can count the distinct values of x per month:

proc sql;
select intnx('month',monthyear,0) format=monyy7. as monthyear, count(distinct x) as s
from tt
where monthyear>'31JAN2022'd
group by calculated monthyear;
quit;

Result:

monthyear         s
-------------------
  FEB2022         2
  MAR2022         2
  APR2022         2
vallsas
Pyrite | Level 9
Yes, now got the expected output. only thing is here we have to do in two steps on month ..
you have give genius solution as most of them think to get on the same step of first but here by creating the second step we have got expected output. even group by gives unformatted values but now we overcome that and got the result by using this method.
excellent well appreciated your effort.
thank you.
FreelanceReinh
Jade | Level 19

As shown in my first reply (just for a different summary function) you can get the result from HAVE also in one step.

 

Here is a variant of that using a WHERE clause instead of a HAVING clause:

proc sql;
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, count(distinct x) as s
from have
where datetime>='01FEB2022:0:0'dt
group by monthyear;
quit;
Kurt_Bremser
Super User

Other than the statistical procedures like FREQ or MEANS, SQL GROUP BY will use the raw, unformatted values. So you need to create the new variable either as character with the formatted values, or use INTNX to align the dates to the first of the month.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 5341 views
  • 2 likes
  • 4 in conversation