Solved: proc sql datepart(date) as monthyear error

vallsas · Posted 04-01-2022 06:02 AM

Hi All,

I have a datetime variable and would like to use datepart(datetime) as monthyear and applied as monyy7. format in proc sql.

in the second step I was doing grouping and where clause on monthyear >012022

but not working. It works only >'01jan2022'd

also my grouping giving by date wise when i check the data so what i want by group by monthyear .

I thought it would group by monthyear but internally it store full value of date(01jan2022) . so when we do group by monthyear then my count values coming by date not by monthyear.

note: I tried like month(datepart(date)) as month and year(datepart(date)) as year

and combined as cat(month,year) as monthyear but it will be converted to character, cant use in where statement > < symbols (where monthyear>012022 not working as well.

I would like to do it in proc sql only this grouping. appreciate your thoughts pls.

FreelanceReinh · Posted 04-02-2022 02:14 PM

As shown in my first reply (just for a different summary function) you can get the result from HAVE also in one step.

Here is a variant of that using a WHERE clause instead of a HAVING clause:

proc sql;
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, count(distinct x) as s
from have
where datetime>='01FEB2022:0:0'dt
group by monthyear;
quit;

View solution in original post

tarheel13 · Posted 04-01-2022 06:47 AM

show your sample data please and code.

FreelanceReinh · Posted 04-01-2022 07:10 AM

Hi @vallsas,

Let me first create sample data to have something to work with:

data have;
input datetime :datetime. x;
format datetime datetime19.;
cards;
05JAN2022:12:34:56 8
31JAN2022:13:45:00 2
05FEB2022:12:34:56 7
28FEB2022:13:45:00 5
05MAR2022:12:34:56 3
28MAR2022:13:45:00 4
05APR2022:12:34:56 6
28APR2022:13:45:00 9
;

From this HAVE dataset you want to create an aggregated table with a grouping variable MONTHYEAR by means of PROC SQL? Then MONTHYEAR must have the same value for all datetimes in a month. Hence datepart(datetime) is not suitable, even if formatted with the MONYY7. format, because the internal values, not the formatted values would be used to group observations.

But you could map all datetimes to the first day of the month:

proc sql;
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, sum(x) as s
from have
group by monthyear
having monthyear>'01JAN2022'd;
quit;

Result:

monthyear         s
-------------------
  FEB2022        12
  MAR2022         7
  APR2022        15

Note that the rows are sorted in chronological order, as they should, because the internal values of MONTHYEAR are SAS date values. This is also the reason why the HAVING clause uses a date literal.

vallsas · Posted 04-01-2022 07:26 AM

proc sql;

create table new as select a.id, datepart(datetime) as monthyear format=monyy7.,

b.id,b.age,b.cti_id

from T1TAB LEFT JOIN T2TAB

ON a.id=b.id

quit;

proc sql;

create table new1 as select monthyear,cti_id,count(distinct id) as countid, count(distinct age) as coutage from new

where monthyear >'01jan2022'd

group by monthyear,cti_id;

quit;

here results coming

jan2022 but it wasnot grouping by monthyear when i see the data i was grouping by date .

so i get more number of observation as summary of count.

FreelanceReinh · Posted 04-01-2022 07:50 AM

So in your second PROC SQL step you would select intnx('month',monthyear,0) format=monyy7. either as a new variable (e.g. monthyear1) or again as monthyear. In the latter case you would need to refer to calculated monthyear in the GROUP BY clause. The HAVING clause would replace the WHERE clause:

group by calculated monthyear, cti_id
having monthyear>'01JAN2022'd;

Edit: Or adapt the WHERE clause to the fact that it uses the internal values of the original monthyear variable:

where monthyear>'31JAN2022'd
group by calculated monthyear, cti_id;

vallsas · Posted 04-01-2022 08:34 AM

thank you so very much, i wll test it on monday and reply .

vallsas · Posted 04-02-2022 08:33 AM

Hi

please check the error.

proc sql;
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear,count(distinct x) as s
from have
where monthyear>'31JAN2022'd
group by calculated monthyear, x;
ERROR: The following columns were not found in the contributing tables: monthyear.

FreelanceReinh · Posted 04-02-2022 09:02 AM

If variable MONTHYEAR is not contained in the input dataset HAVE, but just being created in the current SELECT statement, you need to write calculated monthyear in a WHERE clause (as you did redundantly in the GROUP BY clause). Alternatively, you can use the HAVING clause (but the WHERE clause should be more efficient). My suggested code with "where monthyear ..." referred to your second PROC SQL step where MONTHYEAR had already been created in the preceding step.

Also, are you sure that you want to include x in the GROUP BY clause when your summary statistic is count(distinct x)?

vallsas · Posted 04-02-2022 12:10 PM

HI Thankyou.
here is the code i executed two steps separtely but in the second step got error as I would like to check count by distinct x with monthyear >jan2022 (here we cant pass as monthyear as jan2022 so we put 31jan2022,
but data should full by monthyear all the feb count should be 2, march 2,april 2 and by monthyear.

proc sql;
create table tt as
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, x
from have;

quit;
MONTHYEAR X
1 JAN2022 8
2 JAN2022 2
3 FEB2022 7
4 FEB2022 5
5 MAR2022 3
6 MAR2022 4
7 APR2022 6
8 APR2022 9

69 proc sql;
70 select intnx('month',monthyear,0) format=monyy7. as monthyear,count(distinct x) as s
71 from tt
72 where monthyear>'31JAN2022'd
73 group by calculated monthyear, S;
ERROR: Summary functions are restricted to the SELECT and HAVING clauses only.

Kurt_Bremser · Posted 04-02-2022 12:22 PM

Since the group determines the result of a summary function, you cannot use this result to determine the group. You created a circular definition.

Please provide example data (as shown, in a data step with datalines), and what you want to get out of it.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

FreelanceReinh · Posted 04-02-2022 01:19 PM

Just omit S in the GROUP BY clause.

Using the HAVE dataset from my earlier post I assume you have created something like:

proc sql;
create table tt as
select datepart(datetime) format=monyy7. as monthyear, x
from have;
quit;

Now you can count the distinct values of x per month:

proc sql;
select intnx('month',monthyear,0) format=monyy7. as monthyear, count(distinct x) as s
from tt
where monthyear>'31JAN2022'd
group by calculated monthyear;
quit;

Result:

monthyear         s
-------------------
  FEB2022         2
  MAR2022         2
  APR2022         2

vallsas · Posted 04-02-2022 02:01 PM

Yes, now got the expected output. only thing is here we have to do in two steps on month ..
you have give genius solution as most of them think to get on the same step of first but here by creating the second step we have got expected output. even group by gives unformatted values but now we overcome that and got the result by using this method.
excellent well appreciated your effort.
thank you.

FreelanceReinh · Posted 04-02-2022 02:14 PM

As shown in my first reply (just for a different summary function) you can get the result from HAVE also in one step.

Here is a variant of that using a WHERE clause instead of a HAVING clause:

proc sql;
select intnx('month',datepart(datetime),0) format=monyy7. as monthyear, count(distinct x) as s
from have
where datetime>='01FEB2022:0:0'dt
group by monthyear;
quit;

Kurt_Bremser · Posted 04-01-2022 08:39 AM

Other than the statistical procedures like FREQ or MEANS, SQL GROUP BY will use the raw, unformatted values. So you need to create the new variable either as character with the formatted values, or use INTNX to align the dates to the first of the month.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

Re: proc sql datepart(date) as monthyear error

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away