Yes. here it is
Id | Name | RecordTypeId | CreatedDate | First_Issued_Policy__c |
OzQAO | John | hKQCQ | 08APR2019:16:32:27 | . |
P4QAO | Michael | hJQCQ | 08APR2019:19:14:37 | . |
PEQA4 | Sam | fgQCA | 09APR2019:10:28:26 | . |
PFQA4 | Elia | fgQCA | 09APR2019:10:28:26 | . |
PGQA4 | Julie | fgQCA | 09APR2019:10:28:26 | 3-Jun-19 |
PHQA4 | Chris | fgQCA | 29OCT2016:17:28:31 | 8-Jul-21 |
PIQA4 | Tyler | fgQCA | 09APR2019:10:28:26 | 21-Sep-22 |
PJQA4 | Susan | fgQCA | 09APR2019:10:28:26 | . |
PKQA4 | Laura | fgQCA | 09APR2019:10:28:26 | 24-Jan-20 |
PLQA4 | Mike | fgQCA | 09APR2019:10:28:26 | 11-Aug-22 |
Please post the input data as CARDS data steps, like:
data Contact ;
input ID : $5. Name : $8. RecordTypeID : $5. CreatedDate : datetime. FirstIssued : date9.;
format FirstIssued date9. CreatedDate datetime. ;
cards ;
OzQAO John hKQCQ 08APR2019:16:32:27 .
P4QAO Michael hJQCQ 08APR2019:19:14:37 .
PEQA4 Sam fgQCA 09APR2019:10:28:26 .
PFQA4 Elia fgQCA 09APR2019:10:28:26 .
PGQA4 Julie fgQCA 09APR2019:10:28:26 3-Jun-19
PHQA4 Chris fgQCA 29OCT2016:17:28:31 8-Jul-21
PIQA4 Tyler fgQCA 09APR2019:10:28:26 21-Sep-22
PJQA4 Susan fgQCA 09APR2019:10:28:26 .
PKQA4 Laura fgQCA 09APR2019:10:28:26 24-Jan-20
PLQA4 Mike fgQCA 09APR2019:10:28:26 11-Aug-22
;
run ;
It looks like you have two other input datasets as well, Policy_Std and Policy_Agent_Std. Can you post them (as CARDS data steps)?
Then after you have posted the input datasets, please describe the output you want.
data policy_std; input Record_Date__c : Date9. Id : $6.; format Record_Date__c date9. Id $6.; cards; 24JAN2023 BzuMBBT 24JAN2023 BzuPBBT 20FEB2023 BzuWBBT 15MAR2022 tmJQBQ 07Apr2023 tmLQBQ 12Aug2021 EtGBBU 27Oct2022 pAYBBY ; run;
and policy_agent_std
data policy_agent_std; input Oppotunity__c $6. Contact__c $6. Amount 6.; format Oppotunity__c $6. Contact $6. ; cards; BzuWBB P4QAO 138308 tmJQBQ PHQA4 100000 tmLQBQ PKQA4 61900 EtGBBU CkRE9 135292 nGxQBI OzQAO 40645 zbvQBC mGGAH 13786 ; run;
For each month between Jul2021 and Jun2023, I need sum(Amount), count(distinct Contact), count(Id) on some WHERE conditions
Date | Sum(A.Amount) – a field from dataset A | Count (A.Contact) - this depends on table CONTACT so that’s why I want a whole table for 23 months or 23 separate tables | Count (Id) for each of the months |
Jul2021 |
| 10,000 | 100,000 |
… |
| … | … |
June2023 |
| 15,000 | 120,000 |
Thanks!
For the sample data you've posted, can you show the results you would want?
That is, fill in the table you want as output with a few rows (which I guess means a few months).
With the 3 datasets having only a few rows and the createddate column in the first dataset does not have any date after Jul2021, it would no return anything. However, assume all of the dates in Contact are after Jul2021, see below. I've posted my original code that did not work as expected in a different reply. Thanks!
Date | Sum(A.Amount) – a field from dataset A | Count (A.Contact) - this depends on table CONTACT so that’s why I want a whole table for 23 months or 23 separate tables | Count (Id) for each of the months |
Jul2021 | 0 | 0 | 0 |
Aug2021 | 0 | 0 | 0 |
... | ... | ... | ... |
Mar2023 | 61900 | 1 | 15 (as an example) |
... | ... | ... | ... |
Jan2023 | 238308 | 2 | 50
|
@HG66 wrote:
With the 3 datasets having only a few rows and the createddate column in the first dataset does not have any date after Jul2021, it would no return anything.
The goal here is for you to make a small, workable example, that illustrates the input data you have, and the output you would want.
Can you post three input datasets (as CARDS data steps) that would work together, and post the output you would want from that input? I would think it could be done with 10-20 records, covering say 3 months.
A complete example with input data and output data will help people understand what you're trying to do. It takes work to make workable example, but doing so will help people help you.
So let's use simpler names for you datasets and variables and eliminate that 7th letter from some of the id's that you aren't reading so the example is clearer.
So you have POLICY dataset that has the DATE variable. And a CONTRACT dataset that has the AMOUNT.
data policy;
length id $6 ;
input date :date. id;
format date date9. ;
cards;
24JAN2023 BzuMBB
24JAN2023 BzuPBB
20FEB2023 BzuWBB
15MAR2022 tmJQBQ
07Apr2023 tmLQBQ
12Aug2021 EtGBBU
27Oct2022 pAYBBY
;
data contract;
length id $6 contract $6. ;
input id contract amount;
cards;
BzuWBB P4QAO 138308
tmJQBQ PHQA4 100000
tmLQBQ PKQA4 61900
EtGBBU CkRE9 135292
nGxQBI OzQAO 40645
zbvQBC mGGAH 13786
;
You can put them together an group by the MONTH.
proc sql;
create table want as
select put(date,yymm7.) as Month
, count(distinct a.id) as n_ids
, count(distinct b.contract) as n_contracts
, sum(amount) as total
from policy a
left join contract b
on a.id = b.id
where date between '01JAN2021'd and '31DEC2023'd
group by 1
;
quit;
Results
Obs Month n_ids n_contracts total 1 2021M08 1 1 135292 2 2022M03 1 1 100000 3 2022M10 1 0 . 4 2023M01 2 0 . 5 2023M02 1 1 138308 6 2023M04 1 1 61900
This is great and I was able to run it. However, I'm curious to know if there's a way I don't have to create a table and insert each row to it?
Since it looks like you are just counting OBSERVATIONS (you use ID in the COUNT() function, but I doubt you have any observations that have a missing value of ID) you can probably just use a FORMAT.
If you just want creation counts by month you can just use:
proc freq data=contact;
where &startdate <= intnx('month',createddate,0,'b') <= &enddate ;
tables createddate;
format createddate monyy7.;
run;
But since you appear to want OVERLAPPING intervals you will need to use a multilevel format.
%LET StartDate = '01Jul2021'd;
%Let EndDate = '01Jun2023'd;
data format;
fmtname='ASOF';
hlo='SML';
start=.;
do offset=0 to intck('month',&StartDate., &EndDate.);
end=intnx('month',&StartDate.,offset,'E');
label=put(end,yymm7.);
output;
end;
run;
Now you can get your counts using a procedure like TABULATE, REPORT or MEANS that can use multilevel formats.
proc means data=contact nway noprint;
class createddate / mlf ;
format createddate asof.;
output out=want n=Members;
run;
proc print;
var createdate members;
run;
First, you have not explained what calculations you are trying to do. So explain what you are trying to do, without talking about SAS code, in words. In your table above, you don't say how to compute w x y and z. Are they simply counts? You haven't stated that, and we shouldn't have to guess.
Macros are not a good choice to replace BY group processing, which seems like what you are trying to do (again, I'm guessing) and PROC SQL is not a good choice in loops to replace PROC FREQ.
It seems as if you are counting the number of ID values, PROC FREQ would be the way to go. But it's really not clear what else you are doing here, as we don't have data set CONTACT and you are not counting BY month in any way in your code. Maybe something like this (but this is a guess until you provide a portion of your data set named CONTACT):
%let startdate = '01jul2021'd;
%let enddate = '01jun2023'd;
proc freq data=contact(where=(&startdate<=date<=&enddate));
tables date;
format date monyy7.;
run;
So, no macros needed, and no looping needed, and no SQL needed either.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.