Hi,
I have a data set with BRANCH, OPEN_DATE (start date for the Branches operation), products SHOES & CLOTHES and Payments & Expenses (PMTS , EXPNS)
Time is measured in QTR i.e. 200503 the 3rd QTR of 2005.
I would like to choose a BRANCH say at a past OPEN_DATE and based on the following checks "looking forward in the future TRANDATE QTR up to 201104" :
1.- both PRODUCTS show both PMTS & EXPNS at each "future QTR"
2.- Transaction are occuring at EACH and E VERY subsequent TRANDATE QTR up to 201104
to output in a different dataset.
In the following dataset only BRANCH - GTR satisfies all checks
OPEN_DATE BRANCH TRANDATE SHOES CLOTHES PMTS EXPNS
200503 LON 200503 S C 1000 200
200503 LON 200602 S C 900 100
200503 LON 200603 S C 1600 230
200503 LON 200604 S C 3100 290
200503 LON 200701 S C 880 170
200503 LON 200702 S C 1000 200
200503 LON 200904 S C 1000 200
200503 LON 201002 S C 1000 200
200503 LON 201103 S C 1000 200
200604 FTH 200604 S C 1000 200
200604 FTH 200703 C 1000 200
200604 FTH 200704 S C 1000
200604 FTH 200803 S 1000 200
200604 FTH 200901 S C 200
200604 FTH 200902 S C 1000 200
200604 FTH 200903 S 1000 200
200604 FTH 201004 C 200
200604 FTH 201102 S C 1000 200
200701 GTR 200701 S C 1000 200
200701 GTR 200702 S C 900 100
200701 GTR 200703 S C 1600 230
200701 GTR 200704 S C 3100 290
200701 GTR 200801 S C 880 170
200701 GTR 200802 S C 1000 200
200701 GTR 200803 S C 1000 200
200701 GTR 200804 S C 1000 200
200701 GTR 200701 S C 1000 200
200701 GTR 200702 S C 900 100
200701 GTR 200703 S C 1600 230
200701 GTR 200704 S C 3100 290
200701 GTR 200801 S C 880 170
200701 GTR 200802 S C 1000 200
200701 GTR 200803 S C 1000 200
200701 GTR 200804 S C 1000 200
200701 GTR 200701 S C 1000 200
200701 GTR 200702 S C 900 100
200701 GTR 200703 S C 1600 230
200701 GTR 200704 S C 3100 290
200701 GTR 200801 S C 880 170
200701 GTR 200802 S C 1000 200
200701 GTR 200803 S C 1000 200
200701 GTR 200804 S C 1000 200
200701 GTR 200901 S C 1000 200
200701 GTR 200902 S C 900 100
200701 GTR 200903 S C 1600 230
200701 GTR 200904 S C 3100 290
200701 GTR 201001 S C 880 170
200701 GTR 201002 S C 1000 200
200701 GTR 201003 S C 1000 200
200701 GTR 201004 S C 1000 200
200701 GTR 201101 S C 1000 200
200701 GTR 201102 S C 900 100
200701 GTR 201103 S C 1600 230
200701 GTR 201104 S C 3100 290
Any suggestuions would be more than welcomed.
I would like to thank you in advance
Kind regards
Nikos
.......................................................................
Karl,
If trandate does not always equal open_date at the first.open_date .
data foo; infile cards; input @; substr(_infile_,5,1)='Q'; substr(_infile_,36,1)='Q'; input @1 open_date yyq6. @18 branch $3. @32 trandate yyq6. @49 shoes $1. @61 clothes $1. @74 pmts 4. @84 expns 3.; format open_date trandate yyqn6. pmts expns dollar8.; cards; 200503 LON 200503 S C 1000 200 200503 LON 200602 S C 900 100 200503 LON 200603 S C 1600 230 200503 LON 200604 S C 3100 290 200503 LON 200701 S C 880 170 200503 LON 200702 S C 1000 200 200503 LON 200904 S C 1000 200 200503 LON 201002 S C 1000 200 200503 LON 201103 S C 1000 200 200604 FTH 200604 S C 1000 200 200604 FTH 200703 C 1000 200 200604 FTH 200704 S C 1000 200604 FTH 200803 S 1000 200 200604 FTH 200901 S C 200 200604 FTH 200902 S C 1000 200 200604 FTH 200903 S 1000 200 200604 FTH 201004 C 200 200604 FTH 201102 S C 1000 200 200701 GTR 200701 S C 1000 200 200701 GTR 200702 S C 900 100 200701 GTR 200703 S C 1600 230 200701 GTR 200704 S C 3100 290 200701 GTR 200801 S C 880 170 200701 GTR 200802 S C 1000 200 200701 GTR 200803 S C 1000 200 200701 GTR 200804 S C 1000 200 200701 GTR 200701 S C 1000 200 200701 GTR 200702 S C 900 100 200701 GTR 200703 S C 1600 230 200701 GTR 200704 S C 3100 290 200701 GTR 200801 S C 880 170 200701 GTR 200802 S C 1000 200 200701 GTR 200803 S C 1000 200 200701 GTR 200804 S C 1000 200 200701 GTR 200701 S C 1000 200 200701 GTR 200702 S C 900 100 200701 GTR 200703 S C 1600 230 200701 GTR 200704 S C 3100 290 200701 GTR 200801 S C 880 170 200701 GTR 200802 S C 1000 200 200701 GTR 200803 S C 1000 200 200701 GTR 200804 S C 1000 200 200701 GTR 200901 S C 1000 200 200701 GTR 200902 S C 900 100 200701 GTR 200903 S C 1600 230 200701 GTR 200904 S C 3100 290 200701 GTR 201001 S C 880 170 200701 GTR 201002 S C 1000 200 200701 GTR 201003 S C 1000 200 200701 GTR 201004 S C 1000 200 200701 GTR 201101 S C 1000 200 200701 GTR 201102 S C 900 100 200701 GTR 201103 S C 1600 230 200701 GTR 201104 S C 3100 290 ; run; proc sort data=foo(where=(cmiss(shoes,clothes,pmts,expns)=0)) out=temp nodupkey; by open_date branch trandate; run; data _want(keep=open_date branch ); count=0; do until(last.open_date); set temp; by open_date branch trandate groupformat; if first.open_date then n=intck('qtr',trandate,'01oct2011'd)+1; count+1; end; do until(last.open_date); set temp; by open_date branch trandate groupformat; if n eq count and last.open_date then output; end; run; data want; merge foo(in=ina) _want(in=inb) ; by open_date branch; if ina and inb; run;
Ksharp
I think you can get pretty close with a SQL query. But let me make a couple assumptions.
First, [both PRODUCTS show both PMTS & EXPNS at each "future QTR"]
means: shoes='S' and clothes='S' and pmts>0 and expns>0.
Second, [Transaction are occuring at EACH and EVERY subsequent TRANDATE QTR up to 201104]
means at least 1 row which meets criterion #1 for each quarter between opendate and 201104. (You said "trandate" but that doesn't make sense to me.")
Third, you can convert opendate from a year+qtr to an actual SAS date, because my solution uses the SAS intck() function, which needs dates. You can use the SAS yyq() function for this.
My solution gets at criterion #2 by counting the unique rows which meet criterion #1 and comparing that to the total number of actual calendar quarters between opendate and 10/01/2011 (the SAS representation of QTR 4, 2011). If they are equal, then criterion #2 is met. So take a look and see if you think this is what you want. "Have" is your dataset above, except, as I said, opendate has been converted to a SAS date, not what appears to be YYYYQQ text string in your dataset.
proc sql;
select opendate, branch, intck('QTR',opendate,'01OCT2011'd)+1 as nquarters, count(*) as ntrans
from (select distinct opendate,branch,trandate
from have
where shoes='S' and clothes='C' and pmts>0 and expns>0)
group by opendate,branch
having nquarters = ntrans
;
quit;
Karl
Karl,
If trandate does not always equal open_date at the first.open_date .
data foo; infile cards; input @; substr(_infile_,5,1)='Q'; substr(_infile_,36,1)='Q'; input @1 open_date yyq6. @18 branch $3. @32 trandate yyq6. @49 shoes $1. @61 clothes $1. @74 pmts 4. @84 expns 3.; format open_date trandate yyqn6. pmts expns dollar8.; cards; 200503 LON 200503 S C 1000 200 200503 LON 200602 S C 900 100 200503 LON 200603 S C 1600 230 200503 LON 200604 S C 3100 290 200503 LON 200701 S C 880 170 200503 LON 200702 S C 1000 200 200503 LON 200904 S C 1000 200 200503 LON 201002 S C 1000 200 200503 LON 201103 S C 1000 200 200604 FTH 200604 S C 1000 200 200604 FTH 200703 C 1000 200 200604 FTH 200704 S C 1000 200604 FTH 200803 S 1000 200 200604 FTH 200901 S C 200 200604 FTH 200902 S C 1000 200 200604 FTH 200903 S 1000 200 200604 FTH 201004 C 200 200604 FTH 201102 S C 1000 200 200701 GTR 200701 S C 1000 200 200701 GTR 200702 S C 900 100 200701 GTR 200703 S C 1600 230 200701 GTR 200704 S C 3100 290 200701 GTR 200801 S C 880 170 200701 GTR 200802 S C 1000 200 200701 GTR 200803 S C 1000 200 200701 GTR 200804 S C 1000 200 200701 GTR 200701 S C 1000 200 200701 GTR 200702 S C 900 100 200701 GTR 200703 S C 1600 230 200701 GTR 200704 S C 3100 290 200701 GTR 200801 S C 880 170 200701 GTR 200802 S C 1000 200 200701 GTR 200803 S C 1000 200 200701 GTR 200804 S C 1000 200 200701 GTR 200701 S C 1000 200 200701 GTR 200702 S C 900 100 200701 GTR 200703 S C 1600 230 200701 GTR 200704 S C 3100 290 200701 GTR 200801 S C 880 170 200701 GTR 200802 S C 1000 200 200701 GTR 200803 S C 1000 200 200701 GTR 200804 S C 1000 200 200701 GTR 200901 S C 1000 200 200701 GTR 200902 S C 900 100 200701 GTR 200903 S C 1600 230 200701 GTR 200904 S C 3100 290 200701 GTR 201001 S C 880 170 200701 GTR 201002 S C 1000 200 200701 GTR 201003 S C 1000 200 200701 GTR 201004 S C 1000 200 200701 GTR 201101 S C 1000 200 200701 GTR 201102 S C 900 100 200701 GTR 201103 S C 1600 230 200701 GTR 201104 S C 3100 290 ; run; proc sort data=foo(where=(cmiss(shoes,clothes,pmts,expns)=0)) out=temp nodupkey; by open_date branch trandate; run; data _want(keep=open_date branch ); count=0; do until(last.open_date); set temp; by open_date branch trandate groupformat; if first.open_date then n=intck('qtr',trandate,'01oct2011'd)+1; count+1; end; do until(last.open_date); set temp; by open_date branch trandate groupformat; if n eq count and last.open_date then output; end; run; data want; merge foo(in=ina) _want(in=inb) ; by open_date branch; if ina and inb; run;
Ksharp
I'm sure you're right, Ksharp. I saw a couple ambiguities in the original specs that I just chose to ignore. I couldn't decide if "each and every transaction" in criterion #2 meant "starting with the very 1st" or "starting with the 1st that meets criterion #1". And, if the latter, does that that mean that a branch with, say, 19 transactions that fail criterion #1 followed by 1 transaction that meets it, should be output, but a branch with 19 that do followed by 1 that doesn't gets flushed. That didn't make sense to me, so I just used opendate to get the discussion started.
I also wondered if the target date of 201104 was a constant or was intended to be a stand-in for max(trandate). And, if the latter, is that by branch or for the table as a whole. So again, I just took the easy way out. If, in fact, there are any transactions after that date, my algorithm fails (as, I think, does yours.)
Anyway, if the OP wanted to clarify the specs, this has been a good discussion, I think.
And hey, my SQL solution was kind of elegant, no? Wrong, but elegant.
Karl
data foo;
infile cards;
input @;
substr(_infile_,5,1)='Q'; substr(_infile_,36,1)='Q';
input @1 open_date yyq6. @18 branch $3. @32 trandate yyq6. @49 shoes $1. @61 clothes $1. @74 pmts 4. @84 expns 3.;
format open_date trandate yyq6. pmts expns dollar8.;
cards;
200503 LON 200503 S C 1000 200
200503 LON 200602 S C 900 100
200503 LON 200603 S C 1600 230
200503 LON 200604 S C 3100 290
200503 LON 200701 S C 880 170
200503 LON 200702 S C 1000 200
200503 LON 200904 S C 1000 200
200503 LON 201002 S C 1000 200
200503 LON 201103 S C 1000 200
200604 FTH 200604 S C 1000 200
200604 FTH 200703 C 1000 200
200604 FTH 200704 S C 1000
200604 FTH 200803 S 1000 200
200604 FTH 200901 S C 200
200604 FTH 200902 S C 1000 200
200604 FTH 200903 S 1000 200
200604 FTH 201004 C 200
200604 FTH 201102 S C 1000 200
200701 GTR 200701 S C 1000 200
200701 GTR 200702 S C 900 100
200701 GTR 200703 S C 1600 230
200701 GTR 200704 S C 3100 290
200701 GTR 200801 S C 880 170
200701 GTR 200802 S C 1000 200
200701 GTR 200803 S C 1000 200
200701 GTR 200804 S C 1000 200
200701 GTR 200701 S C 1000 200
200701 GTR 200702 S C 900 100
200701 GTR 200703 S C 1600 230
200701 GTR 200704 S C 3100 290
200701 GTR 200801 S C 880 170
200701 GTR 200802 S C 1000 200
200701 GTR 200803 S C 1000 200
200701 GTR 200804 S C 1000 200
200701 GTR 200701 S C 1000 200
200701 GTR 200702 S C 900 100
200701 GTR 200703 S C 1600 230
200701 GTR 200704 S C 3100 290
200701 GTR 200801 S C 880 170
200701 GTR 200802 S C 1000 200
200701 GTR 200803 S C 1000 200
200701 GTR 200804 S C 1000 200
200701 GTR 200901 S C 1000 200
200701 GTR 200902 S C 900 100
200701 GTR 200903 S C 1600 230
200701 GTR 200904 S C 3100 290
200701 GTR 201001 S C 880 170
200701 GTR 201002 S C 1000 200
200701 GTR 201003 S C 1000 200
200701 GTR 201004 S C 1000 200
200701 GTR 201101 S C 1000 200
200701 GTR 201102 S C 900 100
200701 GTR 201103 S C 1600 230
200701 GTR 201104 S C 3100 290
;
run;
proc sort data=foo noduprec; by branch trandate; run;
proc sql;
create table bar as
select *
from _foo
where open_date ge input('2005Q3',yyq6.)
and pmts is not null
and expns is not null
group by branch
having max(trandate) ge input('2011Q4',yyq6.)
and intck('qtr',min(trandate),max(trandate))+1=count(*)
order by branch,trandate;
quit;
open_date branch trandate shoes clothes pmts expns
2007Q1 GTR 2007Q1 S C $1,000 $200
2007Q1 GTR 2007Q2 S C $900 $100
2007Q1 GTR 2007Q3 S C $1,600 $230
2007Q1 GTR 2007Q4 S C $3,100 $290
2007Q1 GTR 2008Q1 S C $880 $170
2007Q1 GTR 2008Q2 S C $1,000 $200
2007Q1 GTR 2008Q3 S C $1,000 $200
2007Q1 GTR 2008Q4 S C $1,000 $200
2007Q1 GTR 2009Q1 S C $1,000 $200
2007Q1 GTR 2009Q2 S C $900 $100
2007Q1 GTR 2009Q3 S C $1,600 $230
2007Q1 GTR 2009Q4 S C $3,100 $290
2007Q1 GTR 2010Q1 S C $880 $170
2007Q1 GTR 2010Q2 S C $1,000 $200
2007Q1 GTR 2010Q3 S C $1,000 $200
2007Q1 GTR 2010Q4 S C $1,000 $200
2007Q1 GTR 2011Q1 S C $1,000 $200
2007Q1 GTR 2011Q2 S C $900 $100
2007Q1 GTR 2011Q3 S C $1,600 $230
2007Q1 GTR 2011Q4 S C $3,100 $290
I would like to thank you all.
Kind regards
Nikos
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.