Creating tables with proc sql based on subqueries?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 138
Accepted Solution

Creating tables with proc sql based on subqueries?

Hi, I am trying to create a new table or data set differently based on logic that looks for specific records in a data set.  For example, I want to create a new data set with a WTD, MTD, YTD, and LTD column from the original data set if all of those exist.  If they don't (in my case there is no WTD in the original data set) then I want to insert a dummy WTD record and union it to the MTD, YTD, and LTD records from the original data set for the new data set.  The way I have written the code, I get the following error:

 

ERROR: A Composite expression (usually a subquery) is used incorrectly in an expression.

 

Here is the code I have written.  Is there a way to acomplish what I am tring to do?

 

proc sql;
	create table FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT_APP as
		SELECT DISTINCT
			CASE 
				WHEN EXISTS
					(
					  select * 
					  from FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT a3
					  where TIMEFRAME = 'WTD'
					)
				THEN 
					(
						select 
					   		b.TIMEFRAME,
							b._TYPE_ as _TYPE_,
							b._PAGE_ as _PAGE_,
							b._TABLE_ as _TABLE_,
							b.INVESTOR_LOAN_ID_N,
							b.NOTE_BALANCE_Sum,
							b.NOTE_BALANCE_Mean,
							b.WGT_AVG_20YR_Mean,
							b.LTV_Mean,
							b.FICO_Mean,
							b.SRP_RATE_Mean,
							b.SRP_AMT_Mean,
							b.SRP_AMT_Sum,
							b.PCT_TOTAL_UPB_PURCHASE_Mean,
							b.PCT_TOTAL_UPB_REFI_Mean,
							b.PCT_TOTAL_CNT_ESCROW_Mean,
							b.PCT_TOTAL_CNT_NONESCROW_Mean,
							CASE
								WHEN b.TIMEFRAME = 'WTD' then 1
								WHEN b.TIMEFRAME = 'MTD' then 2
								WHEN b.TIMEFRAME = 'YTD' then 3
								WHEN b.TIMEFRAME = 'LTD' then 4
							END as TIMEFRAME_NO
						from FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT b
						/*order by TIMEFRAME_NO*/
					)
				else 
					(
						SELECT 
							'WTD' as TIMEFRAME,
							'1' as _TYPE_,
							1 as _PAGE_,
							1 as _TABLE_,
							0 as INVESTOR_LOAN_ID_N,
							0 as NOTE_BALANCE_Sum,
							0 as NOTE_BALANCE_Mean,
							0 as WGT_AVG_20YR_Mean,
							0 as LTV_Mean,
							0 as FICO_Mean,
							0 as SRP_RATE_Mean,
							0 as SRP_AMT_Mean,
							0 as SRP_AMT_Sum,
							0 as PCT_TOTAL_UPB_PURCHASE_Mean,
							0 as PCT_TOTAL_UPB_REFI_Mean,
							0 as PCT_TOTAL_CNT_ESCROW_Mean,
							0 as PCT_TOTAL_CNT_NONESCROW_Mean,
							1 as TIMEFRAME_NO
					    from FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT a	
					    union 
					    select 
					   		b.TIMEFRAME,
							b._TYPE_ as _TYPE_,
							b._PAGE_ as _PAGE_,
							b._TABLE_ as _TABLE_,
							b.INVESTOR_LOAN_ID_N,
							b.NOTE_BALANCE_Sum,
							b.NOTE_BALANCE_Mean,
							b.WGT_AVG_20YR_Mean,
							b.LTV_Mean,
							b.FICO_Mean,
							b.SRP_RATE_Mean,
							b.SRP_AMT_Mean,
							b.SRP_AMT_Sum,
							b.PCT_TOTAL_UPB_PURCHASE_Mean,
							b.PCT_TOTAL_UPB_REFI_Mean,
							b.PCT_TOTAL_CNT_ESCROW_Mean,
							b.PCT_TOTAL_CNT_NONESCROW_Mean,
							CASE
								WHEN b.TIMEFRAME = 'WTD' then 1
								WHEN b.TIMEFRAME = 'MTD' then 2
								WHEN b.TIMEFRAME = 'YTD' then 3
								WHEN b.TIMEFRAME = 'LTD' then 4
							END as TIMEFRAME_NO
						from FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT b
						/*order by TIMEFRAME_NO*/
					) END as x
		FROM FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT a;
quit;


Accepted Solutions
Solution
‎01-13-2016 02:50 PM
Trusted Advisor
Posts: 1,114

Re: Creating tables with proc sql based on subqueries?

Thanks @elwayfan446 for the clarification. Then I would agree with those who have suggested to use a "master list".

/* Create a dataset containing all time frame codes */

data timeframes;
length timeframe $3; /* Please adapt $3 to the length of TIMEFRAME in HAVE! */
input timeframe;
cards;
LTD
MTD
WTD
YTD
;

/* Prepare the HAVE dataset, if it is not sorted by TIMEFRAME */

proc sort data=have;
by timeframe;
run;

/* Insert dummy observations as desired */

data want;
merge have
      timeframes;
by timeframe;
run;

 

View solution in original post


All Replies
Esteemed Advisor
Esteemed Advisor
Posts: 7,253

Re: Creating tables with proc sql based on subqueries?

Hi,

 

Sorry, thats a bit TL;DR for me.  I can tell at a brief glance that that wont work or wont be the best way of doing it.  I would recommend posting a small datastep with some test data in, just a few lines, and what you want the output to look like.  At a guess on the logic, first create a distinct table of the data you want out, then just left join onto that the data from your main table where the condition is true.

proc sql;
  create table WANT as
  select  distinct ...
  from    HAVE A
  left join (select ... where condition is true) B
  on ...;
quit;

Its vague as I have no test data.

Frequent Contributor
Posts: 138

Re: Creating tables with proc sql based on subqueries?

Thanks RW9.  I was not sure if a left join would work since this would be acting like a merge or union.  Maybe if I lay it out like this:

 

Original Table has the following observations:

 

MTD  field 2, field 3, etc.

YTD  field 2, field 3, etc.

LTD  field 2, field 3, etc.

 

In that table, there is no WTD.  In the new table I need a WTD so I would create a "dummy" record with zeros for the other fields.  I need to check to see if that WTD exists initially to determine if I need a dummy record or not.  If it exists, then I simply need to take all data from the original table.

 

Final table would look like

 

WTD, field 2, field 3, etc. (whether this is from dummy data or the data in the original table)

MTD  field 2, field 3, etc.

YTD  field 2, field 3, etc.

LTD  field 2, field 3, etc.

Grand Advisor
Posts: 17,444

Re: Creating tables with proc sql based on subqueries?

Missing level problem. Create a table that has all the levels you need.  Then you merge them together and if a value doesn't exist the join will add it in, but if all records are present it will work as intended.

Frequent Contributor
Posts: 138

Re: Creating tables with proc sql based on subqueries?

Reeza, can you point me to an example of this?  I am new to this one.

Grand Advisor
Posts: 17,444

Re: Creating tables with proc sql based on subqueries?

@RW9 solution is the method I was suggesting. 

 

Basically, you require a master list. Then merge your data into the master table, and any missing records are added in via the join.

Esteemed Advisor
Esteemed Advisor
Posts: 7,253

Re: Creating tables with proc sql based on subqueries?

Data is key here to understanding, post some test data.  Here is a an example (verbose coding but I am just leaving for the day).  Create a template dataset, then merge back to it:

data have;
  id=1; a="MTD"; field2=1; output;
  id=1; a="YTD"; field2=1; output;
  id=1; a="LTD"; field2=1; output;
  id=2; a="MTD"; field2=1; output;
  id=2; a="YTD"; field2=1; output;
  id=2; a="LTD"; field2=1; output;
  id=2; a="WTD"; field2=1; output;
run;

proc sort data=have out=tmp nodupkey;
  by id;
run;
data tmp;
  set tmp (keep=id);
  do a="MTD","YTD","LTD","WTD";
    output;
  end;
run;

proc sql;
  create table WANT as
  select  A.*,
          B.FIELD2
  from    TMP A
  left join HAVE B
  on      A.ID=B.ID
  and     A.A=B.A;
quit;
Frequent Contributor
Posts: 138

Re: Creating tables with proc sql based on subqueries?

Thanks, I will give this a try too.

Frequent Contributor
Posts: 138

Re: Creating tables with proc sql based on subqueries?

[ Edited ]

I have taken this code and modified it for my data.

 

 

data FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT;
	id=1; a="MTD"; _TYPE_=1; output;
	id=1; a="YTD"; _TYPE_=1; output;
	id=1; a="LTD"; _TYPE_=1; output;
	id=2; a="WTD"; _TYPE_=1; output;
	id=2; a="MTD"; _TYPE_=1; output;
	id=2; a="YTD"; _TYPE_=1; output;
	id=2; a="LTD"; _TYPE_=1; output;
run ;

proc sort data=flowfred.fmflow_loansfunded_20yr_rpt out=tmp nodupkey;
	by id;
run;

data tmp;
	set tmp (keep=id _TYPE_);
	do a="WTD","MTD","YTD","LTD";
		output;
	end;
run;

proc sql;
	create table FLOWFRED.FMFLOW_LOANSFUNDED_20YR_RPT_APP as
	select A.*,
		b._TYPE_
	from tmp A
	left join flowfred.fmflow_loansfunded_20yr_rpt b on a.id=b.id
		and a.a=b.a;
quit;

Here are the results I am getting from this.  Note that currently the original data set currently has an observation that includes WTD in the variable TIMEFRAME.  How should I interpret the results I am getting?  Is this what would be expected?

 

id   _TYPE_     a

1       1            WTD
1       1            MTD
1       1            YTD
1       1            LTD
2       1           WTD
2       1           MTD
2       1          YTD
2       1          LTD

Esteemed Advisor
Esteemed Advisor
Posts: 7,253

Re: Creating tables with proc sql based on subqueries?

Hi,

 

Let me try again.  Post some example test data - in the form of a datastep, and what you want the output to look like.  Each of the replies given are of the same form - create the elements you need in the final dataset, then merge this back to original data.  The merge takes care of missing records.  However we can't provide exact scenarios without seeing what you have and want.

Respected Advisor
Posts: 5,001

Re: Creating tables with proc sql based on subqueries?

This sounds like a good example of when to use macro language.  To prepare, get two working versions of  your program.  One should work when WTD does exist, and the other should work when WTD doesn't exist.

 

When those are ready, it is likely a straightforward macro language application.  Determine whether WTD exists, and have the macro implement one version of the program or the other. 

 

Once we're that far, we can talk about the best implementation that allows for sets of variables to exist or not.  (You won't need dozens of versions of a working program.  Macro language will take care of that.)

Trusted Advisor
Posts: 1,114

Re: Creating tables with proc sql based on subqueries?

It looks like you try to use a CASE structure to create a new table conditionally. However, to my knowledge, you can only use it to create a new variable. Also, in the beginning you talk about a WTD column, but later about a WTD record (i.e row) and in your code 'WTD' occurs only as a value of a character variable TIMEFRAME.

 

In your second post you show a dataset with only three observations. Is this true also for your real data or do you have more records there, maybe with a key variable (not shown so far; see RW9's variable ID) and blocks of 3 or 4 records with TIMEFRAME values 'MTD', 'YTD', 'LTD' and possibly 'WTD'? If this is the case, I think you could achieve what you want even without merging, PROC SQL, auxiliary datasets or macros, in a single data step.

Frequent Contributor
Posts: 138

Re: Creating tables with proc sql based on subqueries?

Sorry for the confusion about this.  Yes, WTD, MTD, YTD, and LTD would be values of the TIMEFRAME variable in the data set.  The data set has several variables other than TIMEFRAME for each observation.  My goal is to create a dummy observation for WTD if it doesn't exist in the original data set, otherwise, just pull the observations for WTD, MTD, YTD, and LTD that will be in the original data set.

 

I hope that is a better explanation.  Again, sorry for the confusion.

Trusted Advisor
Posts: 1,114

Re: Creating tables with proc sql based on subqueries?

So, assuming your existing dataset (call it HAVE) has only 3 or 4 observations (depending on whether an observation with TIMEFRAME='WTD' exists), you could create the new dataset (call it WANT) as shown below:

 

/* Scenario 1: WTD record exists */

data have;
input timeframe $ field2 field3;
cards;
MTD 1 11
YTD 2 22
LTD 3 33
WTD 4 44
;

data want;
set have end=last;
wtdex+(timeframe='WTD');
output;
if last & not wtdex then do;
  call missing(of _all_);
  timeframe='WTD';
  output;
end;
drop wtdex;
run;

/* Scenario 2: WTD record does not exist */

data have;
set have(obs=3);
run;

/* data want; ... same solution as above */
Grand Advisor
Posts: 17,444

Re: Creating tables with proc sql based on subqueries?

Does that solution extend to past WTD? What if LTD is missing instead?

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 22 replies
  • 735 views
  • 3 likes
  • 5 in conversation