Hello all. I have some code that I run on a monthly basis, I am using indsname to identify sources and initialize new variables.
The code is here:
data fyf_output_graph_append;
set
average_lt700 : indsname = set
average_700_800 : indsname = set
average_gt800 : indsname = set
average_bus : indsname = set;
source1 = set;
length Fico $7.;
if source1 = 'WORK.AVERAGE_LT700' then do;
asset_class = 'Consumer';
Fico = '< 700';
end;
else if source1 = 'WORK.AVERAGE_700_800' then do;
asset_class = 'Consumer';
Fico = '700-800';
end;
else if source1 = 'WORK.AVERAGE_GT800' then do;
asset_class = 'Consumer';
Fico = '> 800';
end;
else if source1 = 'WORK.AVERAGE_BUS' then do;
asset_class = 'Business';
Fico = '';
end;
else delete;
drop source1;
Date = intnx('month',input(month,yymmn6.),0,'e');
format date mmddyy10.;
drop month;
run;
The reason I put the 'else delete' is because there are temporary datasets generated through this process, here is the log:
NOTE: There were 456 observations read from the data set WORK.AVERAGE_LT700.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_LT700_PREP.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_LT700_PREP_TWO.
NOTE: There were 456 observations read from the data set WORK.AVERAGE_700_800.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_700_800_PREP.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_700_800_PREP_TWO.
NOTE: There were 456 observations read from the data set WORK.AVERAGE_GT800.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_GT800_PREP.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_GT800_PREP_TWO.
NOTE: There were 456 observations read from the data set WORK.AVERAGE_BUS.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_BUS_PREP.
NOTE: There were 480 observations read from the data set WORK.AVERAGE_BUS_PREP_TWO.
I do not understand why the prep and prep_two datasets are there, but I don't delete them my dataset grows in triplicate.
I tried to duplicate the issue and I am not able, I am pasting the log for a similar piece of code below, where the temp datasets are not generated:
24 GOPTIONS ACCESSIBLE;
25 data join_all;
26 set fico_lt700 : indsname = source
27 fico_700_800 : indsname = source
28 fico_800_plus : indsname = source
29 business : indsname = source;
30 name=source;
31 length fico $7.;
32 if name = 'WORK.FICO_700_800' then do;
33 fico = '700-800';
34 asset_class = 'Consumer';
35 end;
36 else if name = 'WORK.FICO_800_PLUS' then do;
37 fico = '800';
38 asset_class = 'Consumer';
39 end;
40 else if name = 'WORK.FICO_LT700' then do;
41 fico = '< 700';
42 asset_class = 'Consumer';
43 end;
44 else if name = 'WORK.BUSINESS' then do;
45 fico = '';
46 asset_class = 'Business';
47 end;
48 Date = load_date;
49 drop name;
50 run;
NOTE: There were 3080798 observations read from the data set WORK.FICO_LT700.
NOTE: There were 2914217 observations read from the data set WORK.FICO_700_800.
NOTE: There were 1716779 observations read from the data set WORK.FICO_800_PLUS.
NOTE: There were 1643479 observations read from the data set WORK.BUSINESS.
NOTE: The data set WORK.JOIN_ALL has 9355273 observations and 6 variables.
Can someone explain why the temp datasets are generated with one and not the other?
Thank You in advance.
What are you expecting the colon to do in your code?
It's likely being interpreted as being part of the data set name and means include all datasets that start with that prefix - which is hwy you're getting the extra data sets.
You also don't need multiple INDSNAME options, a single one is all that's required.
I think this may be what you want?
set
average_lt700
average_700_800
average_gt800
average_bus indsname = set;
What are you expecting the colon to do in your code?
It's likely being interpreted as being part of the data set name and means include all datasets that start with that prefix - which is hwy you're getting the extra data sets.
You also don't need multiple INDSNAME options, a single one is all that's required.
I think this may be what you want?
set
average_lt700
average_700_800
average_gt800
average_bus indsname = set;
I thought the : was part of the syntax, I saw it somewhere and thought it was necessary.
Now that you say that it makes so much sense. I'll update my code accordingly.
Thank You,
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.