proc sql;
create table step_1 as select *
from master
where date = '200101';
run;
proc sql;
create table step_2 as select *
from time_table
where date = 200102;
run;
proc sql;
create table step_3 as select a.*, b.*
from step_1 as a
left join step_2 as b
on a.date = b.time;
quit;
data step_5;
set step_4_1;
array Month (6) date_200101
date_200102
date_200103
date_200104
date_201510
date_201510
date_201510
do i = 1 to 7;
if Month{i} in ('90','120','150','180') then do;
bad = 1;
mnth = i;
i = 7;
end;
else bad = 0;
end;
if month{7} in ('90','120','150','180') then do;
Bad_1 = 1;
end;
else bad_1 = 0 ;
count_1 = 1;
drop i;
run;
Well, lets back up here for a minute. First, it is quite hard to refactor anything without seeing a) some test data - in the form of a datastep, see this post if you need help:
and b) required output.
A brief glance at your code provided shows a fair few areas for change, for instance just the first three steps could be:
proc sql; create table STEP_3 as select A.*, B.* from (select * from MASTER where DATE='200101') A left join (select * from TIME_TABLE where DATE='200102') B on A.DATE=B.TIME; quit;
I assume date is a character string? If not then you need to put a date value in there - i.e. a numeric or date literal. this is where seeing actual data helps. Also, use of select * shows that you haven't thought about the data coming in (and can cause problems further on.
Thirdly, never start a problem by saying I need to use <xyz> technology, now how can I squeeze my problem into that as your sure to end up with a mess. Just from what you post there, it seems you have the "excel" way of working going on, i.e. data in column headings going across the page which isn't the best way to work. Put your data in the dataset, and name variables consistently and simple so that your code can access them without vast amounts of code needed to access the information. Also put date into the dataset as a proper date variable, this way you will be, by easy use of maths or date functions, be able to pull out any selection of data for processing, i.e. you can do 7 month windows on the data and then process using those groups.
So for me if I started with:
date_201401 date_201402...
123 567
First step would be to remodel the data for programming activities:
DATE RESULT
JAN2014 123
FEB2014 567
I thne know for my programming that there will always be two variables DATE and RESULT, and I can process dates using simple maths and date functions e.g.
DATE RESULT WINDOW
JAN2014 123 First 7 Months
FEB2014 567 First 7 Months
...
I can then group results, sum them etc. without loops and extra code, and if the output needs to look as it was, then a simple proc transpose call can achieve this, hence output as required, process data as required for programming - best of both worlds.
Well, lets back up here for a minute. First, it is quite hard to refactor anything without seeing a) some test data - in the form of a datastep, see this post if you need help:
and b) required output.
A brief glance at your code provided shows a fair few areas for change, for instance just the first three steps could be:
proc sql; create table STEP_3 as select A.*, B.* from (select * from MASTER where DATE='200101') A left join (select * from TIME_TABLE where DATE='200102') B on A.DATE=B.TIME; quit;
I assume date is a character string? If not then you need to put a date value in there - i.e. a numeric or date literal. this is where seeing actual data helps. Also, use of select * shows that you haven't thought about the data coming in (and can cause problems further on.
Thirdly, never start a problem by saying I need to use <xyz> technology, now how can I squeeze my problem into that as your sure to end up with a mess. Just from what you post there, it seems you have the "excel" way of working going on, i.e. data in column headings going across the page which isn't the best way to work. Put your data in the dataset, and name variables consistently and simple so that your code can access them without vast amounts of code needed to access the information. Also put date into the dataset as a proper date variable, this way you will be, by easy use of maths or date functions, be able to pull out any selection of data for processing, i.e. you can do 7 month windows on the data and then process using those groups.
So for me if I started with:
date_201401 date_201402...
123 567
First step would be to remodel the data for programming activities:
DATE RESULT
JAN2014 123
FEB2014 567
I thne know for my programming that there will always be two variables DATE and RESULT, and I can process dates using simple maths and date functions e.g.
DATE RESULT WINDOW
JAN2014 123 First 7 Months
FEB2014 567 First 7 Months
...
I can then group results, sum them etc. without loops and extra code, and if the output needs to look as it was, then a simple proc transpose call can achieve this, hence output as required, process data as required for programming - best of both worlds.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.