Hello, I am first time user of this board.
I have database that contains the 4 types of materials used over the last 100 years.
I like to make a table that shows how many of those 4 materials were used in 5 yrs time segment.
For example, I like to say between 1958-1962, material A was used 100 times, B 50 times, C 10 times, and D 40 times etc.
I am using proc sql to count observation, but it's so hard as I have to repeat the same command over 100 times.
My data look like this;
Address Year_of_service Material_used
113 cascade 2001 A
11 stonehill 1963 B
10 bushes 1953 C
114 oat mil 1993 A
I have about 10,000 observations.
I am currently using this:
proc sql;
select count(*) as N_Obs
from sort1
where Material_used="A" and Year_of_service>1962 and Year_of_service <1968
;
quit;
Do your intervals overlap? This is what PROC MEANS is designed for essentially, but there are ways to do this within a SQL query as well. In a perfect world you can show an example input, expected output and what you've tried. Please included as much of that as possible. This can definitely be done in one or two steps in SAS.
Concrete examples of data usually help. Especially how to show something like "material A was used".
Best is to provide an example in the form of a data step that we can run to create example data to work with as that answers lots of questions about such things as data variable names and types.
Or at least show one example of your SQL code that works.
It's a good case for applying a format, and letting PROC FREQ do the counting For example:
data groups;
do start = 1919 to 2014 by 5;
end = start + 4;
label = cats (start, '-', end);
output;
end;
retain fmtname 'group_yr';
end;
proc format cntlin = groups;
run;
That gives you a format capable of grouping into 5-year increments. Then apply the format while counting:
proc freq data=have;
tables year_of_service * material_used / norow nocol nopercent;
format year_of_service group_yr.;
run;
Or if you have un-even intervals of interest you can specify them as well such as:
proc format ; value someinterval 1900 - 1929 = '1900-1929' 1930 - 1955 = '1930-1955' 1956 - 1999 = '1940-1999' 2000 - high = '2000 and later' ; run;
The only change in the Proc freq code would be the format assigned to the YEAR variable.
You could also create related groups of the material.
LOTS of things can be done with formats and some procedures: Tabulate Summary and Means support a more complex format type called MULTILABEL that can create subgroups as well such as a total for Years 1900-1950, 1951-1975, 1976-2005 then by subgroups within each range.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.