Please look over the following:
Indicator  Range                      Number
i_20304    >= 1.64214 AND < 2.03922     14
i_20304    >= 1.64214 AND < 2.43629     19
i_20304    >= 1.64214 AND < 2.83337     11
i_20304    >= 1.64214 AND < 3.23044     17
i_20304    >= 1.64214 AND < 3.62751     13
The Range refers to the particular Indicator, and Number of those.
Data set contains over 5 million rows of such. Different Indicators, different Ranges, different Numbers of each.
I'm trying to summarize all this data.
Proc Freq works great as far as Indicator and Number.
But incorporating Range (a categorical variable) into the summary is a mystery.
One possibility would be to concatenate Indicator and Range. Then use Proc Freq.
Any thoughts on this are greatly appreciated. Particularly any other procedure that can handle "ranges" of data.
Nicholas Kormanik
1 Split your range variable into 4 variables, upper_bound, lower_bound, upper_inequality, lower_inequality.
Then your range of
>= 1.64214 AND < 2.03922 becomes
upper_bound = 2.03992
lower_bound = 1.64214
upper_inequality = LT
lower_inequality = GE
Then you can use that to graph the ranges as a band or area in SGPLOT.
@NKormanik wrote:
You may need 4 variables if you need to capture the < or <= as well.
Willing to add as many more variables as it takes.
What do you have in mind?
What is the summary you want supposed to look like?
What is the summary you want supposed to look like?
How about ANY summary anyone can think of, for starters?
Split your "range" into an upper and lower value so that you can summarize it in a different manner. You may need 4 variables if you need to capture the < or <= as well.
You may need 4 variables if you need to capture the < or <= as well.
Willing to add as many more variables as it takes.
What do you have in mind?
1 Split your range variable into 4 variables, upper_bound, lower_bound, upper_inequality, lower_inequality.
Then your range of
>= 1.64214 AND < 2.03922 becomes
upper_bound = 2.03992
lower_bound = 1.64214
upper_inequality = LT
lower_inequality = GE
Then you can use that to graph the ranges as a band or area in SGPLOT.
@NKormanik wrote:
You may need 4 variables if you need to capture the < or <= as well.
Willing to add as many more variables as it takes.
What do you have in mind?
I know we can graphically plot POINTS. Nicely.
Is it possible in SAS to plot RANGES?
You can do that in Mathematica.
A graph would make a terrific summary....
Particularly, in the present case, if there's a little number next to each Range showing the total number of such ranges for that Indicator.
Like, for instance:
(54,321)
i_20304    >= 1.64214 AND < 2.03922
Remember, there are over 5 million such lines in the data set. So, the above says there are 54,321 cases of this particular Indicator, and this particular Range.
> But incorporating Range (a categorical variable) into the summary is a mystery.
Why? What's different from the categorical variable INDICATOR?
Both are categorical variables, yes. I haven't yet tried using Range in Proc Freq, thinking that it's so untypical, weird. That there is probably some other way of handling it.
@ChrisNZ wrote:
I have no idea what issue you want to solve.
No worries, Chris. Maybe next time.
What I'm attempting to do is to SUMMARIZE over 5 million rows of data, as given in part above.
Simply paging down that much data is nearly impossible to get a true sense of. Somehow it all has to be summarized. A set of graphs? Using Proc Freq? Some other....
@NKormanik wrote:
@ChrisNZ wrote:
I have no idea what issue you want to solve.No worries, Chris. Maybe next time.
What I'm attempting to do is to SUMMARIZE over 5 million rows of data, as given in part above.
Simply paging down that much data is nearly impossible to get a true sense of. Somehow it all has to be summarized. A set of graphs? Using Proc Freq? Some other....
Hint: Provide a small example of the data and what the "summarized" version looks like.
Make sure the example data matches the rules you posted.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
