BookmarkSubscribeRSS Feed
lcsitl
Fluorite | Level 6

I am trying to create a table from a VERY large dataset.  I have several variables that need to be included (city, lat, long, year, month) and some I want to calculate. I have daily data with an avg, min and max for each day
ie:

Name           Date                 Lat       Long       Month     Year      Max      Min     Average
Shaoguan    01/01/2000    24.667   113.6       Jan        2000        59          21            45

Shaoguan    01/02/2000    24.667   113.6       Jan        2000        63          20            49

 

I need to make a table that sorts by name (and keeps the corresponding lat & long), then by year, then by month, and that does away with the daily figures and just shows the average of max, min, and average for each month.  I get this information via proc means, but the table format is not what I need.  I want to add the average of each of the min, max and avg variables as new columns, rather than the way they display as rows in proc means. 

I need a final result like this:

 

Name             Lat       Long       Month     Year      AvgMax      AvgMin     AvgAverage
Shaoguan      24.667   113.6       Jan        2000             59              21                    45

Shaoguan      24.667   113.6       Feb       2000             54               18                    37

5 REPLIES 5
ballardw
Super User

@lcsitl wrote:

I am trying to create a table from a VERY large dataset.  I have several variables that need to be included (city, lat, long, year, month) and some I want to calculate. I have daily data with an avg, min and max for each day
ie:

Name           Date                 Lat       Long       Month     Year      Max      Min     Average
Shaoguan    01/01/2000    24.667   113.6       Jan        2000        59          21            45

Shaoguan    01/02/2000    24.667   113.6       Jan        2000        63          20            49

 

I need to make a table that sorts by name (and keeps the corresponding lat & long), then by year, then by month, and that does away with the daily figures and just shows the average of max, min, and average for each month.  I get this information via proc means, but the table format is not what I need.  I want to add the average of each of the min, max and avg variables as new columns, rather than the way they display as rows in proc means. 

I need a final result like this:

 

Name             Lat       Long       Month     Year      AvgMax      AvgMin     AvgAverage
Shaoguan      24.667   113.6       Jan        2000             59              21                    45

Shaoguan      24.667   113.6       Feb       2000             54               18                    37


Naming your variables max and min makes for ugly variable names in combinations.

If you showed the proc means code you used it likely only needs a small change. I would start here:

proc summary data=have nway;
  class name lat long  month year;
  var max min average;
  output out=want mean= / autoname;
run;

The output set has two variables _freq_ and _type_ that you might drop. The _freq_ is how many records were used for calculating stats on that record and _type_ indicates combinations of class variables. The NWAY  on the Proc statement says to only have the output for combinations containing values for all of the class variables.

 

lcsitl
Fluorite | Level 6

My code is basically that:

 

 

proc means data=Daily nonobs ;
class Name Latitude Longitude Elevation Year Month;
var TAvg TMax TMin;
output out=Monthly mean(TAvg) = MeanAvg mean(TMax)=MeanMax mean(TMin)=MeanMin;
run;

sas output.jpg
But it gives me this output ^^, and for what I need moving forward, I need this output to have been 3 rows, not 9, with the added columns of TAvg, TMax, and TMin. I can't figure out how to get there. It seems like it should be easier than it is proving. 

 

ballardw
Super User

@lcsitl wrote:

My code is basically that:

 

 

proc means data=Daily nonobs ;
class Name Latitude Longitude Elevation Year Month;
var TAvg TMax TMin;
output out=Monthly mean(TAvg) = MeanAvg mean(TMax)=MeanMax mean(TMin)=MeanMin;
run;

sas output.jpg
But it gives me this output ^^, and for what I need moving forward, I need this output to have been 3 rows, not 9, with the added columns of TAvg, TMax, and TMin. I can't figure out how to get there. It seems like it should be easier than it is proving. 

 


Notice that I used PROC SUMMARY. While the syntax is similar the output dataset structure by default is different. Try it and see.

lcsitl
Fluorite | Level 6

Oh, jeeze, yes, I see. Thank you!
That is so close to helping, except the table output is 128 rows, only the last 3 are what I want. And that's with the sample I'm running with only 63 rows to start (what I will ultimately be processing is about a million rows.. It's like it's spitting out all possible combinations of the class variables. I have tried moving them around or eliminating them, but I still don't get what I need, without losing things I do.

p1.pngp2.png

The 3 rows of data in the red square is all I want.

I have been fooling around with proc tabulate as an alternative, but that gives a way more complicated result than I want.

ballardw
Super User

Details. Please look at the code I showed as an example and see if something is different in yours:

proc summary data=have nway;
  class name lat long  month year;
  var max min average;
  output out=want mean= / autoname;
run;

NWAY says to only take the records with all of the combinations present.

 

 

You have apparently dropped the _TYPE_ variable that would have provided a clue as to which records you wanted for further processing.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1192 views
  • 0 likes
  • 2 in conversation