BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brucehughw
Obsidian | Level 7

Hello,

 

I have a dataset with the following three columns: site, dateTime, precipitationType. To keep it simple, let's assume I have two sites: KBOS and KACY. Datetime is a variable for the date and time of a particular observation. Precipitation type (again, keeping it simple) includes -RA, RA, +RA (for light, moderate, and heavy rain) and -SN, SN, +SN (for snow).

 

I'd like a report that contains columns for site, date, minutes of precipitation, minutes of RA, minutes of SN. I'd also like to, say, filter for dates with at least X minutes of precipitation and maybe include columns for -RA, RA, and +RA. 

 

Any/all suggestions are welcome.

 

Thanks, Bruce

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
infile cards dsd;
informat site $4. datetime datetime. precipitationtype $3.;
format site $4. datetime datetime. precipitationtype $3.;
input site datetime precipitationtype;
cards;
kbos,01JAN2016:05:00:00,-RA
kbos,01JAN2016:05:01:00,-RA
kbos,01JAN2016:05:10:00,+RA
kacy,03JAN2016:07:20:00,-SN
kacy,03JAN2016:07:51:00,SN
kacy,03JAN2016:07:52:00,+SN
kbos,03JAN2016:16:21:00,-RA
kbos,03JAN2016:16:22:00,-RA
kbos,03JAN2016:16:26:00,+RA
kacy,04JAN2016:05:20:00,-SN
kacy,04JAN2016:07:51:00,SN
kacy,04JAN2016:07:52:00,+SN
;
run;
data temp;
 set have;
 date=datepart(datetime);
 type=compress(precipitationtype,,'ka');
 format date date9.;
run;
proc sql;
create table want as
 select site,date,sum(type='RA') as RA,sum(type='SN') as SN,count(*) as total
  from temp
   group by site,date;
quit;

View solution in original post

14 REPLIES 14
Steelers_In_DC
Barite | Level 11

Is there a column to show participation start / stop?  With these columns how do you see how many minutes of participation?  Provide something like this to show the dataset you currently have:

 

data have;
infile cards dsd;
informat site $4. datetime datetime. precipitationtype $3.;
format site $4. datetime datetime. precipitationtype $3.;
input site datetime precipitationtype;
cards;
kbos,01JAN2016:05:00:00,-RA
kbos,02JAN2016:05:00:00,RA
kbos,03JAN2016:05:00:00,+RA
kacy,01JAN2016:05:00:00,-SN
kacy,02JAN2016:05:00:00,SN
kacy,03JAN2016:05:00:00,+SN
;
run;

brucehughw
Obsidian | Level 7

@Steelers_In_DC wrote:

Is there a column to show participation start / stop?  With these columns how do you see how many minutes of participation?  Provide something like this to show the dataset you currently have:

 

data have;
infile cards dsd;
informat site $4. datetime datetime. precipitationtype $3.;
format site $4. datetime datetime. precipitationtype $3.;
input site datetime precipitationtype;
cards;
kbos,01JAN2016:05:00:00,-RA
kbos,02JAN2016:05:01:00,-RA
kbos,03JAN2016:05:10:00,+RA
kacy,01JAN2016:05:20:00,-SN
kacy,02JAN2016:07:51:00,SN
kacy,03JAN2016:07:52:00,+SN
;
run;

 

-- this means there was -RA at 0500 and 0501, +RA at 0510, -SN at 0520, SN at 0751 and +SN at 0752 -- otherwise, no precipitation.


Hi,

 

Please see updates to the data you included. Sorry for not being clearer in my initial submission. Each site can have only one type of weather at a given time, as I've shown. Also, I do not have start/end times. When there is precipitation (recorded once per minute), there is an observation. When there is no precipitation, there is no observation. 

 

thanks, Bruce

ballardw
Super User

You have enough going on that you should provide some example data and what the final result for that data would be.

 

Also, are your data point datetimes the begining, end or middle of a period? Are the reported times fixed interval (every 5 minutes) or variable interval?

If they are variable how are going to caluclate "minutes of precipitation"? I would expect a calcultion like that to come from a start/end time. Are you working from someone else's summarized data?

brucehughw
Obsidian | Level 7

@ballardw wrote:

You have enough going on that you should provide some example data and what the final result for that data would be.

 

Also, are your data point datetimes the begining, end or middle of a period? Are the reported times fixed interval (every 5 minutes) or variable interval?

If they are variable how are going to caluclate "minutes of precipitation"? I would expect a calcultion like that to come from a start/end time. Are you working from someone else's summarized data?


 

Hello. In the previous reply I updated a sample dataset (yes, sorry I did not provide) and answered some of your questions. Reported times are every minute -- but only when there is precipitation. I'm working from my own data. The original data does have start/end times, but I thought it would be cleaner just to note what was happening every minute. Also, I will typically have a numerous entries in a row (0501, 0502, 0503, ...) for, say, -RA, but not always. Here's a fuller sample dataset (again, building on the sample provided earlier)

 

data have;
infile cards dsd;
informat site $4. datetime datetime. precipitationtype $3.;
format site $4. datetime datetime. precipitationtype $3.;
input site datetime precipitationtype;
cards;
kbos,01JAN2016:05:00:00,-RA
kbos,01JAN2016:05:01:00,-RA
kbos,01JAN2016:05:10:00,+RA
kacy,03JAN2016:07:20:00,-SN
kacy,03JAN2016:07:51:00,SN
kacy,03JAN2016:07:52:00,+SN

kbos,03JAN2016:16:21:00,-RA
kbos,03JAN2016:16:22:00,-RA
kbos,03JAN2016:16:26:00,+RA
kacy,04JAN2016:05:20:00,-SN
kacy,04JAN2016:07:51:00,SN
kacy,04JAN2016:07:52:00,+SN
;
run;

 

NOTE: I did not see that the earlier reply (from Steelers_In_DC) had different dates, so my short description of what is happening at 0500, 0501, etc. is correct, but the observations are occuring on different dates. I changed some of these and added more dates. This is more typical: a few observations (or many, actually, up to 1440) on a given day.

 

Thanks, Bruce

Reeza
Super User
And what do you want the output to look like, assuming this would be your data?
brucehughw
Obsidian | Level 7

My output would look like this:

Site     Date           RA SN Total

KBOS 01Jan2016 3    0    3

KBOS 03Jan2016 3    0    3

KACY 03Jan2016 0    3    3

KACY 04Jan2016 0    3    3

 

Thanks, Bruce

PGStats
Opal | Level 21

You could start with something like this:

 

data precipMinutes;
length broadPrecipitationType $2;
set have; by site;
lastTime = lag(dateTime);
if first.site then do;
	call missing(lastTime);
	end;
else do;
	minutes = intck("minute", lastTime, dateTime, "continuous");
	broadPrecipitationType = compress(precipitationType, "+-");
	date = datepart(dateTime);
	if not missing(precipitationType) then output;
	end;
format date yymmdd10.;
keep site dateTime date precipitationType broadPrecipitationType minutes;
run;

proc tabulate data=precipMinutes format=4.0;
class site date broadPrecipitationType;
var minutes;
table site*date, (broadPrecipitationType all)*minutes*sum="";
run;
PG
brucehughw
Obsidian | Level 7

Hello,

 

Given the fuller description of my data:

data have;
infile cards dsd;
informat site $4. datetime datetime. precipitationtype $3.;
format site $4. datetime datetime. precipitationtype $3.;
input site datetime precipitationtype;
cards;
kbos,01JAN2016:05:00:00,-RA
kbos,01JAN2016:05:01:00,-RA
kbos,01JAN2016:05:10:00,+RA
kacy,03JAN2016:07:20:00,-SN
kacy,03JAN2016:07:51:00,SN
kacy,03JAN2016:07:52:00,+SN
kbos,03JAN2016:16:21:00,-RA
kbos,03JAN2016:16:22:00,-RA
kbos,03JAN2016:16:26:00,+RA
kacy,04JAN2016:05:20:00,-SN
kacy,04JAN2016:07:51:00,SN
kacy,04JAN2016:07:52:00,+SN
;
run;

does your solution still apply? Also, I want to report precipitation for each day -- the idea being we will see which days had a lot of precipitation and what types fell.  Finally, in your code, what does this do:

if first. site then do;
	call missing(lastTime);
	end;

Thanks! Bruce

Ksharp
Super User
data have;
infile cards dsd;
informat site $4. datetime datetime. precipitationtype $3.;
format site $4. datetime datetime. precipitationtype $3.;
input site datetime precipitationtype;
cards;
kbos,01JAN2016:05:00:00,-RA
kbos,01JAN2016:05:01:00,-RA
kbos,01JAN2016:05:10:00,+RA
kacy,03JAN2016:07:20:00,-SN
kacy,03JAN2016:07:51:00,SN
kacy,03JAN2016:07:52:00,+SN
kbos,03JAN2016:16:21:00,-RA
kbos,03JAN2016:16:22:00,-RA
kbos,03JAN2016:16:26:00,+RA
kacy,04JAN2016:05:20:00,-SN
kacy,04JAN2016:07:51:00,SN
kacy,04JAN2016:07:52:00,+SN
;
run;
data temp;
 set have;
 date=datepart(datetime);
 type=compress(precipitationtype,,'ka');
 format date date9.;
run;
proc sql;
create table want as
 select site,date,sum(type='RA') as RA,sum(type='SN') as SN,count(*) as total
  from temp
   group by site,date;
quit;
brucehughw
Obsidian | Level 7

@Ksharp wrote:
data have;
infile cards dsd;
informat site $4. datetime datetime. precipitationtype $3.;
format site $4. datetime datetime. precipitationtype $3.;
input site datetime precipitationtype;
cards;
kbos,01JAN2016:05:00:00,-RA
kbos,01JAN2016:05:01:00,-RA
kbos,01JAN2016:05:10:00,+RA
kacy,03JAN2016:07:20:00,-SN
kacy,03JAN2016:07:51:00,SN
kacy,03JAN2016:07:52:00,+SN
kbos,03JAN2016:16:21:00,-RA
kbos,03JAN2016:16:22:00,-RA
kbos,03JAN2016:16:26:00,+RA
kacy,04JAN2016:05:20:00,-SN
kacy,04JAN2016:07:51:00,SN
kacy,04JAN2016:07:52:00,+SN
;
run;
data temp;
 set have;
 date=datepart(datetime);
 type=compress(precipitationtype,,'ka');
 format date date9.;
run;
proc sql;
create table want as
 select site,date,sum(type='RA') as RA,sum(type='SN') as SN,count(*) as total
  from temp
   group by site,date;
quit;

Thanks very much! I was not aware that proc sql allowed conditions (type='SN') in the select statement. That's very handy.

Thanks again, Bruce

PGStats
Opal | Level 21

My code reflected the initial description of your data, or at least my understanding of it. It doesn't take advantage of the fact that every observation represents a single minute.

PG
brucehughw
Obsidian | Level 7

PG,

 

Yes, I apologize for not indicating in my initial post that each observation corresponded to 1 minute of precipitation. Thanks for your suggestion - and also for your suggestion on another post of mine, which I will use/get back to when I next receive data that needs processing and uploading, which will be soon.

 

Bruce

Reeza
Super User

Sine you're looking for a "report" you may want to consider proc tabulate instead of proc sql.

Building on @Ksharp's code, you can use Proc Tabulate to generate your report.

 

proc tabulate data=temp;
class site date type;
table site*date, (type all='Total')*n=''*f=8./misstext='0';
run;
brucehughw
Obsidian | Level 7
Thanks, Reeza. Good suggestion. I will be sending out this and similar reports a lot, so I will try this proc tabulate code. Bruce

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1382 views
  • 1 like
  • 6 in conversation