turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Advanced Analysis on Data Set

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-06-2017 02:46 PM

I have a dataset where the first column is a datetime, and the other columns contain integers. Each row represents one second with corresponding values.

How would I find out how many times a value in the integer column was under 50 and then went over 50 for a minimum of n seconds?

I realize this may be a complex solution, so please let me know if you need more info!

Thanks!

Accepted Solutions

Solution

07-12-2017
08:57 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BC33

07-06-2017 04:48 PM

I would add variables to your data set, to give you flexibility in the later analysis. It's a bit of a headache, but here is the idea. Let's assume one of your existing variables is named ABC:

data want;

set have;

prior_abc = lag(abc);

if _n_=1 and abc < 50 then abc_group=1;

if (abc >= 50 and prior_abc < 50) or (abc < 50 and prior_abc >= 50) then do;

abc_group + 1;

n_seconds_abc = 1;

end;

else n_seconds_abc + 1;

drop prior_abc;

run;

Note that I included 50 exactly as part of the 50+ group.

You should be able to run this on your data and get an idea of what the new variables look like. I tested at least some variations of the code. Note that the number of seconds in the first grouping may be off, but that won't matter since the first grouping is never used for analysis.

Then consider an analysis question:

For ABC, how many times was the value under 50, then rose to 50+? How long did the 50+ measurement last at least N seconds?

proc freq data=want;

where abc_group > 1 and abc >= 50;

tables n_seconds_abc;

run;

Test it with a relatively small amount of data to get a feel for what the new variables and the tabular results represent.

All Replies

Solution

07-12-2017
08:57 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BC33

07-06-2017 04:48 PM

I would add variables to your data set, to give you flexibility in the later analysis. It's a bit of a headache, but here is the idea. Let's assume one of your existing variables is named ABC:

data want;

set have;

prior_abc = lag(abc);

if _n_=1 and abc < 50 then abc_group=1;

if (abc >= 50 and prior_abc < 50) or (abc < 50 and prior_abc >= 50) then do;

abc_group + 1;

n_seconds_abc = 1;

end;

else n_seconds_abc + 1;

drop prior_abc;

run;

Note that I included 50 exactly as part of the 50+ group.

You should be able to run this on your data and get an idea of what the new variables look like. I tested at least some variations of the code. Note that the number of seconds in the first grouping may be off, but that won't matter since the first grouping is never used for analysis.

Then consider an analysis question:

For ABC, how many times was the value under 50, then rose to 50+? How long did the 50+ measurement last at least N seconds?

proc freq data=want;

where abc_group > 1 and abc >= 50;

tables n_seconds_abc;

run;

Test it with a relatively small amount of data to get a feel for what the new variables and the tabular results represent.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

07-07-2017 09:08 AM

I appreciate your response. This is definitely getting me on the right track! However, I am getting an error. It is saying the new variables are "**not on file**" for the dataset. I did use your code and switched abc for the real variable name.

I imported the data from excel and it lives in a table. Does this change anything?

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BC33

07-10-2017 01:40 PM

Post the actual log with error messages here using the code box opened with the forum {i} menu icon.

Importing from Excel may have a number of issues such as you thinking you know the name of the variable(column heading) but the imported result having additional characters such as _ from spaces or other invalid characters in the variable name or requiring use of "name with space"n syntax.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

07-11-2017 11:56 AM

982 DATA WORK; 983 set data_full; 984 prior_data = lag(_variable); 985 if _n_=1 and variable < 50 then variable_group=1; 986 if(variable >= 50 and prior_data < 50) or (variable < 50 and prior_data >= 50) then do; 987 variable_group + 1; 988 n_seconds_data = 1; 989 end; 990 else n_seconds_data + 1; 991 drop prior_data; 992 run; NOTE: There were 1200 observations read from the data set WORK.DATA_FULL. NOTE: The data set WORK.WORK has 1200 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds 993 994 proc freq data=WORK.data_full; 995 where variable_group > 1 and variable >= 50; ERROR: Variable variable_group is not on file WORK.DATA_FULL. 996 tables n_seconds_data; ERROR: Variable N_SECONDS_DATA not found. 997 run;

Yes, when importing from Excel I have noticed changes in variable names; however, the issue looks like it is related to the variables being created in the first step you posted. They are not being recognized. Do I need to add these variables to the dataset separately?

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BC33

07-11-2017 02:44 PM

You've called the new data set "work" (probably not a good name)

982 DATA WORK;

so you need to use that name in PROC FREQ:

994 proc freq data=WORK;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

07-12-2017 08:53 AM

You are correct. I am now getting the expected results. Thanks for the tip!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

07-12-2017 03:28 PM

Thanks for your answer. I realize this next solution may be similar in manner, but **how would you best find how many events there are where x > 50 for 10 consecutive seconds (or rows)**? As you may have guessed, I am a beginner at SAS programming. Your help is greatly appreciated!

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BC33

07-12-2017 06:29 PM

?? That's what my code does.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ChrisNZ

07-14-2017 10:16 AM

Can you please explain what this does? I run your code and get a GROUP= output, but the output value doesn't quite seem to match what I am looking for, and I suspect it could be this code.

VAL=ranuni(0)*200;

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ChrisNZ

07-14-2017 11:10 AM

No worries. I have refined the code to use values from the dataset instead of random numbers and your code works great. Although the original question had a slightly different context, I still appreciate your solution.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to BC33

07-11-2017 09:43 PM - edited 07-11-2017 09:43 PM

Like this?

```
data HAVE;
do DT=1 to 1000;
VAL=ranuni(0)*200;
output;
end;
run;
%let nb_seconds=10;
data COUNT;
set HAVE end=LASTOBS;
if VAL > 50 then SEQ+1;
else SEQ=0;
if SEQ = &nb_seconds. then GROUP+1;
if LASTOBS then putlog GROUP=;
run;
```

GROUP=10