BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ciaranhillery
Calcite | Level 5

Hi,

 

I've got a sample of data which has separate records for each day of a month for the same ID, containing a daily value. I'm trying to identify the number of days since the value was below a certain value as at the latest date.

 

 

data have;
   infile datalines dsd truncover;
   input ID $8. value 8. date $10.;
 datalines;
 1 1200 20211001
 1 500 20211002
 1 50 20211003
 1 25 20211004
 1 10 20211005
 2 200 20211001
 2 10 20211002
 2 5 20211003
 2 25 20211004
 2 100 20211005
 ;;;

 

 

e.g. calculate the number of days where the value is less than 75 as at 20211005 (latest date), per ID. Calculating this as follows against the above code:

 

  if value < 75 then OD = 1; else OD = 0;

 

the results would show as below. What I'm trying to achieve is count the number of days since the value was last below 75 at the latest date per ID, so the count for ID 1 would equal 3, and ID 2 would equal 0. Any assistance would be greatly appreciated!

 

*edit* - the count would reset to 0 if the value is 75 or greater.

 

1 1200 20211001 0
1 500 20211002 0
1 50 20211003 1
1 25 20211004 1
1 10 20211005 1
2 200 20211001 0
2 100 20211002 0
2 5 20211003 1
2 25 20211004 1
2 100 20211005 0

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User
data have;
infile datalines delimiter = " " dsd truncover;
input ID :$1. value :4. date :yymmdd10.;
format date yymmdd10.;
datalines;
1 1200 20211001
1 500 20211002
1 50 20211003
1 25 20211004
1 10 20211005
1 20 20211006
1 500 20211007
1 88 20211008
1 25 20211009
1 36 20211010
1 42 20211011
2 200 20211001
2 10 20211002
2 5 20211003
2 25 20211004
2 100 20211005
;

data want;
set have;
by id;
if first.id or value ge 75 then count = 0;
if value lt 75 then count + 1;
run;

View solution in original post

7 REPLIES 7
maguiremq
SAS Super FREQ

I modified your DATALINES a bit since it wasn't formatted correctly.

data have;
   infile datalines delimiter = " " dsd truncover;
   input ID :$1. value :4. date :$10.;
 datalines;
1 1200 20211001
1 500 20211002
1 50 20211003
1 25 20211004
1 10 20211005
2 200 20211001
2 10 20211002
2 5 20211003
2 25 20211004
2 100 20211005
;

data want;
 	set have;
	by id;
	retain od;
		if value > 75 then od = 0;
		if value < 75 then od + 1;
run;
ID value date od 
1 1200 20211001 0 
1 500 20211002 0 
1 50 20211003 1 
1 25 20211004 2 
1 10 20211005 3 
2 200 20211001 0 
2 10 20211002 1 
2 5 20211003 2 
2 25 20211004 3 
2 100 20211005 0 

You can then use `drop` or `keep` and `if last.id then output` to just get those values and variables.

ballardw
Super User

@ciaranhillery wrote:

Hi,

 

I've got a sample of data which has separate records for each day of a month for the same ID, containing a daily value. I'm trying to identify the number of days since the value was below a certain value as at the latest date.

 

 

data have;
   infile datalines dsd truncover;
   input ID $8. value 8. date $10.;
 datalines;
 1 1200 20211001
 1 500 20211002
 1 50 20211003
 1 25 20211004
 1 10 20211005
 2 200 20211001
 2 10 20211002
 2 5 20211003
 2 25 20211004
 2 100 20211005
 ;;;

 

 

e.g. calculate the number of days where the value is less than 75 as at 20211005 (latest date), per ID. Calculating this as follows against the above code:

 

  if value < 75 then OD = 1; else OD = 0;

 

the results would show as below. What I'm trying to achieve is count the number of days since the value was last below 75 at the latest date per ID, so the count for ID 1 would equal 3, and ID 2 would equal 0. Any assistance would be greatly appreciated!

 

*edit* - the count would reset to 0 if the value is 75 or greater.

 

1 1200 20211001 0
1 500 20211002 0
1 50 20211003 1
1 25 20211004 1
1 10 20211005 1
2 200 20211001 0
2 100 20211002 0
2 5 20211003 1
2 25 20211004 1
2 100 20211005 0


I'm not sure what role OD plays but is this the sort of count you are looking for?

data want;
   set have;
   by id;
   retain lowcount 0;
   if first.id then lowcount=0;
   if value < 75 then do;
      OD = 1; 
      lowcount+1;
   end;
   else do;
      OD = 0;
      lowcount=0;
   end;

run;

If you data is not sorted but only grouped by your ID you would need to use BY NOTSORTED ID;

The BY group lets you test if a record is the first or last of a group of values with the FIRST. or LAST. (note the dot there) automatic variables. These are 1/0 (true/false to SAS) values and can be used with IF as shown to reset or set things at the beginning/end of each group.

ciaranhillery
Calcite | Level 5

Thank you both for your responses. Apologies, but the sample has been expanded to include up to 12 months worth of daily values per ID, where some IDs will have 12 months worth, and others won't (only 3 months worth for e.g.).

 

I'm looking to calculate the counts by month i.e. the count will reset after each month ends. I've calculated the month number using the intck function, so have values from 0 - 12. I need the count to reset for the first monthly value, which can be 0 or 1 depending on the value - so I'll now need it to be based on both the ID, and the month.

andreas_lds
Jade | Level 19

Thanks for posting data, please add the expected result for that dataset. This helps us to better understand what you actually need.

What do you expect with data like

  datalines;
1 1200 20211001
1 500 20211002
1 50 20211003
1 25 20211004
1 10 20211005
1 20 20211006
1 500 20211007
1 88 20211008
1 25 20211009
1 36 20211010
1 42 20211011
;

as result?

 

The function month() can extract the month from a sas-date, right now you have only char-dates. If you are not 105% sure that there will be data for one year only, you should use substr(date, 1, 6) to get both month and year.

ciaranhillery
Calcite | Level 5

Hi, I've added a few records at the beginning to show values from a previous month to hopefully show what I'm after - while the count need to be reset at the beginning of each month, if the value of the first. record for the month is less than 75, the count will start as 1. If the value is 75 or greater, then the count will start as 0.

 

*edited with month end and beginning lines in red

 

  datalines;
1 1200 20210928 0
1 500 20210929 0
1 50 20210930 1
1 60 20211001 1
1 500 20211002 0
1 50 20211003 1
1 25 20211004 2
1 10 20211005 3
1 20 20211006 4
1 500 20211007 0
1 88 20211008 0
1 25 20211009 1
1 36 20211010 2
1 42 20211011 3
1 200 20211030 0
1 250 20211101 0
1 50 20211102 1
;

 

Kurt_Bremser
Super User
data have;
infile datalines delimiter = " " dsd truncover;
input ID :$1. value :4. date :yymmdd10.;
format date yymmdd10.;
datalines;
1 1200 20211001
1 500 20211002
1 50 20211003
1 25 20211004
1 10 20211005
1 20 20211006
1 500 20211007
1 88 20211008
1 25 20211009
1 36 20211010
1 42 20211011
2 200 20211001
2 10 20211002
2 5 20211003
2 25 20211004
2 100 20211005
;

data want;
set have;
by id;
if first.id or value ge 75 then count = 0;
if value lt 75 then count + 1;
run;
ciaranhillery
Calcite | Level 5

Thanks Kurt, this worked in appending what I need.

 

Thank you all for your responses on this, greatly appreciated!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1355 views
  • 0 likes
  • 5 in conversation