BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmea
Quartz | Level 8

Hi I have this example dataframe;

PLACEDATENUMBER
Toronto

12JAN20201

0
Toronto13JAN202010
Toronto14JAN2020199
Toronto15JAN2020199

 

i want a new column in my data called DAYS - is should basically be based on if NUMBER is more than 10 then is shoul count from when it was less until when it became more than 10 from the earliest date - so in this case at TORONTO, 14JAN2021, 99 it would say in the new column 2 days

 

hope it makes senes

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Ok. This gives you what you want.

 

data have;
input PLACE $ 1-7 Method $ 10-11 DATE :ddmmyy10. NUMBER;
format date ddmmyy10.;
datalines;
PLACE A  A 12/01/2021 712
PLACE A  B 12/01/2021 712  
PLACE A  A 13/01/2021 713  
PLACE A  B 13/01/2021 73 
PLACE A  A 14/01/2021 75  
PLACE A  B 14/01/2021 7 
PLACE A  A 15/01/2021 75  
PLACE A  B 15/01/2021 112 
PLACE B  A 12/01/2021 8  
PLACE B  B 12/01/2021 16 
PLACE B  A 13/01/2021 85 
PLACE B  B 13/01/2021 15 
PLACE B  A 14/01/2021 85 
PLACE B  B 14/01/2021 1 
PLACE B  A 15/01/2021 86  
PLACE B  B 15/01/2021 16 
PLACE C  A 12/01/2021 188
PLACE C  B 12/01/2021 345
PLACE C  A 13/01/2021 188
PLACE C  B 13/01/2021 67 
PLACE C  A 14/01/2021 188
PLACE C  B 14/01/2021 67 
PLACE C  A 15/01/2021 188
PLACE C  B 15/01/2021 67
;

data want;

   if _N_ = 1 then do;
      dcl hash h();
      h.definekey('place', 'method');
      h.definedata('number', 'date');
      h.definedone();
   end;

   set have;
   by place;

   if first.place then do;
      a = 0; b = 0;
   end;

   n = number;
   d = date;

   if h.find() ne 0 then new = 0;

   else do;
      if n > 10 & number <= 10 then new = d - date;
      date = d;
      number = n;
   end;

   h.replace();

   number = n;
   date = d;

   if method = "A" then a = max(a, new);
   if method = "B" then b = max(b, new);

   if last.place;

   retain a b;

   keep place a b;
run;

 

Result:

 

PLACE    a  b 
PLACE A  0  1 
PLACE B  1  1 
PLACE C  0  0 

View solution in original post

31 REPLIES 31
ballardw
Super User

You may need to provide a more complex example. There are some questions that you need to answer or demonstrate in your data.

1) Do you have more than one value for place and should the process start over with the "earliest date" when a new Place is encountered

2) is your data sorted by Place and date?

3) Does the first record/"earliest date" for Place always have a 0? Always have a value less than 10?

4) Does the value of number decrease and then increase? If the value of number decreases to below 10 might that become an "earliest date"

5) Is your date variable a SAS date value, i.e. a numeric variable with a Date format assigned?

6) What value should days have, and on which records, if Number never goes over 10?

7) Is the value of Days only to appear on one record? More than 1? If more than 1 which records?

PeterClemmensen
Tourmaline | Level 20

Do you have more than one plane? Can there be gaps in your data? 

 

Please be more specific in your requirements and provide exactly what your resulting data should look like

mmea
Quartz | Level 8

This is a part of my original data

The numbers is the capacity of available time of test

in this case all numbers are over 10 then from theres a waiting time of test at 0 days

PLACEMethodDATENUMBERnew column 
T Covid-19 Test placeA12/01/2021870 days
T Covid-19 Test placeB12/01/202162 
T Covid-19 Test placeA13/01/2021100 
T Covid-19 Test placeB13/01/202177 
T Covid-19 Test placeA14/01/2021188 
T Covid-19 Test placeB14/01/2021345 
T Covid-19 Test placeA15/01/2021188 
T Covid-19 Test placeB15/01/202167 

if the case is like this:

PLACEMethodDATENUMBERnew column 
T Covid-19 Test placeA12/01/20217 
T Covid-19 Test placeB12/01/20217 
T Covid-19 Test placeA13/01/20218 
T Covid-19 Test placeB13/01/20211 
T Covid-19 Test placeA14/01/20211882 days
T Covid-19 Test placeB14/01/2021345 
T Covid-19 Test placeA15/01/2021188 
T Covid-19 Test placeB15/01/202167 

 

ithe a case could also be like this:

PLACEMethodDATENUMBERnew column 
T Covid-19 Test placeA12/01/20211 
T Covid-19 Test placeB12/01/20213 
T Covid-19 Test placeA13/01/2021781 day
T Covid-19 Test placeB13/01/202177 
T Covid-19 Test placeA14/01/2021188 
T Covid-19 Test placeB14/01/202111 day
T Covid-19 Test placeA15/01/20211119 
T Covid-19 Test placeB15/01/202167 

 

 

 

 

 

 

1) There is always one value for each place and it should always count from the earliest date.

2) my data is sorted by date

3)The first record do not always have the value 0  - it was just to illustrate that if theres is a values less than 10 from the earliest date until the 3rd date then the new column will say 2 days - as it took to days for that place to have a value over 10

4)the counting should always start from the earliest date - if the earliest date is over 10 from the beginning then the new column should say 0 days etc.

5) my date format is actually DDMMYY10. 

6) for a place that have under 10 all days then the column should says "NO DAYS AVAILABLE" - if the place has over 10 all days then it will be "0 days"

7) the days should appear every new case as shown in table 3

 

And if there are gaps or no values then "No days available"

 

Hope it is understandable

 

ballardw
Super User

It appears that METHOD is involved in your calculation. Can you state the rule(s) involved?

mmea
Quartz | Level 8

There can be different methods for th same place - is should actually also consider that in the calculations - i forgot to mention that.

 

so the "days" label should be stated for each method 

 

so if:

PLACEMETHODDATEnumberDAYS
T covid-19 test placeA12/01/2021120 days
T covid-19 test placeB12/01/2021670 days

 

i

PeterClemmensen
Tourmaline | Level 20

I'm still not quite sure here. You say: "1) There is always one value for each place and it should always count from the earliest date.".

 

Does that mean you have multiple different values for Place? I.e. Could place both be "T covid-19 test place" and "S covid-19 test place" ?

mmea
Quartz | Level 8

I have multiple places - and different places - them same place has a method A and B

each place goes from current date and 2 weeks ahead

when a specific place for method A has a number over 10 for all dates  then it means this place has 0 days waiting time

if it has for method B for all dates a number below 10 then this place for method B has no available time.

If the cases is that method A starts with a number under 10 on th 12th and 13th, but then becomes over 10 at the 14th then the places has a waiting time at 2 days

 

 

but there are multiple places where the earliest date is the current date for each new place

mmea
Quartz | Level 8

An no each places has different names 

PeterClemmensen
Tourmaline | Level 20

Ok. And why is the new column = 1 in the third-last row of your last example?

 

I thought number should be > 10 for the new column to have a value?

mmea
Quartz | Level 8

Sorry typing mistake.. the 1 day should be on the next day when its over 10 🙂 so at number = 67

mmea
Quartz | Level 8

If a place has over 10 at the 12th, then under 10 at the 13 th, and then over 10 at the 14th and then under 10 for the rest of the days - then the waiting time will 1 day.

 

The definition is: waitingtime until next day where the number is at least 10

 

Kurt_Bremser
Super User

Since you seem to want to treat the methods separately, you need to sort by method before date. Try this:

data have;
infile datalines dlm="," dsd truncover;
input place :$30. method  :$1. date :ddmmyy10. number;
format date yymmdd10.;
datalines;
T Covid-19 Test place,A,12/01/2021,7  
T Covid-19 Test place,B,12/01/2021,7  
T Covid-19 Test place,A,13/01/2021,8  
T Covid-19 Test place,B,13/01/2021,1  
T Covid-19 Test place,A,14/01/2021,188
T Covid-19 Test place,B,14/01/2021,345  
T Covid-19 Test place,A,15/01/2021,188  
T Covid-19 Test place,B,15/01/2021,67
;

proc sort data=have;
by place method date;
run;

data want;
set have;
by place method;
retain s_date;
if first.method then s_date = .;
if s_date = . and number lt 10 then s_date = date;
if s_date ne . and number ge 10
then do;
  days = date - s_date;
  s_date = .;
end;
drop s_date;
run;

Note how source data is supplied in a data step with datalines, so everyone can create the dataset exactly as is with a simple copy/paste and submit. It also leaves no doubts about dataset structure and contents.

PeterClemmensen
Tourmaline | Level 20

See if this helps you

 

data have;
input PLACE $ 1-21 Method $ 23-24 DATE :ddmmyy10. NUMBER;
format date ddmmyy10.;
datalines;
T Covid-19 Test place A 12/01/2021 7  
T Covid-19 Test place B 12/01/2021 7  
T Covid-19 Test place A 13/01/2021 8  
T Covid-19 Test place B 13/01/2021 1  
T Covid-19 Test place A 14/01/2021 188
T Covid-19 Test place B 14/01/2021 345
T Covid-19 Test place A 15/01/2021 188
T Covid-19 Test place B 15/01/2021 67 
;

data want;

   if _N_ = 1 then do;
      dcl hash h();
      h.definekey('place', 'method');
      h.definedata('number', 'date');
      h.definedone();
   end;

   set have;

   n = number;
   d = date;

   if h.find() ne 0 then do;
      new = 0;
      h.replace();
   end;

   else do;
      if n > 10 & number <= 10 then do;
         new = d - date;
         date = d;
         number = n;
         h.replace();
      end;
   end;

   number = n;
   date = d;

   keep PLACE Method DATE NUMBER new;
run;
PeterClemmensen
Tourmaline | Level 20

Just made a small correction.

 

data have;
input PLACE $ 1-21 Method $ 23-24 DATE :ddmmyy10. NUMBER;
format date ddmmyy10.;
datalines;
T Covid-19 Test place A 12/01/2021 7  
T Covid-19 Test place B 12/01/2021 7  
T Covid-19 Test place A 13/01/2021 8  
T Covid-19 Test place B 13/01/2021 1  
T Covid-19 Test place A 14/01/2021 188
T Covid-19 Test place B 14/01/2021 345
T Covid-19 Test place A 15/01/2021 188
T Covid-19 Test place B 15/01/2021 67 
;

data want;

   if _N_ = 1 then do;
      dcl hash h();
      h.definekey('place', 'method');
      h.definedata('number', 'date');
      h.definedone();
   end;

   set have;

   n = number;
   d = date;

   if h.find() ne 0 then new = 0;

   else do;
      if n > 10 & number <= 10 then new = d - date;
      date = d;
      number = n;
   end;

   h.replace();

   number = n;
   date = d;

   keep PLACE Method DATE NUMBER new;
run;

 

Result:

 

PLACE                  Method  DATE        NUMBER  new 
T Covid-19 Test place  A       12/01/2021  7       0 
T Covid-19 Test place  B       12/01/2021  7       0 
T Covid-19 Test place  A       13/01/2021  8       . 
T Covid-19 Test place  B       13/01/2021  1       . 
T Covid-19 Test place  A       14/01/2021  188     1 
T Covid-19 Test place  B       14/01/2021  345     1 
T Covid-19 Test place  A       15/01/2021  188     . 
T Covid-19 Test place  B       15/01/2021  67      . 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 31 replies
  • 1065 views
  • 1 like
  • 4 in conversation