BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Sri_devi
Obsidian | Level 7

i have raw data as below:

Subject_ID Date Time_administered Dose_administered __of_capsules_administered Study_day_IP
001-001 4-Mar-21 45:00.0 25 mg 1 TP1-Day1
001-001 5-Mar-21 50:00.0 25 mg 1 TP1-Day2
001-001 6-Mar-21        
001-001 7-Mar-21        
001-001 8-Mar-21        
001-001 9-Mar-21        
001-001 10-Mar-21        
001-001 11-Mar-21        
001-001 12-Mar-21        
001-001 13-Mar-21        
001-001 14-Mar-21        
001-001 15-Mar-21        
001-001 16-Mar-21        
001-001 17-Mar-21        
001-001 18-Mar-21 45:00.0 50 mg 2 TP1-Day15
001-001 19-Mar-21        
001-001 20-Mar-21        
001-001 21-Mar-21        
001-001 22-Mar-21        
001-001 23-Mar-21        
001-001 24-Mar-21        
001-001 25-Mar-21 00:00.0 50 mg 2 TP1-Day22
001-001 26-Mar-21        
001-001 27-Mar-21        
001-001 28-Mar-21        
001-001 29-Mar-21        
001-001 30-Mar-21        
001-001 31-Mar-21 30:00.0 50 mg 2 TP1-Day28
001-001 1-Apr-21 25:00.0 50 mg 2 TP1Washoutday29
001-001 2-Apr-21        
001-001 3-Apr-21        
001-001 4-Apr-21        
001-001 14-Apr-21 00:00.0 25 mg 1 TP2-Day1
001-001 15-Apr-21        
001-001 16-Apr-21        
001-001 17-Apr-21        
001-001 18-Apr-21        
001-001 19-Apr-21        
001-001 20-Apr-21        
001-001 21-Apr-21        
001-001 22-Apr-21 26:00.0 25 mg 1 TP2-Day8
001-001 23-Apr-21        
001-001 24-Apr-21        
001-001 25-Apr-21        
001-001 26-Apr-21        
001-001 27-Apr-21        
001-001 28-Apr-21        
001-001 29-Apr-21 15:00.0 50 mg 2 TP2-Day15
001-002 9-Mar-21 00:00.0 25 mg 1 TP1-Day1
001-002 10-Mar-21 40:00.0 25 mg 1 TP1-Day2
001-002 11-Mar-21        
001-002 12-Mar-21        
001-002 13-Mar-21        
001-002 14-Mar-21        
001-002 15-Mar-21        
001-002 16-Mar-21 10:00.0 25 mg 1 TP1-Day8
001-002 17-Mar-21        
001-002 18-Mar-21        
001-002 19-Mar-21        
001-002 20-Mar-21        
001-002 21-Mar-21        
001-002 22-Mar-21        
001-002 23-Mar-21 00:00.0 50 mg 2 TP1-Day15
001-002 24-Mar-21        
001-002 25-Mar-21        
001-002 26-Mar-21        
001-002 27-Mar-21        
001-002 28-Mar-21        
001-002 29-Mar-21        
001-002 30-Mar-21 50:00.0 75 mg 3 TP1-Day22
001-002 31-Mar-21        
001-002 1-Apr-21        
001-002 2-Apr-21        
001-002 3-Apr-21        
001-002 4-Apr-21        
001-002 5-Apr-21        
001-002 6-Apr-21        
001-002 20-Apr-21 30:00.0 25 mg 1 TP2-Day1
001-002 21-Apr-21 00:00.0 25 mg 1 TP2-Day2
001-002 22-Apr-21        
001-002 23-Apr-21        
001-002 24-Apr-21        
001-002 25-Apr-21        
001-002 26-Apr-21        
001-002 27-Apr-21 00:00.0 25 mg 1 TP2-Day8
001-002 28-Apr-21        
001-002 29-Apr-21        
001-002 30-Apr-21        
001-002 1-May-21        
001-002 2-May-21        
001-002 3-May-21        
001-002 4-May-21        
001-002 5-May-21 00:00.0 50 mg 2 TP2-Day15
001-002 6-May-21        
001-002 7-May-21        
001-002 8-May-21        
001-002 9-May-21        
001-002 10-May-21        
001-002 11-May-21 03:00.0 75 mg 3 TP2-Day22
001-002 12-May-21        
001-002 13-May-21        
001-002 14-May-21        
001-002 15-May-21        
001-002 16-May-21        
001-002 17-May-21        
001-002 18-May-21 30:00.0 75 mg 3 TP2-Day28
001-002 19-May-21 45:00.0 75 mg 3 TP2Washoutday71
001-002 20-May-21        

 

 

My requirement, I want to fill the blank record in column Study_day_IP in sequential order of day numbers for the data until the data next available data in the same column…(want as like column X).
Other condition is for some rows there is as repeated data coming for those i want the copy the previous data(want like column Z). for example : subject 001-001 for column 21 Apr21 and 22 Apr 21 the X column value got TP2-Day8 so i want the 21 Apr 21 record Study_day_IP as TP2-Day7 as in the previous record.

 

Output a sin below table

 

Please can someone help me...its tricky to explain but i have explained enough if some one still have any further question please ask i will clearly explain.

 

 

I want the below output :

 

Subject_ID Date Time_administered Dose_administered __of_capsules_administered Study_day_IP X Z
001-001 4-Mar-21 45:00.0 25 mg 1 TP1-Day1 TP1-Day1 TP1-Day1
001-001 5-Mar-21 50:00.0 25 mg 1 TP1-Day2 TP1-Day2 TP1-Day2
001-001 6-Mar-21         TP1-Day3 TP1-Day3
001-001 7-Mar-21         TP1-Day4 TP1-Day4
001-001 8-Mar-21         TP1-Day5 TP1-Day5
001-001 9-Mar-21         TP1-Day6 TP1-Day6
001-001 10-Mar-21         TP1-Day7 TP1-Day7
001-001 11-Mar-21         TP1-Day8 TP1-Day8
001-001 12-Mar-21         TP1-Day9 TP1-Day9
001-001 13-Mar-21         TP1-Day10 TP1-Day10
001-001 14-Mar-21         TP1-Day11 TP1-Day11
001-001 15-Mar-21         TP1-Day12 TP1-Day12
001-001 16-Mar-21         TP1-Day13 TP1-Day13
001-001 17-Mar-21         TP1-Day14 TP1-Day14
001-001 18-Mar-21 45:00.0 50 mg 2 TP1-Day15 TP1-Day15 TP1-Day15
001-001 19-Mar-21         TP1-Day16 TP1-Day16
001-001 20-Mar-21         TP1-Day17 TP1-Day17
001-001 21-Mar-21         TP1-Day18 TP1-Day18
001-001 22-Mar-21         TP1-Day19 TP1-Day19
001-001 23-Mar-21         TP1-Day20 TP1-Day20
001-001 24-Mar-21         TP1-Day21 TP1-Day21
001-001 25-Mar-21 00:00.0 50 mg 2 TP1-Day22 TP1-Day22 TP1-Day22
001-001 26-Mar-21         TP1-Day23 TP1-Day23
001-001 27-Mar-21         TP1-Day24 TP1-Day24
001-001 28-Mar-21         TP1-Day25 TP1-Day25
001-001 29-Mar-21         TP1-Day26 TP1-Day26
001-001 30-Mar-21         TP1-Day27 TP1-Day27
001-001 31-Mar-21 30:00.0 50 mg 2 TP1-Day28 TP1-Day28 TP1-Day28
001-001 1-Apr-21 25:00.0 50 mg 2 TP1Washoutday29 TP1Washoutday29 TP1Washoutday29
001-001 2-Apr-21         TP1Washoutday30 TP1Washoutday30
001-001 3-Apr-21         TP1Washoutday31 TP1Washoutday31
001-001 4-Apr-21         TP1Washoutday32 TP1Washoutday32
001-001 14-Apr-21 00:00.0 25 mg 1 TP2-Day1 TP2-Day1 TP2-Day1
001-001 15-Apr-21         TP2-Day2 TP2-Day2
001-001 16-Apr-21         TP2-Day3 TP2-Day3
001-001 17-Apr-21         TP2-Day4 TP2-Day4
001-001 18-Apr-21         TP2-Day5 TP2-Day5
001-001 19-Apr-21         TP2-Day6 TP2-Day6
001-001 20-Apr-21         TP2-Day7 TP2-Day7
001-001 21-Apr-21         TP2-Day8 TP2-Day7
001-001 22-Apr-21 26:00.0 25 mg 1 TP2-Day8 TP2-Day8 TP2-Day8
001-001 23-Apr-21         TP2-Day9 TP2-Day9
001-001 24-Apr-21         TP2-Day10 TP2-Day10
001-001 25-Apr-21         TP2-Day11 TP2-Day11
001-001 26-Apr-21         TP2-Day12 TP2-Day12
001-001 27-Apr-21         TP2-Day13 TP2-Day13
001-001 28-Apr-21         TP2-Day14 TP2-Day14
001-001 29-Apr-21 15:00.0 50 mg 2 TP2-Day15 TP2-Day15 TP2-Day15
001-002 9-Mar-21 00:00.0 25 mg 1 TP1-Day1 TP1-Day1 TP1-Day1
001-002 10-Mar-21 40:00.0 25 mg 1 TP1-Day2 TP1-Day2 TP1-Day2
001-002 11-Mar-21         TP1-Day3 TP1-Day3
001-002 12-Mar-21         TP1-Day4 TP1-Day4
001-002 13-Mar-21         TP1-Day5 TP1-Day5
001-002 14-Mar-21         TP1-Day6 TP1-Day6
001-002 15-Mar-21         TP1-Day7 TP1-Day7
001-002 16-Mar-21 10:00.0 25 mg 1 TP1-Day8 TP1-Day8 TP1-Day8
001-002 17-Mar-21         TP1-Day9 TP1-Day9
001-002 18-Mar-21         TP1-Day10 TP1-Day10
001-002 19-Mar-21         TP1-Day11 TP1-Day11
001-002 20-Mar-21         TP1-Day12 TP1-Day12
001-002 21-Mar-21         TP1-Day13 TP1-Day13
001-002 22-Mar-21         TP1-Day14 TP1-Day14
001-002 23-Mar-21 00:00.0 50 mg 2 TP1-Day15 TP1-Day15 TP1-Day15
001-002 24-Mar-21         TP1-Day16 TP1-Day16
001-002 25-Mar-21         TP1-Day17 TP1-Day17
001-002 26-Mar-21         TP1-Day18 TP1-Day18
001-002 27-Mar-21         TP1-Day19 TP1-Day19
001-002 28-Mar-21         TP1-Day20 TP1-Day20
001-002 29-Mar-21         TP1-Day21 TP1-Day21
001-002 30-Mar-21 50:00.0 75 mg 3 TP1-Day22 TP1-Day22 TP1-Day22
001-002 31-Mar-21         TP1-Day23 TP1-Day23
001-002 1-Apr-21         TP1-Day24 TP1-Day24
001-002 2-Apr-21         TP1-Day25 TP1-Day25
001-002 3-Apr-21         TP1-Day26 TP1-Day26
001-002 4-Apr-21         TP1-Day27 TP1-Day27
001-002 5-Apr-21         TP1-Day28 TP1-Day28
001-002 6-Apr-21         TP1-Day29 TP1-Day29
001-002 20-Apr-21 30:00.0 25 mg 1 TP2-Day1 TP2-Day1 TP2-Day1
001-002 21-Apr-21 00:00.0 25 mg 1 TP2-Day2 TP2-Day2 TP2-Day2
001-002 22-Apr-21         TP2-Day3 TP2-Day3
001-002 23-Apr-21         TP2-Day4 TP2-Day4
001-002 24-Apr-21         TP2-Day5 TP2-Day5
001-002 25-Apr-21         TP2-Day6 TP2-Day6
001-002 26-Apr-21         TP2-Day7 TP2-Day7
001-002 27-Apr-21 00:00.0 25 mg 1 TP2-Day8 TP2-Day8 TP2-Day8
001-002 28-Apr-21         TP2-Day9 TP2-Day9
001-002 29-Apr-21         TP2-Day10 TP2-Day10
001-002 30-Apr-21         TP2-Day11 TP2-Day11
001-002 1-May-21         TP2-Day12 TP2-Day12
001-002 2-May-21         TP2-Day13 TP2-Day13
001-002 3-May-21         TP2-Day14 TP2-Day14
001-002 4-May-21         TP2-Day15 TP2-Day14
001-002 5-May-21 00:00.0 50 mg 2 TP2-Day15 TP2-Day15 TP2-Day15
001-002 6-May-21         TP2-Day16 TP2-Day16
001-002 7-May-21         TP2-Day17 TP2-Day17
001-002 8-May-21         TP2-Day18 TP2-Day18
001-002 9-May-21         TP2-Day19 TP2-Day19
001-002 10-May-21         TP2-Day20 TP2-Day20
001-002 11-May-21 03:00.0 75 mg 3 TP2-Day22 TP2-Day22 TP2-Day22
001-002 12-May-21         TP2-Day23 TP2-Day23
001-002 13-May-21         TP2-Day24 TP2-Day24
001-002 14-May-21         TP2-Day25 TP2-Day25
001-002 15-May-21         TP2-Day26 TP2-Day26
001-002 16-May-21         TP2-Day27 TP2-Day27
001-002 17-May-21         TP2-Day28 TP2-Day27
001-002 18-May-21 30:00.0 75 mg 3 TP2-Day28 TP2-Day28 TP2-Day28
001-002 19-May-21 45:00.0 75 mg 3 TP2Washoutday71 TP2Washoutday71 TP2Washoutday71
001-002 20-May-21         TP2Washoutday72 TP2Washoutday72

 

 

Thanks you in advance. Any help is much appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
SubbuPaz
SAS Employee

Hi, Try this code. This is based on your requirements from your initial post. Note that this is based on the following assumption: 

- A 'Subject_ID' should have its 'Study_day_IP' column populated for their first day in the data.

 

/* extracting the numeric portion out of study_day_ip */
proc sql;
create table tmp_want1 as
select a.*,
case 
	when Study_day_IP eq '' then ''
	when anyalpha(substr(Study_day_IP,length(Study_day_IP)-2, 3)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-2, 3))
	when anyalpha(substr(Study_day_IP,length(Study_day_IP)-1, 2)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-1, 2))
	when anyalpha(substr(Study_day_IP,length(Study_day_IP), 1)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP), 1))
	else '1'
end as Study_day_IP_Num
from have a 
;
quit;

/* extracting the string portion out of study_day_ip */
data tmp_want2;
format Study_day_IP_Str $32.; length Study_day_IP_Str $32.;
set tmp_want1;
if 
	Study_day_IP eq '' then Study_day_IP_Str = '';
else 
	Study_day_IP_Str = substr(Study_day_IP, 1, length(Study_day_IP) - length(Study_day_IP_Num));
run;

/* COl X */
proc sort data = tmp_want2;
by Subject_ID Date;
run;

data tmp_want3;
set tmp_want2;
retain Study_day_IP_Num1;
retain Study_day_IP_Str1;
if Study_day_IP_num ne '' then do;
	Study_day_IP_Num1 = input(Study_day_IP_Num,best.);
	Study_day_IP_Str1 = Study_day_IP_Str;
end;
else do; 
	Study_day_IP_Num1 = Study_day_IP_Num1 + 1;
	Study_day_IP_Str1 = Study_day_IP_Str1;
end;
by Subject_ID Date;
Col_X = trim(Study_day_IP_Str1) || compress(Study_day_IP_Num1) ;
run;

/* COl Y - COmputing Lag and Lead */
proc sort data = tmp_want3;
by Subject_ID Date;
run;

data tmp_want4;
set tmp_want3;
by Subject_ID Date;
Lag_Study_day_IP_Str1 = lag(Study_day_IP_Num1);
run;


proc sort data = tmp_want4;
by Subject_ID descending Date;
run;

data tmp_want4;
set tmp_want4;
by Subject_ID descending Date;
Lead_Study_day_IP_Str1 = lag(Study_day_IP_Num1);
run;

proc sort data = tmp_want4;
by Subject_ID Date;
run;

/* Final tabl with COl Z */
data want;
set tmp_want4;
if 
	Study_day_IP_Num1 = Lead_Study_day_IP_Str1 then Col_Z = trim(Study_day_IP_Str1) || compress(Lag_Study_day_IP_Str1) ;
else  
	Col_Z = Col_X;
drop Study_day_IP_Num1 Lead_Study_day_IP_Str1 Lag_Study_day_IP_Str1 Study_day_IP_Str1 Study_day_IP_Num Study_day_IP_Str;
run;

Thanks.

View solution in original post

7 REPLIES 7
mkeintz
PROC Star

I haven't converted your sample data to a working sas data step, so the following code is untested.

 

But this is a case when you encounter a non-blank study_data _ip, you want to extract the "root" (everything through the character string "day"), and the sequence number (the numeric value to the right of the root).  Then just retain the root and sequence for use in subsequent observations.  When a blank study_data_ip is encountered, increment the sequence, and concatenate it with the root to form the new value of X:

 

data want (drop=_:) ;
  set have;

  if study_day_ip^=' ' then do;
    x=study_day_ip;
    _root=substr(study_day_ip,1,find(lowcase(study_day_ip),'day')+2);
    _seq=input(substr(study_day_ip,length(_root)+1),3.);
  end;
  else do;
    _seq+1;
    x=cats(_root,_seq);
  end;
  retain _root _seq ;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Sri_devi
Obsidian | Level 7
Thank you for quick reply! Then how about getting the Column Z values...as i mentioned for some rows there is as repeated data coming for those i want the copy the previous data(want like column Z).
for example : subject 001-001 for column 21 Apr21 and 22 Apr 21 the X column value got TP2-Day8 so i want the 21 Apr 21 record Study_day_IP as TP2-Day7 as in the previous record.

Thank you in advance.
Tom
Super User Tom
Super User

Wouldn't it be a LOT easier to use numbers instead of those strings?

First please post the data in a usable (and readable!) format.

data have;
  infile cards dsd truncover dlm='|';
  input Subject_ID $ Date :date. Time :stimer. Dose :$20. capsules Study_Day_IP :$20. ;
  format date date9. time tod8. ;
cards;
001-001|4-Mar-21|45:00.0|25 mg|1|TP1-Day1
001-001|5-Mar-21|50:00.0|25 mg|1|TP1-Day2
001-001|6-Mar-21| | | | 
001-001|7-Mar-21| | | | 
001-001|8-Mar-21| | | | 
001-001|9-Mar-21| | | | 
001-001|10-Mar-21| | | | 
001-001|11-Mar-21| | | | 
001-001|12-Mar-21| | | | 
001-001|13-Mar-21| | | | 
001-001|14-Mar-21| | | | 
001-001|15-Mar-21| | | | 
001-001|16-Mar-21| | | | 
001-001|17-Mar-21| | | | 
001-001|18-Mar-21|45:00.0|50 mg|2|TP1-Day15
001-001|19-Mar-21| | | | 
001-001|20-Mar-21| | | | 
001-001|21-Mar-21| | | | 
001-001|22-Mar-21| | | | 
001-001|23-Mar-21| | | | 
001-001|24-Mar-21| | | | 
;

Looks like the new column is just the difference between the current date and the first date for this subject.

data want ;
  set have ;
  by subject_id date ;
  if first.subject_id then day1 = date ;
  retain day1 ;
  format day1 date9.;
  study_day = date - day1 +1 ;
run;
       Subject_                                                   Study_                   study_
Obs       ID            Date        Time    Dose     capsules     Day_IP           day1      day

  1    001-001     04MAR2021    00:45:00    25 mg        1       TP1-Day1     04MAR2021       1
  2    001-001     05MAR2021    00:50:00    25 mg        1       TP1-Day2     04MAR2021       2
  3    001-001     06MAR2021           .                 .                    04MAR2021       3
  4    001-001     07MAR2021           .                 .                    04MAR2021       4
  5    001-001     08MAR2021           .                 .                    04MAR2021       5
  6    001-001     09MAR2021           .                 .                    04MAR2021       6
  7    001-001     10MAR2021           .                 .                    04MAR2021       7
  8    001-001     11MAR2021           .                 .                    04MAR2021       8
  9    001-001     12MAR2021           .                 .                    04MAR2021       9
 10    001-001     13MAR2021           .                 .                    04MAR2021      10
 11    001-001     14MAR2021           .                 .                    04MAR2021      11
 12    001-001     15MAR2021           .                 .                    04MAR2021      12
 13    001-001     16MAR2021           .                 .                    04MAR2021      13
 14    001-001     17MAR2021           .                 .                    04MAR2021      14
 15    001-001     18MAR2021    00:45:00    50 mg        2       TP1-Day15    04MAR2021      15
 16    001-001     19MAR2021           .                 .                    04MAR2021      16
 17    001-001     20MAR2021           .                 .                    04MAR2021      17
 18    001-001     21MAR2021           .                 .                    04MAR2021      18
 19    001-001     22MAR2021           .                 .                    04MAR2021      19
 20    001-001     23MAR2021           .                 .                    04MAR2021      20
 21    001-001     24MAR2021           .                 .                    04MAR2021      21
Sri_devi
Obsidian | Level 7
Hi Tom, thank you! ok will consider numbers for study_day instead of string as per your SAS code. My other question is if the study_day has the same number for example 8 and 8 continiously after one another then i want the first Day 8 as the 7 as previous one.
For example :

I have data :


Subject_ID Date Time_administered Dose_administered __of_capsules_administered Study_day_IP study_day
001-001 4-Mar-21 45:00.0 25 mg 1 TP1-Day1 1
001-001 5-Mar-21 50:00.0 25 mg 1 TP1-Day2 2
001-001 6-Mar-21 3
001-001 7-Mar-21 4
001-001 8-Mar-21 5
001-001 9-Mar-21 6
001-001 10-Mar-21 7
001-001 11-Mar-21 8
001-001 12-Mar-21 9
001-001 13-Mar-21 10
001-001 14-Mar-21 11
001-001 15-Mar-21 12
001-001 16-Mar-21 13
17-Mar-21 14
001-001 18-Mar-21 15
001-001 19-Mar-21 45:00.0 50 mg 2 TP1-Day15 15
001-001 20-Mar-21 16
001-001 21-Mar-21 17
001-001 22-Mar-21 18
001-001 23-Mar-21 19
001-001 24-Mar-21 20
001-001 25-Mar-21 21
001-001 26-Mar-21 00:00.0 50 mg 2 TP1-Day22 22
001-001 27-Mar-21 23
001-001 28-Mar-21 24
001-001 29-Mar-21 25
001-001 30-Mar-21 26
001-001 31-Mar-21 27
1-Apr-21 28
001-001 2-Apr-21 30:00.0 50 mg 2 TP1-Day28 28
001-001 3-Apr-21 25:00.0 50 mg 2 TP1Washoutday29 29
001-001 4-Apr-21 30
001-001 5-Apr-21 30
001-001 6-Apr-21 30
001-001 7-Apr-21 00:00.0 25 mg 1 TP2-Day1 1
001-001 8-Apr-21 2
001-001 9-Apr-21 3
001-001 10-Apr-21 4
001-001 11-Apr-21 5
001-001 12-Apr-21 6
001-001 13-Apr-21 7
001-001 14-Apr-21 8
001-001 15-Apr-21 26:00.0 25 mg 1 TP2-Day8 9
001-001 16-Apr-21 10
001-001 17-Apr-21 11
001-001 18-Apr-21 12
001-001 19-Apr-21 13
001-001 20-Apr-21 14
001-001 21-Apr-21 15
001-001 22-Apr-21 15:00.0 50 mg 2 TP2-Day15 15
001-002 23-Apr-21 00:00.0 25 mg 1 TP1-Day1 1
001-002 24-Apr-21 40:00.0 25 mg 1 TP1-Day2 2
001-002 25-Apr-21 3
001-002 26-Apr-21 4
001-002 27-Apr-21 5
001-002 28-Apr-21 6
001-002 29-Apr-21 7
001-002 30-Apr-21 8
001-002 1-May-21 10:00.0 25 mg 1 TP1-Day8 8
001-002 17-Mar-21 9
001-002 18-Mar-21 10
001-002 19-Mar-21 11
001-002 20-Mar-21 12
001-002 21-Mar-21 13
001-002 22-Mar-21 14
001-002 23-Mar-21 00:00.0 50 mg 2 TP1-Day15 15
001-002 24-Mar-21 16
001-002 25-Mar-21 17
001-002 26-Mar-21 18
001-002 27-Mar-21 19
001-002 28-Mar-21 20
001-002 29-Mar-21 21
001-002 30-Mar-21 50:00.0 75 mg 3 TP1-Day22 22
001-002 31-Mar-21 23
001-002 1-Apr-21 24
001-002 2-Apr-21 25
001-002 3-Apr-21 26
001-002 4-Apr-21 27
001-002 5-Apr-21 28
001-002 6-Apr-21 29
001-002 20-Apr-21 30:00.0 25 mg 1 TP2-Day1 1
001-002 21-Apr-21 00:00.0 25 mg 1 TP2-Day2 2
001-002 22-Apr-21 3
001-002 23-Apr-21 4
001-002 24-Apr-21 5
001-002 25-Apr-21 6
001-002 26-Apr-21 7
001-002 27-Apr-21 00:00.0 25 mg 1 TP2-Day8 8
001-002 28-Apr-21 9
001-002 29-Apr-21 10
001-002 30-Apr-21 11
001-002 1-May-21 12
001-002 2-May-21 13
001-002 3-May-21 14
001-002 4-May-21 15
001-002 5-May-21 16
001-002 6-May-21 00:00.0 50 mg 2 TP2-Day16 16

Output condition as below:

Please observe the column Z, for subject 001-001 date 18-Mar-21 & 19-Mar-21 Study_day is 15 however the first 15 has the Z column as 14 which is same as previous record and second 15 has the Z column as 15(which is ideally Day 15 as per Study_day_IP TP1-Day15).

Subject_ID Date Study_day_IP study_day Z
001-001 18-Mar-21 15 14
001-001 19-Mar-21 15 14

another same example for log line for subject 001-001 date : 1-Apr-21 and date: 2-Apr-21 have Study_day 28 in Z column it should be 27 for first 28 and Study_day be 28 for second 28
Subject_ID Date Study_day_IP study_day Z
001-001 1-Apr-21 28 27
001-001 2-Apr-21 TP1-Day28 28 28



final output:


Subject_ID Date Time_administered Dose_administered __of_capsules_administered Study_day_IP study_day Z
001-001 4-Mar-21 45:00.0 25 mg 1 TP1-Day1 1 1
001-001 5-Mar-21 50:00.0 25 mg 1 TP1-Day2 2 2
001-001 6-Mar-21 3 3
001-001 7-Mar-21 4 4
001-001 8-Mar-21 5 5
001-001 9-Mar-21 6 6
001-001 10-Mar-21 7 7
001-001 11-Mar-21 8 8
001-001 12-Mar-21 9 9
001-001 13-Mar-21 10 10
001-001 14-Mar-21 11 11
001-001 15-Mar-21 12 12
001-001 16-Mar-21 13 13
001-001 17-Mar-21 14 14
001-001 18-Mar-21 15 14
001-001 19-Mar-21 45:00.0 50 mg 2 TP1-Day15 15 15
001-001 20-Mar-21 16 16
001-001 21-Mar-21 17 17
001-001 22-Mar-21 18 18
001-001 23-Mar-21 19 19
001-001 24-Mar-21 20 20
001-001 25-Mar-21 21 21
001-001 26-Mar-21 00:00.0 50 mg 2 TP1-Day22 22 22
001-001 27-Mar-21 23 23
001-001 28-Mar-21 24 24
001-001 29-Mar-21 25 25
001-001 30-Mar-21 26 26
001-001 31-Mar-21 27 27
001-001 1-Apr-21 28 27
001-001 2-Apr-21 30:00.0 50 mg 2 TP1-Day28 28 28
001-001 3-Apr-21 25:00.0 50 mg 2 TP1Washoutday29 29 29
001-001 4-Apr-21 30 30
001-001 5-Apr-21 30 31
001-001 6-Apr-21 30 32
001-001 7-Apr-21 00:00.0 25 mg 1 TP2-Day1 1 1
001-001 8-Apr-21 2 2
001-001 9-Apr-21 3 3
001-001 10-Apr-21 4 4
001-001 11-Apr-21 5 5
001-001 12-Apr-21 6 6
001-001 13-Apr-21 7 7
001-001 14-Apr-21 8 8
001-001 15-Apr-21 26:00.0 25 mg 1 TP2-Day8 9 9
001-001 16-Apr-21 10 10
001-001 17-Apr-21 11 11
001-001 18-Apr-21 12 12
001-001 19-Apr-21 13 13
001-001 20-Apr-21 14 14
001-001 21-Apr-21 15 14
001-001 22-Apr-21 15:00.0 50 mg 2 TP2-Day15 15 15
001-002 23-Apr-21 00:00.0 25 mg 1 TP1-Day1 1 1
001-002 24-Apr-21 40:00.0 25 mg 1 TP1-Day2 2 2
001-002 25-Apr-21 3 3
001-002 26-Apr-21 4 4
001-002 27-Apr-21 5 5
001-002 28-Apr-21 6 6
001-002 29-Apr-21 7 7
001-002 30-Apr-21 8 7
001-002 1-May-21 10:00.0 25 mg 1 TP1-Day8 8 8
001-002 17-Mar-21 9 9
001-002 18-Mar-21 10 10
001-002 19-Mar-21 11 11
001-002 20-Mar-21 12 12
001-002 21-Mar-21 13 13
001-002 22-Mar-21 14 14
001-002 23-Mar-21 00:00.0 50 mg 2 TP1-Day15 15 15
001-002 24-Mar-21 16 16
001-002 25-Mar-21 17 17
001-002 26-Mar-21 18 18
001-002 27-Mar-21 19 19
001-002 28-Mar-21 20 20
001-002 29-Mar-21 21 21
001-002 30-Mar-21 50:00.0 75 mg 3 TP1-Day22 22 22
001-002 31-Mar-21 23 23
001-002 1-Apr-21 24 24
001-002 2-Apr-21 25 25
001-002 3-Apr-21 26 26
001-002 4-Apr-21 27 27
001-002 5-Apr-21 28 28
001-002 6-Apr-21 29 29
001-002 20-Apr-21 30:00.0 25 mg 1 TP2-Day1 1 1
001-002 21-Apr-21 00:00.0 25 mg 1 TP2-Day2 2 2
001-002 22-Apr-21 3 3
001-002 23-Apr-21 4 4
001-002 24-Apr-21 5 5
001-002 25-Apr-21 6 6
001-002 26-Apr-21 7 7
001-002 27-Apr-21 00:00.0 25 mg 1 TP2-Day8 8 8
001-002 28-Apr-21 9 9
001-002 29-Apr-21 10 10
001-002 30-Apr-21 11 11
001-002 1-May-21 12 12
001-002 2-May-21 13 13
001-002 3-May-21 14 14
001-002 4-May-21 15 15
001-002 5-May-21 16 15
001-002 6-May-21 00:00.0 50 mg 2 TP2-Day16 16 16



Please let me know if this is still unclear.

thanks you for your help!

Tom
Super User Tom
Super User

I do not understand what you are trying to do.

Are you trying to label the different treatment periods of the study?

What is it that determines the start and stop of the treatment periods?  Is it the value of one those variables? Which one?  Is it some GAP in the date (or datetime) values?  How large a gap?

 

You will get your point across much better using LESS data.  Only include enough observations and variables to demonstrate the pattern.

SubbuPaz
SAS Employee

Hi, Try this code. This is based on your requirements from your initial post. Note that this is based on the following assumption: 

- A 'Subject_ID' should have its 'Study_day_IP' column populated for their first day in the data.

 

/* extracting the numeric portion out of study_day_ip */
proc sql;
create table tmp_want1 as
select a.*,
case 
	when Study_day_IP eq '' then ''
	when anyalpha(substr(Study_day_IP,length(Study_day_IP)-2, 3)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-2, 3))
	when anyalpha(substr(Study_day_IP,length(Study_day_IP)-1, 2)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-1, 2))
	when anyalpha(substr(Study_day_IP,length(Study_day_IP), 1)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP), 1))
	else '1'
end as Study_day_IP_Num
from have a 
;
quit;

/* extracting the string portion out of study_day_ip */
data tmp_want2;
format Study_day_IP_Str $32.; length Study_day_IP_Str $32.;
set tmp_want1;
if 
	Study_day_IP eq '' then Study_day_IP_Str = '';
else 
	Study_day_IP_Str = substr(Study_day_IP, 1, length(Study_day_IP) - length(Study_day_IP_Num));
run;

/* COl X */
proc sort data = tmp_want2;
by Subject_ID Date;
run;

data tmp_want3;
set tmp_want2;
retain Study_day_IP_Num1;
retain Study_day_IP_Str1;
if Study_day_IP_num ne '' then do;
	Study_day_IP_Num1 = input(Study_day_IP_Num,best.);
	Study_day_IP_Str1 = Study_day_IP_Str;
end;
else do; 
	Study_day_IP_Num1 = Study_day_IP_Num1 + 1;
	Study_day_IP_Str1 = Study_day_IP_Str1;
end;
by Subject_ID Date;
Col_X = trim(Study_day_IP_Str1) || compress(Study_day_IP_Num1) ;
run;

/* COl Y - COmputing Lag and Lead */
proc sort data = tmp_want3;
by Subject_ID Date;
run;

data tmp_want4;
set tmp_want3;
by Subject_ID Date;
Lag_Study_day_IP_Str1 = lag(Study_day_IP_Num1);
run;


proc sort data = tmp_want4;
by Subject_ID descending Date;
run;

data tmp_want4;
set tmp_want4;
by Subject_ID descending Date;
Lead_Study_day_IP_Str1 = lag(Study_day_IP_Num1);
run;

proc sort data = tmp_want4;
by Subject_ID Date;
run;

/* Final tabl with COl Z */
data want;
set tmp_want4;
if 
	Study_day_IP_Num1 = Lead_Study_day_IP_Str1 then Col_Z = trim(Study_day_IP_Str1) || compress(Lag_Study_day_IP_Str1) ;
else  
	Col_Z = Col_X;
drop Study_day_IP_Num1 Lead_Study_day_IP_Str1 Lag_Study_day_IP_Str1 Study_day_IP_Str1 Study_day_IP_Num Study_day_IP_Str;
run;

Thanks.

Sri_devi
Obsidian | Level 7
Thanks alot! this what my requirement is. Thank you so much for understanding my question and for helping me with the code. I really appreciate your efforts!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1155 views
  • 1 like
  • 4 in conversation