i have raw data as below:
Subject_ID | Date | Time_administered | Dose_administered | __of_capsules_administered | Study_day_IP |
001-001 | 4-Mar-21 | 45:00.0 | 25 mg | 1 | TP1-Day1 |
001-001 | 5-Mar-21 | 50:00.0 | 25 mg | 1 | TP1-Day2 |
001-001 | 6-Mar-21 | ||||
001-001 | 7-Mar-21 | ||||
001-001 | 8-Mar-21 | ||||
001-001 | 9-Mar-21 | ||||
001-001 | 10-Mar-21 | ||||
001-001 | 11-Mar-21 | ||||
001-001 | 12-Mar-21 | ||||
001-001 | 13-Mar-21 | ||||
001-001 | 14-Mar-21 | ||||
001-001 | 15-Mar-21 | ||||
001-001 | 16-Mar-21 | ||||
001-001 | 17-Mar-21 | ||||
001-001 | 18-Mar-21 | 45:00.0 | 50 mg | 2 | TP1-Day15 |
001-001 | 19-Mar-21 | ||||
001-001 | 20-Mar-21 | ||||
001-001 | 21-Mar-21 | ||||
001-001 | 22-Mar-21 | ||||
001-001 | 23-Mar-21 | ||||
001-001 | 24-Mar-21 | ||||
001-001 | 25-Mar-21 | 00:00.0 | 50 mg | 2 | TP1-Day22 |
001-001 | 26-Mar-21 | ||||
001-001 | 27-Mar-21 | ||||
001-001 | 28-Mar-21 | ||||
001-001 | 29-Mar-21 | ||||
001-001 | 30-Mar-21 | ||||
001-001 | 31-Mar-21 | 30:00.0 | 50 mg | 2 | TP1-Day28 |
001-001 | 1-Apr-21 | 25:00.0 | 50 mg | 2 | TP1Washoutday29 |
001-001 | 2-Apr-21 | ||||
001-001 | 3-Apr-21 | ||||
001-001 | 4-Apr-21 | ||||
001-001 | 14-Apr-21 | 00:00.0 | 25 mg | 1 | TP2-Day1 |
001-001 | 15-Apr-21 | ||||
001-001 | 16-Apr-21 | ||||
001-001 | 17-Apr-21 | ||||
001-001 | 18-Apr-21 | ||||
001-001 | 19-Apr-21 | ||||
001-001 | 20-Apr-21 | ||||
001-001 | 21-Apr-21 | ||||
001-001 | 22-Apr-21 | 26:00.0 | 25 mg | 1 | TP2-Day8 |
001-001 | 23-Apr-21 | ||||
001-001 | 24-Apr-21 | ||||
001-001 | 25-Apr-21 | ||||
001-001 | 26-Apr-21 | ||||
001-001 | 27-Apr-21 | ||||
001-001 | 28-Apr-21 | ||||
001-001 | 29-Apr-21 | 15:00.0 | 50 mg | 2 | TP2-Day15 |
001-002 | 9-Mar-21 | 00:00.0 | 25 mg | 1 | TP1-Day1 |
001-002 | 10-Mar-21 | 40:00.0 | 25 mg | 1 | TP1-Day2 |
001-002 | 11-Mar-21 | ||||
001-002 | 12-Mar-21 | ||||
001-002 | 13-Mar-21 | ||||
001-002 | 14-Mar-21 | ||||
001-002 | 15-Mar-21 | ||||
001-002 | 16-Mar-21 | 10:00.0 | 25 mg | 1 | TP1-Day8 |
001-002 | 17-Mar-21 | ||||
001-002 | 18-Mar-21 | ||||
001-002 | 19-Mar-21 | ||||
001-002 | 20-Mar-21 | ||||
001-002 | 21-Mar-21 | ||||
001-002 | 22-Mar-21 | ||||
001-002 | 23-Mar-21 | 00:00.0 | 50 mg | 2 | TP1-Day15 |
001-002 | 24-Mar-21 | ||||
001-002 | 25-Mar-21 | ||||
001-002 | 26-Mar-21 | ||||
001-002 | 27-Mar-21 | ||||
001-002 | 28-Mar-21 | ||||
001-002 | 29-Mar-21 | ||||
001-002 | 30-Mar-21 | 50:00.0 | 75 mg | 3 | TP1-Day22 |
001-002 | 31-Mar-21 | ||||
001-002 | 1-Apr-21 | ||||
001-002 | 2-Apr-21 | ||||
001-002 | 3-Apr-21 | ||||
001-002 | 4-Apr-21 | ||||
001-002 | 5-Apr-21 | ||||
001-002 | 6-Apr-21 | ||||
001-002 | 20-Apr-21 | 30:00.0 | 25 mg | 1 | TP2-Day1 |
001-002 | 21-Apr-21 | 00:00.0 | 25 mg | 1 | TP2-Day2 |
001-002 | 22-Apr-21 | ||||
001-002 | 23-Apr-21 | ||||
001-002 | 24-Apr-21 | ||||
001-002 | 25-Apr-21 | ||||
001-002 | 26-Apr-21 | ||||
001-002 | 27-Apr-21 | 00:00.0 | 25 mg | 1 | TP2-Day8 |
001-002 | 28-Apr-21 | ||||
001-002 | 29-Apr-21 | ||||
001-002 | 30-Apr-21 | ||||
001-002 | 1-May-21 | ||||
001-002 | 2-May-21 | ||||
001-002 | 3-May-21 | ||||
001-002 | 4-May-21 | ||||
001-002 | 5-May-21 | 00:00.0 | 50 mg | 2 | TP2-Day15 |
001-002 | 6-May-21 | ||||
001-002 | 7-May-21 | ||||
001-002 | 8-May-21 | ||||
001-002 | 9-May-21 | ||||
001-002 | 10-May-21 | ||||
001-002 | 11-May-21 | 03:00.0 | 75 mg | 3 | TP2-Day22 |
001-002 | 12-May-21 | ||||
001-002 | 13-May-21 | ||||
001-002 | 14-May-21 | ||||
001-002 | 15-May-21 | ||||
001-002 | 16-May-21 | ||||
001-002 | 17-May-21 | ||||
001-002 | 18-May-21 | 30:00.0 | 75 mg | 3 | TP2-Day28 |
001-002 | 19-May-21 | 45:00.0 | 75 mg | 3 | TP2Washoutday71 |
001-002 | 20-May-21 |
My requirement, I want to fill the blank record in column Study_day_IP in sequential order of day numbers for the data until the data next available data in the same column…(want as like column X).
Other condition is for some rows there is as repeated data coming for those i want the copy the previous data(want like column Z). for example : subject 001-001 for column 21 Apr21 and 22 Apr 21 the X column value got TP2-Day8 so i want the 21 Apr 21 record Study_day_IP as TP2-Day7 as in the previous record.
Output a sin below table
Please can someone help me...its tricky to explain but i have explained enough if some one still have any further question please ask i will clearly explain.
I want the below output :
Subject_ID | Date | Time_administered | Dose_administered | __of_capsules_administered | Study_day_IP | X | Z |
001-001 | 4-Mar-21 | 45:00.0 | 25 mg | 1 | TP1-Day1 | TP1-Day1 | TP1-Day1 |
001-001 | 5-Mar-21 | 50:00.0 | 25 mg | 1 | TP1-Day2 | TP1-Day2 | TP1-Day2 |
001-001 | 6-Mar-21 | TP1-Day3 | TP1-Day3 | ||||
001-001 | 7-Mar-21 | TP1-Day4 | TP1-Day4 | ||||
001-001 | 8-Mar-21 | TP1-Day5 | TP1-Day5 | ||||
001-001 | 9-Mar-21 | TP1-Day6 | TP1-Day6 | ||||
001-001 | 10-Mar-21 | TP1-Day7 | TP1-Day7 | ||||
001-001 | 11-Mar-21 | TP1-Day8 | TP1-Day8 | ||||
001-001 | 12-Mar-21 | TP1-Day9 | TP1-Day9 | ||||
001-001 | 13-Mar-21 | TP1-Day10 | TP1-Day10 | ||||
001-001 | 14-Mar-21 | TP1-Day11 | TP1-Day11 | ||||
001-001 | 15-Mar-21 | TP1-Day12 | TP1-Day12 | ||||
001-001 | 16-Mar-21 | TP1-Day13 | TP1-Day13 | ||||
001-001 | 17-Mar-21 | TP1-Day14 | TP1-Day14 | ||||
001-001 | 18-Mar-21 | 45:00.0 | 50 mg | 2 | TP1-Day15 | TP1-Day15 | TP1-Day15 |
001-001 | 19-Mar-21 | TP1-Day16 | TP1-Day16 | ||||
001-001 | 20-Mar-21 | TP1-Day17 | TP1-Day17 | ||||
001-001 | 21-Mar-21 | TP1-Day18 | TP1-Day18 | ||||
001-001 | 22-Mar-21 | TP1-Day19 | TP1-Day19 | ||||
001-001 | 23-Mar-21 | TP1-Day20 | TP1-Day20 | ||||
001-001 | 24-Mar-21 | TP1-Day21 | TP1-Day21 | ||||
001-001 | 25-Mar-21 | 00:00.0 | 50 mg | 2 | TP1-Day22 | TP1-Day22 | TP1-Day22 |
001-001 | 26-Mar-21 | TP1-Day23 | TP1-Day23 | ||||
001-001 | 27-Mar-21 | TP1-Day24 | TP1-Day24 | ||||
001-001 | 28-Mar-21 | TP1-Day25 | TP1-Day25 | ||||
001-001 | 29-Mar-21 | TP1-Day26 | TP1-Day26 | ||||
001-001 | 30-Mar-21 | TP1-Day27 | TP1-Day27 | ||||
001-001 | 31-Mar-21 | 30:00.0 | 50 mg | 2 | TP1-Day28 | TP1-Day28 | TP1-Day28 |
001-001 | 1-Apr-21 | 25:00.0 | 50 mg | 2 | TP1Washoutday29 | TP1Washoutday29 | TP1Washoutday29 |
001-001 | 2-Apr-21 | TP1Washoutday30 | TP1Washoutday30 | ||||
001-001 | 3-Apr-21 | TP1Washoutday31 | TP1Washoutday31 | ||||
001-001 | 4-Apr-21 | TP1Washoutday32 | TP1Washoutday32 | ||||
001-001 | 14-Apr-21 | 00:00.0 | 25 mg | 1 | TP2-Day1 | TP2-Day1 | TP2-Day1 |
001-001 | 15-Apr-21 | TP2-Day2 | TP2-Day2 | ||||
001-001 | 16-Apr-21 | TP2-Day3 | TP2-Day3 | ||||
001-001 | 17-Apr-21 | TP2-Day4 | TP2-Day4 | ||||
001-001 | 18-Apr-21 | TP2-Day5 | TP2-Day5 | ||||
001-001 | 19-Apr-21 | TP2-Day6 | TP2-Day6 | ||||
001-001 | 20-Apr-21 | TP2-Day7 | TP2-Day7 | ||||
001-001 | 21-Apr-21 | TP2-Day8 | TP2-Day7 | ||||
001-001 | 22-Apr-21 | 26:00.0 | 25 mg | 1 | TP2-Day8 | TP2-Day8 | TP2-Day8 |
001-001 | 23-Apr-21 | TP2-Day9 | TP2-Day9 | ||||
001-001 | 24-Apr-21 | TP2-Day10 | TP2-Day10 | ||||
001-001 | 25-Apr-21 | TP2-Day11 | TP2-Day11 | ||||
001-001 | 26-Apr-21 | TP2-Day12 | TP2-Day12 | ||||
001-001 | 27-Apr-21 | TP2-Day13 | TP2-Day13 | ||||
001-001 | 28-Apr-21 | TP2-Day14 | TP2-Day14 | ||||
001-001 | 29-Apr-21 | 15:00.0 | 50 mg | 2 | TP2-Day15 | TP2-Day15 | TP2-Day15 |
001-002 | 9-Mar-21 | 00:00.0 | 25 mg | 1 | TP1-Day1 | TP1-Day1 | TP1-Day1 |
001-002 | 10-Mar-21 | 40:00.0 | 25 mg | 1 | TP1-Day2 | TP1-Day2 | TP1-Day2 |
001-002 | 11-Mar-21 | TP1-Day3 | TP1-Day3 | ||||
001-002 | 12-Mar-21 | TP1-Day4 | TP1-Day4 | ||||
001-002 | 13-Mar-21 | TP1-Day5 | TP1-Day5 | ||||
001-002 | 14-Mar-21 | TP1-Day6 | TP1-Day6 | ||||
001-002 | 15-Mar-21 | TP1-Day7 | TP1-Day7 | ||||
001-002 | 16-Mar-21 | 10:00.0 | 25 mg | 1 | TP1-Day8 | TP1-Day8 | TP1-Day8 |
001-002 | 17-Mar-21 | TP1-Day9 | TP1-Day9 | ||||
001-002 | 18-Mar-21 | TP1-Day10 | TP1-Day10 | ||||
001-002 | 19-Mar-21 | TP1-Day11 | TP1-Day11 | ||||
001-002 | 20-Mar-21 | TP1-Day12 | TP1-Day12 | ||||
001-002 | 21-Mar-21 | TP1-Day13 | TP1-Day13 | ||||
001-002 | 22-Mar-21 | TP1-Day14 | TP1-Day14 | ||||
001-002 | 23-Mar-21 | 00:00.0 | 50 mg | 2 | TP1-Day15 | TP1-Day15 | TP1-Day15 |
001-002 | 24-Mar-21 | TP1-Day16 | TP1-Day16 | ||||
001-002 | 25-Mar-21 | TP1-Day17 | TP1-Day17 | ||||
001-002 | 26-Mar-21 | TP1-Day18 | TP1-Day18 | ||||
001-002 | 27-Mar-21 | TP1-Day19 | TP1-Day19 | ||||
001-002 | 28-Mar-21 | TP1-Day20 | TP1-Day20 | ||||
001-002 | 29-Mar-21 | TP1-Day21 | TP1-Day21 | ||||
001-002 | 30-Mar-21 | 50:00.0 | 75 mg | 3 | TP1-Day22 | TP1-Day22 | TP1-Day22 |
001-002 | 31-Mar-21 | TP1-Day23 | TP1-Day23 | ||||
001-002 | 1-Apr-21 | TP1-Day24 | TP1-Day24 | ||||
001-002 | 2-Apr-21 | TP1-Day25 | TP1-Day25 | ||||
001-002 | 3-Apr-21 | TP1-Day26 | TP1-Day26 | ||||
001-002 | 4-Apr-21 | TP1-Day27 | TP1-Day27 | ||||
001-002 | 5-Apr-21 | TP1-Day28 | TP1-Day28 | ||||
001-002 | 6-Apr-21 | TP1-Day29 | TP1-Day29 | ||||
001-002 | 20-Apr-21 | 30:00.0 | 25 mg | 1 | TP2-Day1 | TP2-Day1 | TP2-Day1 |
001-002 | 21-Apr-21 | 00:00.0 | 25 mg | 1 | TP2-Day2 | TP2-Day2 | TP2-Day2 |
001-002 | 22-Apr-21 | TP2-Day3 | TP2-Day3 | ||||
001-002 | 23-Apr-21 | TP2-Day4 | TP2-Day4 | ||||
001-002 | 24-Apr-21 | TP2-Day5 | TP2-Day5 | ||||
001-002 | 25-Apr-21 | TP2-Day6 | TP2-Day6 | ||||
001-002 | 26-Apr-21 | TP2-Day7 | TP2-Day7 | ||||
001-002 | 27-Apr-21 | 00:00.0 | 25 mg | 1 | TP2-Day8 | TP2-Day8 | TP2-Day8 |
001-002 | 28-Apr-21 | TP2-Day9 | TP2-Day9 | ||||
001-002 | 29-Apr-21 | TP2-Day10 | TP2-Day10 | ||||
001-002 | 30-Apr-21 | TP2-Day11 | TP2-Day11 | ||||
001-002 | 1-May-21 | TP2-Day12 | TP2-Day12 | ||||
001-002 | 2-May-21 | TP2-Day13 | TP2-Day13 | ||||
001-002 | 3-May-21 | TP2-Day14 | TP2-Day14 | ||||
001-002 | 4-May-21 | TP2-Day15 | TP2-Day14 | ||||
001-002 | 5-May-21 | 00:00.0 | 50 mg | 2 | TP2-Day15 | TP2-Day15 | TP2-Day15 |
001-002 | 6-May-21 | TP2-Day16 | TP2-Day16 | ||||
001-002 | 7-May-21 | TP2-Day17 | TP2-Day17 | ||||
001-002 | 8-May-21 | TP2-Day18 | TP2-Day18 | ||||
001-002 | 9-May-21 | TP2-Day19 | TP2-Day19 | ||||
001-002 | 10-May-21 | TP2-Day20 | TP2-Day20 | ||||
001-002 | 11-May-21 | 03:00.0 | 75 mg | 3 | TP2-Day22 | TP2-Day22 | TP2-Day22 |
001-002 | 12-May-21 | TP2-Day23 | TP2-Day23 | ||||
001-002 | 13-May-21 | TP2-Day24 | TP2-Day24 | ||||
001-002 | 14-May-21 | TP2-Day25 | TP2-Day25 | ||||
001-002 | 15-May-21 | TP2-Day26 | TP2-Day26 | ||||
001-002 | 16-May-21 | TP2-Day27 | TP2-Day27 | ||||
001-002 | 17-May-21 | TP2-Day28 | TP2-Day27 | ||||
001-002 | 18-May-21 | 30:00.0 | 75 mg | 3 | TP2-Day28 | TP2-Day28 | TP2-Day28 |
001-002 | 19-May-21 | 45:00.0 | 75 mg | 3 | TP2Washoutday71 | TP2Washoutday71 | TP2Washoutday71 |
001-002 | 20-May-21 | TP2Washoutday72 | TP2Washoutday72 |
Thanks you in advance. Any help is much appreciated.
Hi, Try this code. This is based on your requirements from your initial post. Note that this is based on the following assumption:
- A 'Subject_ID' should have its 'Study_day_IP' column populated for their first day in the data.
/* extracting the numeric portion out of study_day_ip */ proc sql; create table tmp_want1 as select a.*, case when Study_day_IP eq '' then '' when anyalpha(substr(Study_day_IP,length(Study_day_IP)-2, 3)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-2, 3)) when anyalpha(substr(Study_day_IP,length(Study_day_IP)-1, 2)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-1, 2)) when anyalpha(substr(Study_day_IP,length(Study_day_IP), 1)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP), 1)) else '1' end as Study_day_IP_Num from have a ; quit; /* extracting the string portion out of study_day_ip */ data tmp_want2; format Study_day_IP_Str $32.; length Study_day_IP_Str $32.; set tmp_want1; if Study_day_IP eq '' then Study_day_IP_Str = ''; else Study_day_IP_Str = substr(Study_day_IP, 1, length(Study_day_IP) - length(Study_day_IP_Num)); run; /* COl X */ proc sort data = tmp_want2; by Subject_ID Date; run; data tmp_want3; set tmp_want2; retain Study_day_IP_Num1; retain Study_day_IP_Str1; if Study_day_IP_num ne '' then do; Study_day_IP_Num1 = input(Study_day_IP_Num,best.); Study_day_IP_Str1 = Study_day_IP_Str; end; else do; Study_day_IP_Num1 = Study_day_IP_Num1 + 1; Study_day_IP_Str1 = Study_day_IP_Str1; end; by Subject_ID Date; Col_X = trim(Study_day_IP_Str1) || compress(Study_day_IP_Num1) ; run; /* COl Y - COmputing Lag and Lead */ proc sort data = tmp_want3; by Subject_ID Date; run; data tmp_want4; set tmp_want3; by Subject_ID Date; Lag_Study_day_IP_Str1 = lag(Study_day_IP_Num1); run; proc sort data = tmp_want4; by Subject_ID descending Date; run; data tmp_want4; set tmp_want4; by Subject_ID descending Date; Lead_Study_day_IP_Str1 = lag(Study_day_IP_Num1); run; proc sort data = tmp_want4; by Subject_ID Date; run; /* Final tabl with COl Z */ data want; set tmp_want4; if Study_day_IP_Num1 = Lead_Study_day_IP_Str1 then Col_Z = trim(Study_day_IP_Str1) || compress(Lag_Study_day_IP_Str1) ; else Col_Z = Col_X; drop Study_day_IP_Num1 Lead_Study_day_IP_Str1 Lag_Study_day_IP_Str1 Study_day_IP_Str1 Study_day_IP_Num Study_day_IP_Str; run;
Thanks.
I haven't converted your sample data to a working sas data step, so the following code is untested.
But this is a case when you encounter a non-blank study_data _ip, you want to extract the "root" (everything through the character string "day"), and the sequence number (the numeric value to the right of the root). Then just retain the root and sequence for use in subsequent observations. When a blank study_data_ip is encountered, increment the sequence, and concatenate it with the root to form the new value of X:
data want (drop=_:) ;
set have;
if study_day_ip^=' ' then do;
x=study_day_ip;
_root=substr(study_day_ip,1,find(lowcase(study_day_ip),'day')+2);
_seq=input(substr(study_day_ip,length(_root)+1),3.);
end;
else do;
_seq+1;
x=cats(_root,_seq);
end;
retain _root _seq ;
run;
Wouldn't it be a LOT easier to use numbers instead of those strings?
First please post the data in a usable (and readable!) format.
data have;
infile cards dsd truncover dlm='|';
input Subject_ID $ Date :date. Time :stimer. Dose :$20. capsules Study_Day_IP :$20. ;
format date date9. time tod8. ;
cards;
001-001|4-Mar-21|45:00.0|25 mg|1|TP1-Day1
001-001|5-Mar-21|50:00.0|25 mg|1|TP1-Day2
001-001|6-Mar-21| | | |
001-001|7-Mar-21| | | |
001-001|8-Mar-21| | | |
001-001|9-Mar-21| | | |
001-001|10-Mar-21| | | |
001-001|11-Mar-21| | | |
001-001|12-Mar-21| | | |
001-001|13-Mar-21| | | |
001-001|14-Mar-21| | | |
001-001|15-Mar-21| | | |
001-001|16-Mar-21| | | |
001-001|17-Mar-21| | | |
001-001|18-Mar-21|45:00.0|50 mg|2|TP1-Day15
001-001|19-Mar-21| | | |
001-001|20-Mar-21| | | |
001-001|21-Mar-21| | | |
001-001|22-Mar-21| | | |
001-001|23-Mar-21| | | |
001-001|24-Mar-21| | | |
;
Looks like the new column is just the difference between the current date and the first date for this subject.
data want ;
set have ;
by subject_id date ;
if first.subject_id then day1 = date ;
retain day1 ;
format day1 date9.;
study_day = date - day1 +1 ;
run;
Subject_ Study_ study_ Obs ID Date Time Dose capsules Day_IP day1 day 1 001-001 04MAR2021 00:45:00 25 mg 1 TP1-Day1 04MAR2021 1 2 001-001 05MAR2021 00:50:00 25 mg 1 TP1-Day2 04MAR2021 2 3 001-001 06MAR2021 . . 04MAR2021 3 4 001-001 07MAR2021 . . 04MAR2021 4 5 001-001 08MAR2021 . . 04MAR2021 5 6 001-001 09MAR2021 . . 04MAR2021 6 7 001-001 10MAR2021 . . 04MAR2021 7 8 001-001 11MAR2021 . . 04MAR2021 8 9 001-001 12MAR2021 . . 04MAR2021 9 10 001-001 13MAR2021 . . 04MAR2021 10 11 001-001 14MAR2021 . . 04MAR2021 11 12 001-001 15MAR2021 . . 04MAR2021 12 13 001-001 16MAR2021 . . 04MAR2021 13 14 001-001 17MAR2021 . . 04MAR2021 14 15 001-001 18MAR2021 00:45:00 50 mg 2 TP1-Day15 04MAR2021 15 16 001-001 19MAR2021 . . 04MAR2021 16 17 001-001 20MAR2021 . . 04MAR2021 17 18 001-001 21MAR2021 . . 04MAR2021 18 19 001-001 22MAR2021 . . 04MAR2021 19 20 001-001 23MAR2021 . . 04MAR2021 20 21 001-001 24MAR2021 . . 04MAR2021 21
I do not understand what you are trying to do.
Are you trying to label the different treatment periods of the study?
What is it that determines the start and stop of the treatment periods? Is it the value of one those variables? Which one? Is it some GAP in the date (or datetime) values? How large a gap?
You will get your point across much better using LESS data. Only include enough observations and variables to demonstrate the pattern.
Hi, Try this code. This is based on your requirements from your initial post. Note that this is based on the following assumption:
- A 'Subject_ID' should have its 'Study_day_IP' column populated for their first day in the data.
/* extracting the numeric portion out of study_day_ip */ proc sql; create table tmp_want1 as select a.*, case when Study_day_IP eq '' then '' when anyalpha(substr(Study_day_IP,length(Study_day_IP)-2, 3)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-2, 3)) when anyalpha(substr(Study_day_IP,length(Study_day_IP)-1, 2)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP)-1, 2)) when anyalpha(substr(Study_day_IP,length(Study_day_IP), 1)) = 0 then compress(substr(Study_day_IP,length(Study_day_IP), 1)) else '1' end as Study_day_IP_Num from have a ; quit; /* extracting the string portion out of study_day_ip */ data tmp_want2; format Study_day_IP_Str $32.; length Study_day_IP_Str $32.; set tmp_want1; if Study_day_IP eq '' then Study_day_IP_Str = ''; else Study_day_IP_Str = substr(Study_day_IP, 1, length(Study_day_IP) - length(Study_day_IP_Num)); run; /* COl X */ proc sort data = tmp_want2; by Subject_ID Date; run; data tmp_want3; set tmp_want2; retain Study_day_IP_Num1; retain Study_day_IP_Str1; if Study_day_IP_num ne '' then do; Study_day_IP_Num1 = input(Study_day_IP_Num,best.); Study_day_IP_Str1 = Study_day_IP_Str; end; else do; Study_day_IP_Num1 = Study_day_IP_Num1 + 1; Study_day_IP_Str1 = Study_day_IP_Str1; end; by Subject_ID Date; Col_X = trim(Study_day_IP_Str1) || compress(Study_day_IP_Num1) ; run; /* COl Y - COmputing Lag and Lead */ proc sort data = tmp_want3; by Subject_ID Date; run; data tmp_want4; set tmp_want3; by Subject_ID Date; Lag_Study_day_IP_Str1 = lag(Study_day_IP_Num1); run; proc sort data = tmp_want4; by Subject_ID descending Date; run; data tmp_want4; set tmp_want4; by Subject_ID descending Date; Lead_Study_day_IP_Str1 = lag(Study_day_IP_Num1); run; proc sort data = tmp_want4; by Subject_ID Date; run; /* Final tabl with COl Z */ data want; set tmp_want4; if Study_day_IP_Num1 = Lead_Study_day_IP_Str1 then Col_Z = trim(Study_day_IP_Str1) || compress(Lag_Study_day_IP_Str1) ; else Col_Z = Col_X; drop Study_day_IP_Num1 Lead_Study_day_IP_Str1 Lag_Study_day_IP_Str1 Study_day_IP_Str1 Study_day_IP_Num Study_day_IP_Str; run;
Thanks.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.