Hello there,
I have been trying to work on this problem with data manipulation using SQL. Here is my program and my data set:
ID | Cycle | Delay | Reduction | DrugName | Regimen | Dose |
1 | 1 | None | None | Carb | x,y | 750 |
1 | 1 | None | yes | Pem | x,y | 1015 |
1 | 2 | None | None | Pem | x,y | 1015 |
1 | 2 | None | None | Carb | x,y | 705 |
1 | 1 | yes | None | Durv | x | 940 |
1 | 2 | None | None | Durv | x | 940 |
1 | 3 | None | None | Durv | x | 930 |
1 | 4 | yes | None | Durv | x | 910 |
1 | 5 | None | None | Durv | x | 910 |
1 | 6 | None | None | Durv | x | 910 |
1 | 7 | None | None | Durv | x | 940 |
1 | 8 | None | None | Durv | x | 945 |
1 | 9 | None | yes | Durv | x | 920 |
1 | 10 | None | None | Durv | x | 915 |
1 | 11 | None | None | Durv | x | 915 |
1 | 12 | Unknown | None | Durv | x | 920 |
1 | 13 | None | None | Durv | x | 920 |
1 | 14 | None | None | Durv | x | 920 |
1 | 15 | None | yes | Durv | x | 925 |
1 | 16 | None | None | Durv | x | 910 |
1 | 17 | None | None | Durv | x | 890 |
1 | 18 | None | None | Durv | x | 875 |
1 | 19 | None | None | Durv | x | 870 |
1 | 20 | None | None | Durv | x | 850 |
1 | 21 | None | None | Durv | x | 850 |
1 | 22 | None | None | Durv | x | 845 |
1 | 23 | None | None | Durv | x | 860 |
1 | 24 | None | None | Durv | x | 855 |
2 | 1 | None | None | Pacl | x,z | 95 |
2 | 1 | None | None | Carb | x,z | 245 |
2 | 2 | None | None | Pacl | x,z | 95 |
2 | 2 | None | None | Carb | x,z | 225 |
2 | 3 | None | None | Pacl | x,z | 95 |
2 | 3 | None | Other | Carb | x,z | 225 |
Here is my program:
Proc sql;
Create table Drug_Clean AS
select distinct (DrugName),
ID as ID, max(cycle) as maxcycle, min(cycle) as mincycle, Count(Dose) as numberdoses, First(dose) as Firstdose, mean(Dose) as MeanDose, last(Dose) as LastDose
from L_line1
Group by SubjectID;
quit;
Here is the final dataset that i would like to have the following:
It should have all the drugnames listed by id: For exampl ID #1 has carb, pem and durv. I want to display as follows:
ID | Drug | MaxCycle | MinCycle | countdose | Avgdose | FirstDose | LastDose | Reduction | Delay |
1 | Durv | 24 | 1 | 24 | 902.7 | 940 | 855 | yes | yes |
Any help is greatly appreciated!!!
Like this?
proc sql;
create table WANT as
select unique ID
, DRUGNAME
, MAXCYCLE
, MINCYCLE
, NUMBERDOSES
, MEANDOSE
, max(FD) as FIRSTDOSE
, max(LD) as LASTDOSE
from (
select unique ID
, DRUGNAME
, max (CYCLE) as MAXCYCLE
, min (CYCLE) as MINCYCLE
, count(DOSE) as NUMBERDOSES
, mean(DOSE) as MEANDOSE
, DOSE*(CYCLE=calculated MINCYCLE) as FD
, DOSE*(CYCLE=calculated MAXCYCLE) as LD
from HAVE
group by ID, DRUGNAME
)
group by ID, DRUGNAME
order by ID, DRUGNAME
;
quit;
Why is there no Carb in the output?
Sorry, Yes I want it to display all the carb and pem as well. Basically all the drugnames by ID.
Thanks
Like this?
proc sql;
create table WANT as
select unique ID
, DRUGNAME
, MAXCYCLE
, MINCYCLE
, NUMBERDOSES
, MEANDOSE
, max(FD) as FIRSTDOSE
, max(LD) as LASTDOSE
from (
select unique ID
, DRUGNAME
, max (CYCLE) as MAXCYCLE
, min (CYCLE) as MINCYCLE
, count(DOSE) as NUMBERDOSES
, mean(DOSE) as MEANDOSE
, DOSE*(CYCLE=calculated MINCYCLE) as FD
, DOSE*(CYCLE=calculated MAXCYCLE) as LD
from HAVE
group by ID, DRUGNAME
)
group by ID, DRUGNAME
order by ID, DRUGNAME
;
quit;
Hi @rajd1
Here is a way to achieve this.
Could you please clarify the rule to compute delay and reduction?
In the following program, I have set reduction and delay to "yes" if there was at least one record equal to "yes", and "none" in other cases.
However, I can see that there is a value "Other" for patient 2 at cycle 3. How would you like to handle this case?
Best,
proc sql;
select distinct ID, DrugName, MaxCycle, MinCycle, countdose, Avgdose, max(FirstDose) as FirstDose, max(LastDose) as LastDose, Delay, Reduction
from (select ID, DrugName,
max(cycle) as MaxCycle,
min(cycle) as MinCycle,
count(dose) as countdose,
mean(dose) as Avgdose format=8.1,
dose*(cycle=calculated MinCycle) as FirstDose,
dose*(cycle=calculated MaxCycle) as LastDose,
case when sum(lowcase(delay)="yes") > 1 then "yes" else "none" end as Delay,
case when sum(lowcase(reduction)="yes") > 1 then "yes" else "none" end as Reduction /*what if "Other" ?*/
from L_line1
group by ID, DrugName)
group by ID, DrugName
order by ID, DrugName;
quit;
This worked GREAT! The delay variable has 3 categories: none, yes, and unknown. I would like to make these as categories for the delay variable as well.
basically i want to be able to add the 3rd category to the delay variable.
Further i have some dates and was trying to find the difference between dates.
i was working with and here is the code:
proc sql;
create table clean_Drug as
select unique SubjectID
, DRUGNAME
, REGIMEN
, STARTCYCLE
, ENDCYCLE
, CYCLEDAYS
, STARTLOT
, ENDLOT
, MAXCYCLE
, MINCYCLE
, AVGCYCLE
, MEDIANCYCLE
, NUMBERDOSES
, MEANDOSE
, max(FIRSTDOSE) as FIRSTDOSE
, max(LASTDOSE) as LASTDOSE
from (
select unique SubjectID
, DRUGNAME
, REGIMEN
, min (LotStart) as STARTLOT format MMDDYY10.
, max (LotEnd) as ENDLOT format MMDDYY10.
, min (AdminDate) as STARTCYCLE format MMDDYY10.
, max (AdminDate) as ENDCYCLE format MMDDYY10.
, max (CYCLE) as MAXCYCLE
, sum (ENDCYCLE,-STARTCYCLE) as CYCLEDAYS
, min (CYCLE) as MINCYCLE
, mean (CYCLE) as AVGCYCLE
, median(CYCLE) as MEDIANCYCLE
, count(DOSE) as NUMBERDOSES
, mean(DOSE) as MEANDOSE
, DOSE*(CYCLE=calculated MINCYCLE) as FIRSTDOSE
, DOSE*(CYCLE=calculated MAXCYCLE) as LASTDOSE
from L_Line1
group by SubjectID, DRUGNAME)
group by SubjectID, DRUGNAME
order by SubjectID, DRUGNAME
;
quit;
I got the following error:
ERROR: Function SUM requires a numeric expression as argument 1.
ERROR: The following columns were not found in the contributing tables: ENDCYCLE, STARTCYCLE.
Thanks so much!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.