Hi All,
please can this programme below be converted to Proc SQL
Data A;
Set B;
retain EXPC1FL EXPC2FL EXPC3FL TRTPFSDT TRTPLSDT EXPDOSE EXPCYCN;
format EXPC1FL EXPC2FL EXPC3FL $1. TRTPFSDT TRTPLSDT DATE9. EXPDUR EXPCYCN EXPDOSE 8. EXPCOMP 4.1;
by usubjid visitnum;
if first.usubjid then do;
EXPC1FL=' '; EXPC2FL=' '; EXPC3FL=' ';
TRTPFSDT=.; TRTPLSDT=.; EXPDOSE=0; EXPCYCN=0;
end;
EXPDOSE=EXPDOSE+ECDOSE;
EXPCYCN=EXPCYCN+1;
if visitnum=10 then EXPC1FL='Y';
if visitnum=20 then EXPC2FL='Y';
if visitnum=30 then EXPC3FL='Y';
if visitnum=10 then TRTPFSDT=TRTPSDT;
if last.usubjid then TRTPLSDT=TRTPEDT;
if last.usubjid;
EXPCOMP=(EXPDOSE/250)*40;
EXPDUR=(TRTPLSDT - TRTPFSDT)+1;
regards
Queen
Hi,
it's possible, but it wouldn't be so readable and easy to validate as the program you already have. So I can't see why it would make sense right now.
Could you please explain why you would like to do this?
It would also be great if you could provide some data to play with.
- Cheers -
Please use the "Insert SAS Code" button to insert formatted sas code.
The concept of first/last observation in a group does not exist in sql.
SQL has the "GROUP BY" statement that allows to use aggregation functions on subgroups of the dataset.
Here is a simplified exemple based on your data step (not tested as you did not provide sample data in
the form of a datastep) :
proc sql noprint;
CREATE TABLE want AS
AS SELECT DISTINCT count(*) AS EXPCYCN, sum(ECDOSE) AS EXPDOSE,
EXPC1FL=ifc(visitnum=10,'Y',' ') format=$1.
FROM have
GROUP BY usubjid
;
quit;
EDIT : Added "DISTINCT" to keep only one row by aggregate (assuming visitnum is constant for any given usubjid).
This line seems strange to me.
EXPC1FL=ifc(visitnum=10,'Y',' ') format=$1.
Why are trying to test if the existing variable EXPC1FL is equal to either the letter Y or a blank string? Why didn't you give the newly calculated variable a name? Why did you try to attach a character format to a numeric (boolean) value?
Also if you did mean to make a character variable why do you feel a need to attach the $1. format to limit the number of characters displayed when the variable is printed? Why not just set the length of the variable to 1?
With that code neither is needed.
But attaching $ formats to characters is a dangerous practice.
Example: What output do you expect from this program?
data one ;
input study :$10. @@;
cards;
ABCD1234 ABCD1234B ABCD1234C
;
data two ;
format study $8. ;
input study @@;
cards;
WXYZ0001 WXYZ0002
;
data both;
set one two;
run;
proc freq ;
tables study;
run;
Results:
The FREQ Procedure Cumulative Cumulative study Frequency Percent Frequency Percent ------------------------------------------------------------- ABCD1234 3 60.00 3 60.00 WXYZ0001 1 20.00 4 80.00 WXYZ0002 1 20.00 5 100.00
Note this is simplified version of a real problem that I have seen.
SQL and "sequence" of data do not play well.
So anything that uses Retain statements or Lag and Dif functions is very difficult to translate to SQL. It also typically becomes much more convoluted code that may be a tad fragile.
proc sql;
create table A as
select
usubjid,
sum(ecdose) as expdose,
count(*) as expcycn,
calculated expdose / 250 * 40 as expcomp,
case when visitnum=10 then "Y" else " " end as expc1fl,
case when visitnum=20 then "Y" else " " end as expc2fl,
case when visitnum=30 then "Y" else " " end as expc3fl,
case when visitnum=10 then trtpsdt else . end as trtpfsdt,
trtpedt as trtplsdt,
calculated trtplsdt - calculated trtpfsdt + 1 as expdur
from B
group by usubjid
having visitnum=max(visitnum);
quit;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.