Hi,
I have a dataset that needs to be modified to show trends over time.
I think I need to transpose but I'm unsure on the best way to code it.
Have:
ID | Date | Date1 | FB | B |
1 | 30/09/2018 | 30/06/2019 | 2.34 | 2.47 |
1 | 30/09/2018 | 31/03/2019 | 2.34 | 2.48 |
1 | 30/09/2018 | 31/12/2018 | 2.34 | 2.49 |
1 | 30/09/2018 | 30/09/2018 | 2.34 | 2.50 |
2 | 31/03/2019 | 30/06/2019 | 2.82 | 2.74 |
2 | 31/03/2019 | 31/03/2019 | 2.82 | 2.75 |
3 | 30/06/2018 | 31/03/2019 | 3.14 | 2.69 |
3 | 30/06/2018 | 30/06/2018 | 3.14 | 2.70 |
3 | 30/06/2018 | 30/06/2019 | 3.14 | 2.71 |
3 | 30/06/2018 | 31/12/2018 | 3.14 | 2.72 |
3 | 30/06/2018 | 30/09/2018 | 3.14 | 2.73 |
Need:
ID | q0 | q1 | q2 | q3 | q4 | q5 |
1 | 2.34 | 2.48 | 2.49 | 2.5 | ||
2 | 2.82 | 2.75 | ||||
3 | 3.14 | 2.7 | 2.71 | 2.72 | 2.73 |
q0 always takes the first row of the FB field
q1-qx always takes the second row of the B field onwards (transposed)
Can you please suggest ways to code this?
Thanks in advance
data pretrans;
set have;
by id;
if first.id
then do;
b = fb;
count = 1;
end;
else count + 1;
_name_ = cats("q",count);
keep id _name_ b;
run;
proc transpose
data=pretrans
out=want (drop=_name_)
;
by id;
var b;
id _name_;
run;
For tested code, please supply data in usable form (data step with datalines);
You can use:
proc transpose data=have out=want;
var fb b;
by id;
run;
to create below dataset.
Next you can split this into two datasets FB (keep=ID COL1) and B (keep=ID COL2 COL3 COL4 COL5), based on _NAME_, and merge them and rename variables to Q0, Q1 etc.
data pretrans;
set have;
by id;
if first.id
then do;
b = fb;
count = 1;
end;
else count + 1;
_name_ = cats("q",count);
keep id _name_ b;
run;
proc transpose
data=pretrans
out=want (drop=_name_)
;
by id;
var b;
id _name_;
run;
For tested code, please supply data in usable form (data step with datalines);
There is no need to RETAIN count, as it is implicitly retained by using the Sum Statement .
PS good catch on that zero as a starting value.
That's a bit quirky to have Q1 come from a different variable than Q2 - Q5, but here you go, results, with code below the results.
Jim
Code:
DATA Have;
INFILE DATALINES DSD DLM='09'x;
INPUT ID
_Date : ANYDTDTE10.
_Date1 : ANYDTDTE10.
_FB
_B
;
FORMAT _Date DDMMYY10.;
FORMAT _Date1 DDMMYY10.;
DATALINES;
1 30/09/2018 30/06/2019 2.34 2.47
1 30/09/2018 31/03/2019 2.34 2.48
1 30/09/2018 31/12/2018 2.34 2.49
1 30/09/2018 30/09/2018 2.34 2.50
2 31/03/2019 30/06/2019 2.82 2.74
2 31/03/2019 31/03/2019 2.82 2.75
3 30/06/2018 31/03/2019 3.14 2.69
3 30/06/2018 30/06/2018 3.14 2.70
3 30/06/2018 30/06/2019 3.14 2.71
3 30/06/2018 31/12/2018 3.14 2.72
3 30/06/2018 30/09/2018 3.14 2.73
;
Run;
PROC SQL NOPRINT;
SELECT LEFT(PUT(MAX(ID_Count), 8.))
INTO : Max
FROM (
SELECT COUNT(1) AS ID_Count
FROM Have
GROUP BY ID
);
RUN;
%PUT NOTE: &=Max;
DATA Need;
DROP _:;
SET Have;
BY ID;
RETAIN _i;
RETAIN Q1 - Q&Max;
ARRAY QQ [*] Q1 - Q&Max;
IF FIRST.ID THEN
DO;
Q1 = _FB;
_i = 1;
END;
ELSE
DO;
_i + 1;
QQ[_i] = _B;
END;
IF LAST.ID THEN
DO;
OUTPUT;
CALL MISSING(of QQ[*]);
END;
RUN;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.