Solved: Re: Transpose

PetePatel · Posted 10-13-2020 03:44 AM

Hi,

I have a dataset that needs to be modified to show trends over time.

I think I need to transpose but I'm unsure on the best way to code it.

Have:

ID	Date	Date1	FB	B
1	30/09/2018	30/06/2019	2.34	2.47
1	30/09/2018	31/03/2019	2.34	2.48
1	30/09/2018	31/12/2018	2.34	2.49
1	30/09/2018	30/09/2018	2.34	2.50
2	31/03/2019	30/06/2019	2.82	2.74
2	31/03/2019	31/03/2019	2.82	2.75
3	30/06/2018	31/03/2019	3.14	2.69
3	30/06/2018	30/06/2018	3.14	2.70
3	30/06/2018	30/06/2019	3.14	2.71
3	30/06/2018	31/12/2018	3.14	2.72
3	30/06/2018	30/09/2018	3.14	2.73

Need:

ID	q0	q1	q2	q3	q4	q5
1	2.34	2.48	2.49	2.5
2	2.82	2.75
3	3.14	2.7	2.71	2.72	2.73

q0 always takes the first row of the FB field

q1-qx always takes the second row of the B field onwards (transposed)

Can you please suggest ways to code this?

Thanks in advance

Kurt_Bremser · Posted 10-13-2020 04:43 AM

data pretrans;
set have;
by id;
if first.id
then do;
  b = fb;
  count = 1;
end;
else count + 1;
_name_ = cats("q",count);
keep id _name_ b;
run;

proc transpose
  data=pretrans
  out=want (drop=_name_)
;
by id;
var b;
id _name_;
run;

For tested code, please supply data in usable form (data step with datalines);

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

vellad · Posted 10-13-2020 04:08 AM

You can use:

proc transpose data=have out=want;
    var fb b;
    by id;
run;

to create below dataset.

Next you can split this into two datasets FB (keep=ID COL1) and B (keep=ID COL2 COL3 COL4 COL5), based on _NAME_, and merge them and rename variables to Q0, Q1 etc.

Kurt_Bremser · Posted 10-13-2020 04:43 AM

data pretrans;
set have;
by id;
if first.id
then do;
  b = fb;
  count = 1;
end;
else count + 1;
_name_ = cats("q",count);
keep id _name_ b;
run;

proc transpose
  data=pretrans
  out=want (drop=_name_)
;
by id;
var b;
id _name_;
run;

For tested code, please supply data in usable form (data step with datalines);

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

vellad · Posted 10-13-2020 04:50 AM

This would be neat. Minor points: need to retain count and initialize count=0 and not 1 inside the if block.

Kurt_Bremser · Posted 10-13-2020 05:03 AM

There is no need to RETAIN count, as it is implicitly retained by using the Sum Statement .

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

vellad · Posted 10-13-2020 09:52 AM

ah, yes.

Kurt_Bremser · Posted 10-13-2020 09:57 AM

PS good catch on that zero as a starting value.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

jimbarbour · Posted 10-13-2020 04:52 AM

That's a bit quirky to have Q1 come from a different variable than Q2 - Q5, but here you go, results, with code below the results.

Jim

Code:

DATA	Have;
	INFILE	DATALINES	DSD	DLM='09'x;
	INPUT	ID	
			_Date	:	ANYDTDTE10.
			_Date1	:	ANYDTDTE10.
			_FB	
			_B
			;
	FORMAT	_Date	DDMMYY10.;
	FORMAT	_Date1	DDMMYY10.;
DATALINES;
1	30/09/2018	30/06/2019	2.34	2.47
1	30/09/2018	31/03/2019	2.34	2.48
1	30/09/2018	31/12/2018	2.34	2.49
1	30/09/2018	30/09/2018	2.34	2.50
2	31/03/2019	30/06/2019	2.82	2.74
2	31/03/2019	31/03/2019	2.82	2.75
3	30/06/2018	31/03/2019	3.14	2.69
3	30/06/2018	30/06/2018	3.14	2.70
3	30/06/2018	30/06/2019	3.14	2.71
3	30/06/2018	31/12/2018	3.14	2.72
3	30/06/2018	30/09/2018	3.14	2.73
;
Run;

PROC	SQL	NOPRINT;
	SELECT	LEFT(PUT(MAX(ID_Count), 8.))	
		INTO	:	Max
		FROM	(
			SELECT	COUNT(1)	AS	ID_Count
				FROM	Have
					GROUP	BY	ID
					);
RUN;

%PUT	NOTE:  &=Max;

DATA	Need;
	DROP	_:;

	SET	Have;
		BY	ID;
	RETAIN	_i;
	RETAIN	Q1 - Q&Max;
	ARRAY	QQ	[*]	Q1 - Q&Max;

	IF	FIRST.ID		THEN
		DO;
			Q1			=	_FB;
			_i			=	1;
		END;
	ELSE
		DO;
			_i			+	1;
			QQ[_i]		=	_B;
		END;

	IF	LAST.ID			THEN
		DO;
			OUTPUT;
			CALL	MISSING(of	QQ[*]);
		END;
RUN;

Register Today!

SAS Training: Just a Click Away