SAS Programming

kt_080517 · Posted 11-02-2018 01:40 PM

This is my dataset:

Client	Gender	Test_ID	Test_Date
A	F	123	5/1/2015
A	F	567	11/1/2015
B	M	789	1/1/2017
B	M	548	6/1/2017
C	M	121	8/1/2016
C	M	540	2/1/2017

I want this dataset:

Client	Gender	Test_ID_one	Test_ID_two	Test_Date_one	Test_date_two
A	F	123	567	5/1/2015	11/1/2015
B	M	789	548	1/1/2017	6/1/2017
C	M	121	540	8/1/2016	2/1/2017

I think this is a transpose? Any help would be great!

PeterClemmensen · Posted 11-02-2018 01:46 PM

Does each client level always only contain two observations?

The DATA to DATA Step Macro
Blog: SASnrd

kt_080517 · Posted 11-02-2018 01:48 PM

in my current dataset, yes - there are only 2 observations per client. But, I may expand that to multiple obs per client (not exceeding 25 obs)

Doc_Duke · Posted 11-02-2018 01:46 PM

Yes, PROC TRANSPOSE is the tool. I did a simple google search

transpose site:sas.com

and the first hit had an example of what you are looking for

https://support.sas.com/resources/papers/proceedings09/060-2009.pdf

novinosrin · Posted 11-02-2018 02:20 PM

May require a double transpose and the transpose will likely convert the dates to character

If you are OK with this, I can try

novinosrin · Posted 11-02-2018 03:04 PM

data have;
input Client $	Gender $ test_id $	Test_Date :mmddyy10.;
format test_date mmddyy10.;
cards;
A	F	123	5/1/2015
A	F	567	11/1/2015
B	M	789	1/1/2017
B	M	548	6/1/2017
C	M	121	8/1/2016
C	M	540	2/1/2017
;
data want;
do _n_=1 by 1 until(last.client);
set have;
by Client ;
array j(*)$ test_id1-test_id2 ;
array k(*) Test_Date1-Test_Date2;
if _n_=1 then call missing(of j(*),of  k(*));
j(_n_)=test_id;
k(_n_)=Test_Date;
end;
format test_Date: mmddyy10.;
drop test_id test_date;
run;

VDD · Posted 11-02-2018 03:40 PM

So lets look at what @kt_080517 has where they may have multipole test for a client how would they make test_id2 and test_date2 dynamic so they would not have to edit the program and replace the test_id2 with test_id3?

data have;
input Client $	Gender $ test_id $	Test_Date :mmddyy10.;
format test_date mmddyy10.;
cards;
A	F	123	5/1/2015
A	F	124	5/4/2015
A	F	567	11/1/2015
B	M	789	1/1/2017
B	M	790	2/21/2017
B	M	548	6/1/2017
C	M	121	8/1/2016
C	M	540	2/1/2017
;
data want;
do _n_=1 by 1 until(last.client);
set have;
by Client ;
array j(*)$ test_id1-test_id3 ;
array k(*) Test_Date1-Test_Date3;
if _n_=1 then call missing(of j(*),of  k(*));
j(_n_)=test_id;
k(_n_)=Test_Date;
end;
format test_Date: mmddyy10.;
drop test_id test_date;
run;

novinosrin · Posted 11-02-2018 03:49 PM

Sorry, what do you mean by this So lets look at what kt_080517 has where??

I am not catching your suggestion

kt_080517 · Posted 11-02-2018 04:08 PM

Can Proc SQL be used here? to bring a value from obs other than the first into a column of its own so all data is in one row?

novinosrin · Posted 11-02-2018 04:23 PM

why sql?

Sql can help in getting the array subscript(number of elements) as a macro to make the code i shared dynamic. however, for transpose either data-step or proc transpose is best

So this should handle multiple id requirement:

data have;
input Client $	Gender $ test_id $	Test_Date :mmddyy10.;
format test_date mmddyy10.;
cards;
A	F	123	5/1/2015
A	F	567	11/1/2015
B	M	789	1/1/2017
B	M	548	6/1/2017
C	M	121	8/1/2016
C	M	540	2/1/2017
;

proc sql;
select max(cnt) into : m trimmed 
from (select count(client) as cnt from have group by client);
quit;


data want;
do _n_=1 by 1 until(last.client);
set have(rename=(test_id=_test_id Test_Date=_Test_Date));
by Client ;
array test_id(&m)$ ;
array Test_Date(&m);
if _n_=1 then call missing(of test_id(*),of  Test_Date(*));
test_id(_n_)=_test_id;
Test_Date(_n_)=_Test_Date;
end;
format test_Date: mmddyy10.;
drop _:;
run;

Peter_C · Posted 11-02-2018 05:00 PM

Looks like you want one row per client
That works well with a data step using BY statement (by client)
Collect test dates and test_ID in the easiest way, as an array.
Suppose there are no more than 10 tests for each client.
Data client_tests ;
Do test_n = 1 by 1 until( last.client) ;
Set your_dataset end= finished ;
By client;
Array testDate(10) ;
Array testID(10) $10 ;
IF TEST_N > dim( testdate) then do ;
Err_ct +1 ;
If err_ct =1 then
Put "Error: need to widen the array" / _all_ ;
Continue ;
testDate( test_n ) = test_date ;
TestID( test_n ) = test_id ;
End ;
If finished and err_ct >1 then
Put 'WARNING: found ' err_ct ' inputs outside array.' ;
Run;

You might not need as many as 10.
You could collect the max number of tests for a client within the data step.
As the test dates and IDs are stored in array variables testDate1 to testDate10 and testID1 to testID10 you could drop the vars test_date and test_id, along with ER_CT.
On each output row the variable test_n indicates 1 more than the number of tests for each client.
I have assumed that taking the sex value from the last client row provides correct results.

ballardw · Posted 11-02-2018 05:23 PM

@kt_080517 wrote:

This is my dataset:

Client Gender Test_ID Test_Date

A F 123 5/1/2015

A F 567 11/1/2015

B M 789 1/1/2017

B M 548 6/1/2017

C M 121 8/1/2016

C M 540
2/1/2017

I want this dataset:

Client Gender Test_ID_one Test_ID_two Test_Date_one Test_date_two

A F 123 567 5/1/2015 11/1/2015

B M 789 548 1/1/2017 6/1/2017

C M 121 540 8/1/2016 2/1/2017

I think this is a transpose? Any help would be great!

Another of these combine rows question and I just spent yet another couple of hours or so undoing data provided in that format so it can actually be used...

Repeat after me: I hate spreadsheets.

Ksharp · Posted 11-03-2018 07:31 AM

data have;
input Client $	Gender $ test_id $	Test_Date :mmddyy10.;
format test_date mmddyy10.;
cards;
A	F	123	5/1/2015
A	F	567	11/1/2015
B	M	789	1/1/2017
B	M	548	6/1/2017
C	M	121	8/1/2016
C	M	540	2/1/2017
;

proc summary data=have nway;
class  Client ;
output out=want idgroup(out[2] ( Gender  test_id Test_Date)=);
run;

SAS Programming

Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

Re: Bring data from second obs of client into first obs so all data is in one row

proc import xlsx with variable name at obs=5 dand data at obs=6

Data Science and Data Preparation Article Overview by Gerhard

Bringing It All Together! Using SAS Data Preparation to Integrate All ...

Data Team - Student Track

Configuring SAS Federated Data Source at Remote Data Agent (Cloud Data...

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...