proc transpose nightmare

deleted_user · Posted 05-12-2009 06:22 PM

Hi, I transposed every variable in my dataset which is mostly efficient; I have the correct number of observations now; however, I'm having a problem sorting out one set of variables.
Let's call them TEST. I have TEST1-TEST11 for every subject.

Here's an example:

CASE#1 TEST1 4545 TEST2 4545 TEST3 4400 TEST4 4433 TEST 5.....
CASE#2 TEST1 4322 TEST2 4322 TEST3 2432 TEST4 5555
CASE#3 TEST1 3333 TEST2 3333 TEST3 3333 TEST4 6543

As you can see, I have a problem, because for some reason I have all those repetitions.
I want to figure out a way in which I can count, for each individual subject in the study, which tests were done. I don't care about the random ordering of test1-test11; I just care about the test result NUMBERS.

I'd like to be able to say for CASE#1 that I have 4545, 4400 and 4433.
And for CASE#2 that the results were 4322, 2432 and 5555
And for CASE#3 that the results were 3333 and 6543.

In other words, I only want to know if the test result show up at all, and what it is.
Is there some sort of loop that I can do? Will I have to reverse-transpose (but then I am pretty much back where I started...)?

ANY HELP WOULD BE SO MUCH APPRECIATED 🙂
I am stuck.

andreas_lds · Posted 05-13-2009 02:33 AM

Can you share your proc transpose code and some rows of the original dataset. Under normal cirumstances proc transpose does not create duplicates.

sbb · Posted 05-13-2009 09:26 AM

If you don't care to transpose duplicate values, then sort your input file using NODUPKEY to remove duplicate NUMBERS instances for a given BY variable list. Then transpose your de-dup'd output file so that you only get one unique value.

Scott Barry
SBBWorks, Inc.

Cynthia_sas · Posted 05-13-2009 10:33 AM

If the original goal is to find out BY case number what the unique test values are, then a combo of the NLEVELS option and a BY statement in PROC FREQ may get what she wants. De-duping the original data set might still be a good idea, but, for example:
[pre]
ods listing;
proc freq data=sashelp.prdsale nlevels;
by country;
table division;
run;

[/pre]

If the -original- input data showed one case/test per obs, then the above technique would show the unique test values for each case.

cynthia
[pre]
options nodate nonumber nocenter;

data testcase;
infile datalines;
input date : mmddyy10. casenum testtype result $;
return;
datalines;
11/15/2008 1 4545 redo
11/16/2008 1 4545 OK
11/16/2008 1 4400 pass
11/17/2008 1 4433 redo
11/18/2008 1 5454 OK
12/15/2008 2 4322 redo
12/16/2008 2 4322 OK
12/16/2008 2 2432 pass
12/17/2008 2 5555 OK
10/15/2008 3 3333 redo
10/16/2008 3 6543 OK
10/16/2008 3 4545 pass
;
run;

ods listing;
proc freq data=testcase nlevels;
by casenum;
table testtype;
run;

[/pre]

deleted_user · Posted 05-19-2009 10:36 AM

I'm sorry Cynthia, I didn't understand what you meant in you last message.
I am not doing anything involving datalines, just using data and set statements;

I now have a unique BY variable for each case, and I want to pull out tests per case (only if they are different)....

The duplicates are there because there are multiple times that the test has been recorded; however, I just want to pull out the unique tests done per case.

I am such a novice I don't even understand that last code. Thanks 🙂

Cynthia_sas · Posted 05-19-2009 12:33 PM

Hi:
No need to apologize...I was sending you code that you could run to see the results. The "datalines" is a way to read data into SAS -- so in my case, it was a way to take some lines of data, read them into SAS. This has the benefit that you can -see- what the input data looks like as it is turned into a SAS dataset.

The relevant code in that example was the PROC FREQ:
[pre]
ods listing;
proc freq data=testcase nlevels;
by casenum;
tables testtype;
run;

[/pre]

PROC FREQ does counts and percents of counts in a very handy way and creates a report in the LISTING window (if you do not use the Output Delivery System).

If you need a TABLE in order to do further analysis, then you might need to use some other procedure or processing. It really depends on the end result that you want/need -- do you just need a report? What do you mean by "pull out tests per case"? Do you need these observations "only if they are different" to go into another analysis or do you just need a report on the tests that are different for each case.

For example, if you want to rid your dataset of duplicates, then you can look at the PROC SORT procedure, which has several different options for deleting duplicate procedures. If you need to remove duplicates based on some logical condition -- theoretically, you need to remove every duplicate test, but not if it occured on Friday -- that would involve a DATA step because PROC SORT could not handle the logical test.

Frequently, when you use SAS, in order to get from Point A to Point B, you might have to run several different procedure or data steps to get your data cleaned up and in the right shape for the analytical procedure you plan to use.

Sorry I was unclear in my post. I hope this helps.

cynthia

Cynthia_sas · Posted 05-19-2009 12:33 PM

Hi:
No need to apologize...I was sending you code that you could run to see the results. The "datalines" is a way to read data into SAS -- so in my case, it was a way to take some lines of data, read them into SAS. This has the benefit that you can -see- what the input data looks like as it is turned into a SAS dataset.

The relevant code in that example was the PROC FREQ:
[pre]
ods listing;
proc freq data=testcase nlevels;
by casenum;
tables testtype;
run;

[/pre]

PROC FREQ does counts and percents of counts in a very handy way and creates a report in the LISTING window (if you do not use the Output Delivery System).

If you need a TABLE in order to do further analysis, then you might need to use some other procedure or processing. It really depends on the end result that you want/need -- do you just need a report? What do you mean by "pull out tests per case"? Do you need these observations "only if they are different" to go into another analysis or do you just need a report on the tests that are different for each case.

For example, if you want to rid your dataset of duplicates, then you can look at the PROC SORT procedure, which has several different options for deleting duplicate procedures. If you need to remove duplicates based on some logical condition -- theoretically, you need to remove every duplicate test, but not if it occured on Friday -- that would involve a DATA step because PROC SORT could not handle the logical test.

Frequently, when you use SAS, in order to get from Point A to Point B, you might have to run several different procedure or data steps to get your data cleaned up and in the right shape for the analytical procedure you plan to use.

Sorry I was unclear in my post. I hope this helps.

cynthia

proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Re: proc transpose nightmare

Registration is open

SAS Training: Just a Click Away