BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi, I transposed every variable in my dataset which is mostly efficient; I have the correct number of observations now; however, I'm having a problem sorting out one set of variables.
Let's call them TEST. I have TEST1-TEST11 for every subject.

Here's an example:

CASE#1 TEST1 4545 TEST2 4545 TEST3 4400 TEST4 4433 TEST 5.....
CASE#2 TEST1 4322 TEST2 4322 TEST3 2432 TEST4 5555
CASE#3 TEST1 3333 TEST2 3333 TEST3 3333 TEST4 6543

As you can see, I have a problem, because for some reason I have all those repetitions.
I want to figure out a way in which I can count, for each individual subject in the study, which tests were done. I don't care about the random ordering of test1-test11; I just care about the test result NUMBERS.

I'd like to be able to say for CASE#1 that I have 4545, 4400 and 4433.
And for CASE#2 that the results were 4322, 2432 and 5555
And for CASE#3 that the results were 3333 and 6543.

In other words, I only want to know if the test result show up at all, and what it is.
Is there some sort of loop that I can do? Will I have to reverse-transpose (but then I am pretty much back where I started...)?

ANY HELP WOULD BE SO MUCH APPRECIATED 🙂
I am stuck.
6 REPLIES 6
andreas_lds
Jade | Level 19
Can you share your proc transpose code and some rows of the original dataset. Under normal cirumstances proc transpose does not create duplicates.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
If you don't care to transpose duplicate values, then sort your input file using NODUPKEY to remove duplicate NUMBERS instances for a given BY variable list. Then transpose your de-dup'd output file so that you only get one unique value.

Scott Barry
SBBWorks, Inc.
Cynthia_sas
SAS Super FREQ
If the original goal is to find out BY case number what the unique test values are, then a combo of the NLEVELS option and a BY statement in PROC FREQ may get what she wants. De-duping the original data set might still be a good idea, but, for example:
[pre]
ods listing;
proc freq data=sashelp.prdsale nlevels;
by country;
table division;
run;

[/pre]

If the -original- input data showed one case/test per obs, then the above technique would show the unique test values for each case.

cynthia
[pre]
options nodate nonumber nocenter;

data testcase;
infile datalines;
input date : mmddyy10. casenum testtype result $;
return;
datalines;
11/15/2008 1 4545 redo
11/16/2008 1 4545 OK
11/16/2008 1 4400 pass
11/17/2008 1 4433 redo
11/18/2008 1 5454 OK
12/15/2008 2 4322 redo
12/16/2008 2 4322 OK
12/16/2008 2 2432 pass
12/17/2008 2 5555 OK
10/15/2008 3 3333 redo
10/16/2008 3 6543 OK
10/16/2008 3 4545 pass
;
run;

ods listing;
proc freq data=testcase nlevels;
by casenum;
table testtype;
run;

[/pre]
deleted_user
Not applicable
I'm sorry Cynthia, I didn't understand what you meant in you last message.
I am not doing anything involving datalines, just using data and set statements;

I now have a unique BY variable for each case, and I want to pull out tests per case (only if they are different)....

The duplicates are there because there are multiple times that the test has been recorded; however, I just want to pull out the unique tests done per case.

I am such a novice I don't even understand that last code. Thanks 🙂
Cynthia_sas
SAS Super FREQ
Hi:
No need to apologize...I was sending you code that you could run to see the results. The "datalines" is a way to read data into SAS -- so in my case, it was a way to take some lines of data, read them into SAS. This has the benefit that you can -see- what the input data looks like as it is turned into a SAS dataset.

The relevant code in that example was the PROC FREQ:
[pre]
ods listing;
proc freq data=testcase nlevels;
by casenum;
tables testtype;
run;

[/pre]

PROC FREQ does counts and percents of counts in a very handy way and creates a report in the LISTING window (if you do not use the Output Delivery System).

If you need a TABLE in order to do further analysis, then you might need to use some other procedure or processing. It really depends on the end result that you want/need -- do you just need a report? What do you mean by "pull out tests per case"? Do you need these observations "only if they are different" to go into another analysis or do you just need a report on the tests that are different for each case.

For example, if you want to rid your dataset of duplicates, then you can look at the PROC SORT procedure, which has several different options for deleting duplicate procedures. If you need to remove duplicates based on some logical condition -- theoretically, you need to remove every duplicate test, but not if it occured on Friday -- that would involve a DATA step because PROC SORT could not handle the logical test.

Frequently, when you use SAS, in order to get from Point A to Point B, you might have to run several different procedure or data steps to get your data cleaned up and in the right shape for the analytical procedure you plan to use.

Sorry I was unclear in my post. I hope this helps.

cynthia
Cynthia_sas
SAS Super FREQ
Hi:
No need to apologize...I was sending you code that you could run to see the results. The "datalines" is a way to read data into SAS -- so in my case, it was a way to take some lines of data, read them into SAS. This has the benefit that you can -see- what the input data looks like as it is turned into a SAS dataset.

The relevant code in that example was the PROC FREQ:
[pre]
ods listing;
proc freq data=testcase nlevels;
by casenum;
tables testtype;
run;

[/pre]

PROC FREQ does counts and percents of counts in a very handy way and creates a report in the LISTING window (if you do not use the Output Delivery System).

If you need a TABLE in order to do further analysis, then you might need to use some other procedure or processing. It really depends on the end result that you want/need -- do you just need a report? What do you mean by "pull out tests per case"? Do you need these observations "only if they are different" to go into another analysis or do you just need a report on the tests that are different for each case.

For example, if you want to rid your dataset of duplicates, then you can look at the PROC SORT procedure, which has several different options for deleting duplicate procedures. If you need to remove duplicates based on some logical condition -- theoretically, you need to remove every duplicate test, but not if it occured on Friday -- that would involve a DATA step because PROC SORT could not handle the logical test.

Frequently, when you use SAS, in order to get from Point A to Point B, you might have to run several different procedure or data steps to get your data cleaned up and in the right shape for the analytical procedure you plan to use.

Sorry I was unclear in my post. I hope this helps.

cynthia

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 764 views
  • 0 likes
  • 4 in conversation