BookmarkSubscribeRSS Feed
mick_g
Calcite | Level 5

I have a simple (I think) proc sort statement.

proc sort data = all_patients;

by date_add   patient_num   time_add   con_code;

run;

However, this code is only sorting date_add? 

What am I trying to do is have a start date/time for a patient in treatment then for each start date/time have all coresponding procedures.  So I can have multiple start/times on the same day. 

If I change the order of the proc sort to

patient_num date_add time_add con_code;

it appears to sort the data however, I need the date sorted in date_add patient_num time_add con_code.

Does anyone know why only the first field is sorted?  Does anyone know how to get the dataset to sort on all the fields?

Much appreciation for help,

9 REPLIES 9
Tom
Super User Tom
Super User

What evidence do you have that it did not sort properly?

Can you give more indication of what you need to do with the data?  You requirements seem to be in conflict.

If you really need to access the file in both DATE/PATIENT and PATIENT/DATE order then you could make two copies of the data.  Or use one or more indexes.  With indexes you can access the data in the index order without physically sorting the records.

art297
Opal | Level 21

I'm not really sure how you want it sorted.  The way you are calling it, the data will sort by date_add then, within each date_add, the data will be sorted in patient_num order and, within each patient_num, it will be sorted in time_add order and, within each time_add, it will be sorted by con_code.

How do you want it sorted?

Perhaps is will help if you post some sample data and how you want that data to look after it is sorted.

mick_g
Calcite | Level 5

The data needs to be sorted by date_add patient_num time_add con_code

which is this code

proc sort data = all_patients;

by date_add   patient_num   time_add   con_code;

run;

I know this is not sorting because it has sorted all con_code together;

so my data looks like this after I sort with the statement above.

Date_add      Patient_num  Time_add    con_code

7/8/2011        01233210       13.42.14     14

7/18/2011      21233210       18.56.50     14

7/18/2011      30192830       11.31.18     14

7/8/2011        01233210       13.42.15     62   this should be with the other 7/8/2011

art297
Opal | Level 21

Is date_add a SAS date or a text field?

DouglasMartin
Calcite | Level 5

It doesn't look to me like it is sorting at all. Even if Date_add was a character variable (which it shouldn't be) you wouldn't get those results. Are you getting any error messages? Are you looking at all_patients immediately after that step finishes?

art297
Opal | Level 21

Doug,

I sort of disagree .. if the data weren't EXACTLY as shown in Mick's post.  For example:

data have;

  input (Date_add Patient_num Time_add con_code) ($);

  cards;

07/8/2011        01233210       13.42.14     14

07/18/2011      21233210       18.56.50     14

07/18/2011      30192830       11.31.18     14

7/8/2011        01233210       13.42.15     62

;

proc sort data=have out=want;

  by Date_add Patient_num Time_add con_code;

run;

The reason I asked was that such a situation might be correctable with the addition of the sortseq option:

proc sort data=have out=want sortseq=linguistic (numeric_collation=on);

  by Date_add Patient_num Time_add con_code;

run;

DouglasMartin
Calcite | Level 5

As Tom and art297 said, it's not clear what you want. Your first sort will show (for example) all records with a date_add of 01JAN2011 then all records (for axample) with a date_add of 02JAN2011, and so on. For any given date it will show all patients in order that have that particular date. For any given date and patient it will show all times in order, and so on.

One possibility comes to mind - what are the types (character or numeric) of those fields? The type won't affect whether you can sort, but may affect the resulting order (e.g. "10AUG1999", "10DEC1999", "10FEB1999", "11AUG1995" are in ascending order if they are in a character field).

Cynthia_sas
SAS Super FREQ

The way I like to test questions like this is to make some fake data, where the data will clearly reveal, when sorted (or otherwise processed), which path or code alternative get me what I want.

So, for the code below, I get the 2 following outputs (see screen shots). My assumptions were that the dates and times were numeric variables, formatted with SAS date/time formats. Note that I changed the data somewhat from the original data lines posted.

cynthia

data have;
  infile datalines dlm=' ';
  input Date_add : mmddyy10. Patient_num
        Time_add : time8. con_code ;
  format date_add mmddyy10. time_add time8.;
datalines;
07/08/2011       11111111       01.11.11     11
07/08/2011       11111111       01.12.12     12
07/18/2011       11111111       01.13.13     13
07/18/2011       22222222       02.22.21     21
07/18/2011       22222222       02.22.22     22
07/18/2011       33333333       03.33.31     31
07/08/2011       44444444       04.44.41     41
07/08/2011       44444444       04.44.42     42
;
run;
        
proc sort data=have out=want;
  by Date_add Patient_num Time_add con_code;
run;
         
ods listing;
proc print data=want;
  title "1) by Date_add Patient_num Time_add con_code";
  title2 "All 07/08 dates appear before 07/18 dates";
  title3 "which puts patient 44444444 on 07/08 before 07/18 dates";

  var Date_add Patient_num Time_add con_code;
run;
        
proc sort data=have out=want2;
  by Patient_num Date_add Time_add con_code;
run;
            
proc print data=want2;
  title "2) by Patient_num Date_add Time_add con_code";
  title2 "patients are in order and then dates/times are in order";
  title3 "within each patient 'group' ";
  var Patient_num Date_add Time_add con_code;
run;
title;


output_sort1.jpgoutput_sort2.jpg
scmebu
SAS Employee

Another couple of things to try are (1) examining the output of PROC CONTENTS for the data set that was (supposedly) just sorted and (2) perform some type of BY processing using the sorted data set.  PROC CONTENTS will indicate whether the data set is marked as sorted and the BY processing will detect whether any observations are out of order.  As an example of BY processing, try:

  proc print data = all_patients;

  by date_add   patient_num   time_add   con_code;

  run;

just after the data set is sorted.  Out of order observations will result in an ERROR message in the log.

It is very unlikely that PROC SORT is not sorting.  Rather, it is most probably the case that the data set is sorted but, for some reason such as BY variable type, it isn't sorted the way you are expecting.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 8645 views
  • 1 like
  • 6 in conversation