BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JJ_83
Obsidian | Level 7

Hello, 

I am having trouble sorting my data by multiple variables. 

proc sort data = ohstim.combined_stimulation_03;
          BY			    API_WELL_NO
		      DESCENDING 	DT_MODIFIED_STIMULATION
			             	DT_TREATMENT;
run;

Using this code, I get the below results for the observations with API_WELL_NO of "01133244410000". 

Sorting Issues.JPG

The observation I want to appear first is the one with the most recent DT_MODIFIED_STIMULATION value ("28MAR13") and the oldest DT_TREATMENT value ("12DEC12")

 

 

 

The first two variables (API_WELL_NO and DT_MODIFIED_STIMULATION) are sorting correctly, but the third variable (DT_TREATMENT) is not. I think this has something to do with the identical DT_MODIFIED_STIMULATION values. For some reason, when they are identical SAS seems unable to sort them correctly. 

 

I need these observations to be in the correct order because I am using the below code to remove duplicates, so that only the first entry for each API_WELL_NO is retained. 

proc sort data = ohstim.combined_stimulation_03 nodupkey equals;
	BY	API_WELL_NO;
run; 

My code is attached. Any help figuring this out would be greatly appreciated!!

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Longish shot but is your Dt_modified_stimulation variable perhaps a datetime value with a format that only shows the date portion? If that is the case it could be there is only one value for the Dt_treatment variable and so it appears out of sequence.

View solution in original post

8 REPLIES 8
Reeza
Super User

 

proc sort data = ohstim.combined_stimulation_03;
          BY			    API_WELL_NO
		      DESCENDING 	DT_MODIFIED_STIMULATION
		      DT_TREATMENT;
run;

 

Can you post the exact code you ran including your log? The descending should only apply to the first variable the way I understand it, but it seems to be applied to both dt_modified_simulation and DT_Treatment here for some reason. 

 

Your results seem to align with this code, not the code you posted above.

 

proc sort data = ohstim.combined_stimulation_03;
          BY			    API_WELL_NO
		      DESCENDING 	DT_MODIFIED_STIMULATION
		      descending DT_TREATMENT;
run;

 


@JJ_83 wrote:

My code is attached. Any help figuring this out would be greatly appreciated!!

FYI - your code isn't attached, your data sets are attached.

 


@JJ_83 wrote:

Hello, 

I am having trouble sorting my data by multiple variables. 

proc sort data = ohstim.combined_stimulation_03;
          BY			    API_WELL_NO
		      DESCENDING 	DT_MODIFIED_STIMULATION
			             	DT_TREATMENT;
run;

Using this code, I get the below results for the observations with API_WELL_NO of "01133244410000". 

Sorting Issues.JPG

The observation I want to appear first is the one with the most recent DT_MODIFIED_STIMULATION value ("28MAR13") and the oldest DT_TREATMENT value ("12DEC12")

 

 

 

The first two variables (API_WELL_NO and DT_MODIFIED_STIMULATION) are sorting correctly, but the third variable (DT_TREATMENT) is not. I think this has something to do with the identical DT_MODIFIED_STIMULATION values. For some reason, when they are identical SAS seems unable to sort them correctly. 

 

I need these observations to be in the correct order because I am using the below code to remove duplicates, so that only the first entry for each API_WELL_NO is retained. 

proc sort data = ohstim.combined_stimulation_03 nodupkey equals;
	BY	API_WELL_NO;
run; 

My code is attached. Any help figuring this out would be greatly appreciated!!

 

 

 

 


 

JJ_83
Obsidian | Level 7

Hi Reeza, 

 

Thank you for letting me know. I have attached my code exactly as I have run it, and my log. 

 

Thank you 

ballardw
Super User

Longish shot but is your Dt_modified_stimulation variable perhaps a datetime value with a format that only shows the date portion? If that is the case it could be there is only one value for the Dt_treatment variable and so it appears out of sequence.

Reeza
Super User
Why would it show literally different days though?
ballardw
Super User

I am thinking more about the associated date values and that the first after descending is not a unique value at the date level but could be when the time is included.

Consider:

data example;
   input x :datetime. y :date9.;
   format x dtdate9. y date9.;
datalines;
01Jan2020:01:00:00  08dec2019
01Jan2020:02:00:00  09dec2019
01Jan2020:03:00:00  10dec2019
01Jan2020:04:00:00  11dec2019
01Jan2020:05:00:00  12dec2019
01Jan2020:06:00:00  13dec2019
;

Proc sort data=example;
   by descending x;
run;

proc print data=example;
run;

With only ONE Y value for each X value it appears as if the Y is sorted descending (because of the values I picked) but the X has a single day value. I did not include the Y in the sort to minimize any confusion that a "descending" may have also applied to Y.

 

It should be unlikely, but "unlikely" does not mean "never happens". So if other suggestions don't solve then perhaps looking at it with a slightly twisted perspective may be needed.

 

A similar order issue can also occur with Date values that somehow end up with a decimal portion. I won't say how long it took me trace that issue down in a data set...

 

 

JJ_83
Obsidian | Level 7

Hi Ballardw, 

 

Thank you for the idea, it looks like that is exactly what is going on! I would never have thought of that!

 

Do you have any advice on how to remove the time stamp from the DT_MODIFIED_STIMULATION variable?

 

Thank you

ballardw
Super User

@JJ_83 wrote:

Hi Ballardw, 

 

Thank you for the idea, it looks like that is exactly what is going on! I would never have thought of that!

 

Do you have any advice on how to remove the time stamp from the DT_MODIFIED_STIMULATION variable?

 

Thank you


There are several choices. Easiest would be create a new variable with just the date component using the datepart function:

dateonly = datepart(Dt_modified_stimulation);
format dateonly date9.;

Pick your preferred date variable name instead of dateonly.

Other options would be to replace the existing variable the same way using your existing variable as the target name. You really need to change the format in this case.

Or you could change the time component to the start or end of the day using the INTNX function.

JJ_83
Obsidian | Level 7

Using the datepart approach worked perfectly! Thank you so much!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1626 views
  • 3 likes
  • 3 in conversation