- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good day,
I am trying to extract the first observation (cntno=734) and the single observation with minimum Euclidean distance from the following SAS dataset named "reasonable_dates":
The final dataset should be just 2 observations: the row with cntno=734 and the row with the minimum of Euclid_dst. In this case that would be 1056.88 (numeric). I wish to keep all variables in the new dataset: cntno, batchno, and dates.
Current version is
Current version is 9.04.01M5P091317 and I do not have the SAS/STATS package.
First attempt:
proc sql;
create table only_mins as
select cntno, batchno, min(euclid_dst) as close_batch
from reasonable_dates
group by batchno;
quit;
Fails because output contains all values. I'm thinking the min() function does not work because I do not have the SAS/STATS package.
Second attempt:
data _null_;
set reasonable_dates;
firstsmall=.; secondsmall=.;
array k[*] euclid_dst:;
firstsmall=smallest(1,of k[*]);
secondsmall=smallest(2,of k[*]);
put firstsmall =;
put secondsmall = ;
stop;
run;
Fails to compile with log :
NOTE: Argument 1 to function SMALLEST(2,0) at line 739 column 17 is invalid
Third attempt:
proc means data=reasonable_dates min;
class cntno;
var euclid_dst;
output out=new(where=(_type_=1)) min=/autoname;
run;
The output is "new"
This is a problem because, well, it isn't a new dataset, Cntno is renamed as Batchno, and the other variable values are missing.
Thank you for looking at my post.
Jane
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Post sample data as text please, not as an image. No one really wants to type out your data.
MIN() function is not related to the SAS/STATS package, it isn't working because in SAS, when you select columns that are unique (batchno) it will return all unique values, because you haven't specified how to group it.
And in a data step, the MIN would mean the min of that row, not of the full column.
Can you also show what you want your output to look like?
If you just want the first two rows, have you considered obs=2 option?
proc sort data=have;
by euclid_dist;
run;
data want;
set have (obs=2);
run;
@jawhitmire wrote:
Good day,
I am trying to extract the first observation (cntno=734) and the single observation with minimum Euclidean distance from the following SAS dataset named "reasonable_dates":
The final dataset should be just 2 observations: the row with cntno=734 and the row with the minimum of Euclid_dst. In this case that would be 1056.88 (numeric). I wish to keep all variables in the new dataset: cntno, batchno, and dates.
Current version is
Current version is 9.04.01M5P091317 and I do not have the SAS/STATS package.
First attempt:
proc sql; create table only_mins as select cntno, batchno, min(euclid_dst) as close_batch from reasonable_dates group by batchno; quit;
Fails because output contains all values. I'm thinking the min() function does not work because I do not have the SAS/STATS package.
Second attempt:
data _null_; set reasonable_dates; firstsmall=.; secondsmall=.; array k[*] euclid_dst:; firstsmall=smallest(1,of k[*]); secondsmall=smallest(2,of k[*]); put firstsmall =; put secondsmall = ; stop; run;
Fails to compile with log :
NOTE: Argument 1 to function SMALLEST(2,0) at line 739 column 17 is invalid
Third attempt:
proc means data=reasonable_dates min; class cntno; var euclid_dst; output out=new(where=(_type_=1)) min=/autoname; run;
The output is "new"
This is a problem because, well, it isn't a new dataset, Cntno is renamed as Batchno, and the other variable values are missing.
Thank you for looking at my post.
Jane
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The data set "reasonable_dates" has 10 observations and 7 variables
Cntno l_day l_time batchno b_day b_time Euclid_dst
734 053600 12jan2025 0 . . 0.00
0 . . 375 154200 05jan2025 1061.33
0 . . 376 113000 06jan2025 1064.12
0 . . 377 035400 07jan2025 1059.09
0 . . 382 054800 10jan2025 1056.88
I would like the output to have 2 rows: the row with cntno=734 (first observation) and the row with the minimum numeric value in the column Euclid_dst.
So the final dataset will look like this:
Cntno l_day l_time batchno b_day b_time Euclid_dst
734 053600 12jan2025 0 . . 0.00
0 . . 382 054800 10jan2025 1056.88
Jane
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@jawhitmire wrote:
First attempt:
proc sql; create table only_mins as select cntno, batchno, min(euclid_dst) as close_batch from reasonable_dates group by batchno; quit;
Fails because output contains all values. I'm thinking the min() function does not work because I do not have the SAS/STATS package.
Your example data shows only one record per level of BATCHNO. If the remainder of your data is that way then with only one record per batchno the group by forces one record per.
If you want the record with min value (ASSUMES there is only one, which is often a dangerous assumption) then perhaps
proc sql; create table only_mins as select cntno, batchno, min(euclid_dst) as close_batch from reasonable_dates having euclid_dist = min(Euclid_dist) quit;
and then you would have for the final two observation set
data want; set reasonable_dates (obs=1) only_mins; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the reply.
It worked with only slight modifications : grouped by statement added
proc sql;
create table only_mins as
select cntno, batchno, min(euclid_dst) as close_batch, l_time, b_time
from reasonable_dates
group by cntno
having euclid_dst = min(Euclid_dst);
quit;
Thanks again, cheers!
Jane