Solved: Re: Using Proc sql select distinct and keep

Blandine · Posted 11-12-2020 12:29 PM

Hello,

I have a set of bridge data with different variables such as: bridge number (STRUCTURE_NUMBER_008), kind(STRUCTURE_KIND_043A ), owner(OWNER_022), inspection years, etc.

Because inspection year differs I have multiple rows of the same bridge number. I want to do a stratified sample selection base on kind and owner; However, I don't want my sample selection to select the same bridge given the repeated row of the same bridge number.

I was thinking of selecting one bridge number first in my data using proc sql and select distinct, but while doing that I want to keep the variable kind(STRUCTURE_KIND_043A ), owner(OWNER_022) so I can do the stratified sampling later on.

I am having error with my code.

proc sql;
create table SUP.IDNBI_PSC_FILES as
select distinct STRUCTURE_NUMBER_008
from SUP.IDNBI_PSC_SUPER (KEEP=STRUCTURE_KIND_043A OWNER_022);
quit;

log error

42 proc sql;
43 create table SUP.IDNBI_PSC_FILES as
44 select distinct STRUCTURE_NUMBER_008
45 from SUP.IDNBI_PSC_SUPER (KEEP=STRUCTURE_KIND_043A OWNER_022);
ERROR: The following columns were not found in the contributing tables: STRUCTURE_NUMBER_008.
46 quit;

ballardw · Posted 11-12-2020 01:05 PM

The KEEP= option means that the variable you want, STRUCTURE_NUMBER_008, is not available.

Put it on the keep.

Or more in line with SQL behavior perhaps:

proc sql;
create table SUP.IDNBI_PSC_FILES as
select distinct STRUCTURE_NUMBER_008,STRUCTURE_KIND_043A, OWNER_022
from SUP.IDNBI_PSC_SUPER ;
quit;

View solution in original post

ballardw · Posted 11-12-2020 01:05 PM

The KEEP= option means that the variable you want, STRUCTURE_NUMBER_008, is not available.

Put it on the keep.

Or more in line with SQL behavior perhaps:

proc sql;
create table SUP.IDNBI_PSC_FILES as
select distinct STRUCTURE_NUMBER_008,STRUCTURE_KIND_043A, OWNER_022
from SUP.IDNBI_PSC_SUPER ;
quit;

PGStats · Posted 11-12-2020 01:58 PM

I would suggest to keep only the last inspection for each bridge.

proc sql;
create table SUP.IDNBI_PSC_FILES as
select 
	STRUCTURE_NUMBER_008,
	STRUCTURE_KIND_043A, 
	OWNER_022,
	inspectionYear
from SUP.IDNBI_PSC_SUPER
group by STRUCTURE_NUMBER_008
having  inspectionYear = max(inspectionYear) ;
quit;

Replace inspectionYear by the appropriate name. Untested.

PG

Using Proc sql select distinct and keep

Re: Using Proc sql select distinct and keep

Re: Using Proc sql select distinct and keep

Re: Using Proc sql select distinct and keep

SAS Innovate 2025: Save the Date