Hello,
I have a code that works but I wonder if there is a shortest way in sql to do the same that the code below. The data are totally fictive but suppose that there only 4 possible combinations of destination-hotel. Is it possible to obtain sammary dataset in only one step? Thanks
data example1;
input name $ destination $ hotel $ price;
datalines;
John Paris Ritz 1254
John New-York Belfort 4896
Mary Paris Eifel 2452
Mary London Olympic 6965
Steeve Paris Ritz 2152
Mike New-York Belfort 1515
;proc sql;
create table step0 as
select distinct name
from example1
;proc sql;
create table step1 as
select distinct name, price as price_t1
from example1
where destination ='Paris' and hotel='Ritz'
;
proc sql;
create table step2 as
select distinct name, price as price_t2
from example1
where destination ='Paris' and hotel='Eifel'
;
proc sql;
create table step3 as
select distinct name, price as price_t3
from example1
where destination ='London' and hotel='Olympic'
;
proc sql;
create table step4 as
select distinct name, price as price_t4
from example1
where destination ='New-York' and hotel='Belfort'
;
proc sql;
create table summary as
SELECT *
FROM step0
LEFT JOIN step1 ON step0.name = step1.name
LEFT JOIN step2 ON step0.name = step2.name
LEFT JOIN step3 ON step0.name = step3.name
LEFT JOIN step4 ON step0.name = step4.name
;
I would suggest you separate your calculating from your output. SQL works on normalised data, whilst it is possible to get transposed data, it is not recommended or easy with SQL. It is simple to get aggregates:
proc sql;
create table WANT as
select NAME
,DESTINATION
,HOTEL
,sum(PRICE) as PRICE
from EXAMPLE1
group by NAME
,DESTINATION
,HOTEL;
quit;
Do note in the above indentation, alignements, closing steps with the right word (e.g. quit;).
The above gives you a list of the aggregates of the data, if you need this transposed in an output pdf or something then you can simply transpose the data, however if further processing is needed, then retain the normalised structure.
You can get this done in one step, you have to build an array of PRICE and HOTEL for each NAME and export when you change NAME. I can't test now but this should not have too many errors:
data SUMMARY ;
array HOTELS [99] $20 ;
array PRICES [99] ;
do _I=1 to NOBS;
set EXAMPLE1 nobs=NOBS;
if indexw(HOTEL, of HOTELS
end;
do _I=1 to NOBS;
set EXAMPLE1;
by NAME;
if first.NAME then call missing(of PRICES);
_POS=whichn(HOTEL ,of HOTELS
PRICES[_POS]=PRICE;
if last.NAME then output;
end;
drop _: HOTEL PRICE;
run;
I corrected a first error by line 11 with PRICES
data SUMMARY ;
array HOTELS [99] $20 ;
array PRICES [99] ;
do _I=1 to NOBS;
set EXAMPLE1 nobs=NOBS;
if indexw(HOTEL, of HOTELS
end;
do _I=1 to NOBS;
set EXAMPLE1;
by NAME;
if first.NAME then call missing(of PRICES
_POS=whichn(HOTEL ,of HOTELS
PRICES[_POS]=PRICE;
if last.NAME then output;
end;
drop _: HOTEL PRICE;
run;
A few errors 😕 . Fixed:
data SUMMARY2 ;
array HOTELS [9] $20 ;
array PRICES [9] ;
do _I=1 to NOBS;
set EXAMPLE1 nobs=NOBS;
if whichc(HOTEL, of HOTELS
end;
do _I=1 to NOBS;
set EXAMPLE1;
by NAME;
if first.NAME then call missing(of PRICES
_POS=whichc(HOTEL ,of HOTELS
PRICES[_POS]=PRICE;
if last.NAME then output;
end;
keep NAME PRICES:;
run;
Sort/index the data by NAME before.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.