BookmarkSubscribeRSS Feed
yawenyu
Obsidian | Level 7

Hello SAS community,

 

I am attempting to create a table with PROC SQL and need to find the distict count of several things. I am using SAS ver9.3.

 

I have a dataset I wish to manipulate as shown below:

 

Drink

Mocha

Americano

Cappuccino

Solo

Doppio

Earl Grey Latte

London Fog

Iced Green Tea

Iced Sweet Tea

Chocolate Milkshake

Vanilla Milkshake

Strawberry Milkshake

Mango Smoothie

Strawberry Smoothie

Banana Smoothie

Type

Coffee

Coffee

Coffee

Coffee

Coffee

Tea

Tea

Tea

Tea

Blended

Blended

Blended

Blended

Blended

Blended

Dairy

Yes

No

Yes

No

No

Yes

Yes

No

No

Yes

Yes

Yes

No

No

No

Price

4.00

3.00

4.00

1.00

2.00

4.00

4.00

2.00

3.00

5.00

5.00

5.00

4.00

3.00

3.00

 

 

I would like to create a table that looks something like this:

 

Label

Total drinks

   Price >=3

 

Drinks without dairy

   Price >=3

 

Drinks with dairy

   Price >=3

Coffee

5(100.0)

3(60.0)

 

3(60.0)

3(60.0)

 

2(40.0)

0(0.0)

Tea

4(100.0)

2(50.0)

 

2(50.0)

2(50.0)

 

2(50.0)

0(0.0)

Blended

6(100.0)

2(33.3)

 

3(50.0)

2(33.3)

 

3(50.0)

0(0.0)

Combined 

15(100.0)

7(46.7)

 

8(53.3)

7(46.7)

 

7(46.7)

0(0.0)

 

 

Basically, I am looking to build a table with distinct counts based on a specification. The Types are split into the variables, along with a combination of all 3. The numbers in the parenthesis are the percentages from the total, as seen in the first row.

 

I have created a table of the Labels using proc format:

 

proc format ;

  value row

    1 ='Total drinks'

    1.1 =' Price >=3'

    2 ='Drinks without dairy'

    2.1 =' Price >=3'

    3 ='Drinks with dairy'

    3.1 =' Price >=3'

        ;

quit;

 

proc format library=work cntlout=temp (keep=fmtname start end label) ;

quit;

 

I understand that I can use PROC SQL to find distinct counts, but I am unsure how to begin. Each PROC SQL only produces a table with 1 row and 1 column:

 

proc sql;

  create table test as

  select count(distinct Drink) as Coffee

  from work.dataset where Type='Coffee' and Price>=3;

quit;

 

Is there a way to use PROC SQL to select more than one observation? I would really appreciate a push in the right direction.

 

 

Thank you,

Yawen

5 REPLIES 5
ballardw
Super User

Do you want a data set or report table (html, pdf, rtf or text of some form)?

If you want a data set values like: 5(100.0) would have to be text and likely not to sort or such properly. And is there actually a requirement to place them in a single cell? I find data like that a tad difficult to read/comprehend at first glance, especially when there is no column indicator that one is count and the other a percentage (?).

 

Depending on your answer there are a number of approaches.

 

Also to post data it is best to use a data step. From existing data set Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

What ever format you pasted that data as I more work than I want to do to reformat and write a data step.

yawenyu
Obsidian | Level 7

Hello Ballardw! Thank you for the tip. Yes, a dataset output is all that is needed. The formatting on this website is just to clearly show the differences in data. I also had issues with inserting tables so it was copied from a word document.

 

For sake of time, the output should be a sas dataset that looks like this (input dataset is the same format):

 

Capture.PNG

 

Is there a way to get distinct count without using PROC SQL? All sources online show PROC SQL only.

 

Thank you,

Yawen

ballardw
Super User

Actually your input data is part of the problem. Word tables filtered through the forum do not paste well into the SAS editor for creating data step code. Better is something like:

data have;
   infile datalines dlm=',';
   informat drink $25. type $10. Dairy $3. Price best8.;
   input  drink type Dairy price;
datalines;
Mocha,Coffee,Yes,4.00
Strawberry Milkshake,Blended,Yes,5.00
;
run;

as one way to handle data with embedded spaces.

 

 

And a data set in the form you are showing is sub-optimal for almost any purpose other than printing. Which is why I asked if you actually want a report.

ChrisNZ
Tourmaline | Level 20

1. Please provide usable sample data as @ballardw suggested

2- This code computes the same numbers as what you want. Would this suit your needs?

proc format ;
  value $fname (multilabel) 'A'-'B'='AB'
                            'B'-'C'='BC'
                            'A'-'Z'=' All';
run;
proc tabulate data=SASHElP.CLASS out=T;
  class SEX ;
  class NAME / mlf;
  format NAME $fname.;
  table NAME=' ', SEX=' '*(n colpctn);
run;
  

 

  F M
N ColPctN N ColPctN
All 9 100.00 10 100.00
AB 1 11.11 1 10.00
BC 1 11.11 . .
SuryaKiran
Meteorite | Level 14

Hi,

 

Depending on my undestanding on your need, here is the code. This code can be made even simpler when making it as a macro, due to time constraint I didn't recode it into a macro. You can try by understanding my code.

DATA one;
infile datalines dlm=",";
input  Drink :$20.  Type $	Dairy $	Price;
datalines;
Mocha,Coffee,Yes,4
Americano,Coffee,No,3
Cappuccino,Coffee,Yes,4
Solo,Coffee,No,1
Doppio,Coffee,No,2
Earl Grey Latte,Tea,Yes,4
London Fog,Tea,Yes,4
Iced Green Tea,Tea,No,2
Iced Sweet Tea,Tea,No,3
Chocolate Milkshake,Blended,Yes,5
Vanilla Milkshake,Blended,Yes,5
Strawberry Milkshake,Blended,Yes,5
Mango Smoothie,Blended,No,4
Strawberry Smoothie,Blended,No,3
Banana Smoothie,Blended,No,3
;
run;
Data one;
Set one;
run;

proc sort data=one;
by type  price ;
run;
DATA One_(Keep=Total price_3 type);
retain Total price_3;
set one;
by type;
If first.type then do;
			Total=1;
			price_3=0;
			end;
else Total+1;
if price>=3 then price_3+1; 
If last.type then output;
Label Total="Total Drinks" price_3="Price >=3";
run;

proc transpose data=One_ Out=need_1(drop=_Name_) Label=Label;
ID type;
run;
DATA need1(drop=coffee_ Tea_ Blended_);
FORMAT Label $40. coffee $8. Tea $8. Blended $8. Combined $8.;
set need_1(rename=(coffee=coffee_ Tea=Tea_ Blended=Blended_));
IF _N_=1 then DO;
COFFEE=CAT(coffee_,"(100)");
Tea=CAT(Tea_,"(100)");
Blended=CAT(Blended_,"(100)");
combined=CAT(SUM(Blended_,tea_,coffee_),"(100)");
CALL SYMPUT("Coffee_Total",Coffee_);
CALL SYMPUT("Tea_Total",Tea_);
CALL SYMPUT("Blended_Total",Blended_);
CALL SYMPUT("Combined_Total",SUM(Blended_,tea_,coffee_));
END;
ELSE DO;
Coffee=CAT(Coffee_,'(',(Coffee_*100/SYMGET("Coffee_Total")),')');
Tea=   CAT(Tea_,'(',(Tea_*100/SYMGET("Tea_Total")),')');
Blended=CAT(Blended_,'(',(Blended_*100/SYMGET("Blended_Total")),')');
Combined=CAT(SUM(Blended_,tea_,coffee_),'(',SUM(Blended_,tea_,coffee_)*100/SYMGET("Combined_Total"),')');
END;
RUN;


data two;
set one;
IF Dairy="No";
RUN;

proc sort data=two;
by type  price ;
run;
DATA Two_(Keep=Total price_3 type);
retain Total price_3;
set two;
by type;
If first.type then do;
			Total=1;
			price_3=0;
			end;
else Total+1;
if price>=3 then price_3+1; 
If last.type then output;
Label Total="Drinks Without Dairy" price_3="Price >=3";
run;

proc transpose data=Two_ Out=need_2 (drop=_Name_) Label=Label;
ID type;
run;

DATA need2(drop=coffee_ Tea_ Blended_);
FORMAT Label $40. coffee $8. Tea $8. Blended $8. Combined $8.;
set need_2(rename=(coffee=coffee_ Tea=Tea_ Blended=Blended_));
Coffee=CAT(Coffee_,'(',(Coffee_*100/SYMGET("Coffee_Total")),')');
Tea=   CAT(Tea_,'(',(Tea_*100/SYMGET("Tea_Total")),')');
Blended=CAT(Blended_,'(',(Blended_*100/SYMGET("Blended_Total")),')');
Combined=CAT(SUM(Blended_,tea_,coffee_),'(',ROUND(SUM(Blended_,tea_,coffee_)*100/SYMGET("Combined_Total"),0.11),')');
RUN;


data Three;
set one;
IF Dairy="Yes";
RUN;

proc sort data=Three;
by type  price ;
run;
DATA Three_(Keep=Total price_3 type);
retain Total price_3;
set Three;
by type;
If first.type then do;
			Total=1;
			price_3=0;
			end;
else Total+1;
if price>=3 then price_3+1; 
If last.type then output;
Label Total="Drinks With Dairy" price_3="Price >=3";
run;

proc transpose data=Three_ Out=need_3 (drop=_Name_) Label=Label;
ID type;
run;

DATA need3(drop=coffee_ Tea_ Blended_);
FORMAT Label $40. coffee $8. Tea $8. Blended $8. Combined $8.;
set need_3(rename=(coffee=coffee_ Tea=Tea_ Blended=Blended_));
Coffee=CAT(Coffee_,'(',(Coffee_*100/SYMGET("Coffee_Total")),')');
Tea=   CAT(Tea_,'(',(Tea_*100/SYMGET("Tea_Total")),')');
Blended=CAT(Blended_,'(',(Blended_*100/SYMGET("Blended_Total")),')');
Combined=CAT(SUM(Blended_,tea_,coffee_),'(',ROUND(SUM(Blended_,tea_,coffee_)*100/SYMGET("Combined_Total"),0.11),')');
RUN;

Data Final;
SET need1 need2 need3;
run;
Thanks,
Suryakiran

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 6873 views
  • 0 likes
  • 4 in conversation