BookmarkSubscribeRSS Feed
Amali6
Quartz | Level 8
 proc sgplot data=hotel.Hotel_bookings;
  scatter x=arrival_date_year y=country/ group=hotel;
  title'guest arrived from countries in three years';
  run;

Hi all,

I wanted to find out the guest arrived from various countries ,in three different year years to two hotels. I am not getting the output correctly. Looking for help please!

hotel is my libname

Hotel.bookings is the dataset

arrival_date_year is has from 2015,2016,2017

country - my dataset has various countries inside

hotel- resort hotel and city hotel.

These are the expansion of the code, could anyone please help where i am wrong??

8 REPLIES 8
ballardw
Super User

You should show us some examples of what your hotel.hotel_bookings data set actually looks like.

 

And then describe exactly what you expect the output to look like.

If you mean to do this for individuals then likely this is going to be a very busy chart.

You may want to summarize before plotting.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

Amali6
Quartz | Level 8
hotel	is_canceled	lead_time	arrival_date_year	arrival_date_month	arrival_date_week_number	arrival_date_day_of_month	stays_in_weekend_nights	stays_in_week_nights	adults	children	babies	meal	country
Resort Hotel	0	342	2015	July	27	1	0	0	2	0	0	BB	PRT
Resort Hotel	0	737	2015	July	27	1	0	0	2	0	0	BB	PRT
Resort Hotel	0	7	2015	July	27	1	0	1	1	0	0	BB	GBR
Resort Hotel	0	13	2015	July	27	1	0	1	1	0	0	BB	GBR
Resort Hotel	0	14	2015	July	27	1	0	2	2	0	0	BB	GBR
Resort Hotel	0	14	2015	July	27	1	0	2	2	0	0	BB	GBR
Resort Hotel	0	0	2015	July	27	1	0	2	2	0	0	BB	PRT
Resort Hotel	0	9	2015	July	27	1	0	2	2	0	0	FB	PRT
Resort Hotel	1	85	2015	July	27	1	0	3	2	0	0	BB	PRT
Resort Hotel	1	75	2015	July	27	1	0	3	2	0	0	HB	PRT
Resort Hotel	1	23	2015	July	27	1	0	4	2	0	0	BB	PRT
Resort Hotel	0	35	2015	July	27	1	0	4	2	0	0	HB	PRT
Resort Hotel	0	68	2015	July	27	1	0	4	2	0	0	BB	USA
Resort Hotel	0	18	2015	July	27	1	0	4	2	1	0	HB	ESP
Resort Hotel	0	37	2015	July	27	1	0	4	2	0	0	BB	PRT
Resort Hotel	0	68	2015	July	27	1	0	4	2	0	0	BB	IRL
Resort Hotel	0	37	2015	July	27	1	0	4	2	0	0	BB	PRT
Resort Hotel	0	12	2015	July	27	1	0	1	2	0	0	BB	IRL
Resort Hotel	0	0	2015	July	27	1	0	1	2	0	0	BB	FRA
Resort Hotel	0	7	2015	July	27	1	0	4	2	0	0	BB	GBR
Resort Hotel	0	37	2015	July	27	1	1	4	1	0	0	BB	GBR
Resort Hotel	0	72	2015	July	27	1	2	4	2	0	0	BB	PRT
Resort Hotel	0	72	2015	July	27	1	2	4	2	0	0	BB	PRT
Resort Hotel	0	72	2015	July	27	1	2	4	2	0	0	BB	PRT

This is the few columns in my dataset , from these how cani show the plots for people arrived to both hotels in all three years from various countries in the dataset?

Please help me to solve this!

 

Thanks 

ballardw
Super User

Now describe what you mean by "for people arrived to both hotels in all three years"

There is not way I can see from that data to identify if any particular person arrived at any hotel in any given year.

So do you mean totals of some sort? That will require some sort of summary and filter likely.

I might guess that you want to display the total by hotel by year.  Scatter plots will not summarize data. You would have to do that prior to plotting the data. And if you mean "people" to be a total of adults, children and babies you will need to sum those prior to plotting as well.

maybe something like (untested as data step not provided and you only show one "hotel" value so incomplete example)

data temp;
   set hotel.hotel_bookings;
   people = sum(adults,children,babies);
run;
proc summary data=temp nway;
   class hotel country arrival_date_year;
   var people;
   output out =work.plot (drop=_type_ _freq_) sum=;
run;

proc sgplot data=work.plot;
  scatter x=arrival_date_year y=people/ group=hotel datalabel=country;
  title'guest arrived from countries in three years';
  run;

 

 

And does the Is_cancelled variable have any role in this process?

Amali6
Quartz | Level 8

Thanks for making me clear! Sorry i didnt explain my variables properly.The variable is_canceled contains 0 an 1 where 0 is bookings that are not canceled and 1 is canceled bookings. Hotel variables contains value city hotel and resort hotel.Actually my dataset has more than 1lakhs observations thats why i couldn't post here. 

My question is, is it possible to show in plots the number of people arrived in three years for both hotels??

Amali6
Quartz | Level 8

Hi 

When i tried your code i got the output like this. 

Amali6_0-1589395464219.png

May i know the explanation of this line in the code please:

output out =work.plot (drop=_type_ _freq_) sum=;

Can i use the library i created insted of work library in the code? And i am not clear with this (drop=_type_ _freq_) sum=;

Could you please explain??

 

Thanks in advance!

Rick_SAS
SAS Super FREQ

Because you have discrete variables, I suggest a bar chart instead of a scatter plot. You can either use a stacked bar chart or a cluster bar chart. The stacked bars are probably better if you have many countries. For more information, see "Bar Charts with Stacked and Cluster Groups."

 

You don't say how you want the data displayed, so I chose two charts (one for each type of hotel) that shows the number of visitors from each country for each year.  If you want the data displayed in some other way, the code can be modified:

 

/* Create sample data. I use a frequency variable (FREQ), but
   the bar chart will aggregate if the data set contains
   one observation per guest. */
data bookings;
call streaminit(1);
length country $15 hotel $6;
do arrival_date_year = 2015 to 2017;
   do country = "US", "UK", "China", "Japan";
      do hotel = "City", "Resort"; 
          Freq = rand("Poisson", 100);
          output;
      end;
   end;
end;
run;

proc sort data=bookings;
by hotel;
run;

title'Guest arrived from countries in three years';
proc sgplot data=bookings;
  by hotel;
  vbar arrival_date_year / response=Freq group=country 
            groupdisplay=stack seglabel;
  xaxis display=(nolabel);
  yaxis grid;
  run;

proc sgplot data=bookings;
  by hotel;
  vbar arrival_date_year / response=Freq group=country 
            groupdisplay=cluster;
  xaxis display=(nolabel);
  yaxis grid;
  run;
Amali6
Quartz | Level 8

Thanks for the solution but i dont understand the data step u provided. As the dataset is already imported into sas then why to create this step here?

data bookings;
call streaminit(1);
length country $15 hotel $6;
do arrival_date_year = 2015 to 2017;
   do country = "US", "UK", "China", "Japan";
      do hotel = "City", "Resort"; 
          Freq = rand("Poisson", 100);
          output;
      end;

 

Rick_SAS
SAS Super FREQ

Because I don't have access to your data and you didn't provide data in a format that I could use.  You can ignore the DATA step. It is for me and others who do not have access to your data.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1029 views
  • 1 like
  • 3 in conversation