BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

I want to create a report from data set  SASHELP.CARS.

Task1-I want to add a new column  to the raw data (sashelp.cars) with random date between 01.01.2021 and 05.01.2021

Task2- I want to create a summary report  with following statistics calculations:

SUM of MSRP

Number of rows(actually it is number of transactions)

Number of distinct Cylinders

 

These calculations will be done for each category of following variables:

Type

Origin

EngineSize

 

The calculation will be done for each date .

 

So in the report will have 17 columns:

 var name

 category name

 SUM of MSRP (for each date)
Number of rows  (for each date)
Number of distinct Cylinders (for each date)

 

What is the way to create the desired task please?

Ronein_0-1610960023529.png

 

proc format;
value $ typeF 
'Sedan','Hybrid','SUV'='Type_a'
'Sports'='Type_b'
'Truck','Wagon'='Type_c'
;
run;

proc format;
value $ OriginF
'Asia','Europe'='Asia,Europe'
'USA'='USA'
;
run;

proc format ;
value EngineSizeF
0-1.3='(a) till 1.3'
1.3-1.6='(b) 1.3-1.6'
1.6-2.0'=(c) 1.6-2.0'
2.0-high='(d) 2.0+';
run;

 

 

 

 

 

9 REPLIES 9
tarheel13
Rhodochrosite | Level 12
Once you have your dataset, you could do it with proc report or proc tabulate.
Ronein
Meteorite | Level 14

I know that can be done with proc report but my question was to see a code that do it and also to see a code that add dates column to raw data with random allocation of dates between 01/01/2021 and 05/01/2021

 

tarheel13
Rhodochrosite | Level 12

https://communities.sas.com/t5/SAS-Programming/How-to-generate-random-dates-in-sas-over-a-certain-pe...

 

Well, here is some code for getting the random dates. Do you know how to use proc report or proc tabulate? 

Ronein
Meteorite | Level 14
Thank you!
A few questions please:
1- I see that rand() is not working and then I changed it
2-I want to check the dates in order to verify that I get dates between 01/01/2021 and 05.01.2021
I expect to see 5 dates only!!
Why do I get much more???

%let start=01/01/2021;
%let end=05/01/2021;
Data _null_;
start_SAS_date=input("&start",ddmmyy10.);
end_SAS_date=input("&end",ddmmyy10.);
dif=end_SAS_date-start_SAS_date;
call symput('start',put(start_SAS_date,best.));
call symput('dif',put(dif,best.));
run;
%put &start;
%put &dif;


Data RawDataTbl;
set SASHELP.cars;
format date_new ddmmyy10.;
date_new=&start+rand('uniform')*&dif;
Run;
proc sort data=RawDataTbl(keep=date_new)nodup;
by date_new;
run;
PROC SQL;
create table check_dates as
select distinct date_new
from RawDataTbl
;
QUIT;
PaigeMiller
Diamond | Level 26

@Ronein wrote:
Thank you!
A few questions please:
1- I see that rand() is not working and then I changed it
2-I want to check the dates in order to verify that I get dates between 01/01/2021 and 05.01.2021
I expect to see 5 dates only!!
Why do I get much more???

%let start=01/01/2021;
%let end=05/01/2021;
Data _null_;
start_SAS_date=input("&start",ddmmyy10.);
end_SAS_date=input("&end",ddmmyy10.);
dif=end_SAS_date-start_SAS_date;
call symput('start',put(start_SAS_date,best.));
call symput('dif',put(dif,best.));
run;
%put &start;
%put &dif;


Data RawDataTbl;
set SASHELP.cars;
format date_new ddmmyy10.;
date_new=&start+rand('uniform')*&dif;
Run;
proc sort data=RawDataTbl(keep=date_new)nodup;
by date_new;
run;
PROC SQL;
create table check_dates as
select distinct date_new
from RawDataTbl
;
QUIT;

As I said, you just need to generate random months. The year you want is always 2021 and the day of the month is always 1, these are not random.

 

So, if you have generated random month numbers 1 through 5, then you want

date_new = mdy(random_month,1,2021);

 

--
Paige Miller
PaigeMiller
Diamond | Level 26

Add dates at random between Jan 2021 and May 2021 (from your example, the dates are always the first of the month, and the year is always 2021, so the only random thing would be the month)

 

Assign a new variable containing random integers 1 through 5 to each observation indicating the random month, then turn the random integer into a day/month/year.

--
Paige Miller
Ronein
Meteorite | Level 14
Sorry but I dont understand
I defined start as 01JAN2020 and end as 05DEC2020 and calculated the difference between end and start that his 5 days.

ballardw
Super User

@Ronein wrote:

I know that can be done with proc report but my question was to see a code that do it and also to see a code that add dates column to raw data with random allocation of dates between 01/01/2021 and 05/01/2021

 


 

Since you have 3 separate variables in the same column I don't think that Proc Report will be your friend.

Tabulate would have the date as the first Column dimension as a class variable with the other 3 nested inside:

table type origin enginesize,

        datevar *(msrp*sum   n <maybe you don't define what Nr_rows means>

                        <you also do not define what nr_distinct_cylinders is supposed to be>)

 

However If you intending what I think you are intending for the distinct cylinders you may be out of luck as that sort of summary likely needs to be done prior to the Tabulate or Report procedure as there is no "distinct values" statistic in Proc Tabulate or Report .

 

Are you expecting Proc Report or Tabulated to do a random column selection???

Adding values like dates belongs before any report procedure.

 

 

Cynthia_sas
SAS Super FREQ

Hi: When you get the dates into the data, you still have a basic assumption problem.

Cynthia_sas_0-1610984775842.png

 

You show 3 different variables and their values in the first 2 columns on the report. This is the type of thing that TABULATE will do very nicely. But PROC REPORT would NOT do this unless you restructure the data. So what you want to do, aside from the random dates, seems more of a TABULATE report. And I'm not sure how you would calculate number of distinct cylinders. It seems to me that number of distinct cylinders for TYPE will be different than number of distinct cylinders for ENGINESIZE and both of those will be different than number of distinct cylinders for ORIGIN. So that column doesn't seem to be something that you would be able to get in just one variable.

Cynthia

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 827 views
  • 0 likes
  • 5 in conversation