Hi experts,
my data sample (shortcut) looks like this:
Stock | Country | Analyst | Broker | Date | TP2P | |
1 | BCE | CA | Marc | RBC | 06/30/2015 | 0.24 |
2 | BCE | CA | Marc | RBC | 01/17/2017 | 0.32 |
3 | ATCO | CA | Marc | RBC | 05/23/2017 | -0.15 |
4 | Inditex | ES | Jorge | CAJA | 01/16/2018 | 0.03 |
5 | IBM | US | Jon | BNY | 05/30/2018 | 0.09 |
6 | IBM | US | Jon | BNY | 10/25/2016 | 0.01 |
7 | APPLE | US | Julia | BNY | 07/21/2017 | 0.09 |
8 | APPLE | US | Sarah | UBS | 07/30/2018 | 0.10 |
I am trying to build a table that provides the number of STOCKS, ANALYSTS and BROKER,
and the means of key variables by country.
The resulting table should be like this:
Panel A: | |||||
Country | # Stocks | # Analysts | # Broker | # Obs. | Mean (TP2P) |
CA | 2 | 1 | 1 | 3 | 0.14 |
ES | 1 | 1 | 1 | 1 | 0.03 |
US | 2 | 3 | 2 | 4 | 0.07 |
TOTAL | 5 | 5 | 4 | 8 | |
MEAN | 1.66 | 1.66 | 1.33 | 2.66 | 0.08 |
and another table (Panel B) with the following result:
Panel B: | ||
Year | # Obs. | #Analysts |
2015 | 1 | 1 |
2016 | 1 | 1 |
2017 | 3 | 2 |
2018 | 3 | 3 |
MEAN | 2 | 1.75 |
Can anyone help?
Check this:
data have;
infile datalines dlm=',';
input Stock :$8. Country :$8. Analyst :$8. Broker :$8. Date :$8.TP2P;
datalines;
BCE,CA,Marc,RBC,06/30/2015,0.24
BCE,CA,Marc,RBC,01/17/2017,0.32
ATCO,CA,Marc,RBC,05/23/2017,-0.15
Inditex,ES,Jorge,CAJA,01/16/2018,0.03
IBM,US,Jon,BNY,05/30/2018,0.09
IBM,US,Jon,BNY,10/25/2016,0.01
APPLE,US,Julia,BNY,07/21/2017,0.09
APPLE,US,Sarah,UBS,07/30/2018,0.10
;
run;
proc sql;
create table pre_want as
select monotonic() as row_no,
Country,
count(distinct stock) format=4.2 as Stock,
count(distinct Analyst) format=4.2 as Analyst,
count(distinct Broker) format=4.2 as Broker,
count(*) format=4.2 as Obs,
mean(TP2P) format=4.2 as mean_TP2P
from have
group by Country
;
quit;
proc sql;
create table want(drop=row_no) as
select * from pre_want
union
select monotonic()+10 as row_number,
'TOTAL' as Country,
sum(Stock) format=4.2 as Stock,
sum(Analyst) format=4.2 as Analyst,
sum(Broker) format=4.2 as Broker,
sum(Obs) format=4.2as Obs,
. format=4.2 as mean_TP2P
from pre_want
union
select monotonic()+100 as row_number,
'Mean' as Country,
mean(Stock) format=4.2 as Stock,
mean(Analyst) format=4.2 as Analyst,
mean(Broker) format=4.2 as Broker,
mean(Obs) format=4.2 as Obs,
mean(mean_TP2P) format=4.2 as mean_TP2P
from pre_want
order by 1
;
quit;
@Reezaa little bit. If you have a solution using SQL I would appreciate!
Check this:
data have;
infile datalines dlm=',';
input Stock :$8. Country :$8. Analyst :$8. Broker :$8. Date :$8.TP2P;
datalines;
BCE,CA,Marc,RBC,06/30/2015,0.24
BCE,CA,Marc,RBC,01/17/2017,0.32
ATCO,CA,Marc,RBC,05/23/2017,-0.15
Inditex,ES,Jorge,CAJA,01/16/2018,0.03
IBM,US,Jon,BNY,05/30/2018,0.09
IBM,US,Jon,BNY,10/25/2016,0.01
APPLE,US,Julia,BNY,07/21/2017,0.09
APPLE,US,Sarah,UBS,07/30/2018,0.10
;
run;
proc sql;
create table pre_want as
select monotonic() as row_no,
Country,
count(distinct stock) format=4.2 as Stock,
count(distinct Analyst) format=4.2 as Analyst,
count(distinct Broker) format=4.2 as Broker,
count(*) format=4.2 as Obs,
mean(TP2P) format=4.2 as mean_TP2P
from have
group by Country
;
quit;
proc sql;
create table want(drop=row_no) as
select * from pre_want
union
select monotonic()+10 as row_number,
'TOTAL' as Country,
sum(Stock) format=4.2 as Stock,
sum(Analyst) format=4.2 as Analyst,
sum(Broker) format=4.2 as Broker,
sum(Obs) format=4.2as Obs,
. format=4.2 as mean_TP2P
from pre_want
union
select monotonic()+100 as row_number,
'Mean' as Country,
mean(Stock) format=4.2 as Stock,
mean(Analyst) format=4.2 as Analyst,
mean(Broker) format=4.2 as Broker,
mean(Obs) format=4.2 as Obs,
mean(mean_TP2P) format=4.2 as mean_TP2P
from pre_want
order by 1
;
quit;
Hi Reeza,
I thought it would be that easy as you mentioned but unfortunately it doesn't work 😞
Here the code:
proc sql;
create table MyData.PRE_Table1 as
select monotonic() as row_no,
year format=BESTw. length=8,
count(distinct RIC) format=8.0 as RIC,
count(distinct ANCODE) format=8.0 as ANCODE,
count(distinct BROKERNAM) format=8.0 as BROKERNAM,
count(*) format=8.0 as Obs,
mean(TP2P) format=8.2 as mean_TP2P
from MyData.SAMP_ESTIMATOR
group by year
;
quit;
proc sql;
create table MyData.Tabelle_1(drop=row_no) as
select * from MyData.PRE_Table1
union
select monotonic()+10 as row_number,
'MEAN' as year,
sum(RIC) format=8.0 as RIC,
sum(ANCODE) format=8.0 as ANCODE,
sum(BROKERNAM) format=8.0 as BROKERNAM,
sum(Obs) format=8.0 as Obs,
. format=8.2 as mean_TP2P
from MyData.PRE_Table1
union
select monotonic()+100 as row_number,
'MEAN' as year,
mean(RIC) format=8.0 as RIC,
mean(ANCODE) format=8.0 as ANCODE,
mean(BROKERNAM) format=8.0 as BROKERNAM,
mean(Obs) format=8.0 as Obs,
mean(mean_TP2P) format=8.2 as mean_TP2P
from MyData.PRE_Table1
order by 1
;
quit;
The first table is created without any problems. But then in the second proc I get this error:
Sorry, I guess I do not know exactly how to change it in my code.
I changed it like this:
proc sql;
create table JORGE.PRE_Table1 as
select monotonic() as row_no,
year (Put(year, 4.)),
count(distinct RIC) format=8.0 as RIC,
count(distinct ANCODE) format=8.0 as ANCODE,
count(distinct BROKERNAM) format=8.0 as BROKERNAM,
count(*) format=8.0 as Obs,
mean(TP2P) format=8.2 as mean_TP2P
from JORGE.SAMP_ESTIMATOR
group by year
;
quit;
proc sql;
create table JORGE.Tabelle_1(drop=row_no) as
select * from JORGE.PRE_Table1
union
select monotonic()+10 as row_number,
'TOTAL' as year,
sum(RIC) format=8.0 as RIC,
sum(ANCODE) format=8.0 as ANCODE,
sum(BROKERNAM) format=8.0 as BROKERNAM,
sum(Obs) format=8.0 as Obs,
. format=8.2 as mean_TP2P
from JORGE.PRE_Table1
union
select monotonic()+100 as row_number,
'MEAN' as year,
mean(RIC) format=8.0 as RIC,
mean(ANCODE) format=8.0 as ANCODE,
mean(BROKERNAM) format=8.0 as BROKERNAM,
mean(Obs) format=8.0 as Obs,
mean(mean_TP2P) format=8.2 as mean_TP2P
from JORGE.PRE_Table1
order by 1
;
quit;
...and I get now the following error:
YEAR() function argument to be numeric. Remove put() function inside year()
year (Put(year, 4.)) ---> year(year)
Perform the calculations on the right grouped columns
Hi SuryaKiran,
Unfortunately it doesn't work. Now I have a new column called "_TEMA007" instead of "year". Furthermore, the years are all set now to "1995".
my code:
proc sql;
create table JORGE.PRE_Table2 as
select monotonic() as row_no,
year(year),
count(distinct RIC) format=8.0 as RIC,
count(distinct ANCODE) format=8.0 as ANCODE,
count(distinct BROKERNAM) format=8.0 as BROKERNAM,
count(distinct COMPLOC) fortmat=8.0 as COMPLOC,
count(*) format=8.0 as Obs,
mean(TP2P) format=8.2 as mean_TP2P
from JORGE.SAMP_ESTIMATOR
group by year
;
quit;
Even If I a run the whole code the problems is still there:
Do you have year column in JORGE.SAMP_ESTIMATOR dataset as numeric? then use put(year,4.) as year
When your appending two tables using UNION then you need to have same data types for same columns. In your case my guess is your first query has year numeric and the next queries are character.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.