BookmarkSubscribeRSS Feed
ssas
Calcite | Level 5
Hi Can any one help,
I got 2 queries.
1. I have table like the following.
I would like to find out the frequency of (Same) records with in the table with respect to their lname postcode fname.

Table- A1
fname lname pcode
Amer George Dd4 5st
Paul George Dd4 5st
Amer collin Dd4 5st
Amer George Dd5 6st
AJ BENNISON ME20 6JE
G B BOWEN ME5 8QR
L BRYANS ME7 1RB
M CAIN ME7 4EH
P CHAMBERS ME19 5QB
J CHAMPION ME3 9RE
Amer George Dd4 5st
FW CORDIER ME4 4XJ
M COUSENS ME8 9QS
L COUSIN ME5 7JY
Amer collin Dd4 5st
H DAVIES RH12 1PX
P DEEPROSE ME7 5QG
Amer George Dd5 6st
D GOVIER ME7 2UX
JACQ GRAY ME8 9QZ
JK GRIFFIN CT7 9RX
J GRIFFIN ME2 2SY

output should like:

fname lname pcode no. of times
alan john me4 5sd 2
ban dadkds mf3 7ds 1
dad fdsfd bg3 5fd 4




2. I have the 30 tables ( a1 to a30) like above A1 and I want to find out the how many records are in common, which records in which table and howmany timeswith respect to their lname, pcode, fname.
Output should look like

Fname Lname Pcode Freq. Tables
Amer George Dd4 5st 2 A5,a10
Paul George Dd4 5st 4 A3,a6,a9,a15
Amer collin Dd4 5st 1 A25
Amer George Dd5 6st 3 A17,a22,a26


Thanks in advance,
sams
8 REPLIES 8
deleted_user
Not applicable
1[pre]
PROC summary nway data= A1 ;
class fname lname pcode ;
output ;
run;
proc print ;
run;
[/pre]
will create the summary counts for "A1" in a default work data set.

2
You would want to loop through all your tables.
Assuming that in your libname SAMS, these tables are the only ones with names beginning "ABC", then you can use a dictionary table to generate the code above for each ABCxxx table. (see below).
Collecting all these together is (imho) best done with proc append as each summary is created, but first needs the table name to be added to results[pre]
data ;
length table $32;
retain table "ABC1" ;
set ; * picking up latest results;
run;
* then collecting results is just code like;
proc append base= collection ;
run ;[/pre]

Generating this code for every ABCxx data set in SAMS library needs a loop which can be provided in a data step reading dictionary.tables through view SASHELP.vTable and using CALL EXECUTE() to push the code to run.
Here is that packaging of the code generation[pre]* first ensure the "collection" is empty ;
proc delete data= collection;
run;
* now loop through the data sets ;
data _null_ ;
set sashelp.vtable ; * look through _ALL_ datasets known to SAS session;
where libname = 'SAMS' ; * use only those in libname SAMS ;
where also memname =: 'ABC' ; * and only tables beginning ABC ;
* now generate summary code ;
call execute( 'proc summary nway missing data=SAMS.' !! memname );
call execute( ' ; class fname lname pcode ; output ; run;' );
* now add table name to results ;
call execute( 'data ; length table $32; retain table ' );
call execute( quote( trim( memname )) );
call execute( ' ; set ; run; ' ) ;
* now append to collection ;
call execute( 'proc append base= collection; run; ' );
run;[/pre]
After that has run, you will have a dataset called collection holding the counts you need in column _FREQ_ .
I think PROC TABULATE provides the neatest way to show the matrix with column for each table and a row for each combination of the class variables Fname, lName, and pCode.
To handle the multiple columns, use ODS destination html or if excel is suitable, use ods destination tagsets.excelxp. For simplicity, here is the html[pre]ods html file='column combination analysis.html' ;
ods listing close ;
proc tabulate data= collection missing noseps ;
class table fname lname pcode ;
var _freq_ ;
table ( all fname*lname*pcode all ), _freq_=' '*sum=' '*f= comma8.
* ( all = 'all tables' table=' ' all = 'all tables' )
/ rts= 30 ;
run;
ods _all_ close ;[/pre]
Of course ODS provides many ways of improving the presentation, but I think nothing is as convenient as PROC TABULATE for preparing this content.

PeterC
ssas
Calcite | Level 5
Thanks for your reply,
For easy understanding i mentioned a1 to a30, but they have different names.
Each table represents sitename.
could you please give detailed code or explanation.
deleted_user
Not applicable
An SQL based way of doing this would be:

proc sql;
create table want
as
select lname, postcode, fname, sum(london) as london, sum(paris) as paris, sum(ny) as ny
from
(
select lname, postcode, fname, count(*) as london, 0 as paris, 0 as ny
from london
group by lname, postcode, fname
union
select lname, postcode, fname, 0 as london, count(*) as paris, 0 as ny
from paris
group by lname, postcode, fname
union
select lname, postcode, fname, 0 as london, 0 as paris, count(*) as ny
from ny
group by lname, postcode, fname
)
group by lname, postcode, fname
;
quit;

This won't be especially fast, but is easy to code and, I hope, intuitive to understand.

ProcMe ProcMe: typo, missed the last from


Message was edited by: ProcMe
ssas
Calcite | Level 5
Hi Proc Me,
I have submited the following code
proc sql;
create table brookfr
as
select lname, pcode, fname,sum(brfr.beatles1) as bbeat,sum(brfr.bmalone1) as bmalone,
sum(brfr.bombedout1) as bombout,sum(brfr.confusions1) as confuse,sum(brfr.daveorch1)as dave
from
(
select lname,pcode,fname,count(*) as bbeat,0 as bmalone, 0 as bombout,0 as confuse,0 as dave
from brfr.bbeatles1
group by lname, pcode, fname
union
select lname,pcode,fname,0 as bbeat,count(*) as bmalone,0 as bombout,0 as confuse,0 as dave
from brfr.bmalone1
group by lname, pcode, fname
union
select lname,pcode,fname,0 as bbeat,0 as bmalone, count(*) as bombout,0 as confuse,0 as dave
from brfr.bombedout1
group by lname,pcode,fname
union
select lname,pcode,fname,0 as bbeat,0 as bmalone,0 as bombout,count(*) as confuse,0 as dave
from brfr.confusions1
group by lname,pcode,fname
union
select lname,pcode,fname,0 as bbeat,0 as bmalone,0 as bombedout,0 as confuse,count(*) as dave
from brfr.daveorch1
group by lname,pcode,fname
)
group by lname,pcode,fname
;
quit;

AND I AM GETTING THE FOLLOWING ERROR

look_ds: what's on FROM clause?
ERROR: Unresolved reference to table/correlation name brfr.

look_ds: what's on FROM clause?
ERROR: Unresolved reference to table/correlation name brfr.

look_ds: what's on FROM clause?
ERROR: Unresolved reference to table/correlation name brfr.

look_ds: what's on FROM clause?
ERROR: Unresolved reference to table/correlation name brfr.

look_ds: what's on FROM clause?
ERROR: Unresolved reference to table/correlation name brfr.
ERROR: The SUM summary function requires a numeric argument.
ERROR: The SUM summary function requires a numeric argument.
ERROR: The SUM summary function requires a numeric argument.
ERROR: The SUM summary function requires a numeric argument.
ERROR: The SUM summary function requires a numeric argument.
LinusH
Tourmaline | Level 20
In the surrounding query (where you have sum(bfr.xxxxx) you cannot refer to bfr, the column will refer to the result of the inner query, which does not cary any bale name, or alias. So remove bfr. in the outer query, or add an alias bfr after the from (....) .

/Linus
Data never sleeps
ssas
Calcite | Level 5
Hi Mates,

Thanks for your help. I got the required output except the total.
i mean total of 0 and 1's of the every record.
for ex:
fname lname pcode total a1 a2 a3 a4....

jhon alex dg3 8fs 2 0 1 0 1 0 0 0

If it also resolved i will be very happy.

Thanks & Regards,
sams
deleted_user
Not applicable
have I understood this
You require:
{class,combinations}, total freq, {freq, in each table}
Adapting the call execute() approach with
1 use a table of table-names to be investigated
2 build summary for each table across the class combinations
3 for each class combination: merge into a "collection" table of results, the freq of the class-combination in that original table, and add its _freq_ to the overall total freq.

Here is some code which also generates test data[pre]** first build the list of data sets ;
data these_data_sets ;
informat memname $upcase32. ;
input memname ;
datalines ;
london
paris
new_York
;[/pre]and as test data for demo, create these datasets in SAMS which has the same path as WORK;[pre]libname sams (work) ;
data london;
input fname $ lname $ pcode & $ ;
cards ;
Amer George Dd4 5st
Paul George Dd4 5st
Amer collin Dd4 5st
Amer George Dd5 6st
AJ BENNISON ME20 6JE
;
data paris;
input fname $ lname $ pcode & $ ;
cards ;
Amer George Dd4 5st
Amer George Dd4 5st
Paul George Dd4 5st
Amer collin Dd4 5st
Amer George Dd5 6st
AJ BENNISON ME20 6JE
;
data new_York ;
input fname $ lname $ pcode & $ ;
cards ;
Amer collin Dd4 5st
H DAVIES RH12 1PX
P DEEPROSE ME7 5QG
Amer George Dd5 6st
D GOVIER ME7 2UX
;[/pre]
* next ensure the "collection" is empty before merging ;[pre]data collection;
retain fname lname pcode freq ;
length fname lname pcode $32 freq 8 ;
stop ;
run;[/pre]
* Generating summary code for every data set in SAMS library named in table these_data_sets needs a loop reading that table these_data_sets and using CALL EXECUTE() to push the code to run.
Here is that packaging of the code generation. Next loop through the data sets ; [pre]data _null_ ;
set these_data_sets( keep= memname ) ; * look through _ALL_ required datasets ;
* now generate summary code ;
call execute( 'proc summary nway missing data=SAMS.' !! memname );
call execute( ' ; class fname lname pcode ; output out= a; run;' );
* now merge into the collection ;
call execute( ' data collection ;' );
call execute( ' merge collection( in= _base) a( in= newbits ' );
call execute( ' keep= fname lname pcode _freq_ ' );
call execute( ' rename= ( _freq_ =' !! memname !! ' ));' );
call execute( ' by fname lname pcode ; ' );
call execute( ' if newbits then freq =sum( freq, ' !! memname !! ' ); ' );
call execute( ' run ; ' );
run ;[/pre] * finally report results;[pre]proc print n; sum _numeric_; run;[/pre]

I think that produces what is required.
Run it and see how much clearer it becomes when you can read the log with everything in order.
Here is the output I got from testing[pre]The SAS System 20:27 Wednesday, January 14, 2009

Obs fname lname pcode freq LONDON PARIS NEW_YORK

1 AJ BENNISON ME20 6JE 2 1 1 .
2 Amer George Dd4 5st 3 1 2 .
3 Amer George Dd5 6st 3 1 1 1
4 Amer collin Dd4 5st 3 1 1 1
5 D GOVIER ME7 2UX 1 . . 1
6 H DAVIES RH12 1PX 1 . . 1
7 P DEEPROSE ME7 5QG 1 . . 1
8 Paul George Dd4 5st 2 1 1 .
==== ====== ===== ========
16 5 6 5

N = 8[/pre]

PeterC
deleted_user
Not applicable
if you want to buy product I'll be happy to provide

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 907 views
  • 0 likes
  • 3 in conversation