Hi SAS experts,
I am trying to know, for a given year&province&school, what’s the most common other school that people enrol, when they enrolled in multiple school in the same time.
A quick example:
What I have:
id | prov | year | school |
1 | 29 | 2008 | 101 |
1 | 29 | 2008 | 102 |
2 | 35 | 2009 | 103 |
2 | 35 | 2009 | 104 |
2 | 35 | 2009 | 105 |
What I want:
id | prov | year | school | school2 |
1 | 29 | 2008 | 101 | 102 |
1 | 29 | 2008 | 102 | 101 |
2 | 35 | 2009 | 103 | 104 |
2 | 35 | 2009 | 103 | 105 |
2 | 35 | 2009 | 104 | 103 |
2 | 35 | 2009 | 104 | 105 |
2 | 35 | 2009 | 105 | 103 |
2 | 35 | 2009 | 105 | 104 |
In the table “want”, there is an additional column “school2” to record the other school that this person enrolled in a given year.
Greatly appreciate for any hints!
Please define "most common" and provide an example that actually uses a "most common". What your current example shows for output is "any other".
Please provide data in the form of a data step if you actually want working code. The "example" data does not lend itself to pasting into an editor and attempting to write code to read it.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
This computes an adjacency matrix where the vertex on the graph of schools is the count of how many times a student enrolled into school X and school Y in the same year. From this you want to get a table listing vertices that have the largest "popularity-value" from each node/"school" in the graph.
proc sql;
create table want as
select Distinct
t1.school as school1,
t2.school as school2,
count(*) as count
from have t1, have t2
where t1.id=t2.id
and t1.year=t2.year
and t1.school > t2.school
group by t1.school,t2.school
;
quit;
data want;
set want want (rename=(school1=school2 school2=school1));
proc sort;
by school1 descending count;
proc sort nodupkey;
by school1 ;
run;
I once took this free EdX internet course on Excel and data analysis with Felienne Hermans from the Delft University of Technology, Netherlands. She taught her students, using data yanked from IMDB.com, to calculate an actor's Bacon number using similar math to this. It was fun.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: