I have two datasets. One of the dataset contains 10 ID number for each observation (a total of 1000 observation). Hence the dataset is a 1000 by 10 matrix. Each ID number is an ID number for a location that contains a set of information, such as housing price, school quality, share of green space. It looks like:
c1 c2 c3 c4 ... c10 (columes are 10 IDs corresponding to each observations)
obs1 91 72 4 87 54
obs2 34 76 95 8 100
:
:
obs1000 1 3 90 78 64
Another dataset has all the information for all the locations (ID is the indicator which location it is). The dataset looks like:
housing price school quality index share of greenspace
id1 1000,000,000 2 50%
id2
:
:
id100 40,000,000 5 15%
I would like to construct a unified dataset by combining information from the two datasets. It will look like:
housing price school quaility index share of greenspace | housing price school quaility index share of greenspace
obs1 info for location with ID91 info for location with ID72
obs2 info for location with ID34 info for location with ID76
:
:
obs100
Therefore the final dataset is a 100 (number of observation) by 10 (10 ID for each observation )*3 (number of variables correspond to each ID) matrix.
I am having trouble to find a way to achieve the final goal. Any help will be much appreciated!
Thank you!!
Not sure what you are trying to do regarding the random selection part. Since I've never liked doing double transposes (which is what you have to do to get from wide to wider), I always do that kind of transposition with the macro you can download for free at:
http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset
That said, while I only tested this on a 5 id case (like you presented data for in your example), the following appears to do what you want:
data obs; input obs id1-id5; cards; 1 91 72 4 87 54 2 34 76 95 8 100 1000 1 3 90 78 64 ; proc transpose data=obs out=tobs (rename=(col1=id)); var id1-id5; by obs; run; data tobs (drop=_:); set tobs; pref=input(substr(_name_,3),8.); run; data locs; input id housing_price school_quality_index share_of_greenspace; cards; 91 1000000000 2 50 4 150000000 1 35 76 40000000 5 15 ; proc sql; create table need as select * from tobs a left join locs b on a.id eq b.id order by obs,pref ; quit; %transpose(data=need, out=want, by=obs, id=pref, guessingrows=1000, delimiter=_, var=housing_price school_quality_index share_of_greenspace)
HTH,
Art, CEO, AnalystFinder.com
Transpose dataset1 to a long format and then join it to the second dataset. You can then make it wide again with a transpose, but the long format is likely to be more useful.
Use PROC TRANSPOSE to flip the datasets.
After transpose data1 from wide to long format, each column has 10 IDs an observation chooses. The 10 IDs are randomly chosen from 1 to 100 IDs in the second datasets.
So if I want to merge the first dataset with the second dataset, I need to do it for each column.
That might work.
Actually, sounds like you want a 1000 X 10 X 4 matrix .. unless you don't need to keep track of the IDs in which case it would be a 1000 X 10 X3 matrix. Are you sure that's what you want? Do you need to keep track of the IDs?
Art, CEO, AnalystFinder.com
I don't think I need to track the IDs.
As long as the variables for each ID enter into the corresponding columes for each observation in the final dataset, It is OK to not have IDs.
So the first three colums in the final dataset are the housing price, school quality, and share of greenspace for ID 92, because that is the first choice for observation 1, The 3-6 columns are housing price, school quality, and share of green space for ID 72 (the second choice for observation 1). The last three columns will be housing price, school quality, and share of green space for ID 54 for observation 1.
Not sure what you are trying to do regarding the random selection part. Since I've never liked doing double transposes (which is what you have to do to get from wide to wider), I always do that kind of transposition with the macro you can download for free at:
http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset
That said, while I only tested this on a 5 id case (like you presented data for in your example), the following appears to do what you want:
data obs; input obs id1-id5; cards; 1 91 72 4 87 54 2 34 76 95 8 100 1000 1 3 90 78 64 ; proc transpose data=obs out=tobs (rename=(col1=id)); var id1-id5; by obs; run; data tobs (drop=_:); set tobs; pref=input(substr(_name_,3),8.); run; data locs; input id housing_price school_quality_index share_of_greenspace; cards; 91 1000000000 2 50 4 150000000 1 35 76 40000000 5 15 ; proc sql; create table need as select * from tobs a left join locs b on a.id eq b.id order by obs,pref ; quit; %transpose(data=need, out=want, by=obs, id=pref, guessingrows=1000, delimiter=_, var=housing_price school_quality_index share_of_greenspace)
HTH,
Art, CEO, AnalystFinder.com
Thank you very much Art! I will try it out and see how it works out!!
Thank you Art. Your method worked.
I really appreciate your help!!!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.