Hi,
Got couple of queries here.
I have 2 datasets - A, B
The variable1 in dataset B has e.g name of students.
THe variable (only 1 column) in dataset A had e.g name of students and their place of birth.
I am confused as to whether merge / join the 2? If I merge, can I rename the variable(in question) in B to match the one in A?
Q1) What I need is a dataset that would have all the info as it is in B but the variable1 should also have the name of places of birth of the students wherever it matches.
Q2) 2 extra columns need to be added to the new dataset - that shows total score of previous year's. and avg total score.
say,
data have; input name $ totalscore year; datalines; John 241 2011 Mary 460 2015
How do I create the total score of the previous year's for that name?
As @Reeza suggested provide sample data sets that clearly describe information contained for a proper solution. Based on information provided seems like you are looking for something like this:
data A;
input Namecountry $20.;
datalines;
abcCanada
defUS
ghqCanada
;
data B;
input Rank name $ score avg_score;
datalines;
1 abc 23 20
2 def 21 23
;
proc sql;
create table want as
select b.*,substr(Namecountry,length(name)+1) as Country from B
left join A on
substr(Namecountry,1,length(name))=name;
quit;
Please provide sample input and expected output data.
In general you can either do a data step merge, sql join, or a format to add the information.
@Reeza thanks..
dataset A: (this was formed by concatenating 2 (name & country) variables from a previous dataset and all other columns were dropped)
Namecountry
dataset B:
Rank# name score avg score
Output: The country from A should be joined to B(name variable). Wherever there is a match.
Note: The labels are different in A & B.
Renaming the KEY label does not work.
What doesn't work? Please post what you've tried.
That isn't really sample data, please include some observations, fake if necessary, that can be used to replicate your issue.
As @Reeza suggested provide sample data sets that clearly describe information contained for a proper solution. Based on information provided seems like you are looking for something like this:
data A;
input Namecountry $20.;
datalines;
abcCanada
defUS
ghqCanada
;
data B;
input Rank name $ score avg_score;
datalines;
1 abc 23 20
2 def 21 23
;
proc sql;
create table want as
select b.*,substr(Namecountry,length(name)+1) as Country from B
left join A on
substr(Namecountry,1,length(name))=name;
quit;
absolutely those are the input lines.
Could you please tell what's the 'b' in your procsql I assume it's my dataset table B.
It worked.
Thanks..
Yes, A and B are table names
But, is there a SAS code to resolve it and not having used procsql.
It depends. Do you have duplicates in any of your tables?
Otherwise use a MERGE
Again, much easier if you post sample data and output to illustrate your issue.
Proc SQL is very powerful for joining and merging datasets, it's worth learning the basics.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.