DATA Step, Macro, Functions and more

Matching variables in 2 datasets

Accepted Solution Solved
Reply
Contributor
Posts: 24
Accepted Solution

Matching variables in 2 datasets

Hi,

 

Got couple of queries here.

I have 2 datasets - A, B

 

The variable1 in dataset B has e.g name of students.

THe variable (only 1 column) in dataset A had e.g name of students and their place of birth.

 

I am confused as to whether merge / join the 2? If I merge, can I rename the variable(in question) in B to match the one in A?

 

Q1)  What I need is a dataset that would have all the info as it is in B but the variable1 should also have the name of places of birth of the students wherever it matches.

 

Q2) 2 extra columns need to be added to the new dataset - that shows total score of previous year's. and avg total score.

 

say,

data have;
   input name $ totalscore year;
   datalines;
John 241 2011
Mary 460 2015

How do I create the total score of the previous year's for that name?


Accepted Solutions
Solution
‎05-01-2016 09:36 PM
Trusted Advisor
Posts: 1,228

Re: Matching variables in 2 datasets

Posted in reply to Antaralib

As @Reeza suggested provide sample data sets that clearly describe information contained for a proper solution. Based on information provided seems like you are looking for something like this:

 

data A;
input Namecountry $20.;
datalines;
abcCanada
defUS
ghqCanada
;

 

data B;
input Rank name $ score avg_score;
datalines;
1 abc 23 20
2 def 21 23
;

 

proc sql;
create table want as
select b.*,substr(Namecountry,length(name)+1) as Country from B
left join A on
substr(Namecountry,1,length(name))=name;
quit;

View solution in original post


All Replies
Super User
Posts: 19,772

Re: Matching variables in 2 datasets

Posted in reply to Antaralib

Please provide sample input and expected output data. 

 

In general you can either do a data step merge, sql join, or a format to add the information. 

 

Contributor
Posts: 24

Re: Matching variables in 2 datasets

@Reeza thanks..

 

dataset A: (this was formed by concatenating 2 (name & country)  variables from a previous dataset and all other columns were dropped)

Namecountry

 

dataset B:

Rank#  name  score avg score

 

 

Output:  The country from A should be joined to B(name variable). Wherever there is a match.

Contributor
Posts: 24

Re: Matching variables in 2 datasets

Posted in reply to Antaralib

Note: The labels are different in A & B.

 

Renaming the KEY label does not work.

Super User
Posts: 19,772

Re: Matching variables in 2 datasets

Posted in reply to Antaralib

What doesn't work? Please post what you've tried. 

 

That isn't really sample data, please include some observations, fake if necessary, that can be used to replicate your issue. 

Solution
‎05-01-2016 09:36 PM
Trusted Advisor
Posts: 1,228

Re: Matching variables in 2 datasets

Posted in reply to Antaralib

As @Reeza suggested provide sample data sets that clearly describe information contained for a proper solution. Based on information provided seems like you are looking for something like this:

 

data A;
input Namecountry $20.;
datalines;
abcCanada
defUS
ghqCanada
;

 

data B;
input Rank name $ score avg_score;
datalines;
1 abc 23 20
2 def 21 23
;

 

proc sql;
create table want as
select b.*,substr(Namecountry,length(name)+1) as Country from B
left join A on
substr(Namecountry,1,length(name))=name;
quit;

Contributor
Posts: 24

Re: Matching variables in 2 datasets

@stat_sas

absolutely those are the input lines.

Could you please tell what's the 'b' in your procsql I assume it's my dataset table B.

 

It worked.

 

Thanks..

 

 

Trusted Advisor
Posts: 1,228

Re: Matching variables in 2 datasets

Posted in reply to Antaralib

Yes, A and B are table names

Contributor
Posts: 24

Re: Matching variables in 2 datasets

But, is there a SAS code to resolve it and not having used procsql.

Super User
Posts: 19,772

Re: Matching variables in 2 datasets

Posted in reply to Antaralib

It depends. Do you have duplicates in any of your tables?

 

Otherwise use a MERGE

 

Again, much easier if you post sample data and output to illustrate your issue. 

 

Proc SQL is very powerful for joining and merging datasets, it's worth learning the basics.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 398 views
  • 2 likes
  • 3 in conversation