BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JKCho
Pyrite | Level 9

Hello!

 

I am a new SAS learner and particularly having difficulties in matching multiple datasets. Many classes have taught me various regression analysis but mana trimming...

 

What I want to do is...

I have two data files. there is a variable in common so that I can match these two files by this variable. However, my date files are panel data. 

 

For example,

Data1                                                  Data2

ID             Date(Daily)                         ID           Date(Monthly)

1000         09/09/2000                        1000      08/10/2000

1000         09/10/2000                        1000      09/10/2000

1000         09/11/2000                         2000        as same as others...

and so on.... for ID 1000                     and so on....

2000         09/01/2000

2000         same as above    

2000         same as above 

and so on.... for ID 2000

3000

and so on....

.

.

 

What I want to have is... to match by ID as below.

 

MatchedData

ID             Date(Daily)         Date(Monthly)   

1000         09/09/2000         .(missing data)

1000         09/10/2000          09/10/2000

1000         09/11/2000           .(missing data)

2000       Like the above...

3000

 

When I used many-to-many merge, this code created a huge data whose date numbers are Date(daily) multiplied by Date(Monthly).

This is not a thing I wish to have but I merely want to know the code for matching.

To say simple, I want to match both ID and date. What is demanding is there might be no available matching ID, date or both. For example, there is ID 1111 in data1 but not in data2, or ID 1111 with date 09/01/2000 but not exactly the same in data2. At this stage, I rather let them be; all unmatch rows are retained.

 

Please please PLEASE share your wisdom!! I know it is kind of easy but for me so demanding...

 

Also... if you are generous enough, please help me once again.

I have other data that are time series and want to combine these by 'year'.

One is yearly data and another is Monthly.

What I want to have is...

 

Matched

Variable1(from Monthly) Variable2(from yearly)    Monthly           Yearly

0000                                aaaa                              10/01/2000      2000

0000                                aaaa                              11/01/2000      2000

0000                                aaaa                              12/01/2000      2000

0000                                bbbb                              01/01/2001      2001

 

Since Variable2 are from Yearly data, there should be many duplicates in Matched and that is what I want. As I need to match by using different variables I do not even know how to start.

 

I appreciate your time on my questions and thank you for your help very much!!!!

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Welcome the the SAS community!

 

1. Please provide your data in the form of a data step (and check that the pasted code runs properly) when asking questions, like this:

data DATA_D ;
  input ID DATE : ddmmyy10.; 
  format DATE ddmmyy10.;
cards;
1000 09/09/2000                     
1000 09/10/2000                     
1000 09/11/2000                     
run;

data DATA_M;
  input ID DATE ddmmyy10.; 
  format DATE ddmmyy10.;
cards; 
1000 08/10/2000
1000 09/10/2000
run;

2. This should do what you need:

data WANT;
  merge DATA_D (in=D)
        DATA_M (in=M);
  by ID DATE;
  if D then DATE_D=DATE;
  if M then DATE_M=DATE;
  drop DATE;
  format DATE_D DATE_M ddmmyy10.;
run;

proc print noobs; run;
ID DATE_D DATE_M
1000 09/09/2000 .
1000 . 08/10/2000
1000 09/10/2000 09/10/2000
1000 09/11/2000 .

 

 

 

 

 

View solution in original post

3 REPLIES 3
ChrisNZ
Tourmaline | Level 20

Welcome the the SAS community!

 

1. Please provide your data in the form of a data step (and check that the pasted code runs properly) when asking questions, like this:

data DATA_D ;
  input ID DATE : ddmmyy10.; 
  format DATE ddmmyy10.;
cards;
1000 09/09/2000                     
1000 09/10/2000                     
1000 09/11/2000                     
run;

data DATA_M;
  input ID DATE ddmmyy10.; 
  format DATE ddmmyy10.;
cards; 
1000 08/10/2000
1000 09/10/2000
run;

2. This should do what you need:

data WANT;
  merge DATA_D (in=D)
        DATA_M (in=M);
  by ID DATE;
  if D then DATE_D=DATE;
  if M then DATE_M=DATE;
  drop DATE;
  format DATE_D DATE_M ddmmyy10.;
run;

proc print noobs; run;
ID DATE_D DATE_M
1000 09/09/2000 .
1000 . 08/10/2000
1000 09/10/2000 09/10/2000
1000 09/11/2000 .

 

 

 

 

 

Reeza
Super User

Please please PLEASE share your wisdom!! I know it is kind of easy but for me so demanding...

The quickest way to get help, is to put in some effort and then show what you've tried. That's usually more than enough to get a decent answer. 

 

JKCho
Pyrite | Level 9

well.. I am actually working on a massive data set but I am too clumsy. I have utilized my codes on the internet but all was not the answer...

maybe it is better to post a new post and attach my code. 

 

Thank you:)

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1134 views
  • 1 like
  • 3 in conversation