How to merge data with restrictions

Reply
Occasional Contributor
Posts: 13

How to merge data with restrictions

[ Edited ]

Hello everyone,

 

I am currently working on a creating master file for further SAS Analysis. For the most part, all 4 files are formatted to produce the master file I desire but I cannot seem to bring the files together without losing data.

 

For starters file 1 is the base and the other 3 files will add to it. The relevant portion of dataset 1 looks like so: 

ID From To

1   0       0.5

1   0.5    1

2   0       1

3   0.3    0.7

 

In dataset 2 is as such:

IDD Off Count

1     0.2  10

1     0.7  5

2     0.5  8

3     0.3  20

 

Where IDD is the same as ID and Off in between From and To

 

3/4 are similar in concept and look like so:

3)

ID FROM TO Count2

1   0          1     1.65

2   0          1      2.0

3   0.3       0.7   1.56

 

4)

ID FROM TO Count3

1   0          1     18547

2   0          1      23506

3   0.3       0.7   6582

 

Where Count2 and Count are not related. I have tried the following code trying to merge 1 and 2 but the resultant (5) results in huge quantities of sections being lost (3000) observations to (700) observations.

proc sql;
      create table my.5 as
      select *
      from  my.1 as 1, my.2 as 2
where 1.ID = 2.IDD and 1.From<=2.Off and 1.To >= 2.Off;
quit;

Overall I wish the master file to look like:

 

ID From To Count Count2   Count3

1   0       0.5   10      1.65     18547

1   0.5    1       5       1.65     18457

2   0       1       8        2.0       23506

3   0.3    0.7    20     1.56      6582

 

Any suggestions?

Super User
Posts: 13,498

Re: How to merge data with restrictions

1) 1 , 2, 3 etc. are not valid SAS data set names OR sql alias

2) Select * doesn't rename any of the variables so you will have issues with multiple From, to and count2 clashing.

 

I would suggest adding things one at a time until you become more familiar with nesting queries and getting the variables renamed as needed.

 

See if this works for combining the first two sets:

proc sql;
   create table first as 
   select a.id,a.from, a.to, b.count
   from data1 as a
        left join 
        data2 as b
        on a.id=b.id
        and a.from le b.off le a.to
   ;
run;

If that works then use the resulting First as the A set in the next block of code.

 

You will have to explain the rules involved with deciding that both (0,0.5) and (0.5,1) (in the first set) matches to (0,1) but (0,1) in the first set does not match (0,1) in the third or fourth set.

Occasional Contributor
Posts: 13

Re: How to merge data with restrictions

Hello ballardw, thank you very much for your reply and interest in this matter.

 

1) Yes although I call them 1,2,3 and 4 those are not the original file names, the ones I use work no problem. But to ease confusion let us refer to them as sets: A, B, C, D and E(final).

 

2) In D I made a mistake I meant to indicate is as Count3

 

I am working on merging the files one at a time and not all at once.

 

I will try the code.

 

Frist in A,C and D ID/IDD is an identifier of a unique portion of a road. That portion is then defined by a continuous length identifier  FROM and TO.

 

In A there are two of the same IDs because another variable(not shown) is different the requiring to split it into 2 sections.

 

In B only the IDD (where IDD=ID) and at some Off there is a point of interest. Thus the first line in B will only fall into the gap between 0 and 0.5 and not the second and vice versa for the second observation.

 

 

Apart from the naming mistake, C and D are the same sections but the Count3 is the only observation and thus there are no splits.

 

Ask a Question
Discussion stats
  • 2 replies
  • 162 views
  • 0 likes
  • 2 in conversation