Merge

Accepted Solution Solved
Reply
Contributor
Posts: 53
Accepted Solution

Merge

Hi,

 

I have excel file with three sheets that related to each other by ID_key and I want to merge them in one SAS data set file. First, I imported them to sas as three data sets, then merge them by ID_key. However, I've found some of the ID_Key are duplicated, so i need either to change any duplication of ID_key in all three files to a new number or write code recognise the different between them based on the othe two variables (kroki and BrandID). below how is the files look like. I do not know how to do programing so any help would be appreciated and thanks in advance.

 

sheet1:

 

kroki          ID_key    type    BranID
9325194    39179    low      32
9322912    39179    high     31
9320769    39179    low      34
8645781    39179    low      33

 

sheet2:

 

kroki          ID_key   BranID
8645781    39179    33
8645781    39179    33
9320769    39179    34
9320769    39179    34
9322912    39179    31
9325194    39179    32
9325194    39179    32

 

sheet3:

 

kroki           ID_key    BranID
8645781     39179    33
8645781     39179    33
9320769     39179    34
9320769     39179    34
9322912     39179    31
9322912     39179    31
9325194     39179    32
9325194     39179    32

 

 

Best Regrds,

Samnan


Accepted Solutions
Solution
‎01-13-2016 02:40 PM
Super User
Super User
Posts: 7,955

Re: Merge

Create a unique key out of the three elements, for example:

data sheet1;
  set sheet1;
  unique_id=cats(put(kroki,7.),put(id_key,5.),put(branid,2.));
run;

Do that for each sheet, then merge based on unique_id variable.

View solution in original post


All Replies
Solution
‎01-13-2016 02:40 PM
Super User
Super User
Posts: 7,955

Re: Merge

Create a unique key out of the three elements, for example:

data sheet1;
  set sheet1;
  unique_id=cats(put(kroki,7.),put(id_key,5.),put(branid,2.));
run;

Do that for each sheet, then merge based on unique_id variable.

Contributor
Posts: 53

Re: Merge

dear RW9,

 

Super User
Posts: 19,815

Re: Merge

Why create a unique ID instead of merging with 3 by variables?

Contributor
Posts: 53

Re: Merge

hi @Reeza

 

because there are multiple ID_key values

Super User
Posts: 19,815

Re: Merge

@samnan I don't  think so. Could you provide an example of your data where you wouldn't get the same results if you used the 3 variables in your BY statement instead of creating a unique ID?

Contributor
Posts: 53

Re: Merge

dear @Reeza

 

now it gave same result when used the 3 variables in your BY statement, I the beginning i just used only one ID_key.

 

thanks for your help.

Trusted Advisor
Posts: 1,117

Re: Merge

How do you want the result dataset to look like? For example, there are two observations with kroki=8645781, ID_key=39179 and BranID=33 in both sheet2 and sheet3. Shall the dataset resulting from your MERGE step contain two, four or only one observation for this combination of kroki, ID_key and BranID?

 

In any case, I assume you want to have type='low' for these observations.

Contributor
Posts: 53

Re: Merge

Posted in reply to FreelanceReinhard

Dear FreelanceReinhard;

 

what i want is that for one unique value in the first sheet there may one or more values in the second and third sheet. the base for me is the ID_key. so, thanks for your help i got the first code run good

Trusted Advisor
Posts: 1,117

Re: Merge

Hello @samnan,

 

Ok, if you feel confident performing merges with "more than one data set with repeats of BY values", that's fine. That is, if for a particular value of unique_id there are m observations in sheet2 and n in sheet3 (and one in sheet1), your intention is to have max(m, n) such observations in your result dataset.

Contributor
Posts: 53

Re: Merge

Posted in reply to FreelanceReinhard

Hi FreelanceReinhard;

 

it is max, let is make it clear, it is crash data, so the first sheet is the crash info., the second is the vehicle info. (which may have more than one vehicle involved in a crash in this case we have ID_key) and the last one is occupant info. (which also may have more than one occupant involved in a crash). I hope it is now clear.

 

the issue now how to stop merge from repeating value when ocuupants more than vehicles (e.g. two vehicles and four occupants)

 

 thanks again

Contributor
Posts: 53

Re: Merge

sorry, 'it is not max'
Trusted Advisor
Posts: 1,117

Re: Merge

Hi @samnan,

 

Thanks for the clarification. So, the question is, what type of result dataset do you want:

  • one observation per crash (and perhaps aggregated information about vehicles and occupants such as "number of vehicles involved")?
  • (in general) more than one observation per crash (and possibly individual information about vehicles and occupants)?

Apparently, either option would require more specifications.

Contributor
Posts: 53

Re: Merge

Posted in reply to FreelanceReinhard

Hi @FreelanceReinhard

 

You are right, exactly that what i am looking for,

i should do both,

1- count number of vehicles and occupats per crash, but what i think i should do first is to merge all of them then using (sql count) i can count them. 

2- i need all info. (three sheets) in one raw to do regression analysis for each crash.

Trusted Advisor
Posts: 1,117

Re: Merge

What would be the dependent and potential independent variables in your regression model?

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 21 replies
  • 609 views
  • 0 likes
  • 5 in conversation