BookmarkSubscribeRSS Feed
Boa
Obsidian | Level 7 Boa
Obsidian | Level 7
 

Datasets with 5k records. i have to find all workers in the 3rd quartile of the amount they earned  And keep only lastname,total earning and their ID.

 

This is the raw data of both dataset

Earning data set, there are spendingid, customerid, totalprice 

Earning_id                  Worker_id                 totalearning            

  1212112                         12311111                        1900                      

  989898                           9908009                        2980                   

  3i31030                           3376247                          20000                    

  233233                            3332323                          6000

Worker dataset, there are customerid and firstname

  Worker_id                 lastname

  12311111                        Ellen

  9908009                         JOhn

  3376247                         Jay

 

 My code:

 

 data work.merged;

 set assign.earnings assign.workers 

 keep worker_id totalearning lastname;

 run;

  

proc print data=work.merged;

run;

  

Q) So basically i need to merge the two dataset first because the lastname does not contain in the earnings dataset and then showing the Q3 of those customer earnings.

Result:

worker_id  totalearnings lastname

121112       1900      

989898      2980

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I don;t see any part in your code where you merge the data, the first datastep "sets" the two dataset one under the other - thats not merging.  Try:

proc sql;
  create table MERGED as
  select A.*,
         B.LASTNAME
  from   EARNING A
  left join WORKER B
  on     A.WORKER_ID=B.WORKER_ID;
quit;

This will add lastname onto your data, you could of course do the same in datastep with the merge rather than set statement.

 

 

Boa
Obsidian | Level 7 Boa
Obsidian | Level 7

data work.merged;

 set assign.earnings assign.workers 

 keep worker_id totalearning lastname;

 run;

 

// this is not merged? cause i setting both datasets together as one?

How do i show the data from Q3 onwards, i have 5k records so probably 4k records onwards, i would like to display. Thanks

RW9
Diamond | Level 26 RW9
Diamond | Level 26

"// this is not merged? cause i setting both datasets together as one?"

No, its not merged it is appeneded.

Set - this is the the term used in SAS to mean to append one dataset underneath the previous dataset.  So workers dataset starts appearing after the end of earnings dataset.

 

Merge - this is the term used in SAS to mean the combination of matches on the given by variables.  So records from workers get merged to records of earnings based on the by groups given.

 

"How do i show the data from Q3 onwards, i have 5k records so probably 4k records onwards, i would like to display. Thanks"

I do not know what you mean by this.  There is no mention in any of your provided test examples that show a quarter.  You have simply provided one dataset with employees and earnings and another with last names, which merge together based on worker_id.  I seen nothing about quarter, or in fact any date or information to that effect.  Please clarify your question, providing exact test data, in the form of datasteps, and what you expect to see out.

Boa
Obsidian | Level 7 Boa
Obsidian | Level 7

thanks for explaining the differences. So, after proc sql i did a proc rank which creates another variable to indicate the quartile range for each worker. for example, 1,2,3. My question is to display worker that falls under value 3 and keep only the 3 columns i mentioned above. Any advise on continuing the code?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, from your text there:

data want;
  set data_from_rank (where=(rank_value=3) keep=worker_id totalearnings lastname);
run;

That assumes of course that you have merged on lastname to your data at some point previously.  But I still think your problem is the set part.  What it will do is put the data from table 1 and then the data from table 2, so last name will always be blank for the data your interested in:

Earning_id                  Worker_id                 totalearning               Lastname            

  1212112                         12311111                        1900                      

  989898                           9908009                        2980                   

  3i31030                           3376247                          20000                    

  233233                            3332323                          6000

/* start of second dataset */

                                           12311111                                                 Ellen

                                            9908009                                                 JOhn

                                            3376247                                                 Jay

 

ballardw
Super User

"Merge" implies that records are combined such as matching an customer address from one dataset to an amount billed from another on an account number, you have appended data sets. There is a MERGE statement one would expect in a datastep if merging is going on, not set.

 

Your data description is incomplete and may require multiple steps. First is the data that you have one record for the period of interest with the total or is it multiple records per worker? If multiple records then you need to combine them to get the total per worker.

Or does you data need to have earning information matched to the worker identifcation first?

 

It helps to provide some examples of what your data actually looks like. Best would be to use the instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create some datasteps of your data (or a subset) or dummy data that looks like your data but does not contain any sensitive information. You only need to include the variables needed.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1138 views
  • 1 like
  • 3 in conversation