BookmarkSubscribeRSS Feed
frupaul
Quartz | Level 8

Hi Everyone,

 

I am trying to build a Hierarchy from the below file (source code title: code 1). The first level of the hierarchy should be CEO, and every subsequent level (represented by L2, L3 etc) should be correct.  One way of doing this is

- Loop through all the Ls and identify the first L with a value of CEO.

- The first L with a value of CEO will be Level 1, then the rest of the levels will automatically fall into place with Level 2 being Department Head, Level 3 being Senior manager etc.

- Once Levels are defined as such, these could be appended to give a table like table 2 below;

 

Note: This is extremely simplified. Original table contains about 100k records

 

Screen Shot 2018-07-25 at 22.18.25.png

 

 

 

code 1:

 data Ranking;
INFILE DATALINES DLM=',';
length CustID L1 $20 L2 $20 L3 $20 L4 $20 L5 $20 L6 $20;
input CustID L1 $ L2 $ L3 $ L4 $ L5 $ L6 $;
cards;
1,CEO,Department Head,Senior Manager,Manager, Senior Analyst, Peter
2,Major,Vice President, CEO, Department Head, Senior Manager,Manager
3,Technical Chief, CEO, Department Head, Senior Manager, Manager, Senior Analyst
;

 

The Finished product should look like this:

Screen Shot 2018-07-25 at 22.15.39.png

 

Can anyone think of a scripting that will get this result please?

 

Thanks,

 

 

 

7 REPLIES 7
Reeza
Super User

 

But...if they're all the same except the names, why not just hardcode it? 

How did Major disappear and where did it map to? Does it matter though, if they're all the same?

 

Assuming I'm misunderstanding your problem somehow, some general solutions:

 

1. Recode it so that the values are numerical and in order.

 

1=CEO

2=DepartmentHead

3=SeniorManager

etc.

 

2. Sort using Call SORTC

 

3. Reapply labels of 1=CEO

 

 

Another option is to flip your data to a long format and do this recoding, and a proc sort. You can use PROC TRANSPOSE to go from wide to long and vice versa. 

 

Last option, a whole bunch of if/then statements. 

 


@frupaul wrote:

Hi Everyone,

 

I am trying to build a Hierarchy from the below file (source code title: code 1). The first level of the hierarchy should be CEO, and every subsequent level (represented by L2, L3 etc) should be correct.  One way of doing this is

- Loop through all the Ls and identify the first L with a value of CEO.

- The first L with a value of CEO will be Level 1, then the rest of the levels will automatically fall into place with Level 2 being Department Head, Level 3 being Senior manager etc.

- Once Levels are defined as such, these could be appended to give a table like table 2 below;

 

Note: This is extremely simplified. Original table contains about 100k records

 

Screen Shot 2018-07-25 at 22.18.25.png

 

 

 

code 1:

 data Ranking;
INFILE DATALINES DLM=',';
length CustID L1 $20 L2 $20 L3 $20 L4 $20 L5 $20 L6 $20;
input CustID L1 $ L2 $ L3 $ L4 $ L5 $ L6 $;
cards;
1,CEO,Department Head,Senior Manager,Manager, Senior Analyst, Peter
2,Major,Vice President, CEO, Department Head, Senior Manager,Manager
3,Technical Chief, CEO, Department Head, Senior Manager, Manager, Senior Analyst
;

 

The Finished product should look like this:

Screen Shot 2018-07-25 at 22.15.39.png

 

Can anyone think of a scripting that will get this result please?

 

Thanks,

 

 

 


 

frupaul
Quartz | Level 8

Hi Reeza,

 

Major did not disappear. The task is to loop through L1 and see if it contains a value CEO. If it doesn't, then Loop through L2 for CEO.

 

In the example, custID 1 has CEO as value of L1. So CustID 1's level one value will be CEO. A value of CEO is where we want to start building the hierarchy which goes from level 1 to Level n

 

For CustID 2, neither L1 nor L2 has a value of CEO. However, L3 has a value of CEO. So for CUSTid 2, L3 will be Level 1. And automatically L4 will be level 2, L5 will be level 3. 

 

Same Logic applies for CUSTID3. 

 

Thats why I said first getting the Level 1 (which is where L1 or Ln is = 'CEO) for each CUSTID is a starting point. Once these are identified and labeled correctly then append for each CUSTID. However this is a long approach considering i have just under 100k records.

 

Does this make sense now?

Reeza
Super User
Not to me sadly. Is this working across a row or a column? Is the data as posted reflective of what you want? Does the input align with the output here?
frupaul
Quartz | Level 8

For each custid, we go across the row to find the first level (denoted as L1 to Ln at this stage) with a value of CEO. Once that first level is identified, it is labelled Level 1. So if l1='CEO' for any customer, that is automatically where the hierarchy begins. So it is considered Level 1. If L2='CEO' for another customer, that customer's hierarchy begins from there and is given a name Level 1.

 

That is the long approach which will require first identifying the starting point for each customer, and then labelling the variables level 1 to level n, and then appending to get the final output in the original post

 

If you followed this approach, the input will align with the output perfectly.

 

However, i believe there is a simpler way to do it

ChrisNZ
Tourmaline | Level 20

i believe there is a simpler way to do it

I don't see how (but maybe someone does). This works:

 

data HAVE;
  infile datalines dlm=',';
  length CUSTID L1 L2 L3 L4 L5 L6 $20;
  input (CUSTID L1 L2 L3 L4 L5 L6) ($);
cards;
1,CEO,Department Head,Senior Manager,Manager, Senior Analyst, Peter
2,Major,Vice President, CEO, Department Head, Senior Manager,Manager
3,Technical Chief, CEO, Department Head, Senior Manager, Manager, Senior Analyst
run;

data WANT; 
 set HAVE;
 array _L [6] L1-L6;
 CEO_POS= whichc('CEO',of _L[*]);
 if CEO_POS > 1 then do;
   NEW_POS=1;
   do I=CEO_POS to 6 ;
      _L[NEW_POS]=_L[I];
      _L[I]=' ';
      NEW_POS+1;
   end;
 end;
run;
 

 

CUSTID L1 L2 L3 L4 L5 L6
1 CEO Department Head Senior Manager Manager Senior Analyst Peter
2 CEO Department Head Senior Manager Manager    
3 CEO Department Head Senior Manager Manager Senior Analyst  

 

ballardw
Super User

Where do "Andrew" and "James" come from? They aren't in the example input data.

 

If you example output data really should not have exactly the same values for Level1 through Level5 then you should have at least one record to show input that generates a different order and the actual desired result. Which may mean that you have to provide every single possible value for L1 through L6 and the hierarchy.

 

And this isn't really a complex loop, it would be very straightforward if the data were clearly defined.

frupaul
Quartz | Level 8

Sorry i thought it was self explanatory. Here is the full example data set:

Screen Shot 2018-07-26 at 03.26.32.png

 

and code that generates that:

 

data Ranking;
INFILE DATALINES DLM=',' missover;
length CustID L1 $20 L2 $20 L3 $20 L4 $20 L5 $20 L6 $20 L7 $20 L8 $20;
input CustID L1 $ L2 $ L3 $ L4 $ L5 $ L6 $ L7 $ L8 $;
cards;
1,CEO,Department Head,Senior Manager,Manager, Senior Analyst, Peter, ,
2,Major,Vice President, CEO, Department Head, Senior Manager,Manager,Senior Analyst, Andrew
3,Technical Chief, CEO, Department Head, Senior Manager, Manager, Senior Analyst,James
;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 478 views
  • 1 like
  • 4 in conversation