BookmarkSubscribeRSS Feed
jerry898969
Pyrite | Level 9
I have a dataset that has the following data structure

1-1 2-1 3-1 4-1 5-1
1-1 2-1 3-1 4-1 5-2
1-1 2-2 3-2 . .
1-1 2-2 3-2 4-2 5-3

I need a way to create a two column parent/child table without adding the same combination twice.

I need the final result to
1-1 2-1
2-1 3-1
4-1 5-1
4-1 5-2
1-1 2-2
2-2 3-2
3-2 4-2
4-2 5-3

I tried using an array but I didn't come close to getting the output I was looking for.

Thank you for your help.
8 REPLIES 8
deleted_user
Not applicable
try this

let say five variables are a1 b1 c1 d1 e1 and output variables are first and second.

data t2 (keep = first second);
set t1;
first = a1;
second = b1;
output;
first = b1;
second = c1;
output;
first = c1;
second = d1;
output;
first = d1;
second = e1;
output;
run;

GL.
deleted_user
Not applicable
add proc sort to remove duplicate combination.
Cynthia_sas
SAS Super FREQ
Hi:
I guess I don't understand your data -- what are your variable names??? What do the variable values actually look like...are they 1-1 and 2-1 for some variable value??? Or was that mode of showing the data meant to convey something??

For example, for these 4 rows,
[pre]
1-1 2-1 3-1 4-1 5-1
1-1 2-1 3-1 4-1 5-2
1-1 2-2 3-2 . .
1-1 2-2 3-2 4-2 5-3
[/pre]

...I see that 1-1 is repeated on all 4 rows. the first 2 rows are the same except instead of 5-1, the second row has 5-2 -- what does that mean?? What is the significance of the .. in the third row???? Do 2 columns/variables have missing values for variable 4 and variable 5??? What is the significance of the fact that the 3rd row (with the 2 missing values) has 2-2 and 3-2 instead of 2-1 and 3-1??

What is the logic by which you are doing the transform? How do your columns then translate to rows??? What are your variable names? Can you say a little more about how you consider this to be a "two column parent/child table"????

cynthia
jerry898969
Pyrite | Level 9
Cynthia,

this data was just an example of how my data is laid out. I have 5 columns within my data set. The first two columns are always populated the other 3 may or may not have data. I created a id column for each of my 5 label columns. So the first column may have

TEST
TEST
TESTING

So the id for this column would be

1 TEST
1 TEST
2 TESTING

The same goes for all of 5 columns. From there I have to take the id from label1 make that my parent to label2's id and then take label2's id and make it the parent for label3's id. When there are missing values I have to skip over those. I'm trying to make a hierarchical tree. The reason 1-1 shows up multiple times is because the column i'm basing my id off of has the same value for all 4 rows.

thanks to everyone for your their help
Cynthia_sas
SAS Super FREQ
I'm still confused. You have 5 columns in the first example and 2 columns in the second example...what are the column names??? VAR1, VAR2, VAR3, VAR4, VAR5??? You have now shown 2 different forms of the variables 1-1, 2-1, 3-1 ,etc; and 1 TEST, 1 TEST, 2 TESTING, etc.

cynthia
jerry898969
Pyrite | Level 9
Cynthia,
Sorry for the confusion.

Say I have 5 variable id's called var1-var5.
I need to take var1 and make it a parent of var2. So in my original example
var1 = 1-1 and var2 = 2-1. I need to create a data set that has 1-1 in the parent column and 2-1 in the child column.
After that I have to make var2 the parent of var3 so 2-1 will go in parent column and 3-1 will go in child column. This will
continue in the same way where var3 is parent to var4 and var4 to var5.
If there is a blank then nothing should go into the parent/child table.
My first example was fake string values.

The reason the 1-1 is duplicated is because the id is based a different field. I'm using the by statement on that field to create
an id column that increments when the field value changes. In terms of a tree all 4 rows start from the same node then they
split into two branches and so on.
1-1 2-1 3-1 4-1 5-1
1-1 2-1 3-1 4-1 5-2
1-1 2-2 3-2 . .
1-1 2-2 3-2 4-2 5-3

Thank you for your help
Flip
Fluorite | Level 6
data new;
set ;
array orig {5} var1-var5;
do i = 1 to 4;
v1 = orig(i);
v2 = orig(i+1);
if v2 ne '' then output;
end;
run;
proc sort nodupkey;
by v1 v2;
run;
jerry898969
Pyrite | Level 9
Flip & sasteach,

Thank you both for your help. Using your suggestions I got it working like I want. Now I hope the data model doesn't change. LOL

Thank you

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2261 views
  • 0 likes
  • 4 in conversation