From the SAS Users LinkedIn Group, SAS User Daniele asks:
Hi!
I would like to transform a database with 10 numeric variables, where each observation is determined by the modalities of three binary variables in a database with 80 variables (2 * 2 * 2 * 10), where each observation is no longer repeated 8 times but only 1.
I provide an example to be clear: one row contains the number of men (variable binary number 1) minor (variable binary number 2) in jail (var binary number 3) assigned to the prison p (principal variable, therefore repeated 8 times), I would in new database one row containing the data for the prison p, where the number of minor men in jail will be a specific variable of the 80 newly created.
I think I have to use arrays or a set of proc transpose, but I have some problem to imagine how I can practically do.
I hope someone can give me some suggestions!
Thank you!
Original post: https://www.linkedin.com/groups/4000846/4000846-6227792732740427779
It isn't helpful to post a link that requires a login / account. Not everyone has an account on linkedin and some work locations block social media websites.
Example data of input and desired output go a long way especially if provided in the form of a data step. And if the data is not SAS then this isn't the best place to ask such questions.
Had to read through a couple of times before I understood that the "binary number" is just a value to carry over and it's not a math problem; now I think it's just a simple "narrow to wide" transpose problem, probably achievable with PROC TRANSPOSE or a DATA step on sorted data with FIRST and LAST processing and/or arrays. Here's a thread with a similar problem/solution.
(Note to @ballardw: I think @SAS_Cares is trying to source an answer here to help this user and others in the future. I think we can agree there are more experts hanging out here than on the LinkedIn group.)
Nothing can stimulate mental activity like a four-dimensional array.
You'll need to provide more information about the variable names (both those you have now, and those you would like to end up with).
OK, I'll bite. Lets assume you have 14 variables:
P (for prison)
binary1 (0 for male, 1 for female)
binary2 (0 for minor, 1 for adult)
binary3 (0 for in jail, 1 for free)
var1-var10 the original variables
Assume you want vars named like men_minor_jail_1 ... men_minor_jail_10, etc.
If your data are sorted by P, then this structure would work:
data want (drop=var1-var10 v);
array x{0:1,0:1,0:1,10}
men_minor_jail_1 - men_minor_jail_10
men_minor_free_1 - men_minor_free_10
men_adult_jail_1 - men_adult_jail_10
men_adult_free_1 - men_adult_free_10
women_minor_jail_1 - women_minor_jail_10
women_minor_free_1 - women_minor_free_10
women_adult_jail_1 - women_adult_jail_10
women_adult_free_1 - women_adult_free_10;
array vars {10} var1-var10;
do until (last.p);
set have;
by p;
do v=1 to 10;
x{binary1,binary2,binary3,v}=vars{v};
end;
end;
run;
xx
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.