BookmarkSubscribeRSS Feed
SAS_Cares
SAS Employee

From the SAS Users LinkedIn Group, SAS User Daniele asks:

 

Hi!

I would like to transform a database with 10 numeric variables, where each observation is determined by the modalities of three binary variables in a database with 80 variables (2 * 2 * 2 * 10), where each observation is no longer repeated 8 times but only 1.

I provide an example to be clear: one row contains the number of men (variable binary number 1) minor (variable binary number 2) in jail (var binary number 3) assigned to the prison p (principal variable, therefore repeated 8 times), I would in new database one row containing the data for the prison p, where the number of minor men in jail will be a specific variable of the 80 newly created.

I think I have to use arrays or a set of proc transpose, but I have some problem to imagine how I can practically do.

I hope someone can give me some suggestions!

Thank you!

 

Original post: https://www.linkedin.com/groups/4000846/4000846-6227792732740427779 

4 REPLIES 4
ballardw
Super User

It isn't helpful to post a link that requires a login / account. Not everyone has an  account on linkedin and some work locations block social media websites.

 

Example data of input and desired output go a long way especially if provided in the form of a data step. And if the data is not SAS then this isn't the best place to ask such questions.

ChrisHemedinger
Community Manager

Had to read through a couple of times before I understood that the "binary number" is just a value to carry over and it's not a math problem; now I think it's just a simple "narrow to wide" transpose problem, probably achievable with PROC TRANSPOSE or a DATA step on sorted data with FIRST and LAST processing and/or arrays.  Here's a thread with a similar problem/solution.

 

(Note to @ballardw: I think @SAS_Cares is trying to source an answer here to help this user and others in the future.  I think we can agree there are more experts hanging out here than on the LinkedIn group.)

 

 

 

It's time to register for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.
Astounding
PROC Star

Nothing can stimulate mental activity like a four-dimensional array.

 

You'll need to provide more information about the variable names (both those you have now, and those you would like to end up with).

 

mkeintz
PROC Star

OK, I'll bite.  Lets assume you have 14 variables:
     P  (for prison)

     binary1 (0 for male, 1 for female)

     binary2 (0 for minor, 1 for adult)

     binary3 (0 for in jail, 1 for free)
     var1-var10  the original variables

 

Assume you want vars named like   men_minor_jail_1 ... men_minor_jail_10, etc.

 

If your data are sorted by P, then this structure would work:

 

data want (drop=var1-var10 v);
   array x{0:1,0:1,0:1,10}
     men_minor_jail_1 - men_minor_jail_10
     men_minor_free_1 - men_minor_free_10
     men_adult_jail_1 - men_adult_jail_10
     men_adult_free_1 - men_adult_free_10

     women_minor_jail_1 - women_minor_jail_10
     women_minor_free_1 - women_minor_free_10
     women_adult_jail_1 - women_adult_jail_10
     women_adult_free_1 - women_adult_free_10;

   array vars {10} var1-var10;

   do until (last.p);
     set have;
     by p;
     do v=1 to 10;
       x{binary1,binary2,binary3,v}=vars{v};
     end;
   end;
run;

 

xx

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1299 views
  • 1 like
  • 5 in conversation