DATA Step, Macro, Functions and more

"Flattening" a dataset (I don't know how to ask this question)

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 77
Accepted Solution

"Flattening" a dataset (I don't know how to ask this question)

I want to take a dataset that has things grouped by a number, and then a subgroup.

For instance, Group 1 has subgroups 1a, 1b, 1c, 1d, etc. (the number of subgroups is dynamic) and each subgroup could be at one of three different levels (1,2,3).

The data I have now, each subgroup would be it's own observation. What I want is to compress it down so that I have a single group at each level - and for all of the subgroups that are at the same level to be in that observation.

Again, I don't know how to phrase the question that I want to ask, so I've tried to provide examples. How do I take the data in the "have" dataset from below and turn it into the "want" dataset?

Thank you for your time.

data have;

    length unitid $6 group $5 subgroup $7 level $1;

      infile datalines dlm=',' dsd;

      input unitid $ group $ subgroup $ level $;

    return;

datalines;

"000001","05.02","05.0201","1"

"000001","05.02","05.0202","1"

"000001","05.02","05.0203","1"

"000001","05.02","05.0207","1"

"000001","05.02","05.0299","2"

"000001","05.02","05.0299","3"

"000002","05.02","05.0201","1"

"000002","05.02","05.0203","1"

"000002","05.02","05.0207","1"

"000002","05.02","05.0202","2"

"000002","05.02","05.0203","2"

"000002","05.02","05.0207","2"

"000002","05.02","05.0202","3"

;

run;

data want;

    length id $6 group $6 level $1 

            subgroup1 $1 subgroup2 $1 subgroup3 $1 subgroup4 $1 subgroup5 $1;

    infile datalines dlm=',' dsd;

    input id $ group $ level $

            subgroup1 $ subgroup2 $ subgroup3 $ subgroup4 $ subgroup5 $;

    return;

datalines;

"000001","group1","1","a","b","c","d",

"000001","group1","2",,,,,"e"

"000001","group1","3",,,,,"e"

"000002","group1","1","a",,"c","d",

"000002","group1","2",,"b","c","d",

"000002","group1","3",,"b",,,

;

run;


Accepted Solutions
Solution
‎02-05-2014 09:32 AM
PROC Star
Posts: 7,363

Re: "Flattening" a dataset (I don't know how to ask this question)

Given your latest 'want' you could do it with proc transpose:

data have;

    length unitid $6 group $6 subgroup $7 level $1;

      infile datalines dlm=',' dsd;

      input unitid $ group $ subgroup $ level $;

    return;

datalines;

"000001","group1","a","1"

"000001","group1","b","1"

"000001","group1","c","1"

"000001","group1","d","1"

"000001","group1","e","2"

"000001","group1","e","3"

"000002","group1","a","1"

"000002","group1","c","1"

"000002","group1","d","1"

"000002","group1","b","2"

"000002","group1","c","2"

"000002","group1","d","2"

"000002","group1","b","3"

;

proc transpose data=have out=want (drop=_Smiley Happy prefix=subgroup;

  var subgroup;

  by  unitid group level;

run;

View solution in original post


All Replies
Super User
Super User
Posts: 6,502

Re: "Flattening" a dataset (I don't know how to ask this question)

Are the subgroups to be numbered independently within the group/unitid or assigned numbers independent of which group they are in?  Hard to tell from your example as both values of unitid have the same set of distinct subgroup values.

PROC Star
Posts: 7,363

Re: "Flattening" a dataset (I don't know how to ask this question)

You didn't answer Tom's question but, from what you have provided, here is one way:

data want (drop=_Smiley Happy;

  set have (drop=group rename=(subgroup=_subgroup));

  by unitid level;

  group="group1";

  array subgroup(5) $;

  retain subgroup:;

  if first.level then call missing(of subgroup(*));

  if _subgroup eq '05.0201' then subgroup(1)='a';

  else if _subgroup eq '05.0202' then subgroup(2)='b';

  else if _subgroup eq '05.0203' then subgroup(3)='c';

  else if _subgroup eq '05.0207' then subgroup(4)='d';

  else if _subgroup eq '05.0299' then subgroup(5)='e';

  if last.level then output;

run;

Frequent Contributor
Posts: 77

Re: "Flattening" a dataset (I don't know how to ask this question)

Sorry, I had the wrong "have" dataset example.

data have;

    length unitid $6 group $5 subgroup $7 level $1;

      infile datalines dlm=',' dsd;

      input unitid $ group $ subgroup $ level $;

    return;

datalines;

"000001","group1","a","1"

"000001","group1","b","1"

"000001","group1","c","1"

"000001","group1","d","1"

"000001","group1","e","2"

"000001","group1","e","3"

"000002","group1","a","1"

"000002","group1","c","1"

"000002","group1","d","1"

"000002","group1","b","2"

"000002","group1","c","2"

"000002","group1","d","2"

"000002","group1","b","3"

;

run;

data want;

    length id $6 group $6 level $1 

            subgroup1 $1 subgroup2 $1 subgroup3 $1 subgroup4 $1 subgroup5 $1;

    infile datalines dlm=',' dsd;

    input id $ group $ level $

            subgroup1 $ subgroup2 $ subgroup3 $ subgroup4 $ subgroup5 $;

    return;

datalines;

"000001","group1","1","a","b","c","d",

"000001","group1","2",,,,,"e"

"000001","group1","3",,,,,"e"

"000002","group1","1","a",,"c","d",

"000002","group1","2",,"b","c","d",

"000002","group1","3",,"b",,,

;

run;

Frequent Contributor
Posts: 77

Re: "Flattening" a dataset (I don't know how to ask this question)

The example I provided has each subgroup (in this case a,b,c,d,e) going into a specific column.

However, it would work if they just went into the first available column:

data want;

    length id $6 group $6 level $1 

            subgroup1 $1 subgroup2 $1 subgroup3 $1 subgroup4 $1 subgroup5 $1;

    infile datalines dlm=',' dsd;

    input id $ group $ level $

            subgroup1 $ subgroup2 $ subgroup3 $ subgroup4 $ subgroup5 $;

    return;

datalines;

"000001","group1","1","a","b","c","d",

"000001","group1","2","e",,,,

"000001","group1","3","e",,,,

"000002","group1","1","a","c","d",,

"000002","group1","2","b","c","d",,

"000002","group1","3","b",,,,

;

run;

Solution
‎02-05-2014 09:32 AM
PROC Star
Posts: 7,363

Re: "Flattening" a dataset (I don't know how to ask this question)

Given your latest 'want' you could do it with proc transpose:

data have;

    length unitid $6 group $6 subgroup $7 level $1;

      infile datalines dlm=',' dsd;

      input unitid $ group $ subgroup $ level $;

    return;

datalines;

"000001","group1","a","1"

"000001","group1","b","1"

"000001","group1","c","1"

"000001","group1","d","1"

"000001","group1","e","2"

"000001","group1","e","3"

"000002","group1","a","1"

"000002","group1","c","1"

"000002","group1","d","1"

"000002","group1","b","2"

"000002","group1","c","2"

"000002","group1","d","2"

"000002","group1","b","3"

;

proc transpose data=have out=want (drop=_Smiley Happy prefix=subgroup;

  var subgroup;

  by  unitid group level;

run;

Respected Advisor
Posts: 3,124

Re: "Flattening" a dataset (I don't know how to ask this question)

In this case, it would be a classic proc transpose approach, not tested.

proc transpose data=have out=want(drop=_name_) prefix=subgroup ;

  by unitid group level;

  var subgroup;

  run;

Haikuo

Well, Art beats me and tested it. Art, do you have a job beside being president? Smiley Happy

PROC Star
Posts: 7,363

Re: "Flattening" a dataset (I don't know how to ask this question)

: I know that you were being sarcastic/cynical in your comment, but in a nice way, thus I'll answer your question.  I've stated some of this before, but it doesn't hurt repeating.  While this is my 40th year using SAS, I am not a programmer.  I hold a PhD in Educational Psychology and am more of a researcher/analyst/problem solver/administrator than a programmer per se.  I started using the various discussion forums to help me keep abreast of all of the new things that SAS offers each year and participating has been a combination of getting the information I need and paying-it-forward for all of the help I have and will obtain in the future.

However, to answer you question, yes I currently do have a number of jobs although not all are paying jobs:

1) grandfather (currently have 3 grandkids and growing)

2) president of a company involved in a fascinating web development product using SAS

3) president of the Toronto Area SAS Society

4) member of the SAS-L meetup executive board

5) reviewer at sascommunity.org

6) most recently asked to become VP of a recruitment firm

7) currently working with and on an extremely useful SGF paper

8) currently preparing for an SGF panel discussion I've been asked to be on .. talking about why I participate on the various forums

Does that answer your question?

Respected Advisor
Posts: 3,124

Re: "Flattening" a dataset (I don't know how to ask this question)

Yes, Sir, It does completely. Smiley Happy  That is a heck of Bio you have over there and It seems to me you have your plate plenty full. What always intrigues me is the secret of people like you who have a busy schedule, yet, still can make time for something you enjoy doing, like helping people on this forum.

Haikuo

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 531 views
  • 3 likes
  • 4 in conversation