turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- "Flattening" a dataset (I don't know how to ask th...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-04-2014 10:17 PM

I want to take a dataset that has things grouped by a number, and then a subgroup.

For instance, Group 1 has subgroups 1a, 1b, 1c, 1d, etc. (the number of subgroups is dynamic) and each subgroup could be at one of three different levels (1,2,3).

The data I have now, each subgroup would be it's own observation. What I want is to compress it down so that I have a single group at each level - and for all of the subgroups that are at the same level to be in that observation.

Again, I don't know how to phrase the question that I want to ask, so I've tried to provide examples. How do I take the data in the "have" dataset from below and turn it into the "want" dataset?

Thank you for your time.

data have;

length unitid $6 group $5 subgroup $7 level $1;

infile datalines dlm=',' dsd;

input unitid $ group $ subgroup $ level $;

return;

datalines;

"000001","05.02","05.0201","1"

"000001","05.02","05.0202","1"

"000001","05.02","05.0203","1"

"000001","05.02","05.0207","1"

"000001","05.02","05.0299","2"

"000001","05.02","05.0299","3"

"000002","05.02","05.0201","1"

"000002","05.02","05.0203","1"

"000002","05.02","05.0207","1"

"000002","05.02","05.0202","2"

"000002","05.02","05.0203","2"

"000002","05.02","05.0207","2"

"000002","05.02","05.0202","3"

;

run;

data want;

length id $6 group $6 level $1

subgroup1 $1 subgroup2 $1 subgroup3 $1 subgroup4 $1 subgroup5 $1;

infile datalines dlm=',' dsd;

input id $ group $ level $

subgroup1 $ subgroup2 $ subgroup3 $ subgroup4 $ subgroup5 $;

return;

datalines;

"000001","group1","1","a","b","c","d",

"000001","group1","2",,,,,"e"

"000001","group1","3",,,,,"e"

"000002","group1","1","a",,"c","d",

"000002","group1","2",,"b","c","d",

"000002","group1","3",,"b",,,

;

run;

Accepted Solutions

Solution

02-05-2014
09:32 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GregG

02-05-2014 09:32 AM

Given your latest 'want' you could do it with proc transpose:

data have;

length unitid $6 group $6 subgroup $7 level $1;

infile datalines dlm=',' dsd;

input unitid $ group $ subgroup $ level $;

return;

datalines;

"000001","group1","a","1"

"000001","group1","b","1"

"000001","group1","c","1"

"000001","group1","d","1"

"000001","group1","e","2"

"000001","group1","e","3"

"000002","group1","a","1"

"000002","group1","c","1"

"000002","group1","d","1"

"000002","group1","b","2"

"000002","group1","c","2"

"000002","group1","d","2"

"000002","group1","b","3"

;

proc transpose data=have out=want (drop=_ prefix=subgroup;

var subgroup;

by unitid group level;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GregG

02-04-2014 10:50 PM

Are the subgroups to be numbered independently within the group/unitid or assigned numbers independent of which group they are in? Hard to tell from your example as both values of unitid have the same set of distinct subgroup values.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GregG

02-04-2014 11:15 PM

You didn't answer Tom's question but, from what you have provided, here is one way:

data want (drop=_;

set have (drop=group rename=(subgroup=_subgroup));

by unitid level;

group="group1";

array subgroup(5) $;

retain subgroup:;

if first.level then call missing(of subgroup(*));

if _subgroup eq '05.0201' then subgroup(1)='a';

else if _subgroup eq '05.0202' then subgroup(2)='b';

else if _subgroup eq '05.0203' then subgroup(3)='c';

else if _subgroup eq '05.0207' then subgroup(4)='d';

else if _subgroup eq '05.0299' then subgroup(5)='e';

if last.level then output;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GregG

02-05-2014 09:01 AM

Sorry, I had the wrong "have" dataset example.

data have;

length unitid $6 group $5 subgroup $7 level $1;

infile datalines dlm=',' dsd;

input unitid $ group $ subgroup $ level $;

return;

datalines;

"000001","group1","a","1"

"000001","group1","b","1"

"000001","group1","c","1"

"000001","group1","d","1"

"000001","group1","e","2"

"000001","group1","e","3"

"000002","group1","a","1"

"000002","group1","c","1"

"000002","group1","d","1"

"000002","group1","b","2"

"000002","group1","c","2"

"000002","group1","d","2"

"000002","group1","b","3"

;

run;

data want;

length id $6 group $6 level $1

subgroup1 $1 subgroup2 $1 subgroup3 $1 subgroup4 $1 subgroup5 $1;

infile datalines dlm=',' dsd;

input id $ group $ level $

subgroup1 $ subgroup2 $ subgroup3 $ subgroup4 $ subgroup5 $;

return;

datalines;

"000001","group1","1","a","b","c","d",

"000001","group1","2",,,,,"e"

"000001","group1","3",,,,,"e"

"000002","group1","1","a",,"c","d",

"000002","group1","2",,"b","c","d",

"000002","group1","3",,"b",,,

;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GregG

02-05-2014 09:04 AM

The example I provided has each subgroup (in this case a,b,c,d,e) going into a specific column.

However, it would work if they just went into the first available column:

data want;

length id $6 group $6 level $1

subgroup1 $1 subgroup2 $1 subgroup3 $1 subgroup4 $1 subgroup5 $1;

infile datalines dlm=',' dsd;

input id $ group $ level $

subgroup1 $ subgroup2 $ subgroup3 $ subgroup4 $ subgroup5 $;

return;

datalines;

"000001","group1","1","a","b","c","d",

"000001","group1","2","e",,,,

"000001","group1","3","e",,,,

"000002","group1","1","a","c","d",,

"000002","group1","2","b","c","d",,

"000002","group1","3","b",,,,

;

run;

Solution

02-05-2014
09:32 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GregG

02-05-2014 09:32 AM

Given your latest 'want' you could do it with proc transpose:

data have;

length unitid $6 group $6 subgroup $7 level $1;

infile datalines dlm=',' dsd;

input unitid $ group $ subgroup $ level $;

return;

datalines;

"000001","group1","a","1"

"000001","group1","b","1"

"000001","group1","c","1"

"000001","group1","d","1"

"000001","group1","e","2"

"000001","group1","e","3"

"000002","group1","a","1"

"000002","group1","c","1"

"000002","group1","d","1"

"000002","group1","b","2"

"000002","group1","c","2"

"000002","group1","d","2"

"000002","group1","b","3"

;

proc transpose data=have out=want (drop=_ prefix=subgroup;

var subgroup;

by unitid group level;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to GregG

02-05-2014 09:32 AM

In this case, it would be a classic proc transpose approach, not tested.

proc transpose data=have out=want(drop=_name_) prefix=subgroup ;

by unitid group level;

var subgroup;

run;

Haikuo

Well, Art beats me and tested it. Art, do you have a job beside being president?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Haikuo

02-05-2014 10:12 AM

: I know that you were being sarcastic/cynical in your comment, but in a nice way, thus I'll answer your question. I've stated some of this before, but it doesn't hurt repeating. While this is my 40th year using SAS, I am not a programmer. I hold a PhD in Educational Psychology and am more of a researcher/analyst/problem solver/administrator than a programmer per se. I started using the various discussion forums to help me keep abreast of all of the new things that SAS offers each year and participating has been a combination of getting the information I need and paying-it-forward for all of the help I have and will obtain in the future.

However, to answer you question, yes I currently do have a number of jobs although not all are paying jobs:

1) grandfather (currently have 3 grandkids and growing)

2) president of a company involved in a fascinating web development product using SAS

3) president of the Toronto Area SAS Society

4) member of the SAS-L meetup executive board

5) reviewer at sascommunity.org

6) most recently asked to become VP of a recruitment firm

7) currently working with and on an extremely useful SGF paper

8) currently preparing for an SGF panel discussion I've been asked to be on .. talking about why I participate on the various forums

Does that answer your question?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

02-05-2014 10:29 AM

Yes, Sir, It does completely. That is a heck of Bio you have over there and It seems to me you have your plate plenty full. What always intrigues me is the secret of people like you who have a busy schedule, yet, still can make time for something you enjoy doing, like helping people on this forum.

Haikuo