BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
scouttier
Calcite | Level 5

Hi all,

I am very, very new to SAS. I have a dataset with about 300 variables with either a 1 or a 0 if that person does or does not meet those qualifcations.

 

So something like

PersonVar1Var2Var3Var4
521010
321010
1000010
801011

 

should be

 

PersonVar1Var3Var4
52110
32110
100010
8011

1

 

Let me know if you have any ideas!! I am very new to SAS so have no idea where to even begin, but this is needed for my work : )

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Run a proc summary, transpose the result, and create a drop statement dynamically from that:

data have;
input person $ var1 var2 var3;
datalines;
52 1 0 1 0
32 1 0 1 0
100 0 0 1 0
80 1 0 1 1
;
run;

proc summary data=have;
var _numeric_;
output out=check (drop=_type_ _freq_) sum=;
run;

proc transpose data=check out=check2;
var _all_;
run;

data _null_;
set check2 end=eof;
if _n_ = 1 then call execute('data want; set have; drop');
if col1 = 0 then call execute(' ' !! trim(_name_));
if eof then call execute('; run;');
run;

Note that I posted example data in a data step with datalines; this makes it easy for everyone to recreate the exact same dataset with just copy/paste and submit.

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Here's my solution.

 

Do nothing to var2.

 

In any code you write, do not mention or use var2.

--
Paige Miller
Kurt_Bremser
Super User

Run a proc summary, transpose the result, and create a drop statement dynamically from that:

data have;
input person $ var1 var2 var3;
datalines;
52 1 0 1 0
32 1 0 1 0
100 0 0 1 0
80 1 0 1 1
;
run;

proc summary data=have;
var _numeric_;
output out=check (drop=_type_ _freq_) sum=;
run;

proc transpose data=check out=check2;
var _all_;
run;

data _null_;
set check2 end=eof;
if _n_ = 1 then call execute('data want; set have; drop');
if col1 = 0 then call execute(' ' !! trim(_name_));
if eof then call execute('; run;');
run;

Note that I posted example data in a data step with datalines; this makes it easy for everyone to recreate the exact same dataset with just copy/paste and submit.

PaigeMiller
Diamond | Level 26

Yes, Kurt that does seem to solve the problem, but I prefer my solution better, especially since I strive to be a "maximally efficient programmer".

--
Paige Miller
Kurt_Bremser
Super User

I guess the OP wanted a solution for automatically cleaning out irrelevant columns from datasets. And it tickled me to create something completely data-driven 😉

scouttier
Calcite | Level 5

Thanks for the response! This looks very easy to follow and recreate. Though I tried running this, and it just gave me my table back with the 0's still there, the table I am trying to filter is "my_table". What went wrong??

 

proc summary data=my_table;

var _numeric_;

output out=check (drop=_type_ _freq_) sum=;

run;

proc transpose data=check out=check2;

var _all_;

run;

data _null_;

set check2 end=eof;

if _n_ = 1 then call execute('data want; set unix.my_table; drop');

if col1 = 0 then call execute(' ' !! trim(_name_));

if eof then call execute('; run;');

run;

scouttier
Calcite | Level 5

Nevermind, it's because my variables were stored as text! Thanks a lot for the help!

ballardw
Super User

@scouttier wrote:

Hi all,

I am very, very new to SAS. I have a dataset with about 300 variables with either a 1 or a 0 if that person does or does not meet those qualifcations.

 

 

Let me know if you have any ideas!! I am very new to SAS so have no idea where to even begin, but this is needed for my work : )

Thanks


Could you please describe why removing variables with values of all 0 are actually needed.

 

If I were to encounter such data I would first report back to the data source/collector/compiler/designer or whoever is in charge of the project about the behavior of variables with all any value (1, 0 or 99999). Sometimes invariant data occurs due to any of number of issues such as transcription errors, errors in extraction code or creation of a text file used for import or even an incorrect format assignment in SAS (the values by default may look like 0 but actually contain other values such as display format is F1.0 for instance and values are less than 0.5).

 

It may be that this is part of a multiple phase collection effort and one or more phases may all be known that "in phase XXX we expect all 0 values for variables X, Y and Z" in which case removing them would really be the wrong thing to do.

 

I have actually received data with a similar issue values of only 1 and 0 where we expected 1 to 50. The actual issue was the program creating a fixed column text file was writing another variable over the columns for the specific field.

scouttier
Calcite | Level 5

Hello.

I created the table in Excel to describe if a person has a qualification or not, and I later need to modify it by joining it to another table / transposing it. It's taking way too long with all the unnecessary variables : )

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 4819 views
  • 3 likes
  • 4 in conversation