BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
scouttier
Calcite | Level 5

Hi all,

I am very, very new to SAS. I have a dataset with about 300 variables with either a 1 or a 0 if that person does or does not meet those qualifcations.

 

So something like

PersonVar1Var2Var3Var4
521010
321010
1000010
801011

 

should be

 

PersonVar1Var3Var4
52110
32110
100010
8011

1

 

Let me know if you have any ideas!! I am very new to SAS so have no idea where to even begin, but this is needed for my work : )

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Run a proc summary, transpose the result, and create a drop statement dynamically from that:

data have;
input person $ var1 var2 var3;
datalines;
52 1 0 1 0
32 1 0 1 0
100 0 0 1 0
80 1 0 1 1
;
run;

proc summary data=have;
var _numeric_;
output out=check (drop=_type_ _freq_) sum=;
run;

proc transpose data=check out=check2;
var _all_;
run;

data _null_;
set check2 end=eof;
if _n_ = 1 then call execute('data want; set have; drop');
if col1 = 0 then call execute(' ' !! trim(_name_));
if eof then call execute('; run;');
run;

Note that I posted example data in a data step with datalines; this makes it easy for everyone to recreate the exact same dataset with just copy/paste and submit.

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Here's my solution.

 

Do nothing to var2.

 

In any code you write, do not mention or use var2.

--
Paige Miller
Kurt_Bremser
Super User

Run a proc summary, transpose the result, and create a drop statement dynamically from that:

data have;
input person $ var1 var2 var3;
datalines;
52 1 0 1 0
32 1 0 1 0
100 0 0 1 0
80 1 0 1 1
;
run;

proc summary data=have;
var _numeric_;
output out=check (drop=_type_ _freq_) sum=;
run;

proc transpose data=check out=check2;
var _all_;
run;

data _null_;
set check2 end=eof;
if _n_ = 1 then call execute('data want; set have; drop');
if col1 = 0 then call execute(' ' !! trim(_name_));
if eof then call execute('; run;');
run;

Note that I posted example data in a data step with datalines; this makes it easy for everyone to recreate the exact same dataset with just copy/paste and submit.

PaigeMiller
Diamond | Level 26

Yes, Kurt that does seem to solve the problem, but I prefer my solution better, especially since I strive to be a "maximally efficient programmer".

--
Paige Miller
Kurt_Bremser
Super User

I guess the OP wanted a solution for automatically cleaning out irrelevant columns from datasets. And it tickled me to create something completely data-driven 😉

scouttier
Calcite | Level 5

Thanks for the response! This looks very easy to follow and recreate. Though I tried running this, and it just gave me my table back with the 0's still there, the table I am trying to filter is "my_table". What went wrong??

 

proc summary data=my_table;

var _numeric_;

output out=check (drop=_type_ _freq_) sum=;

run;

proc transpose data=check out=check2;

var _all_;

run;

data _null_;

set check2 end=eof;

if _n_ = 1 then call execute('data want; set unix.my_table; drop');

if col1 = 0 then call execute(' ' !! trim(_name_));

if eof then call execute('; run;');

run;

scouttier
Calcite | Level 5

Nevermind, it's because my variables were stored as text! Thanks a lot for the help!

ballardw
Super User

@scouttier wrote:

Hi all,

I am very, very new to SAS. I have a dataset with about 300 variables with either a 1 or a 0 if that person does or does not meet those qualifcations.

 

 

Let me know if you have any ideas!! I am very new to SAS so have no idea where to even begin, but this is needed for my work : )

Thanks


Could you please describe why removing variables with values of all 0 are actually needed.

 

If I were to encounter such data I would first report back to the data source/collector/compiler/designer or whoever is in charge of the project about the behavior of variables with all any value (1, 0 or 99999). Sometimes invariant data occurs due to any of number of issues such as transcription errors, errors in extraction code or creation of a text file used for import or even an incorrect format assignment in SAS (the values by default may look like 0 but actually contain other values such as display format is F1.0 for instance and values are less than 0.5).

 

It may be that this is part of a multiple phase collection effort and one or more phases may all be known that "in phase XXX we expect all 0 values for variables X, Y and Z" in which case removing them would really be the wrong thing to do.

 

I have actually received data with a similar issue values of only 1 and 0 where we expected 1 to 50. The actual issue was the program creating a fixed column text file was writing another variable over the columns for the specific field.

scouttier
Calcite | Level 5

Hello.

I created the table in Excel to describe if a person has a qualification or not, and I later need to modify it by joining it to another table / transposing it. It's taking way too long with all the unnecessary variables : )

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3286 views
  • 3 likes
  • 4 in conversation