BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Irene_IP
Fluorite | Level 6

Hi everyone!

I have a SAS dataset with more than 1000 variables distributed like this:

ID      Auto      Auto_2     Auto_3     City     City_2     Sport     Sport_2     Sport_3     Sport_4 ... and so on (many different variables repeated different times).

I need to:

1) keep only those rows that have missing values in all the variables ending with "_number", (or, equally, drop those rows that have values in any of the variables ending with "_number").

For example:?

   ID     Auto   Auto_2    Auto_3    City         City_2    

    01     Fiat    Ford                      Milan                                ---> Drop row

    02     Fiat                                 Milan        Rome               ---> Drop row

    03     Fiat    Ford          Ford      Milan        Rome               ---> Drop row

    04     Fiat                                 Milan                                ---> KEEP ROW

If it can help, if Variable_n is missing, then Variable_(n+1) is missing too.

2) keep only the columns without the "_number".

In the previous example only  ID Auto City must be kept.

The problem in both cases is the same: I can't write all the variables in a drop/keep statement or in a IF condition because they're too many.

Is there a smart way to manage variables like these? Maybe something like lists...? I don't know.

Thanks a lot

Irene

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Code: Program

data have;
input (ID   Auto   Auto_2   Auto_3   City   City_2  ) ($); 
cards;
  01 Fiat Ford . Milan . 
  02 Fiat . . Milan Rome 
  03 Fiat Ford Ford Milan Rome 
  04 Fiat . . Milan . 
;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;  

proc sql;
select _name_ into : list separated by ',' from temp where prxmatch('/_\d+$/',strip(_name_));
select count(*) into : n from temp where prxmatch('/_\d+$/',strip(_name_));
quit;

data want;
set have;
if cmiss(&list)=&n;
run;

View solution in original post

7 REPLIES 7
Reeza
Super User

You can only use short cut references for variable lists when you have prefixes Smiley Sad

Otherwise you can create a macro variable with the list of variables by querying the SASHELP.VCOLUMN table.

Irene_IP
Fluorite | Level 6

I've never used the SASHELP.VCOLUMN table, I'll have a look!

Saying shortcut references do you mean, for example "Var:" to use all variables starting with "Var"? I've aIready tried this method in a DROP statement, but I think I can't use it in a IF statement.. Am I wrong?

Any ideas for the IF statement?

Thanks

Irene

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, you could put the _numbers in an array, then loop over the array:

data want;

     set have;

     array elem{*} auto-city2;

     do i=2 to dim(elem{*});

          if elem{i}="" and elem{i-1}="" then flag="Y";

     end;

run;

Then drop based on flag (note untested code!).

Personally, in these instances you may be better off normalising your data, i.e. have on row per data element:

ID          SECTION          NUMBER          VALUE

1            AUTO                1                          Fiat

1            AUTO                2                          Ford

...

Astounding
PROC Star

Note that you have a special condition in your data:  If Variable_N is missing, then Variable_N+1 is also missing.  This means that you don't have to process all your variables.  You could process only the "_2" variables and ignore the rest.

Here's a beginning to a solution.  See if you need more than that.

proc contents data=have noprint out=_contents_ (keep=name);

run;

*** Just so you know what you are working with here ...;

proc print data=_contents_ (obs=10);

run;

data keeplist _2_list;

  set _contents_;

  last_part = scan(name, -1, '_');

  if last_part='2' then output _2_list;

  else if input(last_part, ??5.) < 2 then output keeplist;

run;

*** Again, just so you know what you are working with here ...;

proc print data=keeplist (obs=10);

proc print data=_2_list (obs=10);

run;

proc sql;

  select trim(name) into : keeplist separated by ' ' from keeplist;

  select trim(name) into : _2_list separated by ' ' from _2_list;

quit;

%put &keeplist;

%put &_2_list;

The rest will be (by comparison) easier, but see if this much makes sense.

Good luck.

Ksharp
Super User

Code: Program

data have;
input (ID   Auto   Auto_2   Auto_3   City   City_2  ) ($); 
cards;
  01 Fiat Ford . Milan . 
  02 Fiat . . Milan Rome 
  03 Fiat Ford Ford Milan Rome 
  04 Fiat . . Milan . 
;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;  

proc sql;
select _name_ into : list separated by ',' from temp where prxmatch('/_\d+$/',strip(_name_));
select count(*) into : n from temp where prxmatch('/_\d+$/',strip(_name_));
quit;

data want;
set have;
if cmiss(&list)=&n;
run;
Ksharp
Super User

Dictionary.columns would cost some time to query these information.

If you just want get variable's name ,proc transpose is good and fast.

For your second Q.

Code: Program

data have;
input (ID   Auto   Auto_2   Auto_3   City   City_2  ) ($); 
cards;
  01 Fiat Ford . Milan . 
  02 Fiat . . Milan Rome 
  03 Fiat Ford Ford Milan Rome 
  04 Fiat . . Milan . 
;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;  

proc sql;
select _name_ into : list separated by ' ' from temp where not prxmatch('/_\d+$/',strip(_name_));
quit;

data want;
set have;
keep &list;
run;
Irene_IP
Fluorite | Level 6

Thank you so much!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1436 views
  • 8 likes
  • 5 in conversation