BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jtrousd
Calcite | Level 5

Hey guys. I'm looking to replace missing values in a table with zero. I have written the attached code. What confuses me is that the first part of the program works, where it replaces missing msd11 values. After that, however, nothing is replaced. For the life of me I can't figure out why. I've used different indexes (j, k, l, etc). I've tried splitting them all up into different data steps. Nothing works for me. The last thing I checked was to indeed make sure that these are the variable names, and they are. As listed, the variables I'm trying to change are:

msd1101 through msd1112 (this is the one that works)

qsd111 through qsd114

ssd20111 through ssd20112

a_msd1101 through a_msd1112

a_qsd111 through a_qsd114

a_ssd20111 through a_ssd20112

Edit: I have written a simpler code which replaces all missing values with zero (per SAS documentation), but I just want to know why this one isn't working.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Your array definitions are confused.  You are defining multiple arrays that refer to the same variables.  _NUMERIC_ will reference all numeric variables defined so far in the data vector.  In this case all of the numeric variables in the input dataset for the first array.  The later ones will also include the variable I that you introduced in the first DO loop.

You are also not looping over the full array.  Use the DIM() function to dynamically determine how many variables are in an array.

From your problem description I would code the array definitions and loops this way.

array msd11 msd1101 - msd1112;

do i = 01 to dim(msd11);

  if msd11(i) = . then msd11(i) = 0;

end;

drop i;


You only need one "DROP I;" statement, but the extra ones do not cause any trouble.  A number of people use _N_ as the loop variable because SAS will have already defined it and will always drop it.


You can also just use the DO OVER loop syntax instead as you are really not using the index to encode any meaningful information.


array msd11 msd1101 - msd1112;

do over msd11;

  if msd11 = . then msd11 = 0;

end;


View solution in original post

4 REPLIES 4
ballardw
Super User

You only need one DROP statement.

I think your problem is a misunderstanding of what using _numeric_ in the array statement does. It places all numeric values in the array. In EVERY array. They will have the same oreder. since the indices of your arrays other than MSD11 are all less than or equal to the number of items treated in the first array (12) your are just checking the same 12 or fewer variables over and over.

If you want to get ALL missing numeric values with missing set to 0 then try:

array msd11

  • _numeric_;
  • do i = 1 to dim(msd11);

         if msd11 = . then msd11 = 0;

    end;

    otherewise you need to specifically list each set of variables you were thinking of with each of your array declarations.

    Also, your example code was short enough you should include it the post.

    Tom
    Super User Tom
    Super User

    Your array definitions are confused.  You are defining multiple arrays that refer to the same variables.  _NUMERIC_ will reference all numeric variables defined so far in the data vector.  In this case all of the numeric variables in the input dataset for the first array.  The later ones will also include the variable I that you introduced in the first DO loop.

    You are also not looping over the full array.  Use the DIM() function to dynamically determine how many variables are in an array.

    From your problem description I would code the array definitions and loops this way.

    array msd11 msd1101 - msd1112;

    do i = 01 to dim(msd11);

      if msd11(i) = . then msd11(i) = 0;

    end;

    drop i;


    You only need one "DROP I;" statement, but the extra ones do not cause any trouble.  A number of people use _N_ as the loop variable because SAS will have already defined it and will always drop it.


    You can also just use the DO OVER loop syntax instead as you are really not using the index to encode any meaningful information.


    array msd11 msd1101 - msd1112;

    do over msd11;

      if msd11 = . then msd11 = 0;

    end;


    jtrousd
    Calcite | Level 5

    Thanks guys!

    Haikuo
    Onyx | Level 15

    Or you can just:

    proc stdize data=have reponly method=sum missing=0 out=want;

       var msd1101 - msd1112;

       run;

    Haikuo

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 4 replies
    • 5470 views
    • 5 likes
    • 4 in conversation