Desktop productivity for business analysts and programmers

Expand data table

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 79
Accepted Solution

Expand data table

Have:

dataset named 'colList':

 

ColNeed
Col_1
Col_2
Col_3

dataset named 'sample':

Drop 1 Col_2 Col_1 Nope_3
a      2     1      b

Want:

Col_1 Col_2 Col_3
1     2     .

Bascially, in 'sample', drop columns that are not listed in 'colList', add additional columns that listed in 'colList' but not already in 'sample'. Finally, maintain the same order as in 'colList' (from top to bottom).

 

Question:

Currently struggled to find the optimal codes, though it seems there might be multiple approaches.

 

Note:

To generate the data needed, please use:

data colList;
	input ColNeed $;
cards;
Col_1
Col_2
Col_3
;
run;

data sample;
	input 'Drop 1'n $ Col_2 Col_1 Nope_3 $;
cards;
a 2 1 b
;
run;

 

 


Accepted Solutions
Solution
‎04-23-2017 03:10 AM
Esteemed Advisor
Posts: 7,294

Re: Expand data table

First, thank you for asking what appeared to be a simple problem, but required quite a bit of thought to come up with a working solution.

 

As mentioned in my previous post I had thought that you might be confronting a discrepancy between how SAS and WPS function, which there is (though minor in this case), but neither will output variables unless they are explicitly initialized. Interestingly, though, both will accept missing values as an acceptable initialization value.

 

As such, try the following code. It appears to work correctly on both SAS and WPS and, more importantly, solves your current problem:

 

options validvarname=any;

data colList;
	input ColNeed $;
cards;
Col_1
Col_2
Col_3
;
run;

data sample;
	input 'Drop 1'n $ Col_2 Col_1 $ Nope_3 $;
cards;
a 2 1 b
;
run;

proc sql noprint;
  select ColNeed,
    case type
      when 'char' then catx(' ','retain',ColNeed,"' ';")
      else catx(' ','retain',ColNeed,".;")
    end
      into :keeps separated by ' ',
           :cols separated by ' '
        from colList a
          full join (select name,type from dictionary.columns
            where libname='WORK' and
                  memname='SAMPLE') b
              on a.ColNeed=b.name
                where ColNeed is not NULL
  ;
quit;

data want (keep=&keeps.);
  &cols.
  set sample;
run;

Art, CEO, AnalystFinder.com

 

 

View solution in original post


All Replies
Grand Advisor
Posts: 17,358

Re: Expand data table

This is a variation of your earlier problem. The solution is essentially the same. 

 

You can either use a PROC SQL to order vars or look at using a Length or retain in a datastep to maintain order. 

Esteemed Advisor
Posts: 7,294

Re: Expand data table

I would do it as follows:

 

proc sql noprint;
  select ColNeed into: cols separated by ' '
    from colList
  ;
quit;

data want (keep=&cols.);
  retain &cols.;
  set sample;
run;

Art, CEO, AnalystFinder.com

 

Frequent Contributor
Posts: 79

Re: Expand data table

[ Edited ]

Hi Art,
Used exactly the same codes, but seems it's not picking up Col_3.

Codes:

data colList;
	input ColNeed $;
cards;
Col_1
Col_2
Col_3
;
run;

data sample;
	input 'Drop 1'n $ Col_2 Col_1 Nope_3 $;
cards;
a 2 1 b
;
run;

proc sql noprint;
  select ColNeed into: cols separated by ' '
    from colList
  ;
quit;

data want (keep=&cols.); /* this part has some problems */
  retain &cols.;
  set sample;
run;

Logs:

45         data want (keep=&cols.);
46           retain &cols.;
47           set sample;
48         run;

NOTE: Variable Col_3 is uninitialized.
WARNING: The variable Col_3 in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 1 observations read from the data set WORK.SAMPLE.
NOTE: The data set WORK.WANT has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

Thanks!

 

Grand Advisor
Posts: 17,358

Re: Expand data table

Post the code and log please. 

Frequent Contributor
Posts: 79

Re: Expand data table

Please note Col_1 and Col_2 are not necessarily all numeric.
Grand Advisor
Posts: 17,358

Re: Expand data table


ayin wrote:
Please note Col_1 and Col_2 are not necessarily all numeric.

I don't believe that has any bearing on the solutions. Post your code and log that you say does not work. 

Frequent Contributor
Posts: 79

Re: Expand data table

Please see posts above.

Array, Length can't be used here because Col_1, Col_2, Col_3 are not necessarily the same type (numeric/characters)
Grand Advisor
Posts: 17,358

Re: Expand data table

@ayin if you edit an earlier post as a response that doesn't mean we'll automatically notice this. It also makes it hard to make sense of the thread if things change.

 

You're correct that RETAIN won't work. Do you have any other way to indicate the variable type or a value that it can be initialized to if the variable is missing from a dataset.

 

The other standard way to accomplish this is to have a dummy dataset that has all the attributes and then you append the empty table with your 'have' dataset and it will have all the desired traits. But you require more information than you currently have. 

 

 

Grand Advisor
Posts: 9,578

Re: Expand data table




options validvarname=any;

data colList;
	input ColNeed $;
cards;
Col_1
Col_2
Col_3
;
run;

data sample;
	input 'Drop 1'n $ Col_2 Col_1 Nope_3 $;
cards;
a 2 1 b
;
run;

%macro vname;
proc sql noprint;
  select ColNeed into: cols separated by ' '
    from colList;
    
  select count(*) into : n from  colList
   where upcase(ColNeed) not in (
    select upcase(name) as name from dictionary.columns
     where libname='WORK' and memname='SAMPLE');
     
%if &n ne 0 %then %do;
  select ColNeed into : name separated by ' ' from  colList
   where upcase(ColNeed) not in (
    select upcase(name) as name from dictionary.columns
     where libname='WORK' and memname='SAMPLE');
%end;
quit;
data want ;
  set sample;
%if &n ne 0 %then %do;
  retain &name. .;
%end;
  keep &cols.;
run;
%mend;

%vname


Esteemed Advisor
Posts: 7,294

Re: Expand data table

[ Edited ]

NOTE: The following post isn't correct, but I'm leaving it here to show the train of thought in trying to solve the OP's problem.

 

I don't know if this is or isn't the problem you're confronting, but a similar issue was just discussed on a different forum yesterday.

 

The behavior your getting is a current difference between the behaviors of SAS and WPS. Regardless of whether you're using SAS or WPS try the following:

 

proc sql noprint;
  select ColNeed into: cols separated by ' '
    from colList
  ;
quit;

data want (keep=&cols.);
  retain &cols. .;
  set sample;
run;

While SAS doesn't require it, WPS doesn't add a variable to its PDV unless it has been initialized or set to some value. Interestingly, missing is one of those values.

 

Art, CEO, AnalystFinder.com

 

Grand Advisor
Posts: 17,358

Re: Expand data table

@art297 I had the same behaviour using SAS 9.4 M3

Solution
‎04-23-2017 03:10 AM
Esteemed Advisor
Posts: 7,294

Re: Expand data table

First, thank you for asking what appeared to be a simple problem, but required quite a bit of thought to come up with a working solution.

 

As mentioned in my previous post I had thought that you might be confronting a discrepancy between how SAS and WPS function, which there is (though minor in this case), but neither will output variables unless they are explicitly initialized. Interestingly, though, both will accept missing values as an acceptable initialization value.

 

As such, try the following code. It appears to work correctly on both SAS and WPS and, more importantly, solves your current problem:

 

options validvarname=any;

data colList;
	input ColNeed $;
cards;
Col_1
Col_2
Col_3
;
run;

data sample;
	input 'Drop 1'n $ Col_2 Col_1 $ Nope_3 $;
cards;
a 2 1 b
;
run;

proc sql noprint;
  select ColNeed,
    case type
      when 'char' then catx(' ','retain',ColNeed,"' ';")
      else catx(' ','retain',ColNeed,".;")
    end
      into :keeps separated by ' ',
           :cols separated by ' '
        from colList a
          full join (select name,type from dictionary.columns
            where libname='WORK' and
                  memname='SAMPLE') b
              on a.ColNeed=b.name
                where ColNeed is not NULL
  ;
quit;

data want (keep=&keeps.);
  &cols.
  set sample;
run;

Art, CEO, AnalystFinder.com

 

 

Frequent Contributor
Posts: 79

Re: Expand data table

Thank you @art297!

It worked for the sample data listed above. When I tried to implement it on real data, however, all character values are reduced to only have length of 1.

 

For example, if you test it on

options validvarname=any;

data colList;
	input ColNeed $;
cards;
Col_1
Col_2
Col_3
;
run;

data sample;
	infile datalines dlm=",";
	input 'Drop 1'n $ Col_2 Col_1 $ Nope_3 $;
datalines;
GOOD,3,UNIVERSITY,cc
;
run;

The final output is 

Col_1 Col_2 Col_3
U     3     .

All numeric/date values are dealt perfectly.

 

Please let me know how we can tweak your codes a bit and keep character values to its orginal length?

Esteemed Advisor
Posts: 7,294

Re: Expand data table

Just move the retain macro one after the set statement:

 

data want (keep=&keeps.);
  set sample;
  &cols.
run;

Art, CEO, AnalystFinder.com

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 321 views
  • 5 likes
  • 4 in conversation