BookmarkSubscribeRSS Feed
cbsxb
Calcite | Level 5
Hello, need your help.

I have two datasets (A and B) which have same variable names but different length and type. Periodically, I need to add B records to A. How can I automatically update B variable attributes to the same of A? Thanks.
15 REPLIES 15
ArtC
Rhodochrosite | Level 12
this post went unanswered from a coupleof weeks ago. Did you get the solution that you needed? If not are either of your data sets of a size that allow pre processing?
Ksharp
Super User
Hi.
ArtC.I only solve part question ,namely only make all variables in A and B to have the same length.


[pre]
data a;
set sashelp.class;
keep name sex weight;
run;
data b;
length name $ 50 sex $ 20 weight 3;
name='Peter'; sex='M'; weight=56.3; output;
name='Patrick'; sex='M';weight=56.2;output;
name='ArtC'; sex='M'; weight=69.9;output;
run;

%let table_a=A;
%let table_b=B;

%macro alter;
proc sql;
select cat( name ,case (type)
when ('char') then cat(' char(',length,')')
when ('num') then cat(' ',length)
end )
into : change separated by ','
from dictionary.columns
where libname='WORK' and memname="&table_a"
;

alter table &table_b
modify &change ;
quit;
%mend;

%alter




[/pre]


Ksharp

Message was edited by: Ksharp
Patrick
Opal | Level 21
Hi
A similar approach but without the need to re-write the table.

data a;
attrib
key label='this is the key & nothing else'
a format=8. label='test & test';
do key=1 to 5;
a=1;
b='abc';
c=1;
output;
end;
run;

data b;
do key=6 to 10;
a=1;
b='abcd';
d=5;
output;
end;
run;

%let rename=;
%let assign=;
%let length=;
%let drop=;
proc sql noprint;
select
cats(BASE.name,'=xyz_',BASE.varnum)

,cats(BASE.name,'=xyz_',BASE.varnum,';')

,case
when BASE.type='char' then
catx(' ',BASE.name,cats('length=$',BASE.length))
else catx(' ',BASE.name,cats('length=',BASE.length))
end

,cats('drop xyz_',BASE.varnum,';')

into :rename separated by ' '
, :assign separated by ' '
, :length separated by ' '
, :drop separated by ' '

from dictionary.columns BASE, dictionary.columns DATA

where upcase(BASE.name)=upcase(DATA.name)
and base.length ne data.length
and BASE.libname='WORK' and BASE.memname='A'
and DATA.libname='WORK' and DATA.memname='B'

order by BASE.varnum
;
quit;

%put rename: &rename;
%put assign: &assign;
%put length: &length;
%put drop: &drop;


data V_B / view=V_B;
set B(rename=(&rename));
attrib &length;
&assign
&drop
run;

proc append base=A data=V_B force nowarn;
quit;
Ksharp
Super User
Hi.Patrick
My code is also not to re-write dataset, it is directly to modify the attribute of variables.
It is good to see you for a long time not to see your post.
Is it still too busy to post?

Cheers
Ksharp
Patrick
Opal | Level 21
Hi Ksharp

Yes, unfortunately (?) I've been really busy lately.

And: I still think that changing the length of a variable needs re-creating the SAS file.

Length is not a simple variable attribute like a format or a label. I assume that's also the reason why you can't change it with Proc Datasets the way you change other attributes.

Considering that SAS stores data rectangular the SAS file must be re-created in order to change the variable length (a view kind of also re-creates the table in memory).

Just run the following code and compare the file size of test1 and test2 and you will see:

options compress=no;

libname test 'c:\temp';

data test.test1 test.test2;
length var $ 10;
do i=1 to 1000;
var=put(i,8.);
output test.test1;
output test.test2;
end;
run;

proc sql;
alter table test.test2
modify var char(1000);
;
quit;


Cheers
Patrick Message was edited by: Patrick
Peter_C
Rhodochrosite | Level 12
nothing seems to solve this so far, so:
data B ;
* load B with the type+length attributes of A ;
if 0 then set A ;
* this will load ALL columns of A as well (not just those in B) ;

* load values from B ;
set B ;
run ;

* and add B records to A;
proc append base= A data= B force ;
run;
The data step reloading B, will get error[pre]
ERROR: Variable XXXXXXX has been defined as both character and numeric[/pre] if you have a variable on A and B with different types.

Character data in B will truncate if A lengths are not wide enough to hold B strings!

Similarly, if numeric columns in A have shorter (stored) lengths than corresponding columns in B, numeric precision will be affected!


CBSXB
To avoid the "ERROR"-condition above, what are your rules for conversion between character and numeric data types?
Do you need special handling when $lengths in A are not wide enough for corresponding columns in A?


peterC
Patrick
Opal | Level 21
Hi Peter

I was very aware that the code I suggested doesn't handle character/numeric missmatch - and it's on purpose that my code will throw an error for such a case(enough is enough...).

Also on purpose I've set up the example in a way that strings get truncated - just to show that may be modifying variable lengths is not really the solution to the problem.

I believe what one should do is to define appropriate variable lengths explicitly (attrib statement) so length matching issues won't occur at all.

The amendment to the code I've posted could be to compare variable lengths between base and transaction data set and change the length in the data set where it's shorter.

Cheers
Patrick
Peter_C
Rhodochrosite | Level 12
Patrick
It's easy to agree with you.
I might go further than enter a sensible policy discussion.
The original poster requested techniques to solve what looks like a real problem when generalised. However, we don't know enough about the environment of the problem - perhaps it is simpler than the original question implies (to us).
It would be good to hear more from that poster.

Regards
Peter
Ksharp
Super User
Dear Peter:
I also think the biggest challenge is to change variable type between two tables.
So, I think it might will be more easy that firstly changing all the variables type into character, after proc append then change the variable type which is numeric originally in dataset A. How do you think?

Ksharp
Cheers from BeiJing ,China
Ksharp
Super User
Hi.
Patrick.
I also notice this interesting thing that proc datasets can not change the length of variable,
But proc sql can do really,Is it strange?.
If you use proc contents or see the attribute of dataset ,then you will see the length of variable is really change after sql.
But data new; set old; attrib ...; will really load dataset once ,it yields to low efficient.

Ksharp Message was edited by: Ksharp
Patrick
Opal | Level 21
Hi Ksharp

But this is exactly the point: Also SQL alter table must re-create the SAS file.

Length is just more than only a simple variable attribute in the descripter part. The length directly connects to how data gets stored - so changing the length just must result in re-writing the data part whatever SAS syntax is used.

Cheers
Patrick
Peter_C
Rhodochrosite | Level 12
interesting thought occurs.

SAS stores data as we request it (having requested formats, lengths, compression, etc.).

to be used in a PROC or a data step, short length numerics are expanded to length 8. When we request compression, the storage must be expanded to be used.

New string lengths, could "be requested".
It would be similar to "requesting a new column order".
No data moves (except in data set header) until the data portion is actually used.
The original data would not be affected - just new data derived from the old data would be subject also to these "requests"
These "requests" could be made with PROC DATASETS.

well that is just in my imagination - it is not what happens - as we know it for real!

peterC SAS already achieves this and more! The technology is not in PROC DATASETS - it's referred to as DATA set VIEW!
very little not already built in!
peterC 07:55 29-Apr-11.
Message was edited by: Peter.C
Ksharp
Super User
Hi.Patrick.
It looks like You have more insight than me .But I have not found something about it in documentation. Education!
:)
Ksharp
LeonidBatkhan
Lapis Lazuli | Level 10

This scenario is described in a blog post Changing variable type and variable length in SAS datasets where you can find SAS code example on how to change variable attributes of a SAS data table such as type and length. It seems to be a common problem when appending SAS data tables with slighty different column attributes.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1985 views
  • 0 likes
  • 7 in conversation