BookmarkSubscribeRSS Feed
Yiting
Quartz | Level 8

Are they the same? just sas data table files? TIA.

11 REPLIES 11
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What is a hddata file?  I can find no standard explanation of what that file extension is, so its not a well know application.  Where did you get it?  Can you open it in notepad and see text?  This is the problem with people using non-standard file extensions.  

.sas7bdat is a SAS software proprietary file format which contains data, is a binary file, and can be directly opened in SAS.

.xlsx for instance is a proprietary MS file format for storing Excel files.

.hddata - is something apparently made up by someone to make everyone else's life harder.

 

Try opening in notepad or some text editor, maybe its a csv or other delimited file which they have just named badly, if not go back to source and ask them to explain.

Yiting
Quartz | Level 8

Hi

 

Thank you. No, this is not some data I got, but something from some codes I got. They are like this:

 

data Strokeproc;
 set Strokes2.hddata;

 § do something here
run;

 

I did suspect hddata is a column name is the Stroke2 data file, but no. There is no such a column.

 

I am quite confused by the use of . in sas codes. Is . always used between a data file name and a column name and library name and data set name? Can you have a look at the codes below and guess what a and b and Strokeep are? Strokeep is not found as a column in fakedataiii...

 

proc sql;
 create table Strokeproc2 as
 select a.indate, b.*
 from fakedataiii.Strokeep a left join trans b     
 on a.lopenr=b.lopenr;
quit;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

SQL is slightly different.  It uses something called aliases to refer to input datasets so:

 

select a.indate,   <- take the variable indate from the dataset with alias a

from fakedataiii.Strokeep a   <- set alias a to reference library fakedataiii, dataset strokeep.


left join trans b    <- set alia b to reference work library, dataset trans

on a.lopenr=b.lopenr;  <- join alias a and b on variable lopenr from both

 

 

This on the other hand:

 set Strokes2.hddata;

Is telling you that in the library strokes2 - which refers to a physical path on your machine or network drive, there is a dataset called hddata.  This hddata will have either a sas7bdat or sd2 file extension (although might be an older one).  

Yiting
Quartz | Level 8

Thank you so much! 

 

I thought hddata was extension name of dataset Strokes2! Or a column of dataset Strokes2. Right, it should be that Strokes2 is the library name and hddata dataset name. "set" should always be followed by a dataset, right?

 

The SQL part is also confusing to me. I do not have the data, only the codes. Is it correct to say a is the alias of Strokes2.Strokeep and b is the alias of trans? And isn't it better to write as this (according to this link: https://blogs.sas.com/content/sasdummy/2010/02/11/alias-a-table-by-any-other-name/)

 

proc sql;
 create table Strokeproc2 as
 select a.indate, b.*
 from Strokes2.Strokeep as a left join trans as b     
 on a.lopenr=b.lopenr;
quit;

 

 

Kurt_Bremser
Super User

@Yiting wrote:

Is it correct to say a is the alias of Strokes2.Strokeep and b is the alias of trans? And isn't it better to write as this (according to this link: https://blogs.sas.com/content/sasdummy/2010/02/11/alias-a-table-by-any-other-name/)

 




Indeed you are right about the aliases. It is up to you what kind of aliases you use. t1, t2 etc are just what the SAS query builder uses internally, so you could use that strategy, or use single characters as we "lazy typers" prefer 😉

 

The difference between two-level dataset names like Strokes2.Strokeep and single-level names like trans is that names without a library do implicitly point to the temporary library WORK.

Yiting
Quartz | Level 8

Thanks!  I jus do not understand in the codes I originally posted, the from line, how can the original coder omit "as" and still make the codes run... I thought you must have as the indicate who is alias for whom...

Kurt_Bremser
Super User

"as" in the SQL "from" part of a select is optional. You can use it or omit it, the only important thing there is the correct setting of the comma to separate contributing tables.

When using "create table" (or similar), "as" is a mandatory keyword, OTOH.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Can I suggest you do two things to help:

Run through the basic SAS videos, you can find in:

https://video.sas.com/

 

Then run through the basic SQL usage:

https://www.w3schools.com/sql/

 

This will cover most of your questions.

In SAS:
<libname>.<dataset> 

or

work.<dataset> 

The above defaults when library is not provided.

 

In SQL:

<table> as <alias>

or 

<table> <alias> 

The above the as is assumed.

Yiting
Quartz | Level 8

thank you very much! I was told those were SAS codes, so I digged hard in SAS, not to know that they were actually SQL codes... ! I did not know SAS and SQL have a relationship! with the second link, everything clear now.

ballardw
Super User

@Yiting wrote:

Hi

 

Thank you. No, this is not some data I got, but something from some codes I got. They are like this:

 

data Strokeproc;
 set Strokes2.hddata;

 § do something here
run;

 

I did suspect hddata is a column name is the Stroke2 data file, but no. There is no such a column.

 

I am quite confused by the use of . in sas codes. Is . always used between a data file name and a column name and library name and data set name? Can you have a look at the codes below and guess what a and b and Strokeep are? Strokeep is not found as a column in fakedataiii...

 

proc sql;
 create table Strokeproc2 as
 select a.indate, b.*
 from fakedataiii.Strokeep a left join trans b     
 on a.lopenr=b.lopenr;
quit;


In a data step the usage for the . between items is mostly LIBRARY.DATASETNAME

Other uses involve SCL programming such as HASH objects and methods.

SQL, which originated outside of SAS and was incorporated later has the usage TABLE.COLUMN. Since datasets are treated as tables in PROC SQL which makes a reasonable attempt to conform to standard "SQL" that means the Library.dataset syntax would be confusing. SO datasets are often referred to with an alias to associate the table part of the table.column syntax with the standard SAS library.dataset. It does not help that a large number of SQL users are lazy and do not use the predicate AS to indicate explicitly where the alias is occurring:

proc sql;
   create table Strokeproc2 as
   select a.indate, b.*
   from fakedataiii.Strokeep as a 
        left join 
        trans as b     
   on a.lopenr=b.lopenr;
quit;

makes it easier to see that a.lopenr is the lopenr variable in fadedataii.strokeep and b.lopenr is in the data set trans.

 

Yiting
Quartz | Level 8

Thank you very much! Yes, without AS, it was so hard to see who is who... !

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3168 views
  • 5 likes
  • 4 in conversation