sas - Standardising dataset attributes across projects -


background:

i have multiple old projects need standardise (prj01-prj10). each stored under own libname , each has around 30 datasets (note: not studies have same 30 datasets).

the variable names have remained consistent across projects. however, on years, labels , formats have been assigned these variable names have changed in places - example below:

attribute inconsistencies between studies:

data prj01.users(label='user identifiers') ;   attrib userid label='username' format=$20. ; run ;  data prj02.users(label='user identifiers') ;   attrib userid label='name of user' format=$15. ; run; 

attribute inconsistencies within studies:

data prj02.users(label='user identifiers') ;   attrib userid label='name of user' format=$15. ; run;   data prj02.orders(label='orders') ;   attrib userid  label='name of user' format=$15.)           orderno label='order number' format=8. ; run ; 

i have written program report inconsistencies. however, need generate 'tidy' copies of projects giving them standardised structure. current thinking should create dataset of standard variables below can add , adjust until have defined in there:

data standards ;   attrib userid  label='username                                ' format=$20.           orderno label='order number                            ' format=8. ;run ; 

question:

from standards dataset, best way apply attributes ever these variables exist?

i write output datasets new libnames eg: prj01.users --> prjstd01.users , put errors log if there variables changed variable length getting truncated.

create dictionary table containing standards:

name     label         format userid   username      $20. orderno  order number  8. 

join dictionary table containing column names in library:

proc sql; create table standards2 select   d.memname,   s.name,   s.label,   s.format   sashelp.vcolumn d   inner join standards s   on d.name = s.name   libname eq 'prj01' order   d.memname,   s.name ; quit; 

to this:

memname    name     label         format users      userid   username      $20. orders     userid   username      $20. orders     orderno  order number  8. 

then read data set using put statements create proc datasets performs modifications.

filename gencode temp; data _null_;   set standards2 end=eof;   memname;   file gencode;   if _n_ = 1 put       "proc datasets lib=prj01 nolist;";   if first.memname put "  modify " memname ";";   put                       "  label " name "='" label "';";   put                       "  format " name format ";";   if eof put           "quit;"; run; %include gencode / source2; filename gencode clear; 

(stolen this paper)

you should able modify match rest of requirements (copying new libraries, iterating on projects).


Comments

Popular posts from this blog

Java 3D LWJGL collision -

spring - SubProtocolWebSocketHandler - No handlers -

methods - python can't use function in submodule -