Data::Storage |
Data::Storage - Interface for accessing various Storage implementations for Perl in an independent way
Data Storage "Where is the wisdom? Lost in the knowledge. Where is the knowledge? Lost in the information." - T.S. Eliot "Where is the information? Lost in the data. Where is the data? Lost in the #@$%?!& database." - Joe Celko
from: MacPerl: Power and Ease - Chapter 15 url: http://www.macperl.com/ptf_book/r/MP/330.Data_Storage.html
- should encapsulate Tangram, DBI, DBD::CSV and LWP:: to access them in an unordinary (more convenient) way ;) - introduce a generic layered structure, refactor *SUBLAYER*-stuff, make (e.g.) this possible: Perl Data::Storage[DBD::CSV] -> Perl LWP:: -> Internet HTTP/FTP/* -> Host Daemon -> csv-file - provide generic synchronization mechanisms across arbitrary/multiple storages based on ident/checksum maybe it's possible to have schema-, structural- and semantical modifications synchronized??? - might be similar to http://sourceforge.net/projects/perl-repository
# connect to LDAP my $ldapLocator = Data::Storage::Locator->new( ldap => { type => "NetLDAP", dsn => "ldap:host=192.168.10.150;binddn='cn=root, o=netfrag.org, c=de';pass=secret", basedn => "o=netfrag.org, c=de", want_transactions => 0, syncable => 1, }, ); my $ldapStorage = Data::Storage->new($ldapLocator); $ldapStorage->connect();
# connect to MAPI my $mapiLocator = Data::Storage::Locator->new( outlook => { type => "MAPI", showProfileChooser => $self->{config}->get("mapi_showProfileChooser"), ProfileName => $self->{config}->get("mapi_ProfileName"), ProfilePass => $self->{config}->get("mapi_ProfilePass"), syncable => 1, }, ); my $mapiStorage = Data::Storage->new($mapiLocator); $mapiStorage->connect();
This functionality is (in the meanwhile) provided by the Data::Transfer::Sync module.
my $nodemapping = { 'LangText' => 'langtexts.csv', 'Currency' => 'currencies.csv', 'Country' => 'countries.csv', };
my $propmapping = { 'LangText' => [ [ 'source:lcountrykey' => 'target:country' ], [ 'source:lkey' => 'target:key' ], [ 'source:lvalue' => 'target:text' ], ], 'Currency' => [ [ 'source:ckey' => 'target:key' ], [ 'source:cname' => 'target:text' ], ], 'Country' => [ [ 'source:ckey' => 'target:key' ], [ 'source:cname' => 'target:text' ], ], };
s ub syncResource {
my $self = shift; my $node_source = shift; my $mode = shift; my $opts = shift; $mode ||= ''; $opts->{erase} ||= 0; $logger->info( __PACKAGE__ . "->syncResource( node_source $node_source mode $mode erase $opts->{erase} )"); # resolve metadata for syncing requested resource my $node_target = $nodemapping->{$node_source}; my $mapping = $propmapping->{$node_source}; if (!$node_target || !$mapping) { # loggger.... "no target, sorry!" print "error while resolving resource metadata", "\n"; return; } if ($opts->{erase}) { $self->_erase_all($node_source); } # create new sync object my $sync = Data::Transfer::Sync->new( storages => { L => $self->{storage}->{backend}, R => $self->{storage}->{resources}, }, id_authorities => [qw( L ) ], checksum_authorities => [qw( L ) ], write_protected => [qw( R ) ], verbose => 1, ); # sync # todo: filter!? $sync->syncNodes( { direction => $mode, # | +PUSH | +PULL | -FULL | +IMPORT | -EXPORT method => 'checksum', # | -timestamp | -manual source => "L:$node_source", source_ident => 'storage_method:id', source_exclude => [qw( id cs )], target => "R:$node_target", target_ident => 'property:oid', mapping => $mapping, } );
}
# create a new synchronization object my $sync = Data::Transfer::Sync->new( 'sync_version' => $sync_version, __parent => $self );
# configure the synchronization-object $sync->configure( source => { storage => { handle => $mapiStorage, #isIdentAuthority => 1, #isChecksumAuthority => 1, #writeProtected => 1, }, }, target => { storage => { handle => $ldapStorage, #idAuthority => 1, #isChecksumAuthority => 1, #isWriteProtected => 0, }, }, verbose => 1, );
This module heavily relies on DBI and Tangram, but adds a lot of additional bugs and quirks. Please look at their documentation and/or this code for additional information.
For full functionality: DBI from CPAN DBD::mysql from CPAN Tangram 2.04 from CPAN (hmmm, 2.04 won't do in some cases) Tangram 2.05 from http://... (2.05 seems okay but there are also additional patches from our side) Class::Tangram from CPAN DBD::CSV from CPAN MySQL::Diff from http://adamspiers.org/computing/mysqldiff/ ... and all their dependencies
Data::Storage is a module for accessing various "data structures / kinds of structured data" stored inside various "data containers". We tried to use the AdapterPattern to implement a wrapper-layer around known CPAN modules. (e.g. DBI, Tangram, XML::Simple) References: - http://c2.com/cgi/wiki?AdapterPattern - http://home.earthlink.net/~huston2/dp/adapter.html
You will get a better code-structure (not bad for later maintenance) in growing Perl code projects, especially when using multiple database connections at the same time. You will be able to switch between different _kinds_ of implementations used for storing data. Your code will use the very same API to access these storage layers. ... implementation has to be changed for now Maybe you will be able to switch "on-the-fly" without changing any bits in code in the future.... ... but that's not the focus
Having this, we were able to do implement a generic data synchronization module more easy, please look at Data::Transfer.
The Data::Storage module is Copyright (c) 2002 Andreas Motl. All rights reserved. You may distribute it under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.
Larry Wall for Perl, Tim Bunce for DBI, Jean-Louis Leroy for Tangram and Set::Object, Sam Vilain for Class::Tangram, Jochen Wiedmann and Jeff Zucker for DBD::CSV & Co., Adam Spiers for MySQL::Diff and all contributors.
Data::Storage is free software. IT COMES WITHOUT WARRANTY OF ANY KIND.
o interface with Jeff Zucker's AnyData:: modules, e.g. AnyData::Storage::RAM o what about DBD::RAM? (DBD::RAM - a DBI driver for files and data structures) o use DBD::Proxy! o what about DBIx::AnyDBD? o enhance schema information: - DBIx::SystemCatalog - DBIx::SystemCatalog::MSSQL? - Data::Reporter
``DBI-Error [Tangram]: DBD::mysql::st execute failed: Unknown column 't1.requestdump' in 'field list'''
... occours when operating on object-attributes not introduced yet: this should be detected and appended/replaced through: "Schema-Error detected, maybe (just) an inconsistency. Please check if your declaration in schema-module "a" matches structure in database "b" or try to run" db_setup.pl --dbkey=import --action=deploy
Compare schema (structure diff) with database ...
... when issuing "db_setup.pl --dbkey=import --action=deploy" on a database with an already deployed schema, use an additional "--update" then to lift the schema inside the database to the current declared schema. You will have to approve removals and changes on field-level while new objects and new fields are introduced silently without any interaction needed. In future versions there may be additional options to control silent processing of removals and changes. See this CRUD-table applying to the actions occouring on Classes and Class variables when deploying schemas, don't mix this up with CRUD-actions on Objects, these are already handled by (e.g.) Tangram itself. Classes: C create -> yes, handled automatically R retrieve -> no, not subject of this aspect since it is about deployment only U update -> yes, automatically for Class meta-attributes, yes/no for Class variables (look at the rules down here) D delete -> yes, just by user-interaction Class variables: C create -> yes, handled automatically R retrieve -> no, not subject of this aspect since it is about deployment only U update -> yes, just by user-interaction; maybe automatically if it can be determined that data wouldn't be lost D delete -> yes, just by user-interaction It's all about not to be able to loose data simply while this is in pre-alpha stage. And loosing data by being able to modify and redeploy schemas easily is definitely quite easy. As we can see, creations of Classes and new Class variables is handled automatically and this is believed to be the most common case under normal circumstances.
- Get this stuff together with UML (Unified Modeling Language) and/or standards from ODMG. - Make it possible to load/save schemas in XMI (XML Metadata Interchange), which seems to be most commonly used today, perhaps handle objects with OIFML. Integrate/bundle this with a web-/html-based UML modeling tool or some other interesting stuff like the "Co-operative UML Editor" from Uni Darmstadt. (web-/java-based) - Enable Round Trip Engineering. Keep code and diagrams in sync. Don't annoy/bother the programmers. - Add support for some more handlers/locators to be able to access the following standards/protocols/interfaces/programs/apis transparently: + DBD::CSV (via Data::Storage::Handler::DBI) (-) Text::CSV, XML::CSV, XML::Excel - MAPI - LDAP - DAV (look at PerlDAV: http://www.webdav.org/perldav/) - Mbox (use formail for seperating/splitting entries/nodes) - Cyrus (cyrdeliver - what about cyrretrieve (export)???) - use File::DiffTree, use File::Compare - Hibernate - "Win32::UserAccountDb" - "*nix::UserAccountDb" - .wab - files (Windows Address Book) - .pst - files (Outlook Post Storage?) - XML (e.g. via XML::Simple?) - Move to t3, look at InCASE - some kind of security layer for methods/objects - acls (stored via tangram/ldap?) for functions, methods and objects (entity- & data!?) - where are the hooks needed then? - is Data::Storage & Co. okay, or do we have to touch the innards of DBI and/or Tangram? - an attempt to start could be: - 'sub getACLByObjectId($id, $context)' - 'sub getACLByMethodname($id, $context)' - 'sub getACLByName($id, $context)' ( would require a kinda registry to look up these very names pointing to arbitrary locations (code, data, ...) ) - add more hooks and various levels - better integrate introduced 'getObjectByGuid'-mechanism from Data::Storage::Handler::Tangram
Specs: UML 1.3 Spec: http://cgi.omg.org/cgi-bin/doc?ad/99-06-08.pdf XMI 1.1 Spec: http://cgi.omg.org/cgi-bin/doc?ad/99-10-02.pdf XMI 2.0 Spec: http://cgi.omg.org/docs/ad/01-06-12.pdf ODMG: http://odmg.org/ OIFML: http://odmg.org/library/readingroom/oifml.pdf
CASE Tools: Rational Rose (commercial): http://www.rational.com/products/rose/ Together (commercial): http://www.oi.com/products/controlcenter/index.jsp InCASE - Tangram-based Universal Object Editor Sybase PowerDesigner: http://www.sybase.com/powerdesigner UML Editors: Fujaba (free, university): http://www.fujaba.de/ ArgoUML (free): http://argouml.tigris.org/ Poseidon (commercial): http://www.gentleware.com/products/poseidonDE.php3 Co-operative UML Editor (research): http://www.darmstadt.gmd.de/concert/activities/internal/umledit.html Metamill (commercial): http://www.metamill.com/ Violet (university, research, education): http://www.horstmann.com/violet/ PyUt (free): http://pyut.sourceforge.net/ (Dia (free): http://www.lysator.liu.se/~alla/dia/) UMLet (free, university): http://www.swt.tuwien.ac.at/umlet/index.html Voodoo (free): http://voodoo.sourceforge.net/ Umbrello UML Modeller: http://uml.sourceforge.net/
UML Tools: http://www.objectsbydesign.com/tools/umltools_byPrice.html
Further readings: http://www.google.com/search?q=web+based+uml+editor&hl=en&lr=&ie=UTF-8&oe=UTF-8&start=10&sa=N http://www.fernuni-hagen.de/DVT/Aktuelles/01FHHeidelberg.pdf http://www.enhyper.com/src/documentation/ http://cis.cs.tu-berlin.de/Dokumente/Diplomarbeiten/2001/skinner.pdf http://citeseer.nj.nec.com/vilain00diagrammatic.html http://archive.devx.com/uml/articles/Smith01/Smith01-3.asp
Data::Storage |