Toolbox
  • Printable version
 
Toolbox
LANGUAGES
Language
Personal tools
Wikipedia Affiliate Button

Talk:Versioning Structured Data

From BrightByte

Jump to: navigation, search

Semantic wikipedia stores data as attributes ((Berlin) (has population) of (number)) and relations ((Berlin) (is capitol of) (Germany)) but real life is a bit more complicated. Each fact will need to be qualified.

(Berlin) (has population) (number) (date this was valid) (citation for the information)
(Berlin) (was capitol of) (Bismark's Germany) from (date) to (date) (citation)
(Berlin) (was capitol of) (Hitler's Germany) from (date) to (date) (citation)
(Berlin) (was capitol of) (East Germany) from (date) to (date) (citation)
(Berlin) (was capitol of) (Reunited Germany) from (date) to (date) (citation)

How would you cope with this?

Yes, qualifying properties/relations is an additional issue. Date/Time ranges, source information, usnits of measurement, margins of error, etc. RDF solves it using a construct called Reification, but it's cumbersome and inefficient. Topic Maps make it a bit easier. With a document-oriented database that allows complex values, storing and managing this information should be easy, the tricky bit is a) declaring what is valid and b) parsing the legacy text-based mixed up values. If the database only supports references to complex objects, not complex values, managing this kind of information becomes more annoying/inefficient, though still possible (RDF's reification works like this). -- Daniel 09:00, 4 August 2010 (UTC)

B91Wur Great blog.Much thanks again. Much obliged.

[edit] Hmmm

If you already have records with IDs, then the versioning doesn't seem very different from pages in a normal wiki. But how do you assign those IDs and what do the records contain? What does a "reference to another record" look like? I think you need many small experiments as proof of concept before we have to worry about scalability and version history. --LA2

Not very different, no. The idea was to outline a way how versioning can be implemented on top of existing document based database systems.
Data granularity is of course an issue, but as a rule of thumb, all properties of one subject (that is, a thing described by a wikipedia article) would constitute a "data record". References to other records would be a pair of IDs, as explained. -- Daniel 09:33, 6 August 2010 (UTC)

jPmsT3 Thanks for the blog.Thanks Again. Will read on...

[edit] DataTransclusion Extension

I am not sure if this the only to ask questions about the DataTransclusion extension. Is it possible to get a couple of examples for what to put in the localSettings.php file to create sources, say one for a fictional database and one for web access? Thanks.