« Erdös Numbers | Main | opensource.nokia.com »

2005-10-21

More Wilbur2 progress

I have made good progress with the Wilbur2 manual. It is very useful to write a document that - supposedly - would help other understand your thinking (about design, that is). I have discovered things that I now want to change. Here are some of them:

  1. I no longer see any reason to have two separate packages for the Wilbur code, so I will merge the two; for compatibility reasons "NOX" will be made a nickname of the "WILBUR" package.

  2. It seems that proper handling of information about where triples came from cannot be postponed any longer (I was inspired by this thread on the Semantic Web Interest Group mailing list). The current Wilbur design does not allow "duplicate" triples, but records at most one source with every triple. What if multiple documents assert the same triple - we still would like only one in the database, right?. Consequently, the triple class will be changed to allow multiple sources, and when a source is deleted (as happens when you reload a source, for example), only those triples are deleted that came only from that source.

  3. Regarding #2, we still need to give the option of deleting a triple from all sources, when individual triples are being deleted.

  4. I am wondering if there are applications where one really does not need to record where triples came from?

  5. Many function signatures in the "Data Source Loading Protocol" will change, as I have rethought the design; no worries, though, because db-load will still stay the same.

Any comments regarding any aspects of Wilbur design are always welcome.

Posted by ora at 10:06

Comments

  1. I have a vCard parser that relies on NOX; the XML parser is a very neat design, and isn't too tied to Wilbur. I think they should be separate for conceptual reasons.

  2. I wasn't aware of the source-overwriting aspect! Hmm.

  3. db-del-triple, db-del-triple-with-source?

  4. During programmatic creation. E.g., when I'm recording meta-data about the application, users, etc. in RDF, it's not being loaded from anywhere. Also, any "pure RDF" app, because triples have no notion of a source.

Looking forward to the manual :)

Posted by: Rich at October 21, 2005 11:13 AM

Hmm...

  1. We can have NOX as a separate component, but I will import the NOX symbols into WILBUR so that package prefixes are not necessary anymore.

  2. Sorry...

  3. I was thinking more an optional parameter for DB-DEL-TRIPLE to indicate source; omitting would nuke the triple for good.

  4. Agreed.

Thanks!

Posted by: Ora Lassila at October 21, 2005 03:27 PM

  1. Sounds good!
  2. No worries :) I think I was under the impression that multiple triple objects were distinct if their sources were different, but of course that's not the case when stored in a DB -- I do enough stuff with multiple databases (and thus lists of triples) that I'd got used to working with triples outside of the indexed DBs (hence the equality functions in twinql, IIRC).
  3. That works for me!
  4. Perhaps, given this, it would be necessary to have NIL (or T; I suppose it doesn't matter) as an allowable source. Then we can DB-DEL-TRIPLE and specify a source (by providing a node), those with no source (NIL), or 'nuke all' (T).

Posted by: Rich at October 22, 2005 07:48 PM