[clug] Document metadata conversion?

Brad Hards bradh at frogmouth.net
Sat Jul 16 21:12:18 MDT 2011


Hi,

I'm working on a parser that reads RTF and converts to other formats, such as 
a QTextDocument, or to Open Document Format (.odt).
Its at https://launchpad.net/rtf-qt

I've got a "user expectation" question about metadata.

Lets say the RTF document has "date/time created", "date/time modified" and 
"date/time printed" metadata. If I'm using my parser as an import filter (to a 
word processor such as Calligra Words), then presumably I should just pass all 
that through. Conversion to ODF is an implementation detail that the user 
shouldn't be aware of - they're just "opening the RTF file".

However if I'm working on a command line tool that does the conversion and 
writes out ODF, I'm not sure about what is the right thing to do. In 
particular, should the "date/time created" of the output document be the value 
from the import document, or from the time that the conversion process was 
run?

Conceptually, is this metadata a property of the file, or of the document? 

It appears to me that some metadata (like the author or title) is a property 
of the document. Other metadata (like the generator) is a property of the file.
But this leads to a lot of subjectivity.

Has anyone got a good approach to this? Thoughts or suggestions?

Brad


More information about the linux mailing list