[clug] Watercooler - was Open Source Developers in CBR

Wed Dec 29 04:48:47 UTC 2021

Thought I might revive this thread with an update on my grand project.

First, some background. When I retired I had the title of "Software
Architect". The role was to do the high level design for software
systems. I came to realise that there were very few tools available to
assist, with much of the work being done with word processors.

The result was that the design documents were often out of date and
frequently a confused mess that made it difficult to find the
information you were looking for.

My idea was to put all the design information into a knowledge database
and then create reports as needed and tailored to the task at hand.
Thus the network engineers could get a network view of the design, the
database administrators a database view, etc.

A knowledge database is usually based on a linked data system, such as
RDF, OWL (the Web Ontology Language), or Neo4j. This provides the
flexibility necessary when trying to describe a design.

The trouble is that this flexibility makes reporting a bit
"interesting". Report systems based on the rigid structures of
relational databases are not useful except for those parts of the
design that are well structured. The danger is that the important parts
of the design (the non-standard bits) get left out of the reports.

What is needed is something that can explore the knowledge database
with some fuzzy guidelines [such as "networking"] and organise the
results into a coherent document.

So, for the last year I have been studying academic papers on text
generation and developing various tools that I think will be required
to do some experiments in text generation. Currently I have a C++ RDF
library, a GUI for viewing and editing RDF, an inference engine, a
program for creating markdown from an RDF description of a document,
and a program that converts an RDF grammar tree into English sentences.
I am currently working on a rule engine.

The next step is to test the rule engine to see if I can create reports
from well structured data, such as an RDF schema. Then it will be time
to try to teach the computer to write English! I think 2022 is going to
be interesting.

I must admit that after reading the academic papers on the subject I
was a bit depressed. The main stumbling block was sorting the data from
the database into a coherent sequence. They all used the approach of
developing a plan based on some communicative goal and then trying to
find data to satisfy the plan. This has the problem that some important
data can be left out if it doesn't happen to fit the plan - the same
problem I started with. Attempts to use all the data ended up with
exponential running times.

It occurred to me that I usually start with what want I want to say and
then devise a suitable plan for expressing it, not the other way
around. I devised a small experiment to try that idea and was
pleasantly surprised to find that it put the data in a nice coherent
order and did so as quickly as any good sorting routine.

Probably getting ahead of myself, but I cannot help wondering if that
coherent sorting routine might be able to be adapted to 2 dimensions so
it can be used for composing diagrams and data entry forms.

So, that is the state of my grand project. If you want to know more, or
are interested in linked data, software architecture, or are a whiz at
English linguistics, please feel free to contact me.

Brenton

> 
> Links:
> https://sourceforge.net/p/ocratato-sassy/nlg/code/ 
> 
> https://sourceforge.net/p/ocratato-sassy/rdfxx/code/
> 
> https://sourceforge.net/p/ocratato-sassy/rdfxx/code/
>