Interesting project. Out of curiosity, why did you compare against kdb+ since th...

ggleason · on April 14, 2020

The reason is that we were positioning to deal with customers who had financial data which was stored as time-series. We aren't hoping to compete with kdb+ on speed (which would be hopeless) but we have a prototype of Constraint Logic Programming [CLP(fd)] based approaches to doing time queries which is very expressive and which we hope to roll out in the main product in the near future on hub.

The graph database is still in its infancy and there are a lot of graph query languages about. We played around with using some that already exist (especially SPARQL) but decided that we wanted a number of features that were very non-standard (such as CLP(fd)).

Using JSON-LD as the definition, storage and interchange format for the query language has advantages. Since we can marshall JSON-LD into and out of the graph, it is easy to store queries in the graph. It is also very simple to write query libraries for a range of languages by just building up a JSON-LD object and sending it off.

We are firmly of the belief that datalog-style query languages which favour composibility will eventually win the query language wars - even if it is not our particular variety which does so. Composibility was not treated as centrally as it should have been in most of the graph languages.

teleforce · on April 16, 2020

I have two questions regarding the time-series aspect of TerminusDB:

1) Does TerminusdB support raw time-series data (one dimensional) for example electrocadiogram (ECG)? At the moment we have them including the metadata in column based CSV format. FYI, the size is around 1 MB for one minute raw ECG data duration.

2) For automated ECG analysis the data is transformed using time-frequency distribution (two dimensional), and the intermediate data must be kept in-memory for feature extraction purpose. Just wondering does TerminusDB can support this intermediate format/structure of time-frequency as well? FYI, the one minute time-frequency ECG data transformed from (1) will need around 4GB of working memory. For real-time analysis of longer ECG data duration from (1), for example 30 minutes duration or 30 MB data size, we need around 3 TB of working memory.

b3tt3rw0rs3 · on April 14, 2020

Composeability is front and center in GQL, so you may want to consider it.

LukeEF · on April 14, 2020

We will 100% consider it and have been engaging with the community about the best approach. Cypher is by far the biggest graph query language and they seem to have the most weight in the conversation so far, but we are going to try to represent datalog as far as possible. Even if woql isn't the end result we think datalog it is the best basis for graph query so we'll keep banging the drum (especially as most people realize that composability is so important)