So in essence, you're motivated by the same underlying concept as the Plan 9/Inf...

grandalf · on Oct 29, 2014

> what advantages does having the RDBMS be the prime metaphor over the file system?

Easy queryability and easy joining of data across a whole datacenter.

This makes it easy to think about system data across all sorts of boundaries that the file metaphor makes somewhat cumbersome.

trhway · on Oct 29, 2014

> This makes it easy to think about system data across all sorts of boundaries that the file metaphor makes somewhat cumbersome.

while file metaphor became big in previously SQL domain of data processing, i.e. the whole ecosystem of HDFS and everything on top of it

electrum · on Oct 29, 2014

At the end of the day, almost everyone queries those HDFS files with query language using Pig, Hive, Presto, etc.

geofft · on Oct 30, 2014

"Everything is a file" means that you need to parse files to get useful data out (think things like /etc/mtab and /proc/mounts), which is closely tied to another UNIX philosophy, that tools should generate plain text and parse it using generic text-processing tools. This is great for getting things done quickly. It's also great for security holes (think CVE-2011-1749 and related issues; arguably, see also Shellshock).

One advantage of "everything is a table" is that your structures are well-formatted and there's no risk of problems when you put a space in a pathname. For most implementations of "table", you can also have the data formats be well-typed. This brings reliability and security benefits.

I think there's validity to Rob Pike's argument in many contexts -- for instance, you absolutely won't see me defending the semantic web over the greppable/Googleable one. But in the specific case of text files with a single, well-defined structure, his own argument seems to imply that there's no sense in a second tool having to infer the structure on its own.

(The usual way this is worked around these days is separate files for each field, or files designed to be parseable, which is why Linux's /proc/*/ is such a mess. Compare /proc/self/stat and /proc/self/status, and /proc/self/mounts and /proc/self/mountinfo. Also look around /sys a bit.)

harelba · on Oct 31, 2014

There's a command line tool called q, which allows performing SQL-like queries directly on text files, basically treating text as data and auto detecting column types.

http://harelba.github.io/q/

geofft · on Nov 2, 2014

Neat, but auto-detection is exactly what I don't want. We have structure on one side. Why round-trip it through an unstructured format and attempt to guess the exact same structure on the other side? If I guess wrong, it's a security hole.