Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So in essence, you're motivated by the same underlying concept as the Plan 9/Inferno developers: define a small set of abstractions and apply them ruthlessly, in contrast to the myriad of non-uniform interfaces one is practically using every day in an operating system.

For Plan 9, it was "everything is a file" and the power of using simple operations like bind mounts to create complex software interactions that would otherwise require monolithic protocol and library stacks anywhere else.

Here, it's... "everything is a table"? I'm not very familiar with table-oriented programming, but what advantages does having the RDBMS be the prime metaphor over the file system really bring? Structure? Rob Pike had some interesting words on that: http://slashdot.org/story/50858



> what advantages does having the RDBMS be the prime metaphor over the file system?

Easy queryability and easy joining of data across a whole datacenter.

This makes it easy to think about system data across all sorts of boundaries that the file metaphor makes somewhat cumbersome.


> This makes it easy to think about system data across all sorts of boundaries that the file metaphor makes somewhat cumbersome.

while file metaphor became big in previously SQL domain of data processing, i.e. the whole ecosystem of HDFS and everything on top of it


At the end of the day, almost everyone queries those HDFS files with query language using Pig, Hive, Presto, etc.


"Everything is a file" means that you need to parse files to get useful data out (think things like /etc/mtab and /proc/mounts), which is closely tied to another UNIX philosophy, that tools should generate plain text and parse it using generic text-processing tools. This is great for getting things done quickly. It's also great for security holes (think CVE-2011-1749 and related issues; arguably, see also Shellshock).

One advantage of "everything is a table" is that your structures are well-formatted and there's no risk of problems when you put a space in a pathname. For most implementations of "table", you can also have the data formats be well-typed. This brings reliability and security benefits.

I think there's validity to Rob Pike's argument in many contexts -- for instance, you absolutely won't see me defending the semantic web over the greppable/Googleable one. But in the specific case of text files with a single, well-defined structure, his own argument seems to imply that there's no sense in a second tool having to infer the structure on its own.

(The usual way this is worked around these days is separate files for each field, or files designed to be parseable, which is why Linux's /proc/*/ is such a mess. Compare /proc/self/stat and /proc/self/status, and /proc/self/mounts and /proc/self/mountinfo. Also look around /sys a bit.)


There's a command line tool called q, which allows performing SQL-like queries directly on text files, basically treating text as data and auto detecting column types.

http://harelba.github.io/q/


Neat, but auto-detection is exactly what I don't want. We have structure on one side. Why round-trip it through an unstructured format and attempt to guess the exact same structure on the other side? If I guess wrong, it's a security hole.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: