One nice thing I found today is that ScyllaDB has a pretty nice JSON feature, th...

geenat · on April 23, 2023

So ScyllaDB allows you to choose how much Consistency in CAP you desire, in order to gain performance. Pretty interesting.

In contrast to CockroachDB, Postgres, etc, where Consistency in CAP is always enforced and there's no way to turn it off.

You turn it on/off by simply specifying IF NOT EXISTS which is insanely simple: https://www.scylladb.com/2020/07/15/getting-the-most-out-of-... (called "lightweight transactions" or LWT)

I wish CockroachDB had a way to trade consistency for speed when desired.

Another sidenote, in my limited investigation so far- when using LWT in ScyllaDB: performance of CockroachDB INSERT and ScyllaDB INSERT both line up fairly evenly. Still investigating, but this makes sense- Only when you avoid IF NOT EXISTS, ScyllaDB pulls away massively in performance.

geenat · on April 22, 2023

One strategy I've considered:

Write Postgres in a way that's fully compatible with CockroachDB, with the intention of transitioning to CockroachDB if needed.

Basically: 1. BIGINT or UUID or TEXT or composite Primary Keys only. 2. Limited use of sequential indexes. (Ex: timestamp. Not the end if the world but creates hotspot, will require a hash sharded index) 3. Avoid advanced features such as stored procedures. 4. Avoid or very light use of foreign keys. (Performance)

Going straight to CockroachDB seems logical as well, but you'll hit the performance wall much sooner on 1-2 servers, and will require a cluster to match it.

Nathanba · on April 23, 2023

if crdb had some kind of perpetual fallback license like Jetbrains then I would be far happier with the prospect of eventually switching to crdb as an option. But "contact us" pricing is not exactly unscary pricing. If you think about it, even the infamously opaque Microsoft SQLServer has a pricing table nowadays. Therefore I tried to calculate some kind of pricing comparison, I'm looking at the cpu alone now. I'm seeing that they ask for $0.46/h for m5.large and $6.34/h for their m5.8xlarge (managed dedicated offering) Using the aws price calculator and selecting these instance types directly it would cost x4 less to rent the aws instances directly. Paying 4 times your raw server cost for licensing is not exactly a dream come true.

but scylladb really has so many weird footguns. I just discovered that you can't even query for NULL values. It just doesn't work, they have no support for WHERE x = NULL even with an index. That's crazy to me, if I ever make a mistake and have a few NULL'ed values then I can't even find them again so I can set them to an empty string to make them queryable. I just don't know, I almost feel like scylladb has too many tradeoffs.

geenat · on April 24, 2023

Found a solution to address the lack of WHERE x = NULL

SELECT * FROM users;

Then you can plug your PK into.

UPDATE users SET address='' WHERE user_id IN (9affeac3-5d92-111d-779c-55eb6d78a806, ...) IF address=null;

Then you can do your SELECT to do your repair.

SELECT * FROM users WHERE address='';

_______________________

Another footgun: No default values for columns in CREATE TABLE. Very annoying.

Another footgun: Only "=" and "IN (...)" is supported in WHERE for partition keys. https://docs.scylladb.com/stable/cql/dml.html#the-where-clau...

This means, you cannot do UPDATE WHERE user_id > 0 ....

I'm starting to really appreciate the effort CockroachDB went to, to make their version of "CQL" postgres compatible, even though architecturally they are both key-value store databases under the hood... it's just a shame CockroachDB just has far lower performance because of enforced consistency and no way to turn it off.

To be fair, ScyllaDB is removing footguns with every new version, just not fast enough for my taste.

geenat · on April 23, 2023

Can confirm the inability to select null values. Have not yet found a reasonable pattern / workaround.. The reasoning seems to be that null values are not stored-- it's displayed as null in csql, but in reality, that data just does not exist.

For the record, CockroachDB does not store null either, but it allows querying null.

A workaround may be stepping through the full table manually using the PK (which can never be null) checking for null "in app" and cleaning them that way.

Noticed INSERTS are treated as UPSERT unless you use IF NOT EXISTS... ScyllaDB doesn't give a f__k if it's overwriting an existing row, lol. I can live with the UPSERT footgun, though. Agreed the null footgun is far more annoying.

heipei · on April 22, 2023

We're running ScyllaDB on Docker via Nomad and it's been an absolute breeze. One simple container with a handful of open ports. I don't get what's complicated about ScyllaDB deployment.