Vale.sh – A Linter for Prose

zerojames · on Sept 3, 2023

I saw Vale earlier today. The tool helps you comply with technical style guides like those by Microsoft and Google, and the out-of-the-box Vale rules (the extent of the Vale rules I am unsure; I got way more suggestions when installing the Microsoft and Google extensions).

A few notes while getting it set up for macOS:

    1. brew install vale
    2. Create .vale.ini (in root dir, I think)
    3. A minimal setup that lets you lint with Microsoft and Google style guides:

        MinAlertLevel = suggestion
        StylesPath = styles

        Packages = Google, Microsoft

        [*]
        BasedOnStyles = Vale,  Google, Microsoft

    4. Run `vale sync` to install the Google and Microsoft style guide extensions.
    5. Run 'vale {file}.md` to run the linter.

hddqsb · on Sept 5, 2023

The built-in rules only include spelling according to the docs: https://vale.sh/docs/topics/styles/#built-in-style

mooreds · on Sept 3, 2023

Have tried to use this a few times but it's a big lift. Like adding unit tests to an existing application, it's a good idea to start small. And vale doesn't make this super easy (at least I couldn't figure out an easy way to do this). If I could wave my wand, I'd want a way to:

* apply vale to just the doc I was working on

* have a minimal set of rules

* add to them over time

At $curjob, we have a detailed public list of rules of doc ( https://github.com/FusionAuth/fusionauth-site/blob/master/Do... ) and as our team expands, I'd love to have them be applied rigorously. vale seems like a good fit, but there's an activation energy that I haven't been able to get over yet.

I am not aware of any other cli tools similar to this, though, so totally admire the team behind it.

starkparker · on Sept 3, 2023

Ironically, perhaps, considering its popularity among technical writers, Vale's documentation is ambiguous in places and difficult to navigate.

MilStdJunkie · on Sept 4, 2023

It's in keeping with traditional techpubs software being an unmitigated dumpster fire. :)

Not Vale, though. Vale's pretty slick; although personally I've always gravitated to RedPen.cc and LanguageTools[1] more than Vale. Grammarly, yes, Grammarly is also in this natlang linter space, but has possibly the least responsible data safety rules of any VSC extension I've ever seen. It's harsh, but I comprehensively advise customers away from Grammarly if they have any kind of data restriction at all.

[1] LT for no other reason than ASD STE-100 checking - it's old tech

block_dagger · on Sept 3, 2023

In the first bullet point of the linked document, there is an inconsistency in the very style it’s trying to enforce, namely capitalization. UUID vs Id. Consistency is hard!

mooreds · on Sept 4, 2023

I don't know why we use `Id` for identifer, but that's consistent across our codebase and documentation (we may have missed a few spots, but that's an error on our part).

Not sure why it is inconsistent to have UUID be all caps and Id be title cased? They feel different to me. But maybe I'm missing something.

mgaunard · on Sept 3, 2023

Acronyms and abbreviations are different things though.

tomjakubowski · on Sept 3, 2023

A not-untypical style (as used by the BBC) is to write initialisms like UUID in all caps, and acronyms like Nasa in title case.

Initialisms are pronounced by saying each letter in order. Acronyms are pronounced as if they were words.

"ID" is a funny case because it's an ordinary abbreviation, and not really an initialism* or acronym, but it's pronounced just like an initialism. I think I'd prefer it "ID".

I would definitely avoid writing it "Id", lest it be confused with Freud's concept.

* unless as Norm Macdonald joked, the "D" is short for "dentification"

cpeterso · on Sept 4, 2023

“ID” can mean “identity document”, such as a passport.

User23 · on Sept 4, 2023

This is, to my half American ears, bizarre prescriptivism. We pronounce NASA as “nasa” and UUID as “yoo-id” (and GUID as “goo-id). And ID is pronounced “eye-dee.” And of course we pronounce OK (short for “oll korrect”) as “okay.”

uuuuuuuuuid · on Sept 4, 2023

Is “yoo-id” a common way of pronouncing UUID? I’ve only every spoken each letter in turn.

dharmab · on Sept 4, 2023

I've only ever heard U-U-I-D, but I've also heard GUID pronounced as a single syllable like Gwid.

kstrauser · on Sept 3, 2023

But what is it? Is this like Grammarly? Maybe LanguageTool? I paged through most of the docs and I’m still not entirely sure what this does or why I might want to use it.

awoimbee · on Sept 3, 2023

Looks like grammarly but open source, with custom rules and easy to add to CI.

I currently use crate-ci/typos in my CI runs (it's great, humans often miss dumb typos), vale looks like typos but supercharged (but limited to markdown files?).

jamilton · on Sept 3, 2023

What's unclear about it? My impression from the window screenshot alone is you choose rule docs and a text doc, and it tells you what parts of the text doc violate parts of the rule docs . It could be used to help make sure writing (likely for some corporate or technical purpose) follows a set style guide. Organizations already often follow formal style guides in their writing, I don't know if they have their own similar tools to automatically check if their style is being adhered to.

So like Grammarly, but you bring your own rules.

tester457 · on Sept 3, 2023

It needs a better landing page

Modified3019 · on Sept 3, 2023

Definitely. Though they apparently weren't very happy with someone bringing it up: https://github.com/errata-ai/vale.sh/issues/46

kstrauser · on Sept 3, 2023

The wording of that issue could come across as overly aggressive. The submitter should test it with a linter.

XCSme · on Sept 6, 2023

My understanding is that it is like LanguageTool, but where you can define your own rules using some YAML file (?).

NeutralForest · on Sept 3, 2023

Yeah not clear at all tbh

armchairhacker · on Sept 3, 2023

This seems like a tool I'll be using, and this is an almost meaningless criticism, but why the name?

There's already the Vale programming language (https://vale.dev/), but moreover, I don't get the meaning of "vale". You could call it something like "Englint" which actually hints its purpose.

ushakov · on Sept 3, 2023

Other interesting projects in the space:

- nlprule: https://github.com/bminixhofer/nlprule

- prosemd: https://github.com/kitten/prosemd-lsp

- cargo spellcheck: https://github.com/drahnr/cargo-spellcheck

- typosaur: https://typosaur.com

notamy · on Sept 3, 2023

Since there’s a few confused comments, and I was(/am?) confused too, I think this is pretty literally what the title says. I think a good metaphor would be "eslint but for written text instead of code."

kstrauser · on Sept 3, 2023

"Linter" is broad. Turns out it has the feature I most hoped for, language server protocol support, but it wasn't mentioned until far down into the integration docs.

A free, local, grammar checker that I can integrate into my existing editor? Pretty cool!

Update: Oh, cool! After installing the third party language server they recommend that wraps Vale, I have decent realtime grammar checking in my editor of choice.

h1fra · on Sept 3, 2023

For those wondering it's a tool like eslint or prettier (or any code formatter) but for actual text. It checks the formatting, the style, the words themselves and the formulation.

I have used it previously, it's nice for documentation especially when engineers needs to write it but can be very daunting to setup and depressing when you receive hundreds of automated comments in a single PR.

mhitza · on Sept 3, 2023

Very interesting! Just the other day I've tried proselint for the first time, and became annoyed that I'd have to hack something together to only have it lint over the text in my markdown file. It doesn't ignore code blocks, as it just supports text, as is not aware of any file format.

Vale seems to do this out of the box, which is great, but the suggestion I get, while better than nothing, are still very rudimentary compared to Grammarly[1] (which I haven't used for at least a year at this point).

For example, I've enabled all styles. The alex and write-good ones gave actionable suggestions. Readability however had suggestions of the form "Try to keep the Automated Readability Index (8.83) below 8.", "Try to keep the SMOG grade (11.35) below 10.", "Try to keep the Coleman–Liau Index grade (9.57) below 9". If I knew what that was maybe I could improve my score. Is there another FLOSS tool that can turn those in actionable steps?

[1] And LanguageTool, even if sometimes advertised by FLOSS folk as a Grammarly alternative, it really isn't. Every time I try it, I feel disappointed. Has anyone tried their premium option to see if it's better?

skydhash · on Sept 3, 2023

I have the premium option and it’s nice. I use it primarily as a spellchecking tool. And it sometimes detect when I use the wrong style (English is my third language).

dang · on Sept 3, 2023

Related. Others?

Vale.sh: open-source linter for prose - https://news.ycombinator.com/item?id=31782688 - June 2022 (3 comments)

Vale: A syntax-aware linter for prose - https://news.ycombinator.com/item?id=30479010 - Feb 2022 (1 comment)

LucasOe · on Sept 3, 2023

I always struggle with keeping my writing clear and concise, maybe because English isn't my native tongue, so this looks like an incredibly helpful tool to me. Does Vale.sh have the ability to check the prose used in comments? Using this for doc comments would be useful for writing better documentation. The example only shows it being used for markdown files.

hackerfake · on Sept 3, 2023

I don't get what exactly this is and couldn't find an explanation

j7ake · on Sept 3, 2023

This is one step closer for me to type more prose in vim!

pjot · on Sept 3, 2023

My favorite is a tool called write-good

https://github.com/btford/write-good

cratermoon · on Sept 3, 2023

There's a 'write good' style available for Vale: https://github.com/errata-ai/write-good/blob/master/README.m...

xwowsersx · on Sept 3, 2023

Just curious, how exactly you're using this? You'll write stuff in a file and run `write-good stuff-i-wrote.md` and then look through the suggestions and modify as needed?

pjot · on Sept 3, 2023

antiframe · on Sept 4, 2023

I'm curious what "native" means in write-good's description: "Naive linter for English prose for developers who can't write good and wanna learn to do other stuff good too." It appears to require a node server.

xigoi · on Sept 4, 2023

It says “naive”, not “native”…

antiframe · on Sept 5, 2023

It should come as no surprise that I'm not a native English speaker.

mushufasa · on Sept 3, 2023

I love writing my own thoughts and essays in Vim, and often have the problem that when I export the text to a proper word processor to send to someone, the spelling and grammar is messed up. This is despite running the built-in vim spellcheck. So I'm interested to see if this will help, and I'm excited to try it out!

andatki · on Sept 3, 2023

I’ve been editing my Markdown files with it and wrote about it here: https://andyatkinson.com/blog/2023/05/26/better-writing-vale

microflash · on Sept 4, 2023

I've been using Vale for several years now to steer my writing style. It is easy to customize and simple to work with. The only issue that I faced is that many available styles have too corporatey rules, which is why I'm maintaining my own style for personal writing.

vinckr · on Sept 3, 2023

I wrote a blogpost how I use Vale and how you can do it too - https://vinckr.com/blog/open-source-stylecheck/

lindig · on Sept 3, 2023

Links are unreadable in that blog post on iOS/Safari.

cratermoon · on Sept 3, 2023

More helpfully, the background for <code> tags is #302b5f, a very dark desaturated blue and the font color is #262626, very dark gray (mostly black)

darkhorse13 · on Sept 4, 2023

Vale is awesome. I used it to proof-read my documentation for Halfmoon: https://www.gethalfmoon.com/docs/

nmstoker · on Sept 3, 2023

Does this actually work well in practice?

The rules seem like they would be incredibly brittle and not necessarily able to deal with countless variations that are seem in English.

playingalong · on Sept 3, 2023

When is my IT department gonna approve it for use at work?

Kiro · on Sept 3, 2023

> (, , and )

What is this?

lgas · on Sept 3, 2023

Are you using a text only browser? They are ([apple icon], [windows icon], and [linux icon]).

vasco · on Sept 3, 2023

I get the same issue on regular up to date android stock browser.

thiht · on Sept 4, 2023

Maybe you are blocking custom fonts? The icons are apparently loaded from FontAwesome.

There should definitely be an accessibility text on these

xwowsersx · on Sept 3, 2023

This looks great!

By the way, I can't be the only one using ChatGPT to rewrite docs and messages, right? I regularly tell ChatGPT: "Rewrite the following, don't make it too formal:.." or "rewrite the following optimizing for brevity:..." and it usually spits something out much clearer than my first revision. Sometimes there's just one sentence I was struggling to write clearly and this helps a ton. Really valuable in this remote work world we live in where so much of our communication happens in writing.

jskherman · on Sept 3, 2023

I got introduced to https://goblin.tools on the Fediverse and found it handy ever since for some of the usual things I ask ChatGPT. On it, there's the "Formalizer" for this use case, and "Magic To Do" to help break down tasks (and there's the option to generate time estimates too).

xwowsersx · on Sept 3, 2023

Oh neat, thanks for sharing!

Aurornis · on Sept 3, 2023

I tried this with mixed results. I’d spend so much time prompting, rerunning it, proofreading, and making minor changes that I wasn’t really coming up with a more efficient workflow. Rewriting the text myself a couple times produces more concise results after.

Playing with ChatGPT certainly felt like more fun at first. It feels like you’re making more progress until you look at the clock and realize you’ve spent 10 minutes messing around in ChatGPT where another 60 seconds of rewriting would have worked fine.

jackthetab · on Sept 4, 2023

Nope, you're not the only one. I do the exact thing, including saying "Rewrite the following, don't make it too formal". I think the results are excellent.

tommica · on Sept 3, 2023

Yeah, it's a good use case for it. I also use it to translate messages for me in languages that I am weak in, and I really like it's ability to summarize things.

jasonjmcghee · on Sept 3, 2023

As there's a theme in comments here - I'm surprised people are so confused by the landing page. Between "a linter for prose" and the screenshot at the top of the page, I found it quite clear what the use case and point of the tool was.

Personally interested where the confusion is- maybe unfamiliarity with "linting" or "prose"?

dmarchand90 · on Sept 3, 2023

I think it's the lack of examples. Also, because this is very cutting edge, I'd like to know a bit of the technology behind it to know the strengths and limits. E.g. is it based on hard Coded grammar and thus a bit stiff and dumb? Or is it llm based and therefore with data control issues?

It would also be nice to see examples of what it can do. Again, if it's hard Coded grammar it would be nice to see the complexity it can handle, if llm, the lengths the token limits can manage

jonahx · on Sept 3, 2023

I didn't think it was bad but there were some points of confusion that could be improved:

1. "Your style, our editor" Wait is this an editor or a command line tool? I was able to figure it out but the word "editor" is ambiguous here.

2. "that brings your editorial style guide to life." Ok, so do I have to write all my own rules? Or is it like eslint where I choose a base config with ability to override, and possibly add more? These questions are so fundamental I would like to see succint answers at least hinted at in another tag line or short paragraph, even if I can figure them out by digging into the docs.

layer8 · on Sept 3, 2023

It’s the lack of examples (I actually browsed the site for 2–3 minutes for some and didn’t find any), and that the screenshot doesn’t show the input prose text that the output applies to.

Instead of the long list of integrations, more exposition and context would be helpful, or at least an immediate link to an introduction page providing that. In earlier times, software documentation would provide a whole introductory book chapter providing the context, use cases, and some examples.

As a side note, regarding the screenshot, I’m wondering whether the tool also provides a rationale for each item, as the displayed text isn’t very informational on the why.

alpaca128 · on Sept 3, 2023

> Personally interested where the confusion is

There is no clear explanation of what it actually is, just a vague sentence saying "brings your editorial style guide to life" which can mean a lot of things. And the screenshot is a wall of text in the smallest font size across the entire page (even smaller than the labels of the logos that make up 50% of the area), so if it was intended as carrier of important information it wasn't the right choice.

The HN post title says more in four words than the entire landing page.

playingalong · on Sept 3, 2023

The name linter is not widespread in its use as some think. In some ecosystems it's just called "static analysis tools" or something. I came to learn the tool "linter" after having exposure to a few tools of this kind, several years into the programming career.

hhh · on Sept 3, 2023

I found it clear when I looked at the site, but never thought about prose being used in this way. I didn't know the actual definition though, and after googling it's clear.

jrm4 · on Sept 3, 2023

I'm trying to figure out why I instinctively hate this.

I think it's probably because the purpose of computer languages is different from human language. It's certainly valuable to have a linter to make sure the code works, and there's probably value in "style" as well for readability.

But when you get to human language, I don't know, feels like you have the potential to suck out soul/creativity etc. As in, I imagine you throw a great poem or something in here and of course it will tear it apart.

bayindirh · on Sept 3, 2023

I think iA Writer does this well. It has a "mark fillers" setting which crosses out cliches, fillers and other things which dilute the sentences. It's just a finger pointing at "you might want to reconsider this", and if it's indeed something I don't want there, I take it out, otherwise leave it in.

I strive to write with a very neutral tone, and without any strong words, so it really helps me.

You can see the results of that process at https://blog.bayindirh.io

ggoo · on Sept 3, 2023

> I strive to write with a very neutral tone, and without any strong words, so it really helps me.

Curious, why? I would think this would make your writing more "boring", which would make it less likely for people to pay attention to.

Swizec · on Sept 3, 2023

> I would think this would make your writing more "boring“

As a fellow iA Writer enthusiast – it makes your writing less boring because it nudges you to get rid of filler.

for example: “Usually you can just remove some words that iA Writer suggests are filler and your sentence improves so much” —> “You can remove words that iA Writer suggests are filler and your sentence improves”

bayindirh · on Sept 3, 2023

First of all, why not give that blog a go, and share your opinions of it here or via e-mail, if you prefer?

Now, on to your question.

I don't believe words have to be strong or provocative or divisive to be true. Life is not black and white, neither my choice of words. Also, I practice zen in my daily life, so my writing is both a reflection of my inner state and the state I aspire to be in, at the same time.

Yes, as a human being, I want my blog to be read, and get some feedback occasionally, but at the end, it's a blog for myself. An instrument for taking note of my life and my journey on this pale blue dot.

ggoo · on Sept 4, 2023

I have no disagreement on the truthiness of words, my comment was really only about the enjoyment of your readers. Regardless, it sounds like you have a vision for what you want your writing to be, and so far as I know, it's working well for you.

As for feedback, since you asked, I'll be honest and say that it does read pretty boring... Very monotone with too many idioms. Feels like what you read from schoolchildren writing about a topic they're not really interested in.

I'll also say, there's no _you_ in this writing. Reviewing the latest entry, "Practice and Experience" - no stories, metaphors, or even examples of real events to impress your point. People understand a lot through storytelling, often the only takeaway we will have will be a story or metaphor. More importantly though, they are an opportunity to identify with the reader and share something about you and your life.

Apologies if that felt harsh.

bayindirh · on Sept 5, 2023

Thanks for your honest feedback. I greatly appreciate and value that, and there are no hard feelings, and no, it wasn't harsh.

> I'll also say, there's no _you_ in this writing.

It's intentional to remove myself from my writing, because it's not about me. I'm not trying to put myself out there, and say "look at me". These are distilled mind notes. A way of sharing what I learnt about life, without me.

The writings I publish are intended to make readers to reflect themselves upon and see themselves, or get some personal insight about themselves or life. Maybe they also fascinated by this, or they never observed that angle about the life. If they think 5 seconds about the subject itself, that's nice. If they say it's boring, that's fine.

On the other hand, what I'm sharing there is highly personal. It's just devoid of bells, whistles and blinkenlights. Much like a Dieter Rams or Bauhaus design, in a sense.

> Feels like what you read from schoolchildren writing about a topic they're not really interested in.

That's an interesting take. It's true that I'm not aiming for a literary tour de force there, but it's not true that I'm not interested in the subject, it's actually the contrary. The language is simplified to that point to make it straightforward and direct.

Making indirect statements, and slowly approaching points while not quite touching them is very easily accomplished by constructing freight-train long sentences by chaining seldomly used vocabulary end to end with small punctuation marks, as if they were small fragile cotton strings knotted meticulously, and with care, if one decides to write in that way.

However, writing with no fillers have a feel of density and directness, which makes things appear with no fanfare. It's up to the reader to process this "thing" they just encountered.

> no stories, metaphors, or even examples of real events to impress your point.

"Practice and Experience" is a distillation of at least two decades of observation and self-reflection. If I decided to add the stories and examples paved the way to the realizations made in that piece, it'd be a novella. Not that I remember every detail of it, either.

That blog lives true to its tag line "tail -f /dev/brain0". When I understand something, I draft an entry. It sits and simmers for some time, refined occasionally, and when it reaches a density and purity I like, I publish.

However, thanks again for your honest feedback. I'll be saving this.

starkparker · on Sept 3, 2023

If you use it for all general prose, Vale is terrible for those reasons. So is dogmatic adherence to any style guide.

For docs, technical writing, and other formal content where you have multiple authors and consistency matters, Vale can be a fantastic tool to remind users of situations where rules help maintain standards without needing to dedicate time to editing. It can also be terrible, if it's used to force awkwardness to satisfy rules arbitrarily, especially in something like CI (which is where I see Vale abused most often - never block a docs contribution on a prose style rule violation that has no functional effect on the content, just iterate on the language).

petesergeant · on Sept 3, 2023

This is for when three different people on your marketing team decide to come up with “opensource”, “open source”, and “open-source” respectively when writing whitepapers.

CharlesW · on Sept 3, 2023

To "yes, and" this, it gets even worse — employees often mess up their own company, technology, and people names. This seems like a useful tool for helping alert writers to potentially embarrassing and/or brand-damaging gaffes.

jhbadger · on Sept 3, 2023

(human) Copy editors have long been part of the professional writing field, though. It's not like applying rules to writing is something new. Yes, "genius" writers can and do break the rules, but 99% of writers aren't them.

andrepd · on Sept 3, 2023

One thing is a human editor with taste and understanding of the text. Another very different is a dumb LLM.

trws · on Sept 3, 2023

Having used it myself, though maybe not the usual way, I have a feeling you might like it more with a bit of context. If you take a huge rule set, like Microsoft’s style for example, it can definitely have the effect you’re talking about. That’s what I love about vale though. It makes it very easy to build up rules for my own projects, import ones we care about, and not run anything else. I work on a number of API and programming language standards where specific “words of power” are used, and must be used, to have specific effects. Being able to write rules for common mistakes with these has helped me many times. Having one for weasel words I can run on my papers, or those of my graduate students, can save a good deal of time. It’s the flexibility to make it fit my needs/wants on each thing that make it so useful. To the point about poetry, I haven’t tried it, but it wouldn’t shock me if you could make some rules useful to certain styles of rhyming or verse to help catch technical mistakes.

kstrauser · on Sept 3, 2023

I use(d) Grammarly to double-check my personal blog posts. A lot of its purely technical catches, like duplicating words, mismatching tenses, mismatching plurals, etc. are great. I usually ignore its style suggestions altogether, especially when I'm deliberately breaking a rule because the "wrong" text reads better.

sidpatil · on Sept 4, 2023

It depends on what you're trying to communicate and why. Controlled languages [1] have been around for a while — the goal is to simplify the semantics and syntax, in order to improve consistency and ease of use/understanding.

[1] https://en.wikipedia.org/wiki/Controlled_natural_language

pseudotrash · on Sept 3, 2023

Similar reaction here. That said I'd love the idea of a locally hosted https://hemingwayapp.com/ to help with keeping things short and simple ... this linter sadly isn't it.

satvikpendem · on Sept 3, 2023

I don't really understand this, what is a linter for prose? Like Grammarly? Is it an end-user product or for a programmatic interface? If the former, why are there "integrations?" What does it mean to integrate this into Kong, for example? Does it work through VSCode?

Edit: Ok, now I see, it is like an offline version of Grammarly for the editor of your choosing, as well as a lint tool for CI for your docs, like docs.konghq.com.