Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Wisp: A light Lisp written in C++ (github.com/adam-mcdaniel)
140 points by azhenley on Dec 28, 2020 | hide | past | favorite | 77 comments


Aside from legitimate gripes about a recycled name, there's a lot of undue negativity in this thread.

Not one person cares that HN-LISPER is not impressed that OP wrote yet-another-Lisp-interpreter-in-language-X or that you're unhappy with the style of their parentheses. Save it.

As "easy" an exercise as it may be, I will guarantee you that OP learned a lot about Lisp, the host language, its build infrastructure, etc. The project is currently unlicensed (you should think about licensing it, OP!), but the source is available for others to study and learn from, too.

So, I say bravo!



I've been studying various Lisps as a hobby. I've been around a long time and I've never seen a community so touchy about something as having Lisp in the name and not being quite a Lisp. The scorn is real. newLisp has been called a garbage lisp for example that no one in their right mind would use.

tldr; Go ahead and use S-Expression syntax but don't put "lisp" in the name. Rich Hickey knew this when he made Clojure.


Reminds me of the femtolisp README :)

Almost everybody has their own lisp implementation. Some programmers' dogs and cats probably have their own lisp implementations as well. This is great, but too often I see people omit some of the obscure but critical features that make lisp uniquely wonderful. These include read macros like #. and backreferences, gensyms, and properly escaped symbol names. If you're going to waste everybody's time with yet another lisp, at least do it right damnit.

https://github.com/JeffBezanson/femtolisp


Comments like that make me sad. They illustrate that phenomenon where zealots often have these curious blind spots in their understanding of the focus of their zeal.

For example, I would personally not consider macros (let alone reader macros or gensym) to be something that a language must have in order to be called a lisp. Such an opinion implies that the first true lisp was not introduced until a 15 years after the first language that called itself lisp, and tacitly casts McCarthy himself as one of those hapless poseurs who tried to pass an unconvincing lisp wannabe as the real deal.

(At least according to the timeline given in Steele and Gabriel's history of lisp. https://dreamsongs.com/Files/HOPL2-Uncut.pdf)

Getting twitchy about the syntax also seems somewhat odd to me, considering that Lisp 1.5 did things like

  cons (x (y))
    => (x y)


I left out some context in the README. It's opinionated, but it's also sort of a self-deprecating joke, i.e. "why am I writing another Lisp"


Yeah, that's a pretty important bit of context!


The person who made this comment, Jeff Bezanson, went on to create the Julia language, so I don't mind that he's opinionated


I just finished wasting 2 hours of my life trying to get femtolisp to build on Windows10. It built on Ubuntu20 just fine. It was missing a bunch of stuff that should have never been missing. Oh well, it was just a curiosity.


julia --lisp IIRC. It's part of the Julia parser so a working version is in every Julia installation.


A few nits:

(1) C++ has std::to_string nowadays. You don't need to roll your own.

(2) Environment::get would be more efficient if you used a hash table (e.g. unordered_map) instead of many ifs.

(3) It's better to pass std::string and std::vector by const reference, to avoid making a copy. Also take a look at std::string_view.

(4) const char[] is generally preferable to #define.


I can't help but think of this line from "The Joy of Clojure"...

Go to any open source project hosting site, and perform a search for the term Lisp Interpreter. You'll likely get a cyclopean mountain of results from this seemingly innocuous term.


Lisp and Forth might be the only languages out there where writing a basic interpreter is a reasonable weekend project and a decent way to get your feet wet with some new language you want to learn.


Ok wow. Any few-weekends length resources you know about?



I actually hacked up a simple forth-like system, after reading a brief howto here on hackernews:

https://github.com/skx/foth/

Here's the thread which has the barebones overview which inspired me:

https://news.ycombinator.com/item?id=13082825

I could have taken it further, but the implementation there is not "real" in the sense that there is no real return-stack, so you can't implement IF-statements using the lower-level primitives.

That said it is a good starting point, and I had some fun doing it. I'd guesstimate it is more of a single weekend project though, rather than longer.


This is one of my favourites [0] that goes a lot further than most, but fits the timeframe. Error handling, macros, standard library all included.

[0] http://www.buildyourownlisp.com/



Sorry but, Wisp already exists, twice! It's Lisp without parentheses:

http://dustycloud.org/blog/wisp-lisp-alternative/

and a little Clojure-like LISP in JavaScript:

https://github.com/Gozala/wisp


Python's semantic whitespace but for Lisp... is not a sentence I thought I'd ever type.


Paul Graham almost suggests this in the first chapter of ANSI Common Lisp in 1995.


It's the first thing I thought about when learning Lisp.


Exactly what I was going to complain about, thank you for doing it for me, have my upvote.

Please people, check your names before creating more confusion. Sometimes it is OK to use the same name, but here it will only lead to confusion.


Cool project! As I read through I noticed that there doesn’t appear to be any explicit garbage collection, just stack-based deallocation. Is there something about the subset of lisp you implemented that removes the need for GC?


I was sort of expecting an implementation of the wisp standard [0], but apparently not.

Though, the code is pretty clear and easy to follow.

[0] https://srfi.schemers.org/srfi-119/srfi-119.html


Lisp user and general author of unpopular opinions, here.

There's certain amount of "eye rolling" that greets these "I wrote a Lisp interpreter" articles, because 1. Lisp interpreters are incredibly simple and uninteresting, and 2. Lisp compilers have been around for more than 60 years now http://bitsavers.org/pdf/mit/rle_lisp/LISP_I_Programmers_Man....

It would be more interesting if you wrote a Lisp compiler in C++ which included a REPL, or even compiled Lisp to C++.

Also, a Lisp user would never write code like this:

            ))
    ))
Lastly, seeing Lisp interpreters still being blabbed about perpetuates the untrue meme that Lisp is inherently slow, or ugh interpreted. Which is sad.


Lisp is 10 lines of code. If this is not your starting point, you are already lost: https://youtu.be/OyfBQmvr2Hc


Might be a little off topic, but what’s with the ”written in” obsession? From my perspective it’s so irrelevant compared to well tested, design, architectured, community engagement, track record, performance and usability.

Call me crazy, but how does the language you do this in become relevant? As a reference perhaps? Only the best of intentions here, have i missed a paticular pov that makes this extra relevant?


It's relevant:

* For projects where one of the main purposes is for the author of the project to learn said language

* For projects which expose an API in their implementation language. In the case of an implementation of a programming language, they might offer embeddability in the host language, or a foreign function interface with the host language.


Why do we consider it normal that APIs are exposed for the implementation language of a project (library, interpreter, application, whatever) but usually not other languages?

Is there any evidence that the "user" of an API is likely to prefer the same language that was used for the original thing?

(BTW this does not seem to be the case for networked APIs such as REST services).


The only really stable, standardized, well-supported ABI out there is the C ABI. Communicating across it is a chore. On the side of the language publishing the API, you need to write a C-friendly façade that translates everything your language has that C doesn't into a much more Spartan semantic space. You'll be losing or having to re-implement pretty much everything you like about that language: any sort of dynamic dispatch, memory management semantics, exceptions, etc. And then, on the client language side, you probably want to write another façade in order to, to the extent that it's feasible, restore all those sorts of niceties.

Direct FFIs are possible, but generally don't happen except in exceptional cases because of the cost involved in writing a new FFI (and accompanying set of gotchas) for every possible language whose packages you might want to consume. The best and worst thing about the C ABI is that it's pretty much the least common denominator.

Even in platforms like Java and .NET that theoretically smooth over these problems, we find that it never works out that well in practice. Scala and F# users end up having to write compatibility layers in order to preset Java and C# friendly APIs, and they end up writing their own versions of just about every possible library in order to have packages that don't feel awkward to use in their language of choice. The only real exception I know of to that pattern on the JVM or the CLR is between C# and Visual Basic, and that is only because Microsoft was very careful to ensure that the two languages' semantics were similar enough to virtually guarantee no cross-language friction.


COM/UWP are doing just fine on Windows world.


UWP in particular is very nice. But I would argue that, for most of us, the "W" stands for "not well-supported."


Thunking across FFI often takes a lot of work to make it ergonomic.


I agree, "machine learning framework written in <lang>" is relevant, not sure i agree about an interpreter though :p


I am surely an outlier being a developer, but I like to know the project language without clicking as an overview of what people are doing.

I'd suggest that in many forums it is irrelevant to most and should be left off, but I suspect HN is not a representative sample of the general public and many may share my sentiment.


Yeah, but why? Is it a feeling? Is it because it's not relevant if you can't immediately read the code since it's not a language you usually read/write in? Once again just curious :)


With HN, from the comments I get an idea of whether or not the language as a tool fits the solution in a unique way, or if it is a novelty.

So even if I cannot read the code, I can mentally get an idea that tool X is a good fit for problem Y should it ever come up for me.


it's relevant in this case because most simple Lisp-like interpreters are written in... Lisp (or at least one of the many other Lisp-like languages).

Interpreting a simple lisp-like in lisp is borderline trivial. Interpreting a simple lisp-like in an imperative language demonstrates a significantly more fundamental understanding of how lisp-like languages work.


Sometimes the name of the implementation language even creeps into the project name, like with the MIDI sequencer Jazz++.


I like it actually as a mental filter. If the language is the main defining quality it usually means the project is nowhere near production ready, and can be safely ignored for now.


It becomes important for the trustworthiness of the underlying implementation. Some need way more bulletproofing than others.


> Might be a little off topic, but what’s with the ”written in” obsession? From my perspective it’s so irrelevant compared to well tested, design, architectured, community engagement, track record, performance and usability.

As far as I am concerned there is one aspect where I think it is relevant for me, as I compile all software that runs on my machine. If it written in C or in C++, there are good chances that I won't need to compile an horrible chain of dependencies (or most likely, try and fail at compile that chain).

That matters to me, because I try a lot of alternative/niche/toy languages, and if I consider those published within, say, the last ten years, half of them won't even compile. It is a disaster. If it is written in shell + C, I can fix a couple errors easily (if there are more I give up because there is no point in wasting my time on something that claims to be next programming marvel and yet is itself programmed like shit). People want to be fancy and pick up toolchains that are not battle-tested like Autotools+C or CMake+C++, and that's the recipe for pain.

For example, today I tried to compile two languages written in Ada, and I had to give up. A mess. Each time I tried to build an Ada project which wasn't mine, it has been horrible. That's antinomic to the image of Ada (clean and strict to the point of being stiff) but that's my repeated experience with building.

I am not even talking about the anti-feature which is compilers which are written in the very language they aim to compile and do not even provide a bootstrapping minimal compiler written in C or similarly fundamental and widespread language. They're generally very proud of their accomplishment :-/

So yes, a project in C (or C++) offers the best chances of success (for building, which is the point I developed here).

Also, considering toy language, I often want to make a dirty hack to add a missing features. If it is written, say, in Haskell, there is zero chance I can do it. Good old imperative languages like C, Python, Pascal, I have a chance and I have already done it.

There is also the case of all the languages built over the JVM. There are really many. But even when they look good to me, they are limited by the opinionated choices of the JVM, and I don't wanna undergo the multiple programming neuroses of Gosling. Also anything Java based has always been terribly, awfully, painfully, excruciatingly slow on any of my machines; I suppose it not an universal experience, but it has always been mine.

------

> compared to well tested, design, architectured, community engagement, track record, performance and usability.

To each is own: I care about well tested, usability, and depending on the intended use, performance. I also care a lot about proper exhaustive documentation, which is missing from your list and from most programming languages projects.

But I don't give a damn about its design and architecture, what matters to me is the external features, not the internals. Language designers care way too much about the internals and neglect the outside, which is what matters to the end user. I don't care if it make your parser more complicated, it if makes your grammar irregular, if you need more passes, if you need to dirty your hands with context; I want that source code using your language flows, for writing and for reading.

Community? I don't really care, and I have seen it being detrimental to many projects, which lose their compass by wanting to incorporate every feature request and following the mood of the year; I'd rather they were developed between closed doors and only published when they reach beta status, to maintain consistency, instead of jumping from an unpolished feature to the next. I am aware that's totally against the current trend, where people publish repositories with a README.md before writing the first line of code or documentation :-)


First off: Interesting thoughts! I couldn't agree more about the importance to make things easy to build. I'm not sure i agree about C++ and C applications being particularly easy to build though. Personally i have not seen any particular correlation with any language, but have too small of a sample size to say anything definitive. Would make for an interesting deep dive though.

>To each is own: I care about well tested, usability, and depending on the intended use, performance. I also care a lot about proper exhaustive documentation, which is missing from your list and from most programming languages projects.

~But I don't give a damn about its design and architecture, what matters to me is the external features, not the internals. Language designers care way too much about the internals and neglect the outside, which is what matters to the end user. I don't care if it make your parser more complicated, it if makes your grammar irregular, if you need more passes, if you need to dirty your hands with context; I want that source code using your language flows, for writing and for reading.

I think there's a disconnect here between good design and architecture and the result of it. You can't by definition have a good design and architecture that makes your program difficult to use and illogical externally. You can't have good tests without a good design and architecture. Good design and architecture goes hand in hand with usability too.

To go into a deep dive is beyond the scope here. But concisely said, good designs and architectures makes the software easy to understand and easy to change to external changes of requirements. Since its easy to reason about it's also easy to understand and reason about bugs when they inevitably shows up. If the software does not have these features, it does not have a good designs and architecture. You might think a certain architect cares more about the internals then the external, but then the architect is bad, and should study and should step back a step and think about the end user and heading of the software project. Or perhaps there's a communication issue going on, human <-> human style.

Good designs and architectures also is the equivalent, actually superior to, having good documentation. Documentation that defines the external goals are not required if you have good tests that shows them. Hell documentation about the code is not needed if the design and architecture is good. And the self documenting code can never lie. Is it difficult to achieve? Yes. But not trying will put you into a corner you can't get out of and inevitably lead to the rewrite-from-scratch. Unless of course you have a very strictly defined end goal and complete definition of done, but you could still paint yourself into a very difficult corner to get out of.

>Community? I don't really care, and I have seen it being detrimental to many projects, which lose their compass by wanting to incorporate every feature request and following the mood of the year; I'd rather they were developed between closed doors and only published when they reach beta status, to maintain consistency, instead of jumping from an unpolished feature to the next. I am aware that's totally against the current trend, where people publish repositories with a README.md before writing the first line of code or documentation :-)

Not sure i see the correlation between pushing readmes with intent and publishing unfinished feature upon unfinished feature. Isn't readme:s infact documentation? Well well :p i believe you can work in the open without jumping to unfinished feature to unfinished feature. Polish can come in the middle of a development cycle, not just the end. A vibrant community around a software also is equivalent with many contributors etc, which makes the software resilient to people moving to other things etc.

I might have sounded critical, but enjoyed reading your response. Thank you for your response :) I also believe you gave me clues to why the "written in" header is so popular.

The way you interpret certain values i gave was also insightful. I certainly don't have the same opinions, but non the less i can see where you come from. So i just had to respond to it :) Thank you!


This demonstrates a modern distortion in software engineering where people are trained to hate other languages they don't use and reinvent the wheel in each new language. Each new language that is introduced seems to reinforce this incorrect perception and it seems that we can barely do anything else than reinventing the same old, well-tested libraries in the next "language du jour".


Modern?

Nope, language flamewars were pretty much alive on BBSs and USENET.

I have had my share against C on comp.lang.c and comp.lang.c.moderated.


Ah, finally a clear and concise explanation of quoting and special forms.


It's a bit misleading though. QUOTE is a special operator, which prevents data from being evaluated. But: special operators are in general not about quoting. They provide various built-in evaluation rules.

Quoting means that a thing is returned as itself and not evaluated. Thus it is not interpreted in any other way.

Take IF, it does not quote its arguments, it just conditionally evaluates them based on the result of the test form. It implements control flow. Thus IF evaluates things - it does not quote them - they are not returned as they were in the source code.

    (if
      test-form    ; evaluated first
      then-form    ; evaluated when test-form is true
      else-form)   ; evaluated when test-form is false
Thus each of the above three argument forms might be evaluated, but only one of the last two, depending on the evaluation of the first form. In a compiled Lisp, the then-form and else-form will not be stored as quoted forms in the code, they will be compiled to some other language (often machine language, C, or similar), just as the rest of the source code. Thus after compilation all these source forms will disappear and only executable code will be left. Thus there is nothing to quote by the IF operator.

Special operators usually the core of needed built-in control flow and scoping operators. Typically in Lisp, they are also not user extensible in the standard language: the user can't write new special operators.

Many other additional operators are derived from those via macros, which transform source code to new source code which eventually only contains function calls and special forms.


I'd prefer C++ interpreter written in Lisp...


How about C?

https://news.ycombinator.com/item?id=25531871

https://github.com/vsedach/Vacietis

================

Vacietis is a C compiler for Common Lisp systems.

Vacietis works by loading C code into a Common Lisp runtime as though it were Lisp code, where it can then be compiled or evaled. The loaded C code has the same function calling convention as regular CL code and uses the same numerical representations. C memory is backed by regular Common Lisp arrays.

Vacietis comes with a libc implemented in portable Common Lisp.


I've always wanted one that worked by taking C code and outputting actual Common Lisp code. But I guess that is significantly harder since nobody seems interested in doing it.


I'm surprised that we don't see more compilers and interpreters written in Lisp, since people seem to believe that writing compilers in Lisp is easier than other languages. In fact most compilers and interpreters are written in C.


I don't think this is true anymore. From the TIOBE top 10 (that hallowed source of truth), I count only three languages whose main implementation is written in C (Python, PHP, R). Extending to top 20 only adds two more (Perl and Ruby). All of those are interpreters. Most compilers seem to be implemented in the language they are compiling. It appears that C++ is the most popular implementation language, although C is number two - but neither constitute a majority.


However my note is about Lisp in particular. Apart from a few cases, Lisp is little used to implement compilers for other languages.


we should have more static lang interpreters, with repls


I'll probably start a language war with this comment, I sure hope not.

I'm not exactly sure what you mean with just "static", I'm assuming static types. But wouldn't a language with static types and a repl not be as powerful as if it was dynamic instead? In the end, you want to be able to redefine everything via the repl in your runtime, and if there is types to be checked all the time and the compiler has to do extra steps for each eval, it'll get in your way and make it harder to monkey-patch things.


I don't think monkey patching is required. Although without a dedicated interpreter, you can just do what sbcl does under the hood and maintain what is essentially a tempfile and keep compiling that and reporting the output for every new line/block of code. (I don't know how sbcl actually works, being a lisp I presume it doesn't actually use a tempfile but instead a list in memory of the code the user submitted and then invokes the compiler as needed for new input including the output state as available symbols.)

A cycle that went something like:

    1. Write the line to a file
    2. Compile file with default flags
    3. Run binary in gdb causing gdb to dump certain interest info (stack info, variable contents, etc)
    4. Read the sections/symbols and merge this data into the gdb data (maybe get it from gdb)
    5. Dump this data into your reply and refresh waiting for user input.
You could have a few basic commands in the repl:

Modify the compiler args Add/remove/edit a line Attach gdb.

Emacs or vim could easily do this, and I suspect there are people who already do almost exactly this.

You could also just use any editor/ide with debug integration and keep a scratch file around...


> I don't think monkey patching is required

We're talking about a repl here, not just a basic shell. You'll need to be able to redefine functions and everything else at runtime so a repl is all you need to run your program. A lisp with a standard repl is running miles around any editor/ide with debugger support, as the workflow with repls, especially ones that can nest on exception and so on, is much faster.


It is a bit more complicated. Sbcl, as an example, does check types on defun statements. Some it will just warn about, but some will prevent the completion.


Julia is kind of nice example of what’s possible with dynamic language and static types.


Julia does not (semantically) have static types. It just proves that you don't need static typing to have an incredibly rich and useful type system, and you don't need static types to have top notch performance.


Exactly, in case of Julia I think it all boils down to multiple dispatch and ahead-of-time compilation. But it's still dynamic!


It was so not about comparing, but learning c/c++ is often a less fun experience due to either the write / compile / debug cycle or the heavy tooling required.

A c/cpp repl would make people fiddle around and have a rapid intuition. Then they go without.


It exists!

Cling- https://root.cern/cling/ (and the earlier Cint- http://www.hanno.jp/gotom/Cint.html)

I used to occasionally use them for local experimenting with libraries and language features.


Such REPLs exist since XDE for Mesa in Xerox PARC, and there are some for C++, C#, F#, OCaml, Haskell, Java, among others.



Is that even possible? Practically, not theoretically.


Not a complete transpiler, butI read about an interview about JVM implementation done on common lisp of Pascal Costanza and ability to port all things then supported in Java to lisp[0].

"...I decided to implement a Java Virtual Machine in Common Lisp - under normal circumstances, I wouldn't have dared to do this, because this is quite a complex undertaking, but I had read in several places that Lisp would be suitable for projects that you would normally not dare to do otherwise, so I thought I would give it a try. Over the course of 8 weeks, with something like 2 hours per day, or so (because I was still doing other stuff during the day), I was able to get a first prototype that would execute a simple "Hello, World!" program. On top of that, it was a portable (!) just-in-time compiler: It loaded the bytecode from a classfile, translated it into s-expressions that resemble the bytecodes, and then just called Common Lisp's compile function to compile those s-expressions, relying on macro and function definitions for realizing these "bytecodes as s-expressions." I was really impressed that this was all so easy to do.

The real moment of revelation was this: to make sure to reuse as many of the built-in Common Lisp features as possible, I actually translated Java classes into CLOS classes, and Java methods into CLOS methods. Java's super calls posed a problem, because it was not straightforward to handle super calls with plain call-next-method calls. Then I discovered user-defined method combinations, which seemed like the right way to solve this issue, but I was still stuck for a while. Until I discovered that moving a backquote and a corresponding unquote around actually finally fixed everything. That was a true Eureka moment: In every other programming language that I am aware of, all the problems I encountered until that stage would have required a dozen redesigns, and several attempts to start the implementation completely from scratch, until I would have found the right way to get everything in the right places."

[0] - lisp-univ-etc.blogspot.com/2012/04/lisp-hackers-pascal-costanza.html


Well, there's this [0] Scheme->C compiler, so if you were to port Cling [1] (C++ interpreter) to it, then you'd have achieved that goal pretty well. It'd be a large project, but practical to achieve.

[0] https://github.com/sph-mn/sph-sc

[1] https://github.com/root-project/cling


Of course it is. A friend of mine wrote a C99 interpreter in python in high school. He had the wonderful idea of not implementing any undefined behaviour that wasn't strictly necessary :) we used it to test our C code later on in the course.

It probably won't be very fast though. There are parts of C++ that doesn't map very well to common lisp, and which means some parts will have more overhead than necessary.


Very impressive for a high-schooler.

> He had the wonderful idea of not implementing any undefined behaviour that wasn't strictly necessary

What does this mean? It sounds like the way an ordinary C compiler works: it ignores the possibility of undefined behaviour occurring, as the C standard allows it to do this.


I just remember that it crashed if you tried to modify a string literal, which IIRC is undefined. And that it was very diligent about scope.

If you had signed integer over/underflow it wrote insulting error messages.


C99, sure. C++?


Of course it is doable. Nontrivial of course. C++ is probably not the language I would start with.


it seems you could dlopen (etc) a compiled-in-a-repl code snippet, but it's seemingly not often thought of. not sure why. (I've done it, but just prefer straight compiling, as in "i am the repl".)

p.s. here's the most concise two refs i found by googling.

https://www.quora.com/What-is-the-purpose-of-dlopen-dlload-i...

https://man7.org/linux/man-pages/man3/dlopen.3.html


Probably not. Not in any language. If it was it would have been done a dozen times.


It has been done several times, at least.

http://www.hanno.jp/gotom/Cint.html

https://github.com/root-project/cling

https://www.softintegration.com

You can argue whether some of those are strictly interpreters, versus just a REPL hooked up to a compiler (as in the case of Cling). But they do exist.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: