Programming in Unison

Programming in Unison

103

by sohkamyung

sp33der89

Pleasant surprise to see Unison mentioned!

The developer experience in Unison(and Unison Cloud) has been wonderful for me. I try to write as much in it as possible, for hobby projects or side projects for friends and family.

Abilities(what Unison calls algebraic effects) are really ergonomic too use in practice, the learning curve is a lot lower than a IO monad datatype, and it reads just like Python when putting it to practice! Code-in-a-database means I don't have to fumble with long compilation times and Git, it brings joy to just hacking on to something in the weekend, because I just get to write code.

The article mentions its drawbacks, and they are real, especially FFI imo. The Unison team mentioned they are planning to include FFI, and it's going to be interesting to see what gets compromised.

But no other language (currently) hits this sweet spot of abstractions(not too little not too much) with an enjoyable DX, for me.

Docs being first class, go-to-definition and all that is one of my favorite things to show off when mentioning Unison: https://share.unison-lang.org/@unison/base

klabb3

> with an enjoyable DX

How is the tooling? Usually these paradigm shift ideas fail on interop even if they’re amazing in theory. The ambition level of Unison seems to be absolutely gigantic, so my first thought is that interop with anything outside their idealized world would be poor.

sp33der89

The tooling is surprisingly good for such a "young" language. I've had experience with other budding languages, like Nim, Zig, ReasonML throughout the years(not comparing these languages, just their tooling, they have tradeoffs), and Unison is slick. But expectations should probably be managed, it's not IntelliJ or VS Studio.

I'd describe the tooling as "zen like". When I sit down on weekends to write some Unison, I'll pop up my editor in a left tab, and on the right there's ucm. Since the LSP is included in the codebase manager, I didn't have to setup much. The LSP works good, could use some more code actions to control ucm though.

I'll write some code, throw in a `> ` eval expression to quickly mimic a REPL. I'll see the evaluation on the right. Eventually I'll switch to the right side and add my pieces. or write a `test>` and expression, which I can add as well to my codebase. Push my changes, create a branch, or switch project also happens. Maybe I'm not sure what the name was of a piece of code? Or I'd like to be able to search based on function signature. Both can be done in the right tab(ucm) with the command `find`.

Documentation is first class, I can browse the doc comments attached to variables/functions/types, I can link other terms(pieces of code) in those docs and they are discoverable as well, as soon as you add it to your codebase.

I believe a big part of why this is all possible, is that when your code is stored like this, it makes tooling around it a lot less complex. No need make ad-hoc parsers, compilation caches, syntax checkers. I'm no tooling expert, but for example making a visual editor based on the terms in your codebase sounds a lot easier than making a visual editor for TypeScript, the mental gymnastics is a lot less.

binary132

ISTM that packing all that goodness (?) into the compiler stack itself is moving away from what I want where the compiler is a plain-Jane source transform and its inputs and outputs are very well specified and predictable. That’s where the ecosystem of tools comes in, and when done right I think tools should be very easy and straightforward to use and create. I often feel modern languages get this wrong by “including” too many “batteries” instead of specifying and documenting their build / dependency semantics clearly and providing good tools or libraries for working with and composing those elements of build.

Far too few language ecosystems provide any kind of libraries for it whatsoever, even for working with syntax / AST. I really can only think of Clang and Go, and Go’s tool libraries aren’t very accessible IMO.

sp33der89

This is a good point, and deserves a lot more discussion about it.

I personally don't mind that my code is somewhere else than in a text file, because as it now I have to ductape a huge array of tools to make it to production in languages like TypeScript and Elixir(languages we use at $WORK). I don't want to deal with that anymore than I need to.

I hope modern languages get to the level of swiftness like Unison, but until then I'm really happy with Unison, because code-in-database is only one of the several aspects I like about it(no dependency conflicts being another huge part of it!).

Clang is a good example I know, and Scala(https://www.chris-kipp.io/blog/an-intro-to-the-scala-present...) also has some nifty stuff. But even in Scala things aren't as seamless as in Unison.

Unison does things very differently, and it might not be everybody's cup of tea. But I do hope that popular languages take the good bits and help programmers do less chores and more fun stuff.

> I often feel modern languages get this wrong by “including” too many “batteries” instead of specifying and documenting their build / dependency semantics clearly

Do you have some examples? Nix somewhat aims to solve this, but it has its own drawbacks. A lot of people these days _expect_ a battery included environment: package manager, build tooling, linting, formatting, a language server, documentation generators. Unison solves a lot of these in one go because of the way it handles code, I think that's elegant and worth exploring.

binary132

I’m not sure there is really more to say here. A lot of people don’t seem to like strong type systems very much either, but that doesn’t mean they’re not wrong. ;)

That said, it clearly has some interesting ideas! I’m just not sure I understand why the database needs to be the compiler. It seems like an awful lot of design decisions baked into the stack instead of offered as APIs or protocols. And it’s obvious that we can consider the filesystem itself to be a sort of database — Git famously is implemented entirely in filesystem semantics, for example, so it’s also not clear to me why this content addressing can’t simply be done in a cache dir, or some such, the way many build systems do for their compilation units.

sp33der89

> I’m not sure there is really more to say here.

I'm sure that there can be some very interesting discussions about code-in-database and interop with tooling!

> I’m just not sure I understand why the database needs to be the compiler.

I'm not entirely sure if these two are tightly bound. For example `ucm` uses a SQLite database, but Share(their Github equivalent) uses PostgreSQL afaik.

I believe a previous iteration used Git as the backing store but they didn't go through with it. I'm not sure about this tho.

EdiX

I've never used unison but I've used other, more traditional, REPL-based languages (think, Mathematica or Common Lisp with SLIME). To me one of the biggest problems with that style of programming is that sometimes code that you have changed on disk persists in the memory image, in the form of closures, creating unexplainable "ghost bugs" that disappear when you restart the REPL (or worse, that will only appear after you restart the REPL).

This system with making code immutable and storing it into a database makes me think somebody noticed how much of a footgun this behavior of REPLs is and thought... "mmmmhh what if we added a second barrel to the gun so that you can be sure to always shoot your own feet?"

pchiusano

I know what you mean with those other tools, but this doesn't happen in Unison. The reason those systems are somewhat flaky is that the cache of what's in memory can diverge from the "source of truth" which is a bag of constantly mutating text files. Maybe put another way, cache invalidation is hard in those systems.

When the source of truth is instead a database, content-addressed by hash, cache invalidation is simple - if the hash has changed, a cached result is invalid and needs recomputing. If the hash is the same, you're good. We use this approach in many places throughout Unison and it's quite robust.

nerdponx

This is why I never fully embraced to the REPL-driven development style espoused by CL users. Even if I am using hot reloading during my development workflow, I never keep a long-running development image and I never expect to distribute or run that image as the finished product. I always build a fresh clean image/executable (eg via ASDF) and run a separate test suite.

Interestingly this problem has arisen more recently in the Jupyter notebook ecosystem, and is being rediscovered by a generation of non-programmer data analysts / scientists and the programmers that need to support them. Some interesting solutions like "reactive" notebooks have arisen out of that. It would be very interesting to have something like that for Common Lisp. Maybe Mathematica already has it.

pchiusano

Hi there, I'm one of the creators of Unison, feel free to AMA!

LegionMammal978

Suppose that a codebase has two different functions with two different purposes, but they currently have the same implementation. (Say, the "factorial" function is used to implement that operator in a user-facing REPL, but also some growable-array implementation happens to use a factorial function for its allocation policy.) Then if one function starts needing a different implementation (e.g., a new allocation policy), then how do you reliably separate the two in this model? It would seem like there is no way to disambiguate a function apart from its current implementation.

pchiusano

If I understand your question, this would work much the same way as any other language. Suppose you have:

  allocationPolicy = 23484

  -- two usages of allocationPolicy
  foo = allocationPolicy + 1
  bar = allocationPolicy + 99

You then later realize you want different allocation policies to be used in different parts of your app. You first might want to rename the existing `allocationPolicy`:

  move.term allocationPolicy defaultAllocationPolicy

At this point, all the code still references that hash, which now has the name `defaultAllocationPolicy`.

Next if (say) you wanted `foo` to reference a different definition, you'd `edit foo`, and introduce:

  fooAllocationPolicy = 283

  foo = fooAllocationPolicy + 1

Then `update` and you're done. The new version of `foo` references `fooAllocationPolicy` while `bar` continues to reference `defaultAllocationPolicy`.

A couple other notes -

* It's very rare for independent implementations to end up with the same hash. It probably only happens for some very simple defintions that exist in base. (Like the identity function, say)

* If a hash has multiple names in your project because you've used `alias.term` to do so explicitly, the pretty-printer picks one using a deterministic rule (it prefers names you've given that hash in your project, then it consults library dependencies). If you really want to give two definitions different hashes even though they are functionally the same, you can introduce a minor change, like an unused binding.

* The type of a definition is part of its hash, so sometimes you might specialize a more generic function with a more narrow type signature, and this gets its own hash.

* The OP is slightly out of date re: patches. We use something simpler now for updates and merges. When you update, we compare the new and old namespace to obtain a diff, which is applied to the ASTs in the namespace. If the result typechecks, you're done. If not, we make a minimal scratch file for you to get typechecking - it will contain the minimal transitive dependents of the change.

LegionMammal978

Thank you for the more detailed explanations. I was mainly wondering about the more philosophical concern of "unintended identical hashes" causing coupling between parts of the codebase with different purposes (which may want to evolve apart in the future), which you say is thankfully rare in practice.

For instance, say you have a function which generates a certain business report, and your boss wants you to fiddle with the formatting every quarter. Your colleague has a similar function to generate a report with the same data, but according to their own boss's quarterly formatting requirements. With content-based identity, it would seem like you have to be wary of your own and your colleague's reports ever aligning, lest they have the same hash and lose their distinct identities.

> If you really want to give two definitions different hashes even though they are functionally the same, you can introduce a minor change, like an unused binding.

Interesting. Is there at least any way to detect this condition before it occurs? (I.e., to know if you're trying to define a new function which happens to have the same hash as an existing one.)

zawodnaya

Yes, unison detects this condition when it occurs and tells you about it.

carapace

> two different functions ... have the same implementation

Then they are not different functions.

> It would seem like there is no way to disambiguate a function apart from its current implementation.

Right, that's the whole point.

In Unison the name of a function is a hash of its implementation. Change the implementation and you change the identity.

> how do you reliably separate the two in this model?

There is no way to confuse the two in this model.

Comment was deleted :(

DatoClement

My understanding is that the unique hash is really the true identifier of the implementation. So when you change the implementation, the two functions will have two different hashes? I am not sure if this is exactly your problem.

Comment was deleted :(

cannibalXxx

[dead]

Comment was deleted :(

Crafted by Rajat

Source Code

hckrnws

Programming in Unison