sogaiu/notrepl-notes.md

Last active January 30, 2025 11:37

Star (1) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/sogaiu/9b73cee25eb8d798e509b92c92a3d6c6.js"></script>
Save sogaiu/9b73cee25eb8d798e509b92c92a3d6c6 to your computer and use it in GitHub Desktop.

Download ZIP

notrepl notes

Raw

notrepl-notes.md

notrepl

"notrepl" is a specification and implementation of a protocol that helps with the experience of developing in certain ways. It may start as a "subset" of nREPL, though see below for elaboration. The name is a nod to the fact that the term "repl" (somewhat like the term "lisp") seems to be something that some folks feel very strongly about being used in certain ways.

Subset nREPL

For reasons (TM), we want to consider nREPL. However, the term "nREPL" by itself is ambiguous so let's try to be clearer. There are implementations that hint at supporting nREPL and there are documents and discussions that touch on something by that name, but so far we have had no success in locating a programming-language neutral specification document [1].

Bare Minimum Ops

From examining various implementations, it seems that supporting some form of clone and eval ops could lead to a somewhat functional setup in certain scenarios. These two ops seem to be implemented in all of the usable servers examined. These will be our initial target ops.

Potential Ops

If there is some initial success with the aforementioned ops, perhaps we'll consider describe, completions, lookup, and/or load-file. The latter three may not meaningfully work while an eval (or other) op is being processed, but it might be better for it to be possible to use them rather than not.

Uncertain

It's unclear what degree of session support will be practical for many language runtimes. Thus the close and ls-sessions ops are not being considered initially.

Not Likely

The interrupt op seems like it could be a non-trivial / impractical endeavor for many programming languages. Indeed, even for Clojure there was a brief period where it was unclear whether it could continue to be supported.

It's possible that in practice it works usefully in many cases (at least for Clojure), but from the perspective of appropriately cleaning up resources and the existence of native code extensions, it seems possible that a generic solution that handles all cases gracefully is not possible.

The middleware idea is interesting but it seems like it could be a fair bit of complexity. Perhaps too much to be worth it.

Technical Background

Rough Overview

AFAICT, most programming languages don't support Clojure-like vars or namespaces along with threading so we'll have to see what kind of system is practical.

Specifically, the Janet programming language supports fibers (cooperative multi-tasking) and threads (of the operating system persuasion) as well as environments in the form of tables (like maps or associative arrays). It's not obvious how these could be made to fit the original nREPL model where multiple threads are able to "see" and "interact" with the same set of namespaces in a coherent manner.

This is relevant for at least the idea of "sessions" where there could simultaneously be two sessions where one is dedicated to performing user-initiated code evalutions and the other is used for tasks such as completion and documentation lookup. It's unclear how one can do these sorts of things simultaneously referencing the same underlying data in typical programming language runtimes.

Spelled Out in a Bit More Detail

In Janet there are two obvious approaches to simultaneous processing within a single process. One is via operating system threads and the other is via fibers. Let's consider a bit how each mechanism might be applied.

Suppose we used a single thread for each session. In Janet, each thread has its own VM and each such VM has its own heap. So each session would be associated with its own VM, separate from those of all other sessions. Since data is not shared between two threads in Janet by default, each session would be accessing a different set of data (e.g. environment tables). Perhaps it might be possible to copy and update data between threads, but this seems complicated and potentially inefficient.

Now suppose we used a single thread with a fiber per session. Janet's fibers implement a form of cooperative multi-tasking and that cooperation must be accounted for explicitly when expressing the code for a fiber. In general, arbitrary user code will not be written in a way to yield control (often and soon enough) to account for the existence of sessions. As a consequence, this each-session-gets-a-fiber approach is likely to lead to situations where while one fiber is "busy" (and hasn't yielded), the other fibers are "dormant". In terms of sessions, it seems to imply that only one session can be active at a time. The user experience may suffer if user-initiated evaluations are long-running and lead to long wait-times for the user, for example, completion requests may not be serviceable while user evaluations are being processed. It's possible though that while waiting for evaluations to complete, most users wouldn't (or would rarely) make use of completion or documentation lookups so in practice fibers might turn out to be ok.

For the moment, the fiber-per-session approach will be attempted first, as trying to carry out the thread-per-session approach seems more complicated.

Thus we will start with the aim of forgoing true simultaneous processing of ops. This might seem disappointing but as it's unclear to us whether this whole endeavor is practical, we'd like to find out sooner, hence the focus on the bare minimum.

Note that apart from a fiber per session, there needs to be a fiber that coordinates the session fibers.

It turned out that the nursery construct in spork's rpc.janet seemed to be a good starting point.

Workflow

Setup

From the command line, user starts a notrepl server for their project. It seems reasonable to do this from the root of the project directory. On startup it probably makes sense to report an IP address (let's stick to 127.0.0.1 for now) and port information. It may be that writing the port to a .nrepl-port file would be helpful to make determining this information from a client or other program easier.

In an editor, user navigates to the project root directory or opens a file in the project and starts a notrepl client from their editor to connect to the already running server. Possibly it could be nice if starting the server and client could be done via the editor. Not going down that route initially.

At this point, if all has gone well, the clone op will have been sent from the client and the server will have sent back a response containing a new session identifier.

Use

User ensures a file from the project is open in their editor.

The user can send some form for evaluation or perhaps the content of their buffer (loaded from a file).

Testing Ideas

Try evaluating things using rep? Since some Janet forms are Clojure forms, this might be feasible in some cases.

Try to think of other clients that might be tested with.

Tests with network can hang and then it may be unclear where. Always have timeouts?

Session values may vary...this can influence the expected output values. Is there a good way to ignore / fuzzily-match varying values like these? Another thing that is similar seems to be content in error messages / stack traces. For example, file paths, line, column info, etc.

Scope Notes

Clone op
- Server will respond, but in reality there will be only one session per network connection
  - Sessions can't be executed simultaneously (with current fiber model anyway) and only one environment is being provided to evaluate. However, multiple network connections can be handled and each one does have a separate environment table...
Eval op
- Support multiple top-level forms in a single eval request
  - Needed because of the "send buffer" type of command.
  - Capture stdout and stderr for each top-level form separately and send back separately as well.
- Do not support evaluation of partial forms.
  - It's the client's responsibility to send one or more complete forms.
  - Server will treate partial forms as errors and report.
Only support working from a single file per notrepl server running instance
- Current working directory of the server / program process affects evaluation of some forms (e.g. import, os/cwd, etc.). Not having to track this and switch is easier.
- Working with multiple files might imply having to juggle multiple environments which might depend on each other. Not having to do this is easier, how to do it appropriately is unclear, and how often one would use switching between files is unclear.
- Restart the notrepl server if wanting to work with another file

Possible Next Steps

Modify tests so that they incorporate timing out. Without this, tests can hang with no clue about which test is problematic.
- net/read and timeouts
- read-stream / write-stream need timeouts?
- can ev/cancel be used with ev/read / ev/write to stop things? or may be ev/with-deadline?
- or use a dynamic variable to set timeouts just for testing...
Consider how / if timeouts should be used for clients / servers. Can ev/cancel and the like also be employed for this?
- Server needs to receive and respond to messages from a client for which no particular timing is pre-arranged. Is "blocking" therefore appropriate?
- Clients also need to be ready to receive messages from a server and it's unknown when or how many messages will arrive. However, a "done" key show up as the last message from a server.
- Blocking reads don't seem great if they are going to "hang" client / server code. Are there any good solutions?
  - Timeouts with finite number of retries...only for client?
  - Periodic checking with timeouts?
  - "Waiting" efficiently?
Study Lazuli's code with a focus on:
- how a connection is established
- how the clone op works
- how the eval op works
Go through this documentation and "clean up" so that it's more coherent.

Hold Off On

Implementation of documentation and completion ops should be delayed until after some minimal connection plus evaluation has succeeded via a modified Lazuli. There does not seem to be much point in working on these if basic functionality is not feasible.
Including a timestamp (for both client and server) might be nice when investigating issues. A downside at the moment is that expected values / actual values are a bit more complicated to express because timestamps vary. If the similar issue with session values is addressed, this issue may be as well.

Potentially Handy Things

Some readable illustration of protocol use. Currently the usages are pretty good for this.

Client and server code having debugging output for both bencoded bits and decoded bits.

Unclear Points and Questions

Forms and Current Working Directories

In Janet, some forms depend on the current working directory for their evaluation result. Typical examples include import and require forms that use relative paths, but these are not the only ones (e.g. (os/dir ".")). When a user is about to send a form from a particular file, should its current directory (on the server end) be sent along so that the janet process can change its working directory to match?

Note that in general, processes appear to support the idea of a current working directory, but this is not (by default) something that is per-thread. That is, if the current working directory is changed (say via os/cd) in on thread, the value of the current working directory in another thread (for the same procecss) will be affected because there is only one value per process by default.

Switching to Different Files in an Editor

Suppose there are two files A and B for a given project and a user has been evaluating things (via the network connection) from file A. Now suppose the user switches to file B. What should happen if a user evaluates something in file B? Should there be any indication to the running process that the user is now editing file B?

One hack / approach might be to have a way to "restart" the server each time there is a switch to another file. Not sure if there is an op that does this already...

Possible to Not Have Sessions?

Sessions seem like a nice idea, but how much of them can be supported in typical language runtimes? Can we get by without them? Specifically, could we skip the clone op entirely? Even rep uses the clone op though...

May be if there is a session value from the client it can be echoed back but internally it could be ignored. That is, perhaps it can "mean (almost) nothing" to the server.

Should the Number of Connections Be Limited to One?

Currently, multiple connections work and each one has a separate environment.

Should There Be a Way to Shut Down the Server?

Either programmatically or otherwise?

Convenient for testing?

Should There Be a Way to Log Server Activity?

Convenient for investigation?

Bugs

Server is sending back integers for sessions. These should be strings. Fixed. Discovered via testing with rep. rep was segfaulting so used rr to capture and did some investigation.
Placing a timeout value in the wrong location. At least once it was not placed within the call the net/read, but rather after it.

Things That Helped A Lot

ongoing discussions with pyrmont
cmiles74's bencode was a good starting point for mutating into a somewhat modified version and becoming familiar with bencode.
spork/rpc.janet, nursery, nathaniel smith's structured concurrency article

References

https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

[1] We did come across some docs that were generated from source code, but these had programming-language specific portions and they were incomplete in various ways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment