posts [uts-0168]
✍️source

My Day 1 with jj [uts-016M]

> Discuss on [/r/git](https://www.reddit.com/r/git/comments/1lvbzo8/my_day_1_with_jj_jujutsu/) and [Bluesky](https://bsky.app/profile/iutensil.bsky.social/post/3ltnz23tp2k2z)

## TL;DR

I became productive with [jj](https://github.com/jj-vcs/jj) (Jujutsu, *dauntless* version control) on day 1. This is my story, along with my mindset changes, and delicious recipes. Scroll down to the end for a short list of when to use what command/alias.

## Why try `jj`?

Well, *dauntless* is not the official slogan, but I really need it to describe a liberating feeling much stronger than "fearless".

Git is an intimidating black box for some, but I don't fear it in my daily work, as I'm skilled enough to easily recover work if I messed up.

`jj`, on the other hand, gives me the ultimate power in the river of time:

> In `jj`, I can go anywhere in history, do anything, be done with it, and history will reshape itself accordingly.

I wouldn't know this when I first read [Steve's Jujutsu Tutorial](https://steveklabnik.github.io/jujutsu-tutorial/) months ago, it's well written but still a haystack to find the needle that would click with me.

[jujutsu on tangled](https://blog.tangled.sh/blog/stacking) intrigued me, by telling me "Jujutsu is built around structuring your work into meaningful commits", and showing how it helps contributors to iterate on changes with dependencies on other changes, individually reviewed, reworked, and merged. But what if I just use it on my personal repo, and don't care much about clean history?

[A Newbie's First Contribution to (Rust for) Linux: Git gud](https://blog.buenzli.dev/rust-for-linux-first-contrib/#git-gud) shows the author's Git workflow to contribute to Linux kernel with a lot of fast moving forks. The configuration for `rebase` and `rerere` seems really solving the roadblocks of contributing to a thriving project. But the author said: "Jujutsu is much better at this kind of workflow and what I was actually using most of the time. I first discovered this workflow while using Jujutsu and only later found out you could do something similar with Git." A refreshing remark! Maybe I do need native and friendly support for more advanced features, even just to alleviate my mind.

Eventually, pksunkara's [Git experts should try Jujutsu](https://pksunkara.com/thoughts/git-experts-should-try-jujutsu/), and his side-by-side workflow comparisons with Git, and his [`jj` configuration full of aliases](https://gist.github.com/pksunkara/622bc04242d402c4e43c7328234fd01c)  that gave me the first boost, so I finally decided to spend a day with  `jj`. I definitely no Git expert, but those aliases really look attempting, what if I can pick my few favorites?

I started the journey with a minimal goal. I just wanted to recover my Git workflow in `jj`, as `jj` coexists with a local Git repo, and I need `jj` to work with my GitHub (or any code forge I might migrate to in future). I learned much more, and they are much easier than I thought.

## Protect my main branch

First thing, I went to my GitHub settings and [protected my main branch](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/about-protected-branches), to at least forbid force push, in case (my misuse of) `jj` messes up the remote. ([Here](https://ricky-lim.github.io/blog/protect-your-main-branch-in-github/) is how to do even better from CLI.)

## Install and configure `jj`

I use [just](https://github.com/casey/just) to write code snippets as tasks. Installing `jj` on Mac, and prepare the configurations is simply:

```make
prep-jj:
    which jj || brew install jj
    mkdir -p ~/.config/jj
    cp -f dotfiles/.config/jj/config.toml  ~/.config/jj/config.toml
    jj config set --user user.name "$(git config --get user.name)"
    jj config set --user user.email "$(git config --get user.email)"
    jj config list --no-pager
```

The [`dotfiles/.config/jj/config.toml`](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) is an adapted version from pksunkara's, and all changed places are annotated with `[NOTE]` comments and inspiring links. By copying to the right path, it becomes the user-wide configuration.

`jj config set --user ` will write additional and more personal information into the same file (assuming git has already configured these information), so they came later. Finally, the configuration is listed for inspection.

Every time I've updated the configuration, or I'm on a new machine, I'll run `just prep-jj` again, so I'm all set again. Your way to manage code snippets might be different, but it all boils down to similar few lines.

You might want to change the line `editor = 'hx'` back to favorite editor, as mine is now [Helix](https://github.com/helix-editor/helix).

It also assumed that you have installed [delta](https://github.com/dandavison/delta), a syntax-highlighting pager for git, diff, grep, and blame output. It's eye candy, and makes the diff much more clear.

## Initialize the Git repo

`jj` can coexist, or officially, *colocate* with a local Git repo. I place all my local Git repos under `~/projects/` so my just task looks like this:

```make
init-jj PROJ:
    #!/usr/bin/env zsh
    cd ~/projects/{{PROJ}}
    jj status || jj git init --colocate
```

`jj status` will detect if it's already a `jj` repo, otherwise we will initialize `jj` to colocate with Git. I try to make all my tasks open to reentrance, so I can easily tweak and rerun them.

Actually it's already alias to `jj init` by pksunkara, but here we go with the origin command, to avoid locked-in. But later on, I'll use more aliases, and you may consult [the configuration](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) for the full command.

## Thrown into limbo

Once `jj` is initialized, we are no longer on any branch of Git. If we go back to `git` or `lazygit`, we'll see that we are on a detached state.

I don't know about you, but my feeling was like floating in the air, with no solid ground under my feet.

I felt much better when I learned that what I commit as *change*s in `jj` will still be reflected in Git as commits, and commits can still find their parents.

We can go back and edit a *change* then commit again, it would be become a new Git commit, but its identity as the original *change* remains the same. So changes can still find their parent changes.

Actually, a branch in Git, is just a reference always point to the latest commit of the branch, so its history can be traced via parents.

`jj` has *bookmark* for pointing to a *change*. Unlike in Git, when you commit more changes on top of it, the *bookmark* doesn't move to the latest change automatically, however, *we* can move it.

With these 2 powerful concept, we are lifted off ground, start flying among changes, with bookmarks marking our path.

## The working copy

OK, we are not on any branch, so where are we?

`jj log` (or simple `jj l` or even `jj`) gave me something like these at the top:

```
@  rrptyzpt 0f23c873 (empty) (no description set)
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

That's a lot to take in!

`f800c4ed` and  `My last commit on the main branch` can be recognized as my last Git commit hash and message.

`vmzsxwzu` is the *change id* assigned to it, along with a diamond ◆ indicating that it's immutable: It's written on the stars by Git, so we can't edit it.

But what's the commit-like stuff on top of it? I didn't like it, and even tried to `jj abandon` it, but it keeps coming back with a new *change id* and a new commit hash.

Finally, I accepted it as the "working copy", indicated by `@`. Unlike in Git, where I'll be on HEAD, along with some local changes to be staged, in `jj`, I'm already on the future commit as a working copy. That's why sometimes we need `@-` that points to the parent of `@`, the last committed *change*.

`(empty)` means I haven't made any edits to this *change*. `(no description set)` means I haven't described what I'm going to do for this change (via `jj desc`).

I'll feel much better if it's just `(no description)`, as the `set` makes me feel guilty for not providing a description ahead. And it turns out that I didn't need to! Even after I'm done with it, and moved on (`jj` will keep the change in its history, like automatically doing `git stash`). If I have trouble recognizing the change without a description, I could always use `jj d` to inspect the diff.

`jj s` (status) shows it more verbosely:

```
The working copy has no changes.                                                                                                                          │
│Working copy  (@) : rrptyzpt 0f23c873 (empty) (no description set)                                                                                        │
│Parent commit (@-): vmzsxwzu f800c4ed main | My last commit on the main branch
```

The change id of my working copy begins with `rr` (bold and colored, indicating we can refer to it uniquely with `rr`).

It's a bit annoying to enter and quit pager, and unable to see the latest changes, so I'v configured `jj` to use no pager and only occupy about 2/3 screen.

There's also alias `jj lp` that shows private (i.e. not pushed) changes, and `jj lg` that shows all changes, including elided changes that you just pushed.

## Commit

I worked on a repo that I'll also add my just tasks and configurations for `jj` into it. So I edit the justfile in my editor Helix, and saved it to disk.

Alias `jj d` will show the diff, that confirms that `jj` has seen our edits.

We can now commit the change. `jj commit` will commit everything, opening the configured editor for commit message (just save and quit to submit, quit without saving to cancel).

Aliases like `jj ci` will also fire up a built-in diff editor to interactively select which files, or even which parts (called sections) to commit. Arrow keys can fold/unfold files/sections, space to select, `c` to commit, and `q` to quit. (I couldn't find how to commit until I tried `c`, there is no visual hint at all, unless you use mouse to click on the menus, but I didn't realize that I could do it on day 1, I tried every possible key combination to trigger the menu, no luck).

I often find myself repeatly use `jj ci` to review, select and commit, to split my local edits into a few independent changes.

With alias `jj cim <message>`, we can directly supply the commit message from CLI without opening an editor, then select which parts to commit.

## New

After commit, we'll again on an empty change with no description, so we can continue making new edits.

But if after working for a while, I suddenly want to discard my (uncommitted) edits, I can do `jj n rr` to start over from the last change. The uncommitted edits will be still there in `jj` as a change, but local files will be reset to the state when we commited `rr`.

Here `rr` could be any other change id, or any bookmark, or any *revset* in general. We'll learn more about *revset* later, for now, we just need to know that a change id, a bookmark, `@` and `@-` that we know so far are all *revset* s.

`jj n` (new) effectively spins off a new history line from the parent change. We can go along the line for a few changes, and spin off from anywhere again, without having to create a named branch.

## Edit

In Git, I seldom amend a commit, usually I just use `reword` in lazygit to amend the commit message, or `reset --mixed` (`gm` in lazygit) to discard history after a commit, then stage and commit them in a better way again.

In `jj`, if I don't like a change, I could go back and edit it:

```
jj e <revset>
```

will auto stash any dirty edits, and reset local files to the state of that `<reveset>` (read: change), I can happily do any edits, or commit it with alternative selections of files or sections.

We don't need to explictly commit, when we use `jj n <revset>` or `jj e <revset>` to jump to anywhere else, the local edits will be automatically committed to the original change.

## Evolution of a change

Wait, what if I messed up when editing a commit, and left it, can I go back?

At that time, it's beyond what `jj undo` can easily do, especially we might be confused by what we did and didn't do.

The first thing we can do is to inspect how a change has evolved over time by

```
jj evolog -r <revset>
```

It looks like this:

```
❯ jj evolog -r uo
○  uoxoknux 92fb8d3a (no description set)-- operation a0693515ab31 (2025-07-09 11:54:16) snapshot working copy
○  uoxoknux hidden 1f73cdfd (no description set)-- operation 79813357f450 (2025-07-09 11:52:58) snapshot working copy
○  uoxoknux hidden c2bf8f4f (empty) (no description set)-- operation a22c21ce9fc1 (2025-07-09 11:52:32) new empty commit
```

We can inspect the diffs of each hidden commits with their commit hash via `jj dr <revset>` (alias for `diff -r`) where `<revset>` should be the commit hash to disambiguate.

The we can use either `jj n` or `jj e` on the commit hash of our choosing, to continue work.

Note that the change id would become ambiguous, so we need to `jj ad` one of the commit hash so we can continue to refer to the change id.

It looks like this for me:

<details>
<summary>click to expand</summary>
<pre>
❯ jj n 1f73
Working copy  (@) now at: unyrkpru 003d7750 (empty) (no description set)
Parent commit (@-)      : uoxoknux?? 1f73cdfd (no description set)
Added 0 files, modified 1 files, removed 0 files

❯ jj e 1f73
Working copy  (@) now at: uoxoknux?? 1f73cdfd (no description set)
Parent commit (@-)      : mxowtxvr 68b37035 (no description set)

❯ jj evolog -r uo
Error: Revset `uo` resolved to more than one revision
Hint: The revset `uo` resolved to these revisions:
  uoxoknux?? 92fb8d3a (no description set)
  uoxoknux?? 1f73cdfd (no description set)

Hint: Some of these commits have the same change id. Abandon the unneeded commits with `jj abandon <commit_id>`.

❯ jj ad 92f
Abandoned 1 commits:
  uoxoknux?? 92fb8d3a (no description set)
Rebased 1 descendant commits onto parents of abandoned commits
</pre>
</details>

We may also use `jj op diff --op <operationid>` to inspect meta changes of an operation (it has `--from` `--to` variant so it also works on a range of operations).

## PR with an ad hoc branch

> see also the official doc [Working with GitHub: Using a generated bookmark name](https://jj-vcs.github.io/jj/latest/github/#using-a-generated-bookmark-name)

If I like my changes so far on the current history line, I could open an unnamed PR by first commiting changes, then using alias

```
jj prw
```

where `prw` stands for PR working copy.

Per our config, it will push to a branch named `jj/<changeid>` on origin, then open a PR via `gh` (Github's official CLI to interact with issues, PRs, etc.), it will ask you for a title (default to the description of the first change), and a body, we could just press enter all the way to submit, and change anything later.

Personally I think such a tiny ad hoc PR is a good way to record the proposed changes, even for a personal repo. But if pusing to a branch is all you need, alias `jj pscw` (push changes of working copy) will do.

After merging the PR, we need

```
jj git fetch
```

to let both Git and `jj` to learn about the changes.

Here is how `jj l` looked like after I PR a change "Make a start on trying jj" as `#1`, merged it on GitHub, then fetched the changes:

```
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
│ 
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

`gh pr view #1 -w` can open the PR in the browser.

If we have made edits to the change (but not new changes on top of it, because they would be pushed to their own `jj/<changeid>` branch), like `jj e` or `jj sqar` (introduced below), we can use

```
jj pscw
```

where `pscw` stands for pushing local changes of working copy.

The result looks like this:

```
Changes to push to origin:
  Move sideways bookmark jj/stwxmxoqkyym from 33c46b992efc to 4d3c2a60e0d2
remote: Resolving deltas: 100% (3/3), completed with 3 local objects.
```

If we have made other changes on top of the change, we can treat `jj/<changeid>` as a named branch like below (i.e. `jj bs jj/<changeid> -r <latest changeid>` and `jj psb jj/<changeid>`).

## PR with a named branch

This would best resemble a usual GitHub workflow.

First we create a local bookmark:

```
jj bs <bookmark> -r <revset>
```

`<revset>` could be just a change id, `@-` could come in handy to refer to the last committed change.

`<bookmark>` is the name of the bookmark, just like a normal branch name. And `bs` is the alias for `bookmark set`.

We can see if the bookmark is correctly placed from `jj l`. If we don't like it, we can either `jj undo` or `jj bd <bookmark>` (bookmark delete).

Then we can PR that bookmark:

```
jj prb <bookmark>
```

This would push to a branch named `<bookmark>` on Github, then use `gh` to PR it.

It looks like working on a branch in Git, but after we commit more changes, we need to call `jj bs -r <revset>` to set it to desired `<revset>`, then

```
jj psb <bookmark>
```

Note that the alias begins with `ps` instead of `pr`, meaning "push bookmark".

## Fetching

What if we committed some changes on GitHub, like accepting a review suggestion, we need to use `jj git fetch` to make `jj` know about the changes. We'll see something like

```
bookmark: our-name-of-bookmark@origin [updated] tracked
```

`origin` is the remote on GitHub. `[updated]` means `jj` noticed the changes on the remote, `tracked` means the local bookmark is also automatically updated.

If it's not tracked, we can use

```
jj bt our-name-of-bookmark@origin
```

to track it.

Here is how `jj l` looks like when I PR a few changes in this way then fetched it:

```
@  nozpulso fd589b2a (empty) (no description set)
○  kpxkrxvm a22ed20a configure-jj* Add frequently used jj command alias
○  tvvoymuu 6808370c Keybindings for builtin diff editor
○  ylnlzpvx b20ac2d1 Adapt jj config from pksunkara
○  rnkttumw f6296b0a Improve configs in `just init-jj`
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

## Rebase

But I didn't get such a clean history from the start. I'd begun with a few fixup commits, and one of the commits is mixed with different changes. And initially, I pushed it with an ad hoc branch, accepted a few review suggestions on Github.

After opening the PR, I've noticed a weird non-empty commit `xm`

```
○  ...more changes...
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
│ ○  xmnykxym 4dcb41d9 (no description set)
├─┘
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

After inspecting the diff, I believe its changes is aborbed into `rz` (`#1`), so I rebased `xm` on top of `rz` to see if there is any diff left.

```
❯ jj rb -s xm -d rz 
Rebased 1 commits to destination
```

where `rb` stands for `rebase`.

I just need to remember that I'll be rebasing source (`-s`) onto destination (`-d`).

Now I have

```
○  ...more changes...
│ ○  xmnykxym 2f622327 (empty) I think this is what absorbed into #1
├─┘
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
```

Nice, it's indeed empty, so I can be assured no work is lost.

It's an extremely simple use case, but I'm happy with being able to transplant a parallel change to be on top of another change, without much hassle.

## Squash

There are some other complications:

```
@  ...more changes...
│ ○  zquplvzx 2b572336 configure-jj Update dotfiles/.config/jj/config.toml
│ ○  nmuntxon 746cb300 Fix typo found by copilot
├─┘
○  ...some changes in between...
○  ylnlzpvx b20ac2d1 Adapt jj config from pksunkara
○  rnkttumw f6296b0a Improve configs in `just init-jj`
```

I want to absorb typo fixes `nm` and `zq` into `yl`, how do I do that?

It's as simple as correctly specifying `from` and `to`:

```
❯ jj sqa --from zq --to y
Rebased 3 descendant commits
Working copy  (@) now at: srnwoqzq c831d68c (empty) (no description set)
Parent commit (@-)      : kpxkrxvm 7d911838  jj/vzzzxqvnptov* | Add frequently used jj command alias
Added 0 files, modified 1 files, removed 0 files
```

where `sqa` is short for squash. Note from the output that any changes after them will be properly rebased.

There is also a shortcut if I just want to absorb a change into its parent: `jj sqar <revset>`.

I keep using the word "absorb" but `jj absorb` does a completely different thing (it splits a change, allowing some other changes to absorb it, by "moving each change to the closest mutable ancestor where the corresponding lines were modified last", that sounds like "desolve" to me).

## Summary

I could go on and try splitting a change, but the above is enough for day 1.

This doesn't demonstrate the full power of `jj`, but imagining how the same things could be done in Git, I'll definitely rather spend the time with `jj`.

I'll wrap up with a summary of the commands/aliases that would be enough for daily work:

### Getting started

| alias | when to use |
|-------|-------------|
| `jj init` | init `jj` in a local Git repo |

### Show

| alias | when to use |
|-------|-------------|
| `l`   | show log, `jj` works better |
| `lp`  | show private log |
| `lg`  | show all log, even elided changes |
| `s`   | show status |
| `w`   | show status and diff |
| `d` | show the diff of current change |
| `dr <revset>` | show the diff of revision `<revset>` |

### Edits

| alias | when to use |
|-------|-------------|
| `n <revset>` | create a new change based on `<revset>` |
| `cm <message>` | commit everything with `<message>` |
| `ci` | commit the change, choose interactively, write commit message in `$EDITOR` | 
| `e <revset>` | edit any `<revset>` after commit |

`<revset>` could be a short change id, a bookmark, and defaults to @ (working copy).

After edit, just go anywhere else (with e or n), the changes will be commited; or an explicit commit will do.

### Working with remote branch

| alias | when to use |
|-------|-------------|
| `git fetch` |  will update local bookmark from remote branch |
| `bt <branch>@origin` | track remote branch |
 
### PR with named branch

| alias | when to use |
|-------|-------------|
| `bs <bookmark> -r <revset>` | move bookmark `<bookmark>` to `<revset>` |
| `psb <bookmark>` | push bookmark to remote branch named `<bookmark>` |
| `prb <bookmark>` |  create a PR from `<bookmark>` to the default branch |

### PR with ad hoc branch

| alias | when to use |
|-------|-------------|
| `prw` | open (committed) working copy as a PR |
| `pscw` | push changes of working copy to remote branch named `jj/<changeid>` |

### History cleanup

| alias | when to use |
|-------|-------------|
| `undo` | undo last jj operation when messed up |
| `rb -s <src> -d <dst>` | rebase `<src>` onto `<dst>` |
| `sqar <revset>` | squash `<revset>` into its parent |
| `sqa --from <src> --to <dst>` | squash `<src>` into `<dst>` |
| `ad <revset>` | abandon a change |
| `evolog <revset>` | show how a change evolve in time |
| `op log` | show log of operations on jj |
| `bd <bookmark>` | delete a bookmark |

These aliases can be found in this [config file](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) which is based on pksunkara's [gist](https://gist.github.com/pksunkara/622bc04242d402c4e43c7328234fd01c).

Trying Zig's self-hosted x86 backend on Apple Silicon [uts-016K]

> TL;DR: I used `colima` to run an x86_64 Docker container (Ubuntu) on Apple silicon, to quickly test `zig build` with LLVM backend and with self-hosted backend, because it's both exciting and concerning (for missing all the goodies from LLVM) news.

After Self-Hosted x86 Backend is Now Default in Debug Mode dropped, I immediately wanted to try it out, but I only have an Apple Silicon machine (e.g. Mac Mini M4 Pro).

## Running an x86 container on Apple Silicon

According to Intel-on-ARM and ARM-on-Intel, I'm supposed to be able to run x86_64 Zig using `lima` with Apple's native virtualization framework and Rosetta. After some fiddling and searching, I've realized that I could just use `colima` to run an x86_64 container on an ARM64 VM, which is also backed by `lima`.

OK, let's get started:

First, install `colima` and prepare it properly:

```bash
# we need `docker` CLI as the client
which docker || (brew install docker; brew link docker)

# (optional) while we are at it, get `docker-compose` and `kubectl` too
which docker-compose || brew install docker-compose
which kubectl || brew install kubectl

# install colima
which colima || brew install colima

# this is to prevent othere docker daemons from interfering
docker context use default
```

Next, let's start colima with Apple's native virtualization framework and rosetta:

```bash
colima start --vm-type=vz --vz-rosetta
```

Because I have already used colima before, but without these flags, there is a warning saying that they are ignored, so I have to delete the existing colima VM and start over.

(Warning: the following command will also DELETE all existing images! So I commented out them to prevent accidental execution.)

```bash
# colima delete
```

Now, we can pull an x86_64 Docker container with Ubuntu:

```bash
# asssuming `docker login` has been done already
docker pull --platform linux/amd64 ubuntu:jammy
```

and start it (`--rm` means to remove the container after it exits, so we'll lose the changes made inside, remove this option if you want to keep the container):

```bash
docker run --rm -it --platform linux/amd64 ubuntu:jammy bash
```

Inside the container, let's confirm that we are indeed running x86_64:

```bash
uname -m
```

Cool, I'm seeing `x86_64`!

## Running `zig build`

Now, we can install Zig and try it out:

```bash
# we need a few basic utils
apt update
apt install -y wget xz-utils software-properties-common git

# Download the corresponding Zig version with self-hosted x86 backend
wget https://ziglang.org/builds/zig-x86_64-linux-0.15.0-dev.769+4d7980645.tar.xz
tar xvf zig-x86_64-linux-0.15.0-dev.769+4d7980645.tar.xz

# Make it available in PATH
export PATH=/zig-x86_64-linux-0.15.0-dev.769+4d7980645/:$PATH

# Verify its version and that it runs
zig version
# got: 0.15.0-dev.769+4d7980645
```
Let's create a simple Zig project to test building it:

```bash
mkdir projects
cd projects
mkdir zig_x86_64
cd zig_x86_64

zig init
zig build
```

Success!

`zig build run` gives

```
All your codebase are belong to us.
Run `zig build test` to run the tests.
```

and `zig build test --summary all` gives:

```
Build Summary: 5/5 steps succeeded; 3/3 tests passed
test success
├─ run test 1 passed 7ms MaxRSS:5M
│  └─ compile test Debug native cached 68ms MaxRSS:44M
└─ run test 2 passed 7ms MaxRSS:5M
   └─ compile test Debug native cached 67ms MaxRSS:44M
```

## Comparing with and without LLVM

But wait, how do I know it's actually using the self-hosted x86 backend?

Hopefully someone has a better way, I just took the longer way to force Zig to build with and without LLVM.

After reading the doc and some searching, I figured out that I could expose an extra option to `zig build` in my `build.zig` to set the corresponding flag for the executable, with only 2 edits:

```zig
    // EDIT 1
    const use_llvm = b.option(bool, "use_llvm", "Force use llvm or not") orelse false;

    const exe = b.addExecutable(.{
        .name = "zig_x86_64",
        .root_module = b.createModule(.{
            .root_source_file = b.path("src/main.zig"),
            .target = target,
            .optimize = optimize,
            .imports = &.{
                .{ .name = "zig_x86_64", .module = mod },
            },
        }),
        // EDIT 2
        .use_llvm = use_llvm,
    });
```

(Optional) I did the edits by running the following to install a helix editor so I can edit Zig code out-of-the-box:

```bash
# https://docs.helix-editor.com/package-managers.html#ubuntu-ppa
add-apt-repository -y ppa:maveonair/helix-editor
apt install helix
# then fire up `hx build.zig` and use it mostly like in Vim
# I also installed zls by
# cd /tmp
# wget https://builds.zigtools.org/zls-linux-x86_64-0.15.0-dev.197+48fb941e.tar.xz
# tar xvf zls-linux-x86_64-0.15.0-dev.197+48fb941e.tar.xz
# cp -f zls /usr/local/bin/
# and checked that it works by
# hx --health zig
# so I can use `gd` to go to definitions!
```

Cool, now let's try building with LLVM:

```bash
rm -rf .zig-cache && time zig build -Duse_llvm=true
```

```
real    0m3.068s
user    0m3.610s
sys     0m0.363s
```

Then without (which should be the x86 self-hosted backend):

```bash
rm -rf .zig-cache && time zig build -Duse_llvm=false
```

```
real    0m2.112s
user    0m2.812s
sys     0m0.361s
```

Wow, it's indeed faster without LLVM! I've tested this a few times and getting consistent results. I'll also try this on more complex projects later, but it's so exciting that I just wanted to write a note for this.

### UPDATE 2025-06-15

I further tried using poop to get more metrics.

First, get and install `poop`:

```bash
cd /projects && git clone https://github.com/andrewrk/poop && cd poop && zig build && cp -f zig-out/bin/poop /usr/local/bin/
```

Then let's run the cold start builds again:

```bash
cd /projects/zig_x86_64

rm -rf .zig-cache* && poop "zig build -Duse_llvm=true --cache-dir .zig-cache-llvm" "zig build -Duse_llvm=false --cache-dir .zig-cache-nollvm"
```

Well, that doesn't work due to permission denied. And `--cap-add PERFMON` or even `--cap-add SYS_ADMIN` didn't work. Not even `--privileged`. See this issue.

Let's try `hyperfine` instead:

```bash
apt install -y hyperfine
```

Taking comments by `mlugg0` on reddit into account, a few factors should be ruled out for a more fair comparison (with C):

1. rule out the build time for `build.zig`;
2. rule out the overhead of panic handler
3. squeeze a bit of performance at the cost of some safety by disabling C sanitization.

1 means to build `build.zig` before the benchmark and after cleaning the cache (observing that `zig build --help`  will build `build.zig` in order to get the options defined in the build script).

2 means to add the following to `main.zig`:

```zig
/// Don't print a fancy stack trace if there's a panic
pub const panic = std.debug.no_panic;
/// Don't print a fancy stack trace if there's a segfault
pub const std_options: std.Options = .{ .enable_segfault_handler = false };
```

3 means to pass `.sanitize_c = .off` to `root_module` in `build.zig`.

With

```bash
hyperfine --prepare "rm -rf .zig-cache* && zig build --help -Duse_llvm=true && zig build --help -Duse_llvm=false" "zig build -Duse_llvm=true" "zig build -Duse_llvm=false"
```

I got

```
Benchmark 1: zig build -Duse_llvm=true
  Time (mean ± σ):      1.392 s ±  0.052 s    [User: 1.287 s, System: 0.126 s]
  Range (min … max):    1.329 s …  1.473 s    10 runs

Benchmark 2: zig build -Duse_llvm=false
  Time (mean ± σ):     546.1 ms ±  13.6 ms    [User: 570.1 ms, System: 128.7 ms]
  Range (min … max):   532.9 ms … 575.9 ms    10 runs

Summary
  'zig build -Duse_llvm=false' ran
    2.55 ± 0.11 times faster than 'zig build -Duse_llvm=true'
```

which is indeed even faster!

discussions on /r/Zig, Bluesky

What's next for LLMs? [uts-016C]

This is no way a professional prediction, just summarizing some ideas popped into my head in a recent discussion with friends. Read this with a grain of salt.

Reasoning, agent, and multimodality are top 3 research topics for LLMs, as far as I can tell. And for me, there is always the topic of efficiency or scalability. And of course, beyond that, world models, and intelligent robotics.

Verifiable reinforcement learning boosted the reasoning performance as explored by DeepSeek and others, but it requires much more output tokens, to a factor of almost 10. That would be 10x inference compute, and to solve the same problem in the same time, you would need another 10x compute to achieve that. Such a 100x compute demand, is not desirable.

I was wrong in 2023, about the inefficient inference of LLMs. I didn't like the quadratic scaling of the attention mechanism, but it turns out that there are more ways to work around the bottleneck than seeking linear attention such as RWKV and other alternatives. Inference time scaling, and MoE are both game changers.

So I would expect myself to be wrong again about the cost of reasoning. And there are preliminary researches showing that reasoning is already inside and can be further trained into the LLMs.

That is, reasoning should not be happening at the output layer, but intermediate layers. Writing down the reasoning process, should merely serve as a memory of explored paths, to keep things on track, or materialize the box to think out of.

The waste is not just on the input side, there are significant waste of input tokens by prompts that are not part of the instruction nor the context, but guidance. These guidances should not just be cached, but "compiled", in a more efficient and controllable manner, while remaining the interpretability.

Prompt engineering and reasoning by output tokens are two sides of the same coin: we are reaching better results in an interpretable manner, but much more expensive than it should be, and it would become obvious once we find the right principle. Of course, history relies on such detours to get back on track.

Agent is another layer of waste, on top of individual inferences. Agents fall into fruitless planning and exploration, failing to snap out of the infinite loop, by acquiring the extra information or triggering the extra side effects. Their interactions with tools (e.g., MCP), agents (e.g., A2A), and humans (e.g., A2H) are all inefficient due to using natural language as the intermediate representation.

Imagine an alien species with a far more efficient language, like the one depicted in *Arrival*. Such a language could express complex concepts like tool invocation with parameters using just a single token - or perhaps a single multimodal token combining multiple meanings.

There should be special dense tokens, to represent tools or even ways to think (e.g., the MCP for "Sequential Thinking"). And why should a token just be an embedding of a high dimensional vector? Why isn't it a curve, a hypersurface, or a manifold?

Take some time to think about this, while showing the visualization of the Biology of a Large Language Model on the screen.

I believe using modularity other than human perceptions, as the intermediate representation, is the key to the next breakthrough. They would also represent the world model better.

At the end of the day, it all comes down to human-level efficiency to reach human-level intelligence. And the efficiency will come from using the right principle, which will also result in better performance.

And all of reasoning, agent, multimodality, efficiency, world model, and robotics, are all covered by this one integrated solution.

4 bugs caught me off guard recently [uts-016A]

And it's non-trivial to hunt their families down.

But yeah, that's pretty much my job description, that is, to set up mechanisms to eliminate similar problems from all systems in future.

They are from different kind of systems and varying technical stacks, so it would be interesting to pin these specimens down for a display.

How to hunt their families down are left as exercises for the readers.

## Legacy configuration removal

The system in question is a proxy server that routes requests to different backends that are responsible for communicating with different third-party APIs. Two keys in the request are used to determine the backend: `party` and `service`, which indicate *who* does *what*, respectively. For the same pair of these keys, even if there is only one type of backend, there could be multiple instances in different availability zones (AZs) and regions.

There is a subsystem (let's call it the "Switch") to route requests based on the health of these instances. It's vital for the switch to always work, so the requests won't go to a bad instance that is already unresponsive.

The Switch will reuse the *static* configurations of the proxy that lists combinations of `party` and `service`, along with *dynamic* instance information. The static and dynamic information as a whole is written to a cache, whenever an arbiter detects unhealthy instances. The proxy will use the cache to change its routing dynamically.

There was a special party (let's call it "Bob") that requires 2 steps/services (namely "Pre" and "Main") to complete what other parties can do in 1 step ("Main"). These 2 steps are stateful in the sense that if an intance can't do "Main", requests should not be routed to "Pre" either.

So when the Switch reads the blob from the cache, a special check is in place to ensure "Bob" & "Pre" are in the combination list, if they are not, the switch will reject to read, and report an error as a monitoring metric which didn't set up a proper alerting threshold for sudden increase in error rate, because it's an error that never happened before, and is missed when SREs setting up the thresholds for important metrics.

10 years passed by, and "Bob" can also do it in one step, "Main". There are also many legacy combinations that are not used anymore. One day it becomes necessary to remove unused combinations, the includes the combination "Bob" & "Pre".

Bang! The Switch rejects all new writes to the cache for 6 days and no one noticed because no major outage happened, the error rate due to unhealthy instances was low enough to be ignored. The issue is only discovered when the same person is observing the Switch for a different deployment, and it took almost a day to figure out the root cause.

The interesting part? The same person who removed the legacy configuration placed those special checks 10 years ago, the reviewer of the removal also reviewed the check 10 years ago. The check was just an unthoughtful precaution, recalling the check and predicting its ramifications on config removal didn't cross the minds of both of them, even during the initial stage of debugging.

## Too many concurrent slow queries

There is a batch processing system (let's call it the "Sink") that loads different kinds of data from asynchronous nodes of production database clusters, according the configured rules about data source, conditions, how to process etc. There are many rate limits in place to prevent the system from overwhelming the database clusters, such as limiting the concurrent number of tasks, and the threshold of the number of concurrent queries on the same database node. It's also an assumption that the asynchronous nodes are not used for user-facing online queries but other batch processing systems.

The rate limits gradually converged to a balance that maximizes the throughput without overwhelming the nodes even during stress tests of high data volume.

There is also a separate monitor metric that tracks slow queries over a certain threshold, and the threshold was also converging, so SREs could focus on the top issues.

One day, many different databases nodes were reporting high load, only for a few seconds each, but continuously affecting an online service that happens to be using asynchronous nodes, as the service is considered less time-sensitive thus tolerable to the delay that asynchronous nodes might have. SREs and DBAs are confused for quite a while, until a technical lead who guessed it might be caused by the batch processing system Sink, which turned out to be true.

The developers of Sink were also confused, as the concurrent queries on the same node, and the slow queries were both within the limits.

The truth did come out in the end. There was 2 tasks configured in Sink, which will be broken down into many sub-tasks for different time spans, and subsequently, queries for different shards of the same table across different asynchronous nodes. And there is a bug in the task configuration that caused slow queries, and the bug was copy-pasted to affect both tasks. Both tasks were running slow slow, but the database nodes could handle the load along with online queries, as the concurrent slow queries are well below the CPU cores that the nodes could use.

One that day, both the tasks were requested to be rerun for 2 time spans by the same person who introduced the bug, which caused concurrent slow queries on the same node to exceed the core count, blocking the online queries. But as the queries are rate-limited so they hit different nodes in a rotating manner, each node was only affected for a few seconds, the overall impact could only be observed by SREs monitoring the online services, and DBAs considered the phenomenon as a normal load spike.

The lesson learned? Each piece of measures were working individually, but they could not account for combinations of different issues, namely too many *concurrent* (and) *slow* queries, and only the combination could cause an impactful issue. And the information about different parts of the system (the online service, database nodes, and Sink) were not shared across teams.

## OOM when retiring old worker processes

This story is about an SOA framework written in C++ (let's call it the "Framework") that will spawn many worker processes to handle requests for a service, and when a new version of the service is deployed, it will gracefully retire the old worker processes, and spawn new ones of the new version to handle requests. The framework also integrates a few plugins to extend its functionality, and the plugins have their own lifecycle management, e.g. `init`, and `fini`.

One of these plugins (let's call it "Charlie") will starts an asynchronous thread for each worker process, and signals the termination of the thread in `fini`. It recommends an explicit call to `fini`, but `fini` was considered trivial enough to be also called in the destructor of the plugin. It's a bad practice in the eyes of any C++ expert, but it worked well for a long time, there was also no issue calling `fini` twice, once explicitly and once in the destructor.

Every few ten thousand deployments, there is an OOM ("Out of memory") when retiring the old worker processes, but it will be killed per cgroup thresholds on memory consumptions, so only few requests are impacted overtime, eventually SREs concluded that the OOMs are not isolated events.

It was a nightmare to reproduce the issue without know the cause upfront, and code looked innocent enough in review. Finally enough data were collected on rare reproductions, and the clues pointed to `fini` in the destructor of Charlie, but there is only the signaling in `fini`. Further investigation revealed that the asynchronous thread was sending a heartbeat request which caused the OOM.

During sending the request, there was a call to a method using local const static `std::array` holding some `std::string` initialized by literals. That static variable had been destroyed when the asynchronous thread is still alive, which would be signaled to terminate later in `fini` in the destructor of Charlie. The destructor was doing its thing not too non-trivial, but too late, due to the undefined execution order of destructors.

Why would this cause an OOM? After destruction, the heap memory involved were still valid, as they were not released but reused by the allocator, thus there is no SEGFAULT, but the memory for `size()` of the strings contains garbage value, which happens to be a large number, and the strings were appending to a `std::stringstream` to form the heartbeat request, causing endless memory allocation until the OOM.

The issue became more reproducible after using an allocator (`tcmalloc`) that would retain the heap memory longer after destruction. It would have been caught by various sanitizers if we have integrated them in CI.

## Silently hanging tasks

It's again about the batch processing system Sink. Internally it uses an MQ to queue tasks, so it can retry failed tasks. To prevent aggressive retries, there is a backoff mechanism that will exponentially increase the delay between retries, and the maximum number of retries is also limited.

One day, an incident caused all available zones of the MQ to drop a certain percentage of network packets, causing a stable failure rate of task dispatching. If the retries are performed in a regular rate, the aggregate delay of tasks would not be too long.

Unfortunately, the backoff mechanism backfired, along with other rate-limiting mechanisms, eventually causing all task to hang indefinitely. Again, existing metrics were not able to pinpoint the issue, and the developers took some time to figure out the root cause. But as the incident causing all SREs to be in a all-hands-on-deck mode, the hung were noticed quickly from less sensitive monitoring metrics, resulted in a relatively timely fix.

A more sensitive monitoring metric was deployed to detect hanging tasks, and it caught another issue a few days later, and the issue was handled swiftly before causing actual delay.

The issue was that there was a silly bug that runs `remove()` for every element in a huge array, and overall it would take `O(n^2)` time to finish, it's almost forever for such huge arrays. The removal would only be triggered when certain conditions are met, and the bug of removal was copy-pasted from some other utility package.

The bug should only hang one processes that handled the task, unfortunately, the dispatcher would detect timeout tasks, considered the process as dead, reassign the task to other processes when the task was still occupying the old process which was burying its head deep in the `remove()` calls, missing its own checkpoint for timeout. The re-assignment happened over and over again for each timeout, eventually causing all processes to be busy with the same task, and the `remove()` calls would almost never finish. The world literally stopped for that task.

With early detection of hanging tasks, developers were able to try fixes quickly. It should not be easy to find the root cause, but the developers noticed long pauses between two log entries, and the removal loop was the only code there when they wanted to add some logging between the log entries, thus the obvious bug was immediately identified.

It's really lucky that the first issue that was not too serious helped to deploy the more sensitive monitoring metric, so the second issue, which could be much fatal, was detected and fixed promptly, on the morning of the first day of a national holiday. Task re-assignment were then limited to prevent poluting more processes with a bad task.

## LLM's takeaways

Disclaimer: The above stories are slightly modified from the actual events, and they are mannually written by me, not generated by LLM.

I asked DeepSeek to write some takeaways as I was a bit tired after telling these stories. This is the version I'm happy with:

1. **Legacy safeguards breed hidden traps**
   Undocumented checks (Bug 1) and copy-pasted code (Bugs 2, 4) linger as systemic debt. Always audit dependencies before removal or reuse.

2. **Test failure *combinations*, not thresholds**
   Rate limits + slow queries (Bug 2), retries + packet loss (Bug 4): isolated safeguards fail when risks collide. Model interactions, not silos.

3. **Own lifecycle control explicitly**
   Destructor chaos (Bug 3) and hanging tasks (Bug 4) stem from assuming cleanup. Govern threads, memory, and resources with strict ownership.

4. **Instrument for why, not what**
   Missing alerts for config errors (Bug 1), slow queries (Bug 2), or hung processes (Bug 4) delay fixes. Metrics must expose root causes, not just symptoms.

**Core**: Systems fail in layers. Anticipate overlap, govern lifecycles ruthlessly, and monitor for causality—not just correlation.

Fragmented learning, no more [uts-0167]

Well, at least I hope so, for 2025.

I started keeping a Learning diary on math and tech since September 2024.

Before that, I was working on a series of math notes, in a Zen mode, for months. I annotated PDFs in Margin Note 4, and forged definitions, diagrams into Forester trees which are bi-directionally linked notes that can be rendered into beautiful web pages and LaTeX PDFs, until I'm satisfied. During that time, I didn't read news, unrelated books, and solely focused on the books and papers needed for polishing the notes. I held my breath until one note reached a certain stage, so I might move on to a next note.

It was a mental paradise.

Then came the learning diary idea. Since September, constant context switch had dominated me at work. My attention for learning, had inevitably become fragmented. I started to check out interesting stuff from social media, still about math and tech, they are indeed potential elements of what I might want to work on. So I wished to log these readings or just findings daily, so they are not lost in time.

The habit was well-kept, to the end of 2024, even when the load at work had become unbearably distracting and exhausting. The breadcrumbs would be a gold mine for future me.

But it was a disaster to the mental state of my mind.

During that time, I managed to go back to a few open-source projects that I was working on. Most of them have almost no users, but I still feel the obligation to keep them up to date. Usually I just had little time for each of them, so after some chores, I ran out of time and energy for new features, and moved on to chores for a next project. This partially reflected my mental state caused by my fragmented learning.

The world had become an endless curiosity for me, the potential to work on a new project inspired by an element just learned about, was dazzling.

When I stood at the beginning of 2025, I realized that I'm almost fruitless in 2024. I didn't finish the notes before September, although they are nice in their partially completed form. I didn't create anything in a whole after September. I enjoyed the reward of the journey, but came to the end empty-handed.

I no longer log my learning daily in 2025 due to various reasons. My work and life still consume a lot of my time at the beginning of a year, like always. I still log them, but almost just weekly.

April is coming, I would hope to reach a similar Zen mode like the better half of last year. Hence, this post is poured out of me.

A customizable, hybrid approach to code & resource pipeline and embedding [uts-000X]

August 28, 2024
Utensil Song

The following are written in Markdown, and submitted as ~jonsterling/forester#87.

## TL;DR

This is the proposal I mentioned in #82. It's an informal RFC about what I wish to do about embedding both the code and the resource (primarily images) generated in forester output.

The code is there so that the resource generated is fully reproducible, the resource is there so that if the consumer of the forester output (e.g. browser or LaTeX) could gracefully downgrade to use the resource instead. And the pipeline handles regeneration when the source code or dependencies (e.g. imported external files, used assets) change.

## Feasibility

The proposal is already feasible with current forester implementation with some external add-ons, so it might not bring too much complexity to immediate forester development. It can be viewed as not a feature request, but a way to make use of existing features whose equivalents may or may not be included in future forester development, e.g. XML/XSLT etc. That said, it would be nice to have some degree of native support from forester to smooth things out. Actually, this proposal is inspired by how forester handles LaTeX diagrams embedding, with visions on more diagram tools and beyond.

Now that let's assume the feasibility is not a problem, and take a closer look at the approach.

## The customizable, hybrid Approach

The "customizable" part in the approach means the user can choose the kind of tools and pipelines (e.g. parameter customzation or post-processing) to generated the resource, as well as the format(s) of the resource. Currently, the only natively support tool is `latex`, with fixed parameters (e.g. no `--shell-escape`), and a pipeline to generate DVI then post-process it with `divsvgm`, and the format is SVG, with a naming convention of hashing the source code to guarantee browser-cache invalidation of the image whenever the source dode is changed. The pipeline also watch the source change and trigger the regeneration and copying related assets to the output, to an extent that it would be triggered by changes of upstream macro (as it would impact the source code) but not including some `.tex`/`.sty` files that's "imported" in the source code, but forester has no way of detecting that in the current implementation.

Imagine all these can be customized for a second, it would be a great productivity boost for the users, but would this bring a ton of complexity into forester? Not neccessarily. If we view forester as a kernel, then it could just provide a minimal infrastructure to allow all these to happen, instead of handling all tools, pipelins, and formats by itself.

The "hybrid" part in the approach, at least in the web context, means the code can be executed both in browser (browser-side render, BSR for short, which might result in a more interactive experience, e.g. WebGL animations), and in server-side generation (SSR). The technical possibility thanks to that JS libraries that can be executed both in browser and in a JS runtime on server-side (e.g. node.js, deno), and that native libraries can be executed both on server, and in browser as WASM, that includes programs written in C, Rust, Go and more.

In the LaTeX context (or equivalently Typst or other authoring tools), the hybrid approach means whether to execute the code inside the document environment to regenerate, e.g. to preserve the same font and style settings, or to just include the externally generated diagrams. The current forester-to-latex takes the former approach, and usually results in a more desired, style-consistent result, while the latter may have the advantage of stabibility and efficiency, as the diagram could have unexpected errors or deformations during regeneration, and it could be more time consuming to regenerate too many diagrams when you are just authoring some text, and expect to preview the result in real-time.

Now it become clear that this hybrid approach facilitates both the reproducibliity and the graceful downgrade, as long as there is a customizable pipeline to handle it.

## A teaser: authoring tool candidates

Let's get a bit more concrete about the tools that could be used in the customizable pipeline, and see how it opens to endless possibilities. Here are some candidates that I have experimented with:

- Rich text (including math, diagrams)
  - Markdown: a easy-to-write, source-code-readable format
    - both BSR and SSR capable
    - it plays well with forester markups, as fr XML nodes are transformed to HTML tags to be rendered as is by the markdown renderer
    - it can downgrade to plain text (but preserving rendered forester markups) when the markdown renderer is not loaded
    - experiment: test markdownit (via markdown-it )
    - it's at least very convenient for writing paragraph/link/list/footnote-heavy notes compared to forester markups
    - plus, users could write the same text for forester, or for external communication, such as this very proposal which is written in Markdown, submitted to forester ticket system, but also rendered in my forest
  - Typst
    - both BSR and SSR capable
    - it can be made to plays with forester markups, to support links/cross-references/citations etc.
    - it renders to text-selctable SVG, indistinguishable from ordinary text
    - besides text and math, it can be used as a modern diagram authoring tool with decent scripting support
    - experiment: test typst (via typst.ts )
  - Syntax-highlighted code blocks
    - both BSR and SSR capable
    - experiment: test syntax highlighting (via shiki )
- Static diagrams
  - Penrose: creating diagrams with a declarative language with separation of domain knowlege, layout constraints&styling, and concrete substances
    - both BSR and SSR capable
    - experiment: test penrose
  - Pikchr: an open-source text-to-diagram tool, the language is simple yet has great support for relative positioning and flexible path guiding
    - experiment: test pikchr
    - SSR first, BSR-capable thanks to WASM
  - D2: a modern diagram scripting language, preferred over PlantUML, Mermaid, and Graphviz, for its great layout support for complex diagrams and its declarative design.
    - no BSR experiments yet, but I have used its SSR with other static site generators
- Animated, interactive diagrams
  - WebGL
    - experiments:  , test interactions between ray-marching and 3D objects
    - still trying to work out a SSR pipeline, but combining `xvfb-run`, headless-gl, three.js, should make it possible, e.g. see this gist.
    - a less preferred alternative is to use the SVGRenderer addon from three.js, but its render result is visually worse than WebGLRenderer, per my experiment in test three.js to SVG.

For SSR for other output format, users could use XSLT to transform the XML output to make the downstream tools (e.g. LaTeX) to consume the source code, or the resource produced by the SSR pipeline. If neither is consumable, the same content could be excluded, using a macro like `webonly` to surround the forester markup.

Note that these experiments are mostly done for only BSR. SSR experiments are not done for forester, as I have learned the feasibility from my experiment with other static site generators. Without SSR gracefully downgrade, it will take a bit time to load on first visit.

## Elements of an implementation external to forester

Currently, forester can:

- consume `.tree` files, and output `.xml` files, from which the final HTML are produced by XSLT on browser side, or by LaTeX on server side
- allowing user to define macros, which could emitted HTML tags in the XML output (this could also be done as XML tags)
- forester 4.3 uses `fr:resource`, `fr:img`, `fr:resource-source` (with type and part attributes) for rendered SVG image, and the LaTeX source code, which is general enough to be used for other `type` of code that renders to images

To make it available to BSR, the users could:

- use XSLT to transform `fr:resource` tags to HTML tags with certain `class` attribute
- use CSS to render them to show the source code, or a loading animation, or a fallback image rendered by the SSR pipeline described below
- some JS can be loaded on demand to replace the the HTML tag with a rendered, potentially interactive HTML element

To make a SSR pipeline external to forester, the users could:

- define macros accepting the source code to emit `fr:resource`, `fr:img`, `fr:resource-source` following the forester practice on LaTeX diagrams, but with a filename specified by the user, instead of generated from the source code hash
- use a file change watcher (e.g. watchexec) to check the change of XML, then detect `fr:resource-source` presence, then render the source using external scripts to call the corresponding tools by the `type` attribute, and generate an image file with the specified filename
- use a browser cache bustering mechanism (e.g. use XSLT to generate a random number to be appended to the image URL) for reloading the latest image whenever the source code changes if the user refreshes the page

So far, BSR are implemented for the experiments mentioned above, and SSR could wait for the standardization of `fr:resource`, and the resolution of this proposal, or just use any other XML tags in a similar spirit.

## Concluding remarks

The text above is organized so that a reader might gradually see the full vision of the proposal, then the feasibility, and many possible path, downgraded or even upgraded.

To reiterate, this is already feasible with current forester implementation with some external add-ons, and will continue to work if the used features or their equivalents are available in future forester.

It might coincide with the direction of forester development (e.g. the on-going pipeline refactor might just provide the minimal native support needed for this approach to work more smoothly), or it might exceed what forester was originally designed for (thus should not pose a maintenance burden to forester). Hence this informal RFC.

Forester experiment [uts-000S]

April 26, 2024
Utensil Song

I'm experimenting using Jon Sterling's Forester for building Zettelkasten-style forests of evergreen notes, like his Notes on duploid theory (source), CAIMEO's The Rabbit Hole (source) or Trebor’s forest (source).

I wish to use it to organize many definitions, theorems and discussions about the same mathematical concepts/topics.

My hope is to find these notes easier to reuse and refer to in blogs, papers, blueprints etc. Preferably, I would like to be able to author them in Forester markup, LaTeX, Markdown, Typst, and various diagram tools, freely and in a reproducible manner (see my post on this, and relevant experiments).

My math interests in 2024 [uts-001R]

March 16, 2024
Utensil Song
https://utensil.github.io/blog/posts/math-2024/

Rationale [uts-001S]

I wish this post to be a continuously updated list of my math interests in 2024 with proper citations to literatures, as I keep wandering in the math wonderland and I don't want to be lost in it without breadcrumbs.

Some interests that have older origins will gradually moved to corresponding posts for earlier years.

I also hope certain interests will be developed into research projects, and leaving only a brief summary and a link here.

Each interest should have one or few central questions, and one or few references to literatures.

Formalization [uts-001T]

This part of interests is about small-scale formalization of mathematical concepts and theorems, for learning and potential PRs to Lean's Mathlib. Each should focus on one reference which is well organized and convenient to be converted into a blueprint.

Spin groups [uts-001U]

The PR to Mathlib #9111 about Spin groups is ready to merge, but there are 2 open questions:

- what more lemmas about Spin groups are interesting to mathematians?
- what more should be formalized to formalize Versors and what’s in Section "The contents of a geometric algebra" in [chisolm2012geometric] , e.g. r-blades, r-vectors?

For the former, I should take a closer look at [figueroa2010spin] and maybe [suarez2019expository], [reynoso2023probing].

For the latter, see the Z-filteration in lean-ga and the prototype.

I also wish to include some latest results presented in [ruhe2023clifford], with supplements from [brehmer2023geometric] (lately there is a new paper applying this in HEP [spinner2024lorentz], in the same spirit, I should also read [berzins2024geometry] and possibly [raissi2019physics]), in which some of the results are proven in [roelfs2023graded].

See also discussions in the lean-ga blueprint.

I've started a Forester experiment about the definitions of Spin groups here. I also need to check citations of On some Lie groups in degenerate Clifford geometric algebras.

Matrix [uts-001V]

The Matrix Cookbook (November 15, 2012) [petersen2008matrix] covers many useful results about matrices, and Eric Wieser's project lean-matrix-cookbook aims to give one-liner proofs (with reference to the counter part in Mathlib) to all of them.

The project is far from complete and it would be great to claim a small portion of interested results and contribute to it. I also wish to figure out the GA counterpart of the same portion.

Previous interests about matrices rise from Steven De Keninck's work on GALM [de2019geometric], since the paper I have been interested in GA approaches that has practical advantages over traditional matrix-based methods. Notably the paper also discussed the link between degenerate Clifford algebras and dual numbers / automatic differentiation. A more recent inspiration might be his new work LookMaNoMatrices.

TODO: decide which results are interesting and feasible to be formalized for me.

I wish to pursue further on the topic of Matrix/Tensor, see [taylor2024introduction] and [randy2023matrix]. The former also led me to Einsums in C++. For the latter, I'm thinking of HepLean.SpaceTime.CliffordAlgebra.

Group Algebra [uts-001W]

> In a sense, group algebras are the source of all you need to know about representation theory.

The primary reference is [james2001representations] for understanding FG-module, Group algebra, the presentation of groups, Clifford theory (which is the standard method of constructing representations and characters of semi-direct products, see [woit2017quantum], and "3.6 Clifford theory" in [lux2010representations]), Schur indices etc. We also need to check [lux2010representations] for its introduction to GAP, and we should pay close attention to the progress of GAP-LEAN. [sims1994computation] might also be interesting in a similar manner as [lux2010representations] but with emphasis on the presentation of groups.

See also group algebra on nlab, particularly that "A group algebra is in particular a Hopf algebra and a \(G\)-graded algebra."

The related Zulip thread is here, and I have preliminary explorations and experiments in Lean here.

This interest originates from reading Robert A. Wilson's work [wilson2024discrete]. The ultimate goal is to understand the group algebra of the binary tetrahedral group (\(Q_8 \rtimes  Z_3\)), then the three-dimensional complex reflection group (\(G_{27} \rtimes  Q_8 \rtimes  Z_3\)), a.k.a. the triple cover of the Hessian group, which can be interpreted as a finite analogue of the complete gauge group \(U(1) \times  SU(2) \times  SU(3)\).

Type Theory [uts-001X]

Recently I have read some meta-level dependent type theory (Typst source). It might be time to re-read `leantt` paper, and start reading `lean4lean` paper/source.

The author `ice1000` has strong interest in QIIT (Quotient Inductive-Inductive Types) and QIIR (Quotient Inductive-Inductive Recursion), he has implemented `overlap` in Aya with termination check and confluence check.

Aya has a philosophy that the kernel could include pattern matching, but at the cost of no generation and translation of eliminators at present. In principle this is feasible, without `overlap`, it could be implemented by "theory of signatures", with `overlap`, it needs "Coherent and concurrent elimination for initial algebras" which I find fascinating, and have read [yeasin2011initial].

The author also has a great article on TT & Cat [zhang2021type]. His recommendation of [gundry2013tutorial] is also worth reading.

In the process of learning Topos, I wish to have a better understanding of Logic. Particularly, Curry–Howard–Lambek correspondance, Propositional truncation, and Paraconsistent logic are on the plate.

Although remotely related, I'll place Introduction to Formal Reasoning (COMP2065) in this section so I won't lose track of it.

Philosophy [uts-001Y]

Type theories, logic have their origins in philosophy. Lacan had drawn inspiration from the work of the mathematician and philosopher of science Georges Canguilhem, and the philosopher of mathematics Alain Badiou. If he had lived to see the rise of proof assistants, he would be interested in the formalization of his theories.

There is a project on Github called Lacan-Mathemes which visualized some core concepts of Lacan in TikZ. It would be interesting to start there and look for proper objects to formalize in Lean. Another interesting thing to do is to visualize them in Typst.

Here I would like to quote a generated response from Claude 1.2 Instant, which is quite optimistic:

> Here are a few thoughts on formalizing aspects of Lacanian psychoanalytic theory using theorem provers:
>
> - Lacan's structural theory of the psyche could potentially be formalized using logic. The tripartite structure of the Real, Symbolic, and Imaginary could be modeled as formal domains or ontologies with defined relationships between them.
> - Key concepts like the mirror stage, the Name-of-the-Father, the objet petit a, etc. could be defined as logical predicates or functions operating within this structural framework. For example, one could define predicates like "inMirrorStage(subject)" or "desiresObjetPetitA(subject, object)".
> - Relations like the split between desire and drive, the tension between the Symbolic order and the Real, the misrecognition of the Imaginary could be expressed through logical rules and inferences between concepts.
> - The dynamics of psychoanalytic concepts like repression, sublimation, transference could be modeled as state transitions or logical transformations within the structured system.
> - Specific psychoanalytic theories like the stages of psychosexual development or the topology of the fantasy could be axiomatized and theorems deduced from the axioms.
>
> However, capturing the open-ended, non-deterministic nature of unconscious processes and the ambiguity/contingency of signification would be challenging and may require non-classical logics or probabilistic approaches.
>
> So in summary, while difficult, certain aspects of Lacanian theory seem amenable to formalization using tools from logic, ontology, and knowledge representation. Careful design would be needed to address theory's complexity.

Lately there is a paper formalizing Kant [achourioti2011formalization] which might be interesting.

Generalized Clifford Algebra and related mathematical branches [uts-001Z]

A Clifford algebra is a \(Z_2\)-graded algebra, and a Filtered algebra, the associated graded algebra is the exterior algebra.

It may be thought of as quantizations (cf. quantum group) of the exterior algebra, in the same way that the Weyl algebra is a quantization of the symmetric algebra. Specifically, for \(V\) an inner product space, the symbol map constitutes an isomorphism of the underlying super vector spaces of the Clifford algebra with the exterior algebra on  \(V\), and one may understand the Clifford algebra as the quantization Grassmann algebra induced from the inner product regarded as an odd symplectic form.

Weyl algebras and Clifford algebras admit a further structure of a *-algebra, and can be unified as even and odd terms of a superalgebra, as discussed in CCR and CAR algebras.

A Clifford module is a representation of a Clifford algebra.

A Generalized Clifford algebra (GCA) can also refer to associative algebras that are constructed using forms of higher degree instead of quadratic forms, e.g.

> For \(q_1, q_2, \ldots , q_m \in  \mathbb {k}^*\), the generalized Clifford algebra \(C^{(n)}\left (q_1, q_2, \ldots , q_m\right )\) is a unital associative algebra generated by \(e_1, e_2, \ldots , e_m\) subject to the relations
> \[ e_i^n=q_i \mathbf {1}, \quad  e_i e_j=\omega  e_j e_i, \quad  \forall  j \lt  i . \]
>
> It is easy to see that \(C^{(n)}\left (q_1, q_2, \ldots , q_m\right )\) is \(\mathbb {Z}_n\)-graded where the degree of \(e_i\) is \(\overline {1}\), the generator of \(\mathbb {Z}_n\). [cheng2019new]

In [cheng2019new], note also that "Clifford algebras are weak Hopf algebras in some symmetric tensor categories." while "generalized Clifford algebras are weak Hopf algebras in some suitable braided linear categories of graded vector spaces." as well as that "the Clifford process is a powerful technique to construct larger dimensional Clifford algebras from known ones."

TODO: add papers linking Hopf algebra and Clifford algebra together learned from the adjoint discord here.

Clifford algebras can be obtained by twisting of group algebras [albuquerque2002clifford], where twisted group algebras are studied in [conlon1964twisted], [edwards1969twisted], [edwards1969twisted2].

There exists isomorphisms between certain Clifford algebras and NDAs (Normed Division Algebras) over \(\mathbb {R}\).

Variants of Clifford algebras whose generators are idempotent or nilpotent can be considered. Zeon algebras ("nil-Clifford algebras") have proven to be useful in enumeration problems on graphs where certain configurations are forbidden, such as in the enumeration of matchings and self-avoiding walks. The idempotent property of the generators of idem-Clifford algebras can be used to avoid redundant information when enumerating certain graph and hypergraph structures. See [ewing2022zeon].

It's also closely related to universal enveloping algebra (see [figueroa2010spin] and "The universal enveloping algebra of a Lie algebra is the analogue of the usual group algebra of a group." from group algebra on nlab).

Great discussions about the limitations and generalizations of Clifford algebras can be found in John C. Baez's [baez2002octonions]. Particularly, note Cayley-Dickson construction, Bott periodicity, matrix algebra, triality, and \(\mathbb {R}\) as a real commutative associative nicely normed ∗-algebra. Also see Anthony Lasenby's work on the embedding of octonions in the Clifford geometric algebra for space-time STA ( \(\mathop {\mathcal {C}\ell }(1, 3)\) ) [lasenby2024some].

Note also Kingdon algebras: alternative Clifford-like algebras over vector spaces equipped with a symmetric bilinear form [depies2024octonions].

Categorified Clifford Algebra [uts-0020]

A categorical view of Clifford Algebra is discussed in [figueroa2010spin].

A Clifford Algebra can be categorified: "An Clifford algebra over a vector space is defined to be the Koszul dual to an abelian fully weak Lie-algebra" where "Fully weak Lie-algebras are Koszul dual to differential graded Clifford algebras." See also Higher Clifford Algebras.

Sheaves of Clifford Algebras [uts-0021]

Sheaves of Clifford Algebras are studied in [yizengaw2015clifford] and its references. See also [schapira2023introduction], an elementary introduction to [kashiwara2006categories], which presents categories, homological algebra and sheaves in a systematic and exhaustive manner.

DKP Algebra [uts-0022]

Duffin-Kemmer-Petiau Algebra is

\[\frac {T(V)}{I(v\otimes  w\otimes  v - g(v,w)v)}\]

in the same way that Clifford Algebra is

\[\frac {T(V)}{I(v\otimes  v - g(v,v))}.\]

See [fernandes2022clifford] which embeds DKP Algebra in Clifford Algebra with projectors.

Misc [uts-0023]

TODO:

- Recovering Composition Algebras from 3D Geometric Algebras
- Eliminating topological errors in neural network rotation estimation using self-selecting ensembles - Sitao Xiang - SIGGRAPH 2021
- OSMU24: Quantum Foundations, Particle Physics, and Unification of Forces
- Bill Baylis' research on Clifford's Geometric Algebra of Physical Space (APS)

Applied mathematics [uts-0024]

Topology [uts-0025]

I wish to render some pictures in [francis1987topological] with TikZ, GLSL shader, and by GA with ganja.js & GAmphetamine or their Lean version, inspired by Steven De Keninck's notebooks, e.g. torus, orbit 1, orbit 2, origami, skinning, slicing etc.

Knots [uts-0026]

See [manturov2018knot] and the tracking issue.

For interpreting knots in a sheaf-theoretic perspective, see [chrisman2023geometric].

See also:

- A Table of Virtual Knots
- The homological arrow polynomial for virtual links (code)

Origami [uts-0027]

See [hull2020origametry] and the tracking issue.

Dynamical Systems: Bifurcation Theory [uts-0028]

The preferred reference for me is [izhikevich2007dynamical] for its applications in neuroscience and various excellent diagrams. But it's not a mathematically rigorous treatment of bifurcation theory.

Sheaves [uts-0029]


My first impression of sheaves is that they are useful to local-to-global applications "which ask for global solutions to problems whose hypotheses are local".

Roughly speaking, a sheaf requires some gluing conditions (axioms "Locality" and "Gluing") so that local data can be collated compatibly into a global algebraic structure that varies continuously over local covering domains ("sections" of sheaves).

To do so, a sheaf in general, as defined in the category-theoretical language, needs

- a topological space (or a site in general), denoted \(X\) (or \(\mathcal {C}\) for a site)
- a category, sometimes denoted \(\mathcal {D}\), meaning "data category", whose objects are algebraic structures and morphisms are structure-preserving maps

and builds (gluing conditions) on a \(\mathcal {D}\)-valued presheaf over \(X\) (or \(\mathcal {C}\)), denoted \(\mathcal {F}\) (as its French name is "faisceau"), which is essentially a  contravariant functor \(\mathcal {F}: \mathcal {C}^{op} \to  \mathcal {D}\) but a concept with an attitude, and its morphisms are restriction maps between open sets in \(X\) (or between objects that satisfy the pretopology \(\mathcal {J}\) in \(C\), where \(\mathcal {J}\) is the pretopology on \(\operatorname {Open}(X)\), which specifies when a covering family of open sets exists).

Its latest application to deep learning, Thomas Gebhart's thesis [gebhart2023sheaf] sees a sheaf over a topological space as a data structure "which defines rules for associating data to a space so that local agreements in data assignment align with a coherent global representation", thus a generalization of both:

- relational learning, which aims to "combine symbolic, potentially hand-engineered prior knowledge with the tabula rasa representational flexibility of deep learning to achieve a synthetic model family which can be defined with respect to symbolic knowledge priors about the data domain"
- geometric deep learning, which "provides a group-theoretic approach to reasoning about and encoding domain-symmetry invariance or equivariance within machine learning models",

"providing a mathematical framework for characterizing the interplay between the topological information embedded within a domain and the representations of data learned by machine learning models".

My prior interest in geometric deep learning, particularly group-equivariant neural networks, and my believe in symbolism, are the background of my interest in sheaf representation learning.

Notably, this thesis treats the discrete case of sheaves, a cellular sheaf, whose

- topological space is a cell complex, which is "a topological generalization of a graph, with set inclusion and intersection given by the incidence relationships among cells in the complex", thus "admitting a computable, linear-algebraic representation".
- data category is \(\mathtt {FVect}\), the category of finite-dimensional vector spaces over a field \(\mathbb {F}\), which is a common choice for the data category in machine learning applications, a model-free approach with massive parameter space, flexible representational capacity, but inherits fundamental limitations, e.g. data inefficiency, generalization failure, and interpretability issues.

For more details, see also Thomas Gebhart's talk Sheaves for AI: Graph Representation Learning through Sheaf Theory.

<!-- has the potential to learn sheaf representations of data, which is essentially assigning observed data to a space in a way that's

globally consistent with the constraints imposed by the topology of the application domain .

a richer but still tractable representation that is an algebraic structure richer than a vector space commonly used in ML, to sampled data, which is in the form of components of a category (e.g. a category of a data structure like a Graph).  -->

Its application to physics has the potential to formulate differential geometry in a more general setting, without assuming the existence of a locally Euclidean space as manifold did. It's believe that this approach can overcome some difficulties in Quantum field theory even Quantum gravity, because locally there might be no concept of a metric space at all [mallios2015differential].

Note that there are CAS systems that can do sheaf cohomology etc., e.g. Macaulay2, OSCAR.

Synthetic Differential Geometry [uts-002A]

For SDG, [kock2006synthetic] is a classic. [schellingerhout2023circles] is a concise bachelor thesis on the topic, and has interesting discussions on circles.

Ryszard Paweł Kostecki has very approachable notes on Topos [kostecki2011introduction] and SDG [kostecki2009differential].

We should also read [rosiak2022sheaf]. It also has many diagrams in the way I imagined, for examples of sheaf.

ML [uts-002B]

I don't want efforts in Transformers: from self-attention to performance optimizations to be discontinued, lately there is [ferrando2024primer] on this topic.

I might need to follow on the latest development on the linear attention mechanism [peng2024eagle].

I have almost no understanding of diffusion models, so I should read [bao2023all] and related papers.

I should also read [mikula2023magnushammer] and related papers.

Consciousness [uts-002C]

Lately I became aware of the work on mathematical models of consciousness, namely Integrated information theory [albantakis2023integrated] and The information theory of individuality [krakauer2020information].

Physics [uts-002D]

Physics-Based Simulation is out.

So is [sen2024string] on string theory.

TODOs [uts-002E]

Incorporating the following interests into this post:

- Threads of thought in recent years (in Chinese)
- Interests logged here

For earlier posts, see Utensil's Blog powered by Hugo.

For even ealier posts but not in English, see Utensil's Tech Blog, powered by an old version of Middleman.

Context

Home [index]

I'm Utensil Song, my Forester notes are hosted here, mostly about math and tech.

If you are new to Forester, you may wish to check out some tips about Forester.

Breadcrumbs [uts-0169]

I keep a Learning diary since 2024, and trace back to my notes on interests in early years (2014~2023).

Math notes (partially complete) [uts-000U]

Notes on ray-marching implicit surfaces [ag-000G]

August 17, 2024
Utensil Song

preface [ag-000H]

The note should contain:

ray-marching ordinary SDFs
ray-marching SDFs with singularities
optimizations: bounded, adaptive etc.
lighting
shadow, including self-shadow
edge detection
auto-differentiation, curvature and geodesics
non-Euclidean space
dynamics, chaos, fractals
volumetric ray marching
meshes, polygonization, reconstruction
higher dimensions
PGA, CGA
geometric construction based on topology and constraints
neural implicit surfaces [Takikawa et al. 2021, Section 2]
limitation

This is also an experiment to see how well the following 4 mix together in Forester:

formal math notes of the concepts and formulas
illustrative diagrams in TikZ
psuedocode of the algorithms, then syntax highlighted acutal shade code
the rendered GL canvas

See some links for rendering implicit surfaces, some links for learning shaders and 0910~0911 in Learning diary for some links as references. See test Three.js and shaders, test compute shaders for experiments on shaders.

implicit surfaces [ag-000M]

definition. level set [wiki-level-set] [ag-000Y]

A level set of a real-valued function \(f\) (a.k.a. an iso-contour of a scalar field in 3D) \[L_c(f)=\left \{\boldsymbol {x} \mid \forall \boldsymbol {x} \in X, f(\boldsymbol {x})=c \right \} \] where \(c\) is a constant.

definition. implicit surface [winchenbach2024lipschitz, sec. 1] [ag-000X]

Let \(\Omega \) be a subset of a topological space \(X\), and \(\partial \Omega \) be its boundary.

\(\partial \Omega \) as an implicit surface can be defined by a level set \(L_0(f)\) where \[ f(\boldsymbol {p})\begin {cases} \le 0 & \text { if } \boldsymbol {p} \in \Omega \\ > 0 & \text { if } \boldsymbol {p} \notin \Omega \end {cases} \]

SDFs [ag-000Z]

definition. signed distance function [wiki-sdf] [ag-000I]

Let \(\Omega \) be a subset of a metric space \(X\) with metric \(d\), and \(\partial \Omega \) be its boundary. The distance between a point \(\boldsymbol {p}\) of \(X\) and the subset \(\partial \Omega \) of \(X\) is defined as usual as \[ d(\boldsymbol {p}, \partial \Omega )=\inf _{\boldsymbol {q} \in \partial \Omega } d(\boldsymbol {p}, \boldsymbol {q}) \] where \(\inf \) denotes the infimum, i.e. the greatest lower bound.

The signed distance function or signed distance field (SDF) from a point \(\boldsymbol {p}\) of \(X\) to \(\Omega \) is defined by \[ f(\boldsymbol {p})= \begin {cases} -d(\boldsymbol {p}, \partial \Omega ) & \text { if } \boldsymbol {p} \in \Omega \\ d(\boldsymbol {p}, \partial \Omega ) & \text { if } \boldsymbol {p} \notin \Omega \end {cases} \]

convention. sign [ag-000J]

Simply put, SDFs are the minimum possible distance from a point to an implicit surface defined by \(f(\boldsymbol {p})=0\).

The convention adopted in this note is that the \(+\) and \(-\) signs indicate whether the point is outside or inside the surface, respectively, so that when a ray marches towards the surface from the outside, the distance is positive, becomes smaller when approaching the surface.

[wiki-sdf] uses a convention with opposite signs.

remark. SDFs in a Euclidean space [ag-0012]

If \(X\) is also a normed space, i.e. for a point \(\boldsymbol {p} \in X\), its norm can be defined, e.g. in a Euclidean space \(\mathbb R^n\) where \(\boldsymbol {p}\) can be expressed by basis vectors as \((p_1, \ldots , p_n)\), then its norm \(\lVert \boldsymbol {p}\rVert \) can be defined as the Euclidean norm \[ \lVert \boldsymbol {p}\rVert _2:=\sqrt {p_1^2+\cdots +p_n^2} \] which is essentially the distance from the origin \(\boldsymbol {o}=(0, \ldots , 0)\) to the point \(\boldsymbol {p}\) \[ \lVert \boldsymbol {p}\rVert =d(\boldsymbol {p}, \boldsymbol {o})=\lVert \boldsymbol {p} - \boldsymbol {o}\rVert \] where \(d\) is the Euclidean distance function.

In the following, we will be working with Euclidean space \(\mathbb R^n\), which is both a metric space and a normed space. Most of the time, we will focus on \(\mathbb R^3\) for visualization purposes.

example. SDF of a sphere [ag-000L]

The SDF of a sphere with radius \(r\) centered at the origin \(\boldsymbol {o}\) is \[ f(\boldsymbol {p})=\lVert \boldsymbol {p}\rVert -r \] as depicted below:

figure. SDF for a sphere [ag-0017]

The SDF above for a sphere translates to the following GLSL:

float sdSphere(vec3 p, float r)
{
  return length(p) - r;
}

example. SDFs of geometric primitives [ag-0011]

SDFs of some geometric primitives in \(\mathbb R^3\), can be found in Appendix A: Distance to Natural Quadrics and Torus of [hart1996sphere].

Many more can be found in Inigo Quilez's article Distance functions (written in GLSL), which also includes an example that renders them by ray-marching.

figure. test implicit surface shader 3 [ag-0014]

#define HW_PERFORMANCE 0
// The MIT License
// Copyright © 2013 Inigo Quilez
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
//
// The license is here only not because I want to (can one
// license pieces of math?), but because people get upset
// if I don't add one...

// A list of useful distance function to simple primitives. All
// these functions (except for ellipsoid) return an exact
// euclidean distance, meaning they produce a better SDF than
// what you'd get if you were constructing them from boolean
// operations (such as cutting an infinite cylinder with two planes).

// List of other 3D SDFs:
//    https://www.shadertoy.com/playlist/43cXRl
// and
//    https://iquilezles.org/articles/distfunctions

#if HW_PERFORMANCE==0
#define AA 1
#else
#define AA 1   // make this 2 or 3 for antialiasing
#endif

//------------------------------------------------------------------
float dot2( in vec2 v ) { return dot(v,v); }
float dot2( in vec3 v ) { return dot(v,v); }
float ndot( in vec2 a, in vec2 b ) { return a.x*b.x - a.y*b.y; }

float sdPlane( vec3 p )
{
	return p.y;
}

float sdSphere( vec3 p, float s )
{
    return length(p)-s;
}

float sdBox( vec3 p, vec3 b )
{
    vec3 d = abs(p) - b;
    return min(max(d.x,max(d.y,d.z)),0.0) + length(max(d,0.0));
}

float sdBoxFrame( vec3 p, vec3 b, float e )
{
       p = abs(p  )-b;
  vec3 q = abs(p+e)-e;

  return min(min(
      length(max(vec3(p.x,q.y,q.z),0.0))+min(max(p.x,max(q.y,q.z)),0.0),
      length(max(vec3(q.x,p.y,q.z),0.0))+min(max(q.x,max(p.y,q.z)),0.0)),
      length(max(vec3(q.x,q.y,p.z),0.0))+min(max(q.x,max(q.y,p.z)),0.0));
}
float sdEllipsoid( in vec3 p, in vec3 r ) // approximated
{
    float k0 = length(p/r);
    float k1 = length(p/(r*r));
    return k0*(k0-1.0)/k1;
}

float sdTorus( vec3 p, vec2 t )
{
    return length( vec2(length(p.xz)-t.x,p.y) )-t.y;
}

float sdCappedTorus(in vec3 p, in vec2 sc, in float ra, in float rb)
{
    p.x = abs(p.x);
    float k = (sc.y*p.x>sc.x*p.y) ? dot(p.xy,sc) : length(p.xy);
    return sqrt( dot(p,p) + ra*ra - 2.0*ra*k ) - rb;
}

float sdHexPrism( vec3 p, vec2 h )
{
    vec3 q = abs(p);

    const vec3 k = vec3(-0.8660254, 0.5, 0.57735);
    p = abs(p);
    p.xy -= 2.0*min(dot(k.xy, p.xy), 0.0)*k.xy;
    vec2 d = vec2(
       length(p.xy - vec2(clamp(p.x, -k.z*h.x, k.z*h.x), h.x))*sign(p.y - h.x),
       p.z-h.y );
    return min(max(d.x,d.y),0.0) + length(max(d,0.0));
}

float sdOctogonPrism( in vec3 p, in float r, float h )
{
  const vec3 k = vec3(-0.9238795325,   // sqrt(2+sqrt(2))/2 
                       0.3826834323,   // sqrt(2-sqrt(2))/2
                       0.4142135623 ); // sqrt(2)-1 
  // reflections
  p = abs(p);
  p.xy -= 2.0*min(dot(vec2( k.x,k.y),p.xy),0.0)*vec2( k.x,k.y);
  p.xy -= 2.0*min(dot(vec2(-k.x,k.y),p.xy),0.0)*vec2(-k.x,k.y);
  // polygon side
  p.xy -= vec2(clamp(p.x, -k.z*r, k.z*r), r);
  vec2 d = vec2( length(p.xy)*sign(p.y), p.z-h );
  return min(max(d.x,d.y),0.0) + length(max(d,0.0));
}

float sdCapsule( vec3 p, vec3 a, vec3 b, float r )
{
	vec3 pa = p-a, ba = b-a;
	float h = clamp( dot(pa,ba)/dot(ba,ba), 0.0, 1.0 );
	return length( pa - ba*h ) - r;
}

float sdRoundCone( in vec3 p, in float r1, float r2, float h )
{
    vec2 q = vec2( length(p.xz), p.y );
    
    float b = (r1-r2)/h;
    float a = sqrt(1.0-b*b);
    float k = dot(q,vec2(-b,a));
    
    if( k < 0.0 ) return length(q) - r1;
    if( k > a*h ) return length(q-vec2(0.0,h)) - r2;
        
    return dot(q, vec2(a,b) ) - r1;
}

float sdRoundCone(vec3 p, vec3 a, vec3 b, float r1, float r2)
{
    // sampling independent computations (only depend on shape)
    vec3  ba = b - a;
    float l2 = dot(ba,ba);
    float rr = r1 - r2;
    float a2 = l2 - rr*rr;
    float il2 = 1.0/l2;
    
    // sampling dependant computations
    vec3 pa = p - a;
    float y = dot(pa,ba);
    float z = y - l2;
    float x2 = dot2( pa*l2 - ba*y );
    float y2 = y*y*l2;
    float z2 = z*z*l2;

    // single square root!
    float k = sign(rr)*rr*rr*x2;
    if( sign(z)*a2*z2 > k ) return  sqrt(x2 + z2)        *il2 - r2;
    if( sign(y)*a2*y2 < k ) return  sqrt(x2 + y2)        *il2 - r1;
                            return (sqrt(x2*a2*il2)+y*rr)*il2 - r1;
}

float sdTriPrism( vec3 p, vec2 h )
{
    const float k = sqrt(3.0);
    h.x *= 0.5*k;
    p.xy /= h.x;
    p.x = abs(p.x) - 1.0;
    p.y = p.y + 1.0/k;
    if( p.x+k*p.y>0.0 ) p.xy=vec2(p.x-k*p.y,-k*p.x-p.y)/2.0;
    p.x -= clamp( p.x, -2.0, 0.0 );
    float d1 = length(p.xy)*sign(-p.y)*h.x;
    float d2 = abs(p.z)-h.y;
    return length(max(vec2(d1,d2),0.0)) + min(max(d1,d2), 0.);
}

// vertical
float sdCylinder( vec3 p, vec2 h )
{
    vec2 d = abs(vec2(length(p.xz),p.y)) - h;
    return min(max(d.x,d.y),0.0) + length(max(d,0.0));
}

// arbitrary orientation
float sdCylinder(vec3 p, vec3 a, vec3 b, float r)
{
    vec3 pa = p - a;
    vec3 ba = b - a;
    float baba = dot(ba,ba);
    float paba = dot(pa,ba);

    float x = length(pa*baba-ba*paba) - r*baba;
    float y = abs(paba-baba*0.5)-baba*0.5;
    float x2 = x*x;
    float y2 = y*y*baba;
    float d = (max(x,y)<0.0)?-min(x2,y2):(((x>0.0)?x2:0.0)+((y>0.0)?y2:0.0));
    return sign(d)*sqrt(abs(d))/baba;
}

// vertical
float sdCone( in vec3 p, in vec2 c, float h )
{
    vec2 q = h*vec2(c.x,-c.y)/c.y;
    vec2 w = vec2( length(p.xz), p.y );
    
	vec2 a = w - q*clamp( dot(w,q)/dot(q,q), 0.0, 1.0 );
    vec2 b = w - q*vec2( clamp( w.x/q.x, 0.0, 1.0 ), 1.0 );
    float k = sign( q.y );
    float d = min(dot( a, a ),dot(b, b));
    float s = max( k*(w.x*q.y-w.y*q.x),k*(w.y-q.y)  );
	return sqrt(d)*sign(s);
}

float sdCappedCone( in vec3 p, in float h, in float r1, in float r2 )
{
    vec2 q = vec2( length(p.xz), p.y );
    
    vec2 k1 = vec2(r2,h);
    vec2 k2 = vec2(r2-r1,2.0*h);
    vec2 ca = vec2(q.x-min(q.x,(q.y < 0.0)?r1:r2), abs(q.y)-h);
    vec2 cb = q - k1 + k2*clamp( dot(k1-q,k2)/dot2(k2), 0.0, 1.0 );
    float s = (cb.x < 0.0 && ca.y < 0.0) ? -1.0 : 1.0;
    return s*sqrt( min(dot2(ca),dot2(cb)) );
}

float sdCappedCone(vec3 p, vec3 a, vec3 b, float ra, float rb)
{
    float rba  = rb-ra;
    float baba = dot(b-a,b-a);
    float papa = dot(p-a,p-a);
    float paba = dot(p-a,b-a)/baba;

    float x = sqrt( papa - paba*paba*baba );

    float cax = max(0.0,x-((paba<0.5)?ra:rb));
    float cay = abs(paba-0.5)-0.5;

    float k = rba*rba + baba;
    float f = clamp( (rba*(x-ra)+paba*baba)/k, 0.0, 1.0 );

    float cbx = x-ra - f*rba;
    float cby = paba - f;
    
    float s = (cbx < 0.0 && cay < 0.0) ? -1.0 : 1.0;
    
    return s*sqrt( min(cax*cax + cay*cay*baba,
                       cbx*cbx + cby*cby*baba) );
}

// c is the sin/cos of the desired cone angle
float sdSolidAngle(vec3 pos, vec2 c, float ra)
{
    vec2 p = vec2( length(pos.xz), pos.y );
    float l = length(p) - ra;
	float m = length(p - c*clamp(dot(p,c),0.0,ra) );
    return max(l,m*sign(c.y*p.x-c.x*p.y));
}

float sdOctahedron(vec3 p, float s)
{
    p = abs(p);
    float m = p.x + p.y + p.z - s;

    // exact distance
    #if 0
    vec3 o = min(3.0*p - m, 0.0);
    o = max(6.0*p - m*2.0 - o*3.0 + (o.x+o.y+o.z), 0.0);
    return length(p - s*o/(o.x+o.y+o.z));
    #endif
    
    // exact distance
    #if 1
 	vec3 q;
         if( 3.0*p.x < m ) q = p.xyz;
    else if( 3.0*p.y < m ) q = p.yzx;
    else if( 3.0*p.z < m ) q = p.zxy;
    else return m*0.57735027;
    float k = clamp(0.5*(q.z-q.y+s),0.0,s); 
    return length(vec3(q.x,q.y-s+k,q.z-k)); 
    #endif
    
    // bound, not exact
    #if 0
	return m*0.57735027;
    #endif
}

float sdPyramid( in vec3 p, in float h )
{
    float m2 = h*h + 0.25;
    
    // symmetry
    p.xz = abs(p.xz);
    p.xz = (p.z>p.x) ? p.zx : p.xz;
    p.xz -= 0.5;
	
    // project into face plane (2D)
    vec3 q = vec3( p.z, h*p.y - 0.5*p.x, h*p.x + 0.5*p.y);
   
    float s = max(-q.x,0.0);
    float t = clamp( (q.y-0.5*p.z)/(m2+0.25), 0.0, 1.0 );
    
    float a = m2*(q.x+s)*(q.x+s) + q.y*q.y;
	float b = m2*(q.x+0.5*t)*(q.x+0.5*t) + (q.y-m2*t)*(q.y-m2*t);
    
    float d2 = min(q.y,-q.x*m2-q.y*0.5) > 0.0 ? 0.0 : min(a,b);
    
    // recover 3D and scale, and add sign
    return sqrt( (d2+q.z*q.z)/m2 ) * sign(max(q.z,-p.y));;
}

// la,lb=semi axis, h=height, ra=corner
float sdRhombus(vec3 p, float la, float lb, float h, float ra)
{
    p = abs(p);
    vec2 b = vec2(la,lb);
    float f = clamp( (ndot(b,b-2.0*p.xz))/dot(b,b), -1.0, 1.0 );
	vec2 q = vec2(length(p.xz-0.5*b*vec2(1.0-f,1.0+f))*sign(p.x*b.y+p.z*b.x-b.x*b.y)-ra, p.y-h);
    return min(max(q.x,q.y),0.0) + length(max(q,0.0));
}

float sdHorseshoe( in vec3 p, in vec2 c, in float r, in float le, vec2 w )
{
    p.x = abs(p.x);
    float l = length(p.xy);
    p.xy = mat2(-c.x, c.y, 
              c.y, c.x)*p.xy;
    p.xy = vec2((p.y>0.0 || p.x>0.0)?p.x:l*sign(-c.x),
                (p.x>0.0)?p.y:l );
    p.xy = vec2(p.x,abs(p.y-r))-vec2(le,0.0);
    
    vec2 q = vec2(length(max(p.xy,0.0)) + min(0.0,max(p.x,p.y)),p.z);
    vec2 d = abs(q) - w;
    return min(max(d.x,d.y),0.0) + length(max(d,0.0));
}

float sdU( in vec3 p, in float r, in float le, vec2 w )
{
    p.x = (p.y>0.0) ? abs(p.x) : length(p.xy);
    p.x = abs(p.x-r);
    p.y = p.y - le;
    float k = max(p.x,p.y);
    vec2 q = vec2( (k<0.0) ? -k : length(max(p.xy,0.0)), abs(p.z) ) - w;
    return length(max(q,0.0)) + min(max(q.x,q.y),0.0);
}

//------------------------------------------------------------------

vec2 opU( vec2 d1, vec2 d2 )
{
	return (d1.x<d2.x) ? d1 : d2;
}

//------------------------------------------------------------------

#define ZERO (min(iFrame,0))

//------------------------------------------------------------------

vec2 map( in vec3 pos )
{
    vec2 res = vec2( pos.y, 0.0 );

    // bounding box
    if( sdBox( pos-vec3(-2.0,0.3,0.25),vec3(0.3,0.3,1.0) )<res.x )
    {
      res = opU( res, vec2( sdSphere(    pos-vec3(-2.0,0.25, 0.0), 0.25 ), 26.9 ) );
	  res = opU( res, vec2( sdRhombus(  (pos-vec3(-2.0,0.25, 1.0)).xzy, 0.15, 0.25, 0.04, 0.08 ),17.0 ) );
    }

    // bounding box
    if( sdBox( pos-vec3(0.0,0.3,-1.0),vec3(0.35,0.3,2.5) )<res.x )
    {
	res = opU( res, vec2( sdCappedTorus((pos-vec3( 0.0,0.30, 1.0))*vec3(1,-1,1), vec2(0.866025,-0.5), 0.25, 0.05), 25.0) );
    res = opU( res, vec2( sdBoxFrame(    pos-vec3( 0.0,0.25, 0.0), vec3(0.3,0.25,0.2), 0.025 ), 16.9 ) );
	res = opU( res, vec2( sdCone(        pos-vec3( 0.0,0.45,-1.0), vec2(0.6,0.8),0.45 ), 55.0 ) );
    res = opU( res, vec2( sdCappedCone(  pos-vec3( 0.0,0.25,-2.0), 0.25, 0.25, 0.1 ), 13.67 ) );
    res = opU( res, vec2( sdSolidAngle(  pos-vec3( 0.0,0.00,-3.0), vec2(3,4)/5.0, 0.4 ), 49.13 ) );
    }

    // bounding box
    if( sdBox( pos-vec3(1.0,0.3,-1.0),vec3(0.35,0.3,2.5) )<res.x )
    {
	res = opU( res, vec2( sdTorus(      (pos-vec3( 1.0,0.30, 1.0)).xzy, vec2(0.25,0.05) ), 7.1 ) );
    res = opU( res, vec2( sdBox(         pos-vec3( 1.0,0.25, 0.0), vec3(0.3,0.25,0.1) ), 3.0 ) );
    res = opU( res, vec2( sdCapsule(     pos-vec3( 1.0,0.00,-1.0),vec3(-0.1,0.1,-0.1), vec3(0.2,0.4,0.2), 0.1  ), 31.9 ) );
	res = opU( res, vec2( sdCylinder(    pos-vec3( 1.0,0.25,-2.0), vec2(0.15,0.25) ), 8.0 ) );
    res = opU( res, vec2( sdHexPrism(    pos-vec3( 1.0,0.2,-3.0), vec2(0.2,0.05) ), 18.4 ) );
    }

    // bounding box
    if( sdBox( pos-vec3(-1.0,0.35,-1.0),vec3(0.35,0.35,2.5))<res.x )
    {
	res = opU( res, vec2( sdPyramid(    pos-vec3(-1.0,-0.6,-3.0), 1.0 ), 13.56 ) );
	res = opU( res, vec2( sdOctahedron( pos-vec3(-1.0,0.15,-2.0), 0.35 ), 23.56 ) );
    res = opU( res, vec2( sdTriPrism(   pos-vec3(-1.0,0.15,-1.0), vec2(0.3,0.05) ),43.5 ) );
    res = opU( res, vec2( sdEllipsoid(  pos-vec3(-1.0,0.25, 0.0), vec3(0.2, 0.25, 0.05) ), 43.17 ) );
    res = opU( res, vec2( sdHorseshoe(  pos-vec3(-1.0,0.25, 1.0), vec2(cos(1.3),sin(1.3)), 0.2, 0.3, vec2(0.03,0.08) ), 11.5 ) );
    }

    // bounding box
    if( sdBox( pos-vec3(2.0,0.3,-1.0),vec3(0.35,0.3,2.5) )<res.x )
    {
    res = opU( res, vec2( sdOctogonPrism(pos-vec3( 2.0,0.2,-3.0), 0.2, 0.05), 51.8 ) );
    res = opU( res, vec2( sdCylinder(    pos-vec3( 2.0,0.14,-2.0), vec3(0.1,-0.1,0.0), vec3(-0.2,0.35,0.1), 0.08), 31.2 ) );
	res = opU( res, vec2( sdCappedCone(  pos-vec3( 2.0,0.09,-1.0), vec3(0.1,0.0,0.0), vec3(-0.2,0.40,0.1), 0.15, 0.05), 46.1 ) );
    res = opU( res, vec2( sdRoundCone(   pos-vec3( 2.0,0.15, 0.0), vec3(0.1,0.0,0.0), vec3(-0.1,0.35,0.1), 0.15, 0.05), 51.7 ) );
    res = opU( res, vec2( sdRoundCone(   pos-vec3( 2.0,0.20, 1.0), 0.2, 0.1, 0.3 ), 37.0 ) );
    }
    
    return res;
}

// https://iquilezles.org/articles/boxfunctions
vec2 iBox( in vec3 ro, in vec3 rd, in vec3 rad ) 
{
    vec3 m = 1.0/rd;
    vec3 n = m*ro;
    vec3 k = abs(m)*rad;
    vec3 t1 = -n - k;
    vec3 t2 = -n + k;
	return vec2( max( max( t1.x, t1.y ), t1.z ),
	             min( min( t2.x, t2.y ), t2.z ) );
}

vec2 raycast( in vec3 ro, in vec3 rd )
{
    vec2 res = vec2(-1.0,-1.0);

    float tmin = 1.0;
    float tmax = 20.0;

    // raytrace floor plane
    float tp1 = (0.0-ro.y)/rd.y;
    if( tp1>0.0 )
    {
        tmax = min( tmax, tp1 );
        res = vec2( tp1, 1.0 );
    }
    //else return res;
    
    // raymarch primitives   
    vec2 tb = iBox( ro-vec3(0.0,0.4,-0.5), rd, vec3(2.5,0.41,3.0) );
    if( tb.x<tb.y && tb.y>0.0 && tb.x<tmax)
    {
        //return vec2(tb.x,2.0);
        tmin = max(tb.x,tmin);
        tmax = min(tb.y,tmax);

        float t = tmin;
        for( int i=0; i<70 && t<tmax; i++ )
        {
            vec2 h = map( ro+rd*t );
            if( abs(h.x)<(0.0001*t) )
            { 
                res = vec2(t,h.y); 
                break;
            }
            t += h.x;
        }
    }
    
    return res;
}

// https://iquilezles.org/articles/rmshadows
float calcSoftshadow( in vec3 ro, in vec3 rd, in float mint, in float tmax )
{
    // bounding volume
    float tp = (0.8-ro.y)/rd.y; if( tp>0.0 ) tmax = min( tmax, tp );

    float res = 1.0;
    float t = mint;
    for( int i=ZERO; i<24; i++ )
    {
		float h = map( ro + rd*t ).x;
        float s = clamp(8.0*h/t,0.0,1.0);
        res = min( res, s );
        t += clamp( h, 0.01, 0.2 );
        if( res<0.004 || t>tmax ) break;
    }
    res = clamp( res, 0.0, 1.0 );
    return res*res*(3.0-2.0*res);
}

// https://iquilezles.org/articles/normalsSDF
vec3 calcNormal( in vec3 pos )
{
#if 0
    vec2 e = vec2(1.0,-1.0)*0.5773*0.0005;
    return normalize( e.xyy*map( pos + e.xyy ).x + 
					  e.yyx*map( pos + e.yyx ).x + 
					  e.yxy*map( pos + e.yxy ).x + 
					  e.xxx*map( pos + e.xxx ).x );
#else
    // inspired by tdhooper and klems - a way to prevent the compiler from inlining map() 4 times
    vec3 n = vec3(0.0);
    for( int i=ZERO; i<4; i++ )
    {
        vec3 e = 0.5773*(2.0*vec3((((i+3)>>1)&1),((i>>1)&1),(i&1))-1.0);
        n += e*map(pos+0.0005*e).x;
      //if( n.x+n.y+n.z>100.0 ) break;
    }
    return normalize(n);
#endif    
}

// https://iquilezles.org/articles/nvscene2008/rwwtt.pdf
float calcAO( in vec3 pos, in vec3 nor )
{
	float occ = 0.0;
    float sca = 1.0;
    for( int i=ZERO; i<5; i++ )
    {
        float h = 0.01 + 0.12*float(i)/4.0;
        float d = map( pos + h*nor ).x;
        occ += (h-d)*sca;
        sca *= 0.95;
        if( occ>0.35 ) break;
    }
    return clamp( 1.0 - 3.0*occ, 0.0, 1.0 ) * (0.5+0.5*nor.y);
}

// https://iquilezles.org/articles/checkerfiltering
float checkersGradBox( in vec2 p, in vec2 dpdx, in vec2 dpdy )
{
    // filter kernel
    vec2 w = abs(dpdx)+abs(dpdy) + 0.001;
    // analytical integral (box filter)
    vec2 i = 2.0*(abs(fract((p-0.5*w)*0.5)-0.5)-abs(fract((p+0.5*w)*0.5)-0.5))/w;
    // xor pattern
    return 0.5 - 0.5*i.x*i.y;                  
}

vec3 render( in vec3 ro, in vec3 rd, in vec3 rdx, in vec3 rdy )
{ 
    // background
    vec3 col = vec3(0.7, 0.7, 0.9) - max(rd.y,0.0)*0.3;
    
    // raycast scene
    vec2 res = raycast(ro,rd);
    float t = res.x;
	float m = res.y;
    if( m>-0.5 )
    {
        vec3 pos = ro + t*rd;
        vec3 nor = (m<1.5) ? vec3(0.0,1.0,0.0) : calcNormal( pos );
        vec3 ref = reflect( rd, nor );
        
        // material        
        col = 0.2 + 0.2*sin( m*2.0 + vec3(0.0,1.0,2.0) );
        float ks = 1.0;
        
        if( m<1.5 )
        {
            // project pixel footprint into the plane
            vec3 dpdx = ro.y*(rd/rd.y-rdx/rdx.y);
            vec3 dpdy = ro.y*(rd/rd.y-rdy/rdy.y);

            float f = checkersGradBox( 3.0*pos.xz, 3.0*dpdx.xz, 3.0*dpdy.xz );
            col = 0.15 + f*vec3(0.05);
            ks = 0.4;
        }

        // lighting
        float occ = calcAO( pos, nor );
        
		vec3 lin = vec3(0.0);

        // sun
        {
            vec3  lig = normalize( vec3(-0.5, 0.4, -0.6) );
            vec3  hal = normalize( lig-rd );
            float dif = clamp( dot( nor, lig ), 0.0, 1.0 );
          //if( dif>0.0001 )
        	      dif *= calcSoftshadow( pos, lig, 0.02, 2.5 );
			float spe = pow( clamp( dot( nor, hal ), 0.0, 1.0 ),16.0);
                  spe *= dif;
                  spe *= 0.04+0.96*pow(clamp(1.0-dot(hal,lig),0.0,1.0),5.0);
                //spe *= 0.04+0.96*pow(clamp(1.0-sqrt(0.5*(1.0-dot(rd,lig))),0.0,1.0),5.0);
            lin += col*2.20*dif*vec3(1.30,1.00,0.70);
            lin +=     5.00*spe*vec3(1.30,1.00,0.70)*ks;
        }
        // sky
        {
            float dif = sqrt(clamp( 0.5+0.5*nor.y, 0.0, 1.0 ));
                  dif *= occ;
            float spe = smoothstep( -0.2, 0.2, ref.y );
                  spe *= dif;
                  spe *= 0.04+0.96*pow(clamp(1.0+dot(nor,rd),0.0,1.0), 5.0 );
          //if( spe>0.001 )
                  spe *= calcSoftshadow( pos, ref, 0.02, 2.5 );
            lin += col*0.60*dif*vec3(0.40,0.60,1.15);
            lin +=     2.00*spe*vec3(0.40,0.60,1.30)*ks;
        }
        // back
        {
        	float dif = clamp( dot( nor, normalize(vec3(0.5,0.0,0.6))), 0.0, 1.0 )*clamp( 1.0-pos.y,0.0,1.0);
                  dif *= occ;
        	lin += col*0.55*dif*vec3(0.25,0.25,0.25);
        }
        // sss
        {
            float dif = pow(clamp(1.0+dot(nor,rd),0.0,1.0),2.0);
                  dif *= occ;
        	lin += col*0.25*dif*vec3(1.00,1.00,1.00);
        }
        
		col = lin;

        col = mix( col, vec3(0.7,0.7,0.9), 1.0-exp( -0.0001*t*t*t ) );
    }

	return vec3( clamp(col,0.0,1.0) );
}

mat3 setCamera( in vec3 ro, in vec3 ta, float cr )
{
	vec3 cw = normalize(ta-ro);
	vec3 cp = vec3(sin(cr), cos(cr),0.0);
	vec3 cu = normalize( cross(cw,cp) );
	vec3 cv =          ( cross(cu,cw) );
    return mat3( cu, cv, cw );
}

void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
    vec2 mo = iMouse.xy/iResolution.xy;
	float time = 32.0 + iTime*1.5;

    // camera	
    vec3 ta = vec3( 0.25, -0.75, -0.25 );
    vec3 ro = ta + vec3( 6.0*cos(0.1*time + 7.0*mo.x), 4.5, 6.0*sin(0.1*time + 7.0*mo.x) );
    // camera-to-world transformation
    mat3 ca = setCamera( ro, ta, 0.0 );

    vec3 tot = vec3(0.0);
#if AA>1
    for( int m=ZERO; m<AA; m++ )
    for( int n=ZERO; n<AA; n++ )
    {
        // pixel coordinates
        vec2 o = vec2(float(m),float(n)) / float(AA) - 0.5;
        vec2 p = (2.0*(fragCoord+o)-iResolution.xy)/iResolution.y;
#else    
        vec2 p = (2.0*fragCoord-iResolution.xy)/iResolution.y;
#endif

        // focal length
        const float fl = 2.5;
        
        // ray direction
        vec3 rd = ca * normalize( vec3(p,fl) );

         // ray differentials
        vec2 px = (2.0*(fragCoord+vec2(1.0,0.0))-iResolution.xy)/iResolution.y;
        vec2 py = (2.0*(fragCoord+vec2(0.0,1.0))-iResolution.xy)/iResolution.y;
        vec3 rdx = ca * normalize( vec3(px,fl) );
        vec3 rdy = ca * normalize( vec3(py,fl) );
        
        // render	
        vec3 col = render( ro, rd, rdx, rdy );

        // gain
        // col = col*3.0/(2.5+col);
        
		// gamma
        col = pow( col, vec3(0.4545) );

        tot += col;
#if AA>1
    }
    tot /= float(AA*AA);
#endif
    
    fragColor = vec4( tot, 1.0 );
}

adapted from Inigo Quilez's Raymarching - Primitives

example. SDFs of exotic implicit surfaces [ag-0013]

SDFs of many exotic implicit surfaces can be found in:

Appendix A: Implicit Equations in [singh2009real]
Appendix B: Function definitions in [winchenbach2024lipschitz]
[gillespie2024ray]
Herwig Hauser's Gallery of Singular Algebraic Surfaces

The following example renders many exotic implicit surfaces by ray-marching:

figure. [uts-000J]

#define AS_LIB 1

int get_shape() {
    return int(iTime) % 52;
}

#include "/forest/shader/implicit.glsl"

void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
    vec2 uv = 2.*(fragCoord-iResolution.xy/2.)/iResolution.y; // contains [-1,1]^2
    vec3 col = vec3(0.);
    
    // Camera rays
    vec3 camPos = vec3(4.,0.,0.);
    vec3 camDir = - normalize(camPos);
    vec3 rayPos, rayDir;
    float zoom = 1.3; // 1.8*cos(iTime);
    // if (checkKey(KEY_E)) zoom = 0.5;
    float fov = 0.4*zoom;
    float fov_ortho = 1.5*zoom;
    #if perspective
        // perspective cam
        rayPos = camPos;
        rayDir = normalize(camDir + fov*vec3(0., uv.x, uv.y));
    #else
        // orthographic cam
        rayPos = camPos + fov_ortho*vec3(0., uv.x, uv.y);
        rayDir = camDir;
    #endif
    // for perspective background in orthographic mode
    vec3 cubemapDir = normalize(camDir + fov*vec3(0., uv.x, uv.y));
    
    // Mouse-controlled rotation
    vec2 mouse = initMouse + vec2(0.015625*sin(iTime*PI), 0.0); // initMouse; // iMouse.xy == vec2(0.,0.) ? initMouse : (iMouse.xy/iResolution.xy - 0.5);
    float yaw = clamp(- mouse.x * 2.*PI * 1., -PI,PI);
    float pitch = clamp( mouse.y * PI * 1.2, -PI*0.5, PI*0.5);
    // pitch and yaw rotations (column-wise matrices)
    mat3 rot = mat3(cos(yaw), sin(yaw), 0., -sin(yaw), cos(yaw), 0., 0., 0., 1.);
    rot = rot * mat3(cos(pitch), 0., -sin(pitch), 0., 1., 0., sin(pitch), 0., cos(pitch));
    // apply
    camPos = rot*camPos;
    camDir = rot*camDir;
    rayPos = rot*rayPos;
    rayDir = rot*rayDir;
    cubemapDir = rot*cubemapDir;
    
    //cubemapDir = vec3(cubemapDir.x, cubemapDir.z, cubemapDir.y);
    
    vec3 hitPoint = raycast(rayPos, rayDir);
    
    if (hitPoint == BINGO) { fragColor = vec4(BINGO,1.0); return; }
    //if (hitPoint == NOHIT) { fragColor = vec4(NOHIT,1.0); return; }
    //if (hitPoint == NOBOUNDHIT) { fragColor = vec4(NOBOUNDHIT,1.0); return; }
    //if (hitPoint == ESCAPEDBOUNDS) { fragColor = vec4(ESCAPEDBOUNDS,1.0); return; }
    //if (hitPoint == MAXDISTREACHED) { fragColor = vec4(MAXDISTREACHED,1.0); return; }
    //if (hitPoint == MAXITERREACHED) { fragColor = vec4(MAXITERREACHED,1.0); return; }
    
    if (hitPoint == NOBOUNDHIT  || hitPoint == NOHIT || hitPoint == ESCAPEDBOUNDS || hitPoint == MAXITERREACHED) {
        //fragColor = vec4(vec3(0.2),1.0); return;

        // make background transparent
        fragColor = vec4(0.0,0.0,0.0,0.0); return;
        
        col = with_background(cubemapDir);
        
        #if showBoundingCube
        // darken bounding cube
        if (hitPoint != NOBOUNDHIT) { col *= vec3(0.7); }
        #endif

        fragColor = vec4(col,1.0); return;
    }

    vec3 grad = gradf(hitPoint+1.1*EPS*(-rayDir));
    float s = -sign(dot(grad,rayDir));
    
    col = with_color_mode(grad, s, hitPoint, camPos);

    col = clamp(col, 0., 1.);

    col = with_surface_pattern(col, hitPoint);

    col = with_shading(col, grad, s, rayDir);
    
    col = clamp(col, 0., 1.);

    fragColor = vec4(col,1.0);
}

adapted from Rogolop's Implicit surfaces

example. SDFs of implicit surfaces in exotic geometries [ag-0016]

Implicit surface could be defined in geometries other than Euclidean space, for some examples of such geometries, see [szirmay2022adapting] [coulon2022ray] [kopczynski2022real] and more.

The following example renders exotic implicit surfaces in exotic geometries by ray-marching:

figure. test implicit surface shader 4 [ag-0015]

//********************************************************************
//
// Begin of automatic differentiation header
// Full code with additional functions (gradients, jacobians, ...) can be found
// at: https://github.com/sibaku/glsl-autodiff
//
//********************************************************************
#ifndef HESSNUM_3_H_
#define HESSNUM_3_H_

// This file contains methods to compute the gradient and hessian 
// of a scalar valued 3 dimensional function using automatic forward differentiation

//--------------------------------
// Types
//--------------------------------

// Data type to hold information about a scalar valued 3 dimensional function
// These should be created by the constH3 (for constants) and varH3 (for variables) helpers
struct HNum3
{
    // The current value
    float val;
    // The current gradient
    vec3 g;
    // The current hessian
    mat3 h;
};

//--------------------------------
// Prototypes
//--------------------------------

/**
* Creates a constant HNum3
* @param val The current value of the constant
*/
HNum3 constH3(in float val);
/**
* Creates a HNum3 corresponding to the variable with the given index
* @param val The current value of the variable
* @param index The variable's index
*/
HNum3 varH3(in float val, in int index);
/**
* Creates a HNum3 corresponding to the variable x (index = 0)
* @param val The current value of the variable
*/
HNum3 varH3x(in float val);
/**
* Creates a HNum3 corresponding to the variable y (index = 1)
* @param val The current value of the variable
*/
HNum3 varH3y(in float val);
/**
* Creates a HNum3 corresponding to the variable z (index = 2)
* @param val The current value of the variable
*/
HNum3 varH3z(in float val);
HNum3 add(in HNum3 a, in HNum3 b);
HNum3 add(in HNum3 a, in float b);
HNum3 add(in float a, in HNum3 b);
HNum3 sub(in HNum3 a, in HNum3 b);
HNum3 sub(in HNum3 a, in float b);
HNum3 sub(in float a, in HNum3 b);
HNum3 mult(in HNum3 a, in HNum3 b);
HNum3 mult(in HNum3 a, in float b);
HNum3 mult(in float a, in HNum3 b);
HNum3 neg(in HNum3 a);
HNum3 div(in HNum3 a, in HNum3 b);
HNum3 div(in HNum3 a, in float b);
HNum3 div(in float a, in HNum3 b);
HNum3 inv(in HNum3 a);
HNum3 a_pow(in HNum3 a, in HNum3 b);
HNum3 a_pow(in HNum3 a, in float b);
HNum3 a_pow(in float a, in HNum3 b);
HNum3 a_min(in HNum3 a, in HNum3 b);
HNum3 a_max(in HNum3 a, in HNum3 b);
HNum3 a_exp2(in HNum3 a);
HNum3 a_inversesqrt(in HNum3 a);
HNum3 a_atan(in HNum3 a);
HNum3 a_sqrt(in HNum3 a);
HNum3 a_sinh(in HNum3 a);
HNum3 a_ceil(in HNum3 a);
HNum3 a_tan(in HNum3 a);
HNum3 a_asinh(in HNum3 a);
HNum3 a_asin(in HNum3 a);
HNum3 a_acosh(in HNum3 a);
HNum3 a_abs(in HNum3 a);
HNum3 a_exp(in HNum3 a);
HNum3 a_cosh(in HNum3 a);
HNum3 a_floor(in HNum3 a);
HNum3 a_log(in HNum3 a);
HNum3 a_atanh(in HNum3 a);
HNum3 a_log2(in HNum3 a);
HNum3 a_acos(in HNum3 a);
HNum3 a_tanh(in HNum3 a);
HNum3 a_cos(in HNum3 a);
HNum3 a_sin(in HNum3 a);
HNum3 a_atan2(in HNum3 y, in HNum3 x);
HNum3 a_atan2(in HNum3 y, in float x);
HNum3 a_atan2(in float y, in HNum3 x);
HNum3 a_mix(in HNum3 a, in HNum3 b, in HNum3 t);
HNum3 a_mix(in HNum3 a, in HNum3 b, in float t);
HNum3 a_mix(in HNum3 a, in float b, in HNum3 t);
HNum3 a_mix(in HNum3 a, in float b, in float t);
HNum3 a_mix(in float a, in HNum3 b, in HNum3 t);
HNum3 a_mix(in float a, in HNum3 b, in float t);
HNum3 a_mix(in float a, in float b, in HNum3 t);

//--------------------------------
// Macros
//--------------------------------

#define HESSIAN3(f,x, y, z,result)  {     result = f(varH3x(x), varH3y(y), varH3z(z)); }

//--------------------------------
// Utilities prototypes
//--------------------------------

mat3 a_outerProduct(in vec3 a, in vec3 b);

//--------------------------------
// Implementation
//--------------------------------

HNum3 constH3(in float val)
{
    return HNum3(val, vec3(0.0), mat3(0.0));
}
//--------------------------------
HNum3 varH3(in float val, in int index)
{   
    vec3 g = vec3(0.0);
    g[index] = 1.0;
    return HNum3(val, g, mat3(0.0));
}
//--------------------------------
HNum3 varH3x(in float val)
{   
    vec3 g = vec3(0.0);
    g[0] = 1.0;
    return HNum3(val, g, mat3(0.0));
}
//--------------------------------
HNum3 varH3y(in float val)
{   
    vec3 g = vec3(0.0);
    g[1] = 1.0;
    return HNum3(val, g, mat3(0.0));
}
//--------------------------------
HNum3 varH3z(in float val)
{   
    vec3 g = vec3(0.0);
    g[2] = 1.0;
    return HNum3(val, g, mat3(0.0));
}
//--------------------------------
HNum3 add(in HNum3 a, in HNum3 b)
{
    return HNum3(a.val + b.val , a.g + b.g, a.h + b.h);
}
//--------------------------------
HNum3 add(in HNum3 a, in float b)
{
    return HNum3(a.val + b , a.g, a.h);
}
//--------------------------------
HNum3 add(in float a, in HNum3 b)
{
    return HNum3(a + b.val , b.g, b.h);
}
//--------------------------------
HNum3 sub(in HNum3 a, in HNum3 b)
{
    return HNum3(a.val - b.val , a.g - b.g, a.h - b.h);
}
//--------------------------------
HNum3 sub(in HNum3 a, in float b)
{
    return HNum3(a.val - b , a.g, a.h);
}
//--------------------------------
HNum3 sub(in float a, in HNum3 b)
{
    return HNum3(a - b.val , - b.g, - b.h);
}
//--------------------------------
HNum3 mult(in HNum3 a, in HNum3 b)
{
    return HNum3(a.val * b.val, 
        a.val*b.g + b.val*a.g, 
        a.val*b.h + b.val*a.h + a_outerProduct(b.g,a.g) + a_outerProduct(a.g,b.g)
    );
}
//--------------------------------
HNum3 mult(in HNum3 a, in float b)
{
    return HNum3(a.val * b, b*a.g, b*a.h);
}
//--------------------------------
HNum3 mult(in float a, in HNum3 b)
{
    return HNum3(a * b.val, a*b.g, a*b.h);
}
//--------------------------------
HNum3 neg(in HNum3 a)
{
    return mult(-1.0,a);
}
//--------------------------------
HNum3 div(in HNum3 a, in HNum3 b)
{
    float b1 = b.val;
    float b2 = b1*b1;
    float b3 = b2*b1;

    return HNum3(a.val / b.val , 
        (b.val*a.g - a.val*b.g)/b2, 
        2.0*a.val/b3*a_outerProduct(b.g,b.g) 
        - a.val/b2*b.h
        + a.h/b1 
        - a_outerProduct(b.g/b2, a.g)
        - a_outerProduct(a.g/b2, b.g)
    );
}
//--------------------------------
HNum3 div(in HNum3 a, in float b)
{
    return HNum3(a.val / b, a.g/b, a.h/b);
}
//--------------------------------
HNum3 div(in float a, in HNum3 b)
{
    float b1 = b.val;
    float b2 = b1*b1;
    float b3 = b2*b1;

    return HNum3(a / b.val, 
        -a*b.g/b2, 
        2.0*a/b3*a_outerProduct(b.g,b.g) - a/b2*b.h
    );
}
//--------------------------------
HNum3 inv(in HNum3 a)
{
    return div(1.0, a);
}
//--------------------------------
HNum3 a_pow(in HNum3 a, in HNum3 b)
{
    return a_exp(mult(b,a_log(a)));
}
//--------------------------------
HNum3 a_pow(in HNum3 a, in float b)
{
    // constant exponent -> make special case
    float v = pow(a.val, b); // value f(a(x))
    float da = b*pow(a.val,b-1.0); // first derivative f'(a(x))
    float dda = b*(b-1.0)*pow(a.val,b-2.0); // second derivative f''(a(x))
    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_pow(in float a, in HNum3 b)
{
    return a_exp(mult(b,log(a)));
}
//--------------------------------
HNum3 a_min(in HNum3 a, in HNum3 b)
{
    if(a.val < b.val)
    {
        return a;
    }
    return b;
}
//--------------------------------
HNum3 a_max(in HNum3 a, in HNum3 b)
{
    if(a.val > b.val)
    {
        return a;
    }
    return b;
}
//--------------------------------
HNum3 a_exp2(in HNum3 a)
{
    float v = exp2(a.val); // value f(a(x))
    float da = log(2.0)*exp2(a.val); // first derivative f'(a(x))
    float dda = log(2.0)*log(2.0)*exp2(a.val); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_inversesqrt(in HNum3 a)
{
    float v = inversesqrt(a.val); // value f(a(x))
    float da = -0.5/pow(sqrt(a.val),3.0); // first derivative f'(a(x))
    float dda = 0.75/pow(sqrt(a.val),5.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_atan(in HNum3 a)
{
    float v = atan(a.val); // value f(a(x))
    float da = 1.0/(1.0 + a.val * a.val); // first derivative f'(a(x))
    float dda = -2.0*a.val/pow(1.0 + a.val * a.val, 2.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_sqrt(in HNum3 a)
{
    float v = sqrt(a.val); // value f(a(x))
    float da = 0.5/sqrt(a.val); // first derivative f'(a(x))
    float dda = -0.25/pow(sqrt(a.val),3.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_sinh(in HNum3 a)
{
    float v = sinh(a.val); // value f(a(x))
    float da = cosh(a.val); // first derivative f'(a(x))
    float dda = sinh(a.val); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_ceil(in HNum3 a)
{
    float v = ceil(a.val); // value f(a(x))
    float da = 0.0; // first derivative f'(a(x))
    float dda = 0.0; // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_tan(in HNum3 a)
{
    float v = tan(a.val); // value f(a(x))
    float da = 1.0 + pow(tan(a.val),2.0); // first derivative f'(a(x))
    float dda = 2.0*tan(a.val)*(1.0 + pow(tan(a.val),2.0)); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_asinh(in HNum3 a)
{
    float v = asinh(a.val); // value f(a(x))
    float da = 1.0/sqrt(1.0 + a.val * a.val); // first derivative f'(a(x))
    float dda = -a.val/pow(sqrt(1.0 + a.val * a.val),3.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_asin(in HNum3 a)
{
    float v = asin(a.val); // value f(a(x))
    float da = 1.0/sqrt(1.0 - a.val * a.val); // first derivative f'(a(x))
    float dda = a.val/pow(sqrt(1.0 - a.val * a.val),3.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_acosh(in HNum3 a)
{
    float v = acosh(a.val); // value f(a(x))
    float da = 1.0/sqrt(-1.0 + a.val * a.val); // first derivative f'(a(x))
    float dda = -a.val/pow(sqrt(-1.0 + a.val * a.val),3.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_abs(in HNum3 a)
{
    float v = abs(a.val); // value f(a(x))
    float da = a.val < 0.0 ? -1.0 : 1.0; // first derivative f'(a(x))
    float dda = 0.0; // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_exp(in HNum3 a)
{
    float v = exp(a.val); // value f(a(x))
    float da = exp(a.val); // first derivative f'(a(x))
    float dda = exp(a.val); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_cosh(in HNum3 a)
{
    float v = cosh(a.val); // value f(a(x))
    float da = sinh(a.val); // first derivative f'(a(x))
    float dda = cosh(a.val); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_floor(in HNum3 a)
{
    float v = floor(a.val); // value f(a(x))
    float da = 0.0; // first derivative f'(a(x))
    float dda = 0.0; // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_log(in HNum3 a)
{
    float v = log(a.val); // value f(a(x))
    float da = 1.0/a.val; // first derivative f'(a(x))
    float dda = -1.0/(a.val * a.val); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_atanh(in HNum3 a)
{
    float v = atanh(a.val); // value f(a(x))
    float da = 1.0/(1.0 - a.val * a.val); // first derivative f'(a(x))
    float dda = 2.0*a.val/pow(1.0 - a.val * a.val,2.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_log2(in HNum3 a)
{
    float v = log2(a.val); // value f(a(x))
    float da = 1.0/(a.val * log(2.0)); // first derivative f'(a(x))
    float dda = -1.0/(a.val * a.val * log(2.0)); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_acos(in HNum3 a)
{
    float v = acos(a.val); // value f(a(x))
    float da = -1.0/sqrt(1.0 - a.val * a.val); // first derivative f'(a(x))
    float dda = -a.val/pow(sqrt(1.0 - a.val * a.val),3.0); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_tanh(in HNum3 a)
{
    float v = tanh(a.val); // value f(a(x))
    float da = 1.0 - pow(tanh(a.val),2.0); // first derivative f'(a(x))
    float dda = -2.0*tanh(a.val)*(1.0 - pow(tanh(a.val),2.0)); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_cos(in HNum3 a)
{
    float v = cos(a.val); // value f(a(x))
    float da = -sin(a.val); // first derivative f'(a(x))
    float dda = -cos(a.val); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_sin(in HNum3 a)
{
    float v = sin(a.val); // value f(a(x))
    float da = cos(a.val); // first derivative f'(a(x))
    float dda = -sin(a.val); // second derivative f''(a(x))

    return HNum3(v , da * a.g,  da * a.h + dda * a_outerProduct(a.g,a.g));
}
//--------------------------------
HNum3 a_atan2(in HNum3 y, in HNum3 x)
{
    const float pi = 3.14159265; 
    // from https://en.wikipedia.org/wiki/Atan2
    if(x.val > 0.0)
    {
        HNum3 n = a_sqrt(add(mult(x,x),mult(y,y)));
        HNum3 inner = div(y, add(n,x));
        
        return mult(2.0,a_atan(inner));
        
    }else if(x.val <= 0.0 && abs(y.val) > 1E-6)
    {
        HNum3 n = a_sqrt(add(mult(x,x),mult(y,y)));
        HNum3 inner = div(sub(n,x),y);
         return mult(2.0,a_atan(inner));
    }else if(x.val < 0.0 && abs(y.val) <= 1E-6)
    {
        return constH3(pi);
    }
    // return 0 for undefined
    return constH3(0.0); 
}
//--------------------------------
HNum3 a_atan2(in HNum3 y, in float x)
{
    return a_atan2(y,constH3(x));
}
//--------------------------------
HNum3 a_atan2(in float y, in HNum3 x)
{
    return a_atan2(constH3(y),x);
}
//--------------------------------
HNum3 a_mix(in HNum3 a, in HNum3 b, in HNum3 t)
{
    return add(mult(a, sub(1.0, t)), mult(b, t));
}

//--------------------------------
HNum3 a_mix(in HNum3 a, in HNum3 b, in float t)
{
    return add(mult(a, 1.0 - t), mult(b, t));
}

//--------------------------------
HNum3 a_mix(in HNum3 a, in float b, in HNum3 t)
{
    return add(mult(a, sub(1.0, t)), mult(b, t));
}

//--------------------------------
HNum3 a_mix(in HNum3 a, in float b, in float t)
{
    return add(mult(a, 1.0 - t), b*t);
}

//--------------------------------
HNum3 a_mix(in float a, in HNum3 b, in HNum3 t)
{
    return add(mult(a, sub(1.0, t)), mult(b, t));
}

//--------------------------------
HNum3 a_mix(in float a, in HNum3 b, in float t)
{
    return add(a * (1.0 - t), mult(b, t));
}

//--------------------------------
HNum3 a_mix(in float a, in float b, in HNum3 t)
{
    return add(mult(a, sub(1.0, t)), mult(b, t));
}


//--------------------------------
// Implementation prototypes
//--------------------------------

mat3 a_outerProduct(in vec3 a, in vec3 b)
{
    return mat3(a * b[0], a * b[1], a * b[2]);
}


#endif // HESSNUM_3_H_
//********************************************************************
//
// End automatic differentiation header
//
//********************************************************************

//********************************************************************
//
// Metric functions
//
//********************************************************************
#define PI 3.14159265359

// float pSphere(vec3 p, float s) {return length(p)-s;}
HNum3 fSphere(in HNum3 x, in HNum3 y, in HNum3 z,
                 float globalTime) {             
                 
   return sub(a_sqrt(add(add(mult(x,x), mult(y,y)), mult(z,z))), 5.0)  ;
}

// Predefined functions!
// Just comment in the one you want to see! (comment out the other ones)
// If none is defined, the default identity operation is used (f(x,y,z) = 0).
// You can comment out all predefined ones and implement your own in fCustom

// #define EXPONENTIAL
// #define SADDLE// 
// #define WAVES
// #define PARABOLA
// #define IDENTITY

HNum3 fExponential(in HNum3 x, in HNum3 y, in HNum3 z,
                      float globalTime);
HNum3 fSaddle(in HNum3 x, in HNum3 y, in HNum3 z, float globalTime);
HNum3 fWaves(in HNum3 x, in HNum3 y, in HNum3 z, float globalTime);
HNum3 fParabola(in HNum3 x, in HNum3 y, in HNum3 z,
                   float globalTime);
HNum3 fIdentity(in HNum3 x, in HNum3 y, in HNum3 z,
                   float globalTime);
                   
// Implement your custom height function here and comment out all the defines
// above! You can use all the mathematical operations defined in the automatic
// differentiation header above
HNum3 fCustom(in HNum3 x, in HNum3 y, in HNum3 z,
                 float globalTime) {
  return div(fExponential(x, y, z, globalTime), 1.0);
  // return a_max(fExponential(x, y, z, globalTime), fSphere(x, y, z, globalTime));
}

// Height function used for the metric. More information in compute_update
HNum3 f(in HNum3 x, in HNum3 y, in HNum3 z, float globalTime) {

#if defined EXPONENTIAL
  return fExponential(x, y, z, globalTime);
#elif defined SADDLE
  return fSaddle(x, y, z, globalTime);
#elif defined WAVES
  return fWaves(x, y, z, globalTime);
#elif defined PARABOLA
  return fParabola(x, y, z, globalTime);
#elif defined IDENTITY
  return fIdentity(x, y, z, globalTime);
#else
  return fCustom(x, y, z, globalTime);
#endif
}

HNum3 hessF(vec3 p, float globalTime) {
  // compute hessian with addition time parameter

  vec3 uGrad = vec3(1., 0., 0.);
  HNum3 uHessian = HNum3(p.x, uGrad, mat3(0.));
  vec3 vGrad = vec3(0., 1., 0.);
  HNum3 vHessian = HNum3(p.y, vGrad, mat3(0.));
  vec3 wGrad = vec3(0., 0., 1.);
  HNum3 wHessian = HNum3(p.z, wGrad, mat3(0.));
  return f(uHessian, vHessian, wHessian, globalTime);
}
vec3 compute_update(vec3 p, vec3 v, float globalTime) {

  // explicit form of the christoffel symbols for the metric given by graph of
  // the function f(x_1,x_1,x_3) df/dx_i = f_i d^2f/(dx_j dx_j) = f_{i,j}
  // gamma_{i,j}^k = f_k*f_{i,j}/(1 + |grad(f)|^2)

  // This is the update given by the differential geodesic equation used for
  // numerical integration in local coordinates (x_i) for a curve c_i(t) = x_i
  // d^2 c_k/dt^2 = gamma_{i,j}^k * d c_i/dt * d c_j/dt

  // The local position's velocity is given by the ray's current direction

  // compute first and second order derivatives
  HNum3 r = hessF(p, globalTime);

  vec3 vp = vec3(0.0);

  for (int k = 0; k < 3; k++) {
    for (int i = 0; i < 3; i++) {
      for (int j = 0; j < 3; j++) {
        vp[k] += r.g[k] * r.h[j][i] * v[i] * v[j];
      }
    }
  }

  return -vp / (1.0 + dot(r.g, r.g));
}

//********************************************************************
//
// Data
//
//********************************************************************

const float deg2Rad = PI / 180.0;
const vec2 D_lastMouse = vec2(0., 0.);
vec4 D_thetaPhi;
const vec3 D_pos = vec3(0.0);

//********************************************************************
//
// Camera functions
//
//********************************************************************
// This assumes the pixel position px to be in [0,1],
// which can be done by (x+0.5)/w or (y+0.5)/h (or h-y +0.5 for screens
// with top left origin) to sample pixel centers
vec3 createRay(vec2 px, mat4 PInv, mat4 VInv);

mat4 translate(vec3 t);
mat4 translateInv(vec3 t);

mat4 scale(vec3 s);
mat4 scaleInv(vec3 s);

mat4 rightToLeft();
mat4 rightToLeftInv();

mat4 ortho(float l, float r, float b, float t, float n, float f);
mat4 orthoInv(float l, float r, float b, float t, float n, float f);

mat4 projection(float n, float f);
mat4 projectionInv(float n, float f);

mat4 perspective(float fov, float aspect, float n, float f);
mat4 perspectiveInv(float fov, float aspect, float n, float f);

mat4 lookAt(vec3 eye, vec3 center, vec3 up);
mat4 lookAtInv(vec3 eye, vec3 center, vec3 up);

//********************************************************************
//
// Implementation
//
//********************************************************************
HNum3 fExponential(in HNum3 x, in HNum3 y, in HNum3 z,
                      float globalTime) {
  HNum3 rx = sub(x, 0.0);
  HNum3 ry = sub(y, 1.0);
  HNum3 rz = sub(z, 2.0);

  HNum3 x2 = mult(rx, rx);
  HNum3 y2 = mult(ry, ry);
  HNum3 z2 = mult(rz, rz);

  HNum3 sum = neg(add(x2, add(y2, z2)));

  HNum3 f1 = a_exp(mult(sum, 1.0 / (abs(sin(globalTime / 5.0)) + 0.1)));

  rx = sub(x, -6.0);
  ry = sub(y, -0.25);
  rz = sub(z, 3.5);

  x2 = mult(rx, rx);
  y2 = mult(ry, ry);
  z2 = mult(rz, rz);

  sum = neg(add(x2, add(y2, z2)));
  sum = mult(sum, 0.1);
  HNum3 f2 = mult(a_exp(sum), abs(sin(globalTime / 5.0)) * 5.0);
  return add(f1, f2);
}
HNum3 fSaddle(in HNum3 x, in HNum3 y, in HNum3 z,
                 float globalTime) {
  return mult(0.25 * pow(sin(globalTime / 2.0), 2.0),
              sub(add(mult(x, x), mult(z, z)), mult(y, y)));
}
HNum3 fWaves(in HNum3 x, in HNum3 y, in HNum3 z, float globalTime) {
  float frequency = 2.0 * PI / (10.0 + 3.0 * sin(0.1 * globalTime));
  return a_cos(add(mult(x, frequency), mult(z, frequency)));
}
HNum3 fParabola(in HNum3 x, in HNum3 y, in HNum3 z,
                   float globalTime) {
  HNum3 rel = sub(x, 0.0);
  HNum3 sum = mult(rel, rel);

  rel = sub(y, 0.0);
  sum = add(sum, mult(rel, rel));

  rel = sub(z, -10.0);
  sum = add(sum, mult(rel, rel));
  return mult(sum, 0.1);
}
HNum3 fIdentity(in HNum3 x, in HNum3 y, in HNum3 z,
                   float globalTime) {
  return constH3(0.0);
}


//********************************************************************
vec3 createRay(vec2 px, mat4 PInv, mat4 VInv) {

  // convert pixel to NDS
  // [0,1] -> [-1,1]
  vec2 pxNDS = px * 2. - 1.;

  // choose an arbitrary point in the viewing volume
  // z = -1 equals a point on the near plane, i.e. the screen
  vec3 pointNDS = vec3(pxNDS, -1.);

  // as this is in homogenous space, add the last homogenous coordinate
  vec4 pointNDSH = vec4(pointNDS, 1.0);
  // transform by inverse projection to get the point in view space
  vec4 dirEye = PInv * pointNDSH;

  // since the camera is at the origin in view space by definition,
  // the current point is already the correct direction (dir(0,P) = P - 0 = P
  // as a direction, an infinite point, the homogenous component becomes 0
  // the scaling done by the w-division is not of interest, as the direction
  // in xyz will stay the same and we can just normalize it later
  dirEye.w = 0.;

  // compute world ray direction by multiplying the inverse view matrix
  vec3 dirWorld = (VInv * dirEye).xyz;

  // now normalize direction
  return normalize(dirWorld);
}

// matrix operations
mat4 translate(vec3 t) {
  return mat4(vec4(1., 0., 0., 0.), vec4(0., 1., 0., 0.), vec4(0., 0., 1., 0.),
              vec4(t, 1.));
}
mat4 translateInv(vec3 t) { return translate(-t); }

mat4 scale(vec3 s) {
  return mat4(vec4(s.x, 0., 0., 0.), vec4(0., s.y, 0., 0.),
              vec4(0., 0., s.z, 0.), vec4(0., 0., 0., 1.));
}
mat4 scaleInv(vec3 s) { return scale(1. / s); }

mat4 rightToLeft() {
  // 1 0 0  0
  // 0 1 0  0
  // 0 0 -1 0
  // 0 0 0  1
  return scale(vec3(1., 1., -1.));
}

mat4 rightToLeftInv() {
  // same matrix
  return rightToLeft();
}

mat4 ortho(float l, float r, float b, float t, float n, float f) {

  // translation and scale
  return scale(vec3(2. / (r - l), 2. / (t - b), 2. / (f - n))) *
         translate(vec3(-(l + r) / 2., -(t + b) / 2., -(f + n) / 2.));
}

mat4 orthoInv(float l, float r, float b, float t, float n, float f) {
  return translateInv(vec3(-(l + r) / 2., -(t + b) / 2., -(f + n) / 2.)) *
         scaleInv(vec3(2. / (r - l), 2. / (t - b), 2. / (f - n)));
}

mat4 projection(float n, float f) {
  // n 0 0 0	0
  // 0 n 0 0	0
  // 0 0 n+f	-fn
  // 0 0 1	0
  return mat4(vec4(n, 0., 0., 0.), vec4(0., n, 0., 0.), vec4(0., 0., n + f, 1.),
              vec4(0., 0., -f * n, 0.));
}

mat4 projectionInv(float n, float f) {
  // 1/n 	0 	0 		0
  // 0 	1/n	0 		0
  // 0	0	0 		1
  // 0	0	-1/fn	(f+n)/fn

  return mat4(vec4(1. / n, 0., 0., 0.), vec4(0., 1. / n, 0., 0.),
              vec4(0., 0., 0., -1. / (f * n)),
              vec4(0., 0., 1., (f + n) / (f * n)));
}

mat4 perspective(float fov, float aspect, float n, float f) {
  float l = tan(fov / 2.) * n;
  float b = l / aspect;

  return ortho(-l, l, -b, b, n, f) * projection(n, f) * rightToLeft();
}

mat4 perspectiveInv(float fov, float aspect, float n, float f) {
  float l = tan(fov / 2.) * n;
  float b = l / aspect;

  return rightToLeftInv() * projectionInv(n, f) * orthoInv(-l, l, -b, b, n, f);
}

mat4 lookAt(vec3 eye, vec3 center, vec3 up) {

  vec3 z = normalize(eye - center);
  vec3 x = normalize(cross(up, z));
  vec3 y = cross(z, x);

  mat4 v = mat4(vec4(x.x, y.x, z.x, 0.), vec4(x.y, y.y, z.y, 0.),
                vec4(x.z, y.z, z.z, 0.), vec4(0., 0., 0., 1.));

  return v * translate(-eye);
}

mat4 lookAtInv(vec3 eye, vec3 center, vec3 up) {
  vec3 z = normalize(eye - center);
  vec3 x = normalize(cross(up, z));
  vec3 y = cross(z, x);

  return translateInv(-eye) *
         mat4(vec4(x, 0.), vec4(y, 0.), vec4(z, 0.), vec4(0., 0., 0., 1.));
}

//********************************************************************
//
// INSTRUCTIONS:
//
// This shader implements basic distancefield raytracing of a regular grid on
// a 3D manifold given as the graph of a function f(x,y,z).
//
// Click and drag the mouse to look around. WASD for movement. You are moving
// through distorted space, so you might not move in a straight line, but
// instead move along the shortest paths in curved space.
// Press "R" to reset the view.
//
// WARNING: The space-bending and movement might be a bit nauseating. I tried to
// not make stuff move too fast, but wanted to give a short warning anyways.

// DISPLAY OPTIONS:
// Below you can find a few defines to control the look of the grid, background
// and general tracing
//
// FUNCTIONAL OPTIONS
// The actual function is defined in the common shader part under "Metric
// functions".  A few functions are predefined, but you can implement your own
// in "HessNum3 fCustom(in HessNum3 x, in HessNum3 y, in HessNum3 z, float
// globalTime)". Just comment in the one you want to see or comment all of them
// out to use the custom function. Additional information can be found there as
// well.
//
// Note: As calculating the metric from the function requires 1st and 2nd order
// derivatives and it would be cumbersome to calculate and implement them for
// each function, I use automatic differentiation, which is why you need to use
// the special types and operations defined in the automatic differentiation
// header below to implement your own f.
//
//
//********************************************************************

// Maximum number of trace steps
// Higher step number will let you look farther, especially with higher
// deformations
#define MAX_STEPS 100
// The maximum length traversed per step. Lower values produce higher quality
// paths through the curved space
#define STEP_LENGTH 0.2

// Grid parameters
#define GRID_SIZE 1.0
#define GRID_THICKNESS 0.0001

// Linear fog parameters
#define FOG_MAX_DIST 50.0
#define FOG_MIN_DIST 10.0
#define FOG_COLOR vec3(0.1)

// Surface epsilon
#define EPS 0.005

//********************************************************************
//
// Distance helpers and tracing
//
//********************************************************************
#define INF 3.402823466e+38

float sdf(in vec3 p);

vec3 g_eye;

// Tracing function
bool trace(inout vec3 p, in vec3 dir, out float t, out int steps) {

  steps = 0;
  t = 0.0;
  
  for (int n = 0; n < MAX_STEPS && t < FOG_MAX_DIST; n++) {

    float d = sdf(p);

    if (d < EPS) {
    
     if(length(p - g_eye) > 10.0 * GRID_SIZE) return false;
    
      return true;
    }


    // make step length dependent on distance field to not miss geometry, but
    // also use EPS as a min step
    float s = max(EPS, min(d, STEP_LENGTH));

    // "acceleration" of a particle traveling along the geodesic
    vec3 vp = compute_update(p, dir, iTime);

    // euler integration
    dir += vp * s;
    // position euler step
    p += dir * s;

    // maybe not fully correct, but try to preserve arc length by normalizing
    // dir
    dir = normalize(dir);

    // length traveled, used for fog
    t += s;

    steps++;
  }

  return false;
}

float sdfBox2d(vec2 p, vec2 b);
float sdfGrid(vec3 p, vec3 cellSize, vec3 gridHalfLength);
vec3 normal(in vec3 p);

//********************************************************************
//
// Image formation
//
//********************************************************************

vec3 shade(in vec3 P, in vec3 N, in vec3 color, in vec3 LPos);

float simpleColor(float coord, float gridSize, float repetition);

// vec4 loadValue(in ivec2 re) { return texelFetch(iChannel0, re, 0); }


void mainImage(out vec4 fragColor, in vec2 fragCoord) {

  vec2 uv = fragCoord.xy / iResolution.xy;

  float aspect = iResolution.x / iResolution.y;

  D_thetaPhi = vec4(90.0 * deg2Rad - clamp(2.0 * PI * iTime, 0., 2.0*PI) / 20., 0.0, 0.0, 0.0);

  vec3 rayDir;
  vec3 p;

  // {
    vec2 thetaPhi = D_thetaPhi.xy; // loadValue(D_thetaPhi).xy;
    float theta = thetaPhi.x;
    float phi = thetaPhi.y;

    vec3 eye = D_pos.xyz; // loadValue(D_pos).xyz;
    // vec3 eye = vec3(0.0);
    vec3 center =
        eye + vec3(sin(theta) * sin(phi), cos(theta), sin(theta) * cos(phi));
        
    g_eye = eye;
    // inverse projection and view matrices
    mat4 PInv = perspectiveInv(radians(90.), aspect, 0.1, 100.);
    mat4 VInv = lookAtInv(eye, center, vec3(0., 1., 0.));

    rayDir = createRay(uv, PInv, VInv);
    p = eye;
  // }

  float t = 0.0;
  int steps = 100;

  vec3 col = vec3(0.0);
  
  if (trace(p, rayDir, t, steps)) {
  
    vec3 N = normal(p);

    col =
        vec3(simpleColor(p.x, GRID_SIZE, 3.0), simpleColor(p.y, GRID_SIZE, 7.0),
             simpleColor(p.z, GRID_SIZE, 11.0));

    vec3 LPos = vec3(1.0, 2.0, 0.0);
    col = shade(p, N, col, LPos);

  } else {
    t = INF;
  }

  // simple linear fog
  float fogFactor = (FOG_MAX_DIST - t) / (FOG_MAX_DIST - FOG_MIN_DIST);
  fogFactor = clamp(fogFactor, 0.0, 1.0);
  col = mix(FOG_COLOR, col, fogFactor);

  fragColor = vec4(col, t == INF ? 0.0 : 1.0);
}

//***************

float sdfBox2d(vec2 p, vec2 b) {
  vec2 d = abs(p) - b;
  return min(max(d.x, d.y), 0.0) + length(max(d, 0.0));
}

float sdfGrid(vec3 p, vec3 cellSize, vec3 gridHalfLength) {
  vec3 pg = floor(p / cellSize) * cellSize;

  vec3 center = pg + cellSize * 0.5;

  float d = INF;
  
  d = min(d, sdfBox2d(p.xz - center.xz, gridHalfLength.xz));
  d = min(d, sdfBox2d(p.xy - center.xy, gridHalfLength.xy));
  d = min(d, sdfBox2d(p.yz - center.yz, gridHalfLength.yz));
  
  return d;
}

float pSphere(vec3 p, float s) {return length(p)-s;}

float sdf(in vec3 p) {
  float d = sdfGrid(p, vec3(GRID_SIZE), vec3(GRID_THICKNESS));
  
  // d = max(d, pSphere(p - g_center, 5.0 * GRID_SIZE));
  
  return d;
}

vec3 normal(in vec3 p) {
  return normalize(
      vec3(sdf(p + vec3(EPS, 0., 0.)) - sdf(p - vec3(EPS, 0., 0.)),
           sdf(p + vec3(0., EPS, 0.)) - sdf(p - vec3(0., EPS, 0.)),
           sdf(p + vec3(0., 0., EPS)) - sdf(p - vec3(0., 0., EPS))));
}

//********************************************************************
//
// Shading and main
//
//********************************************************************
vec3 shade(in vec3 P, in vec3 N, in vec3 color, in vec3 LPos) {
  vec3 L = normalize(LPos - P);
  // simple two sided diffuse shading
  return vec3(0.1, 0.8, 0.1) * clamp(abs(dot(L, N)), 0.0, 1.0);
}

float simpleColor(float coord, float gridSize, float repetition) {

  float cr = (gridSize * 0.5 - coord) / gridSize;

  float v = pow(sin(PI * cr / 2.0), 2.0);

  return v;
}

adapted from sibaku's Geodesic raytracer

ray-marching [ag-000N]

definition. ray [ag-001A]

A ray is a half-line that starts at a point and extends indefinitely in one direction.

In the context of ray marching etc., a ray could be a view ray, i.e. a ray that starts at the camera, extending through the screen, possibly intersecting with objects in the scene, or a secondary ray, i.e. a ray generated from the intersection of other rays, e.g. shadow rays, reflection rays, or refraction rays.

definition. ray equation [gillespie2024ray, sec. 1.1 eq. 3] [ag-000V]

A ray anchored at origin \(\boldsymbol {r}_o\) in the direction of the unit vector \(\boldsymbol {r}_d\) can be parametrically defined as a ray equation \[ \boldsymbol {r}(t)=\boldsymbol {r}_o+t \boldsymbol {r}_d \]

definition. ray intersection [gillespie2024ray, sec. 1.1 eq. 4] [ag-000W]

Plugging the ray equation \(r: \mathbb {R} \rightarrow \mathbb {R}^n\) into the function \(f: \mathbb {R}^n \rightarrow \mathbb {R}\) that defines the implicit surface produces the composite real function \(F: \mathbb {R} \rightarrow \mathbb {R}\) where \(F=f \circ \boldsymbol {r}\) such that the solutions to \[ F(t)=0 \] correspond to ray intersections with the implicit surface. Theses solutions are also called the roots of the function \(F\).

remark. ray-casting/ray-tracing/ray-marching [ag-001E]

The ray-casting/ray-tracing/ray-marching algorithms simply apply one of the multitude of root finding methods to solve the ray intersection equation.

In general, ray-tracing typically uses a bunch of geometry (e.g. closed-form formulas) to calculate the intersection point, while ray-marching marches along the ray, and uses numerical methods to find the intersection point.

The term ray-casting was introduced concurrently with ray-tracing in the 1980s, nowadays it could be used to mean the general ray intersection process, or specifically the ray-marching process with fixed step size.

definition. ray marching [ag-001C]

To render a scene, the Ray marching algorithm marches a ray from an origin towards a direction. At each step, the algorithm marches a short (possibly changing) distance and checks if a ray intersection occurs. This process continues until a stopping condition is met. The information obtained from the ray intersection is then used to determine the color of a pixel on the screen.

example. ray-marching for different types of rays [ag-001D]

A view ray has the origin at the camera, the direction is determined by the pixel on the screen that it passes through. It may intersect with a surface in the scene, and the information obtained from the intersection is used to determine the color of the corresponding pixel on the screen.

A secondary ray typically originates from a point on a surface and its direction is determined by the type of ray. For instance, a shadow ray is cast towards a light source to determine if the point is in shadow, while a reflection ray and a refraction ray follow directions based on optics. The information obtained from secondary rays helps determine the color of the corresponding pixel that the view ray passes through.

figure. ray marching [ag-001F]

Inspired by Ray trace diagram from Wikipedia

definition. ray marching (naïve) [ag-001B]

The ray marching (naïve) algorithm is to march a ray by a fixed step size, check if a ray intersection occurs at each step, until the ray reaches a maximum distance or step count.

Formally, the root found by ray marching (naïve) is the limit point of the sequence defined by the recurrence equation \[t_{i+1} = t_i + \Delta t\] where \(t_0 = 0\) and \(\Delta t\) is the fixed step size.

At each step, the ray marches to \(\boldsymbol {r}(t_i)\), which is usually denoted \(\boldsymbol {r}_i\).

figure. ray marching (naïve) [ag-000O]

algorithm. ray marching (naïve) [ag-0018]

example. ray marching (naïve) [ag-0019]

The following renders a scene with a unit sphere at the origin, the camera at \((0,0,8)\) and looking at the origin, through a screen of height 1.0, centered at 5.0 from the camera.

The color of pixels are adjusted so that they are brighter the closer they are to the camera, as if there is a light source at the camera. As it takes a fixed step size of 0.1 (not small enough), the sphere is not rendered smoothly, and clearly shows the stepping artifacts.

/* The SDF for a sphere with radius r */
float sdSphere(vec3 p, float r)
{
  return length(p) - r;
}

/* The SDF for a scene with a unit sphere at the origin */
float sdScene(in vec3 pos) {
    return sdSphere(pos, 1.0);
}

#define EPSILON 0.01
#define T_MAX 10.0
#define DELTA_T 0.1
#define SDF_FUNCTION sdScene

/* Ray marching (naïve) */
float rayMarch(in vec3 ro, in vec3 rd) {

    float t = 0.0;

    while(t < T_MAX) {
        vec3 rt = ro + t * rd;
        float ft = SDF_FUNCTION(rt);
        
        if(ft < EPSILON) return t;

        t += DELTA_T;
    }
    
    return T_MAX;
}

/* The main function to render the scene per pixel */
void mainImage(out vec4 pixelColor, in vec2 pixelCoordinate) {
    
    // set up uv coordinates on the screen
    float aspect = iResolution.x/iResolution.y;
    vec2 uv = pixelCoordinate/iResolution.xy;
    uv -= vec2(0.5, 0.5);
    uv *= 2.0 * vec2(aspect, 1.0);

    // set up camera and view ray
    float observerDistance = 8.0;
    vec3 ro = vec3(0.0, 0.0, observerDistance);
    vec3 ta = ro - vec3(uv, 5.0);
    vec3 rd = normalize(ta - ro);

    // ray march
    float t = rayMarch(ro, rd);
    float hit = t != T_MAX ? 1.0 : 0.0;

    // adjust the color to look just bright enough
    float baseRatio = 0.65;
    // the closer, the brighter
    float distRatio = baseRatio + smoothstep(0.0, 1.0 - baseRatio, 1.0 - t / observerDistance);
    vec3 col = distRatio * vec3(1.0);

    // transparent if not hit
    pixelColor = vec4(col, hit);
}

simplified from zmcbeth's Raymarch 3.0

Code

/* The SDF for a sphere with radius r */
float sdSphere(vec3 p, float r)
{
  return length(p) - r;
}

/* The SDF for a scene with a unit sphere at the origin */
float sdScene(in vec3 pos) {
    return sdSphere(pos, 1.0);
}

#define EPSILON 0.01
#define T_MAX 10.0
#define DELTA_T 0.1
#define SDF_FUNCTION sdScene

/* Ray marching (naïve) */
float rayMarch(in vec3 ro, in vec3 rd) {

    float t = 0.0;

    while(t < T_MAX) {
        vec3 rt = ro + t * rd;
        float ft = SDF_FUNCTION(rt);
        
        if(ft < EPSILON) return t;

        t += DELTA_T;
    }
    
    return T_MAX;
}

/* The main function to render the scene per pixel */
void mainImage(out vec4 pixelColor, in vec2 pixelCoordinate) {
    
    // set up uv coordinates on the screen
    float aspect = iResolution.x/iResolution.y;
    vec2 uv = pixelCoordinate/iResolution.xy;
    uv -= vec2(0.5, 0.5);
    uv *= 2.0 * vec2(aspect, 1.0);

    // set up camera and view ray
    float observerDistance = 8.0;
    vec3 ro = vec3(0.0, 0.0, observerDistance);
    vec3 ta = ro - vec3(uv, 5.0);
    vec3 rd = normalize(ta - ro);

    // ray march
    float t = rayMarch(ro, rd);
    float hit = t != T_MAX ? 1.0 : 0.0;

    // adjust the color to look just bright enough
    float baseRatio = 0.65;
    // the closer, the brighter
    float distRatio = baseRatio + smoothstep(0.0, 1.0 - baseRatio, 1.0 - t / observerDistance);
    vec3 col = distRatio * vec3(1.0);

    // transparent if not hit
    pixelColor = vec4(col, hit);
}

remark. [ag-001G]

By convention, the GLSL community uses two-to-three letter variable names. Here are some common abbreviations:

ro = ray origin
rd = ray direction
ca = camera
ta = target (where the camera is pointing)
uv = texture coordinates

Source: ⧉

sphere tracing [ag-0010]

definition. point-to-set distance [hart1996sphere, def. 1] [ag-000R]

The point-to-set distance defines the distance from a point \(\boldsymbol {x} \in \mathbb {R}^3\) to a set \(A \subset \mathbb {R}^3\) as the distance from \(x\) to the closest point in \(A\), \[ d(\boldsymbol {x}, A)=\min _{y \in A}\lVert \boldsymbol {x}-\boldsymbol {y}\rVert \]

definition. signed distance bound [hart1996sphere, def. 2] [ag-000S]

A function \(f: \mathbb {R}^3 \rightarrow \mathbb {R}\) is a signed distance bound of its implicit surface \(f^{-1}(0)\) if and only if \[ |f(\boldsymbol {x})| \leq d\left (\boldsymbol {x}, f^{-1}(0)\right ) \] If equality holds, then \(f\) is a signed distance function.

definition. Lipschitz bound [hart1996sphere, def. 3] [ag-000P]

A function \(f: \mathbb {R}^3 \rightarrow \mathbb {R}\) is Lipschitz over a domain \(D\) if and only if for all \(\boldsymbol {x}, \boldsymbol {y} \in D\), there exists a positive finite constant \(\lambda \), called Lipschitz bound, such that \[ |f(\boldsymbol {x})-f(\boldsymbol {y})| \leq \lambda \lVert \boldsymbol {x}-\boldsymbol {y}\rVert . \] The Lipschitz constant, denoted \(\operatorname {Lip} f\), is the minimum \(\lambda \) satisfying the condition above.

theorem. [gillespie2024ray, thm. 1] [ag-000T]

Let \(f\) be Lipschitz with Lipschitz bound \(\lambda \geq \) Lip \(f\). Then the function \(f / \lambda \) is a signed distance bound of its implicit surface.

definition. sphere tracing [hart1996sphere, sec. 2.3, eq. 12] [ag-001I]

The sphere tracing algorithm is to march a ray by an adaptive safe step size, which is the absolute value of the signed distance bound calculated by \(F/\lambda \) per theorem [ag-000T], the rest is the same as ray marching (naïve).

Formally, the root found by sphere tracing is the limit point of the sequence defined by the recurrence equation \[t_{i+1} = t_i + \frac {|F(t_i)|}{\lambda } = t_i + \frac {f(\boldsymbol {r}(t_i))}{\lambda } \] where \(F\) is the ray intersection, and \(f\) is the SDF.

Usually \(\lambda = 1\), and \(F(t_i)\) stays positive until the stopping condition \(F(t_i) < \epsilon \) is met, so the above can be simplified to \[t_{i+1} = t_i + F(t_i) = t_i + f(\boldsymbol {r}(t_i)) \]

figure. sphere tracing [ag-001H]

figure. raymarching in raymarching (sphere tracing) [ag-001J]

#define MAT_SCREEN 0.
#define MAT_SPHERE1 1.
#define MAT_FLOOR  2.
#define MAT_CORNER 3.
#define MAT_SCREENZ 4.
#define MAT_MARCHSPHERE 5.
#define MAT_MARCHROUTE 6.
#define MAT_MARCHRADIUS 7.
#define MAT_SPHERE2 8.
#define MAT_SPHERE3 9.
#define MAT_RAYDIRECTION 10.
#define MAT_HITPOINT 11.

float uvToP = 0.0;
float colToUv = 1.0;
float screenZ = 2.5;

float sphereAnim = 1.0;
float radiusAnim = 1.0;
float routeAnim = 1.0;

float raySphereAlpha = 0.0;

float cornersAlpha = 0.0;
float cornersAnim = 0.0;

float screenZAlpha = 0.0;

float radiusAlpha = 1.0;

float rayDirectionAnim = 0.0;
float rayDirectionAnim2 = 0.0;

float screenAlpha = 0.0;

float hitSphereID = 0.0;

// =====================Camera========================
vec3 cameraPos, cameraTarget;

//================================

float sum = 0.0;

float tl(float val, float offset, float range)
{
    float im = sum + offset;
    float ix = im + range;
    sum += offset + range;
    return clamp((val - im) / (ix - im), 0.0, 1.0);
}

float cio(float t) {
	return t < 0.5
	? 0.5 * (1.0 - sqrt(1.0 - 4.0 * t * t))
	: 0.5 * (sqrt((3.0 - 2.0 * t) * (2.0 * t - 1.0)) + 1.0);
}

float eio(float t) {
	return t == 0.0 || t == 1.0
		? t
		: t < 0.5
		? +0.5 * pow(2.0, (20.0 * t) - 10.0)
		: -0.5 * pow(2.0, 10.0 - (t * 20.0)) + 1.0;
}

float fio(float t) {
	return t == 0.0 || t == 1.0
		? t
		: t <= 0.5
		? -0.5 * pow(2.0, 5.0 - ((t + 0.5) * 10.0)) + 1.0 - 0.5
		: +0.5 * pow(2.0, (10.0 * (t - 0.5)) - 5.0) + 0.5;
}

void timeLine(float time)
{
    cameraPos = vec3(5., 5., -3.); // mix(vec3(0.0), , eio(t));
    cameraTarget = vec3(0.0, 0.0, 5.);
    
    //time += 32.;
    float t; // = tl(time, 1.0, 0.5);
    // uvToP = mix(0.0, 1.0, eio(t));
    
    // t = tl(time, 1.0, 1.0);
    
    // t = tl(time, 0.5, 1.0);
    // raySphereAlpha = mix(0., 1., t);
    // cornersAlpha = mix(0., 1., t);
    
    // t = tl(time, 0.5, 1.0);
    // cornersAnim = mix(0., 1., t);
    
    t = tl(time, 0.0, 0.5);
    // cameraPos = mix(cameraPos, vec3(5., 6., -3.) , t); // eio(t));
    ///cameraTarget = vec3(0.0, 0.0, 3.0); //mix(cameraTarget, , eio(t));
    
    // t = tl(time, 0.5, 0.5);
    // screenZAlpha = mix(0., 1., t);

	// t = tl(time, 0.5, .5);
    // colToUv = mix(colToUv, 0.0, eio(t));
    
    // t = tl(time, 0.5, 1.0);
    // screenZ = mix(screenZ, 5.0, eio(t));
    
    // t = tl(time, 1.0, 1.0);
    // screenZ = mix(screenZ, .75, eio(t));
    
    
    // t = tl(time, 1.0, 1.0);
    // screenZ = mix(screenZ, 2.5, eio(t));
    
    
    // t = tl(time, 0.5, 1.0);
    // cornersAlpha = mix(cornersAlpha, 0., t);
    // screenZAlpha = mix(screenZAlpha, 0., t);
    // colToUv = mix(colToUv, 1.0, eio(t));
    
    // t = tl(time, 0.0, 0.0);
    // cornersAnim = mix(cornersAnim, 0., t);
    
    // t = tl(time, 0.5, 1.0);
    // cameraPos = mix(cameraPos, vec3(5., 15., 6.0), eio(t));
    // cameraTarget = mix(cameraTarget, vec3(0.0, 0.0, 6.), eio(t));
    
    raySphereAlpha = 1.;
    
    for (int i=0; i<20; i++) {
        t = tl(time, 0., 0.5);
        radiusAnim = mix(radiusAnim, float(i) + 2., t);
        t = tl(time, 0., 0.5);
        routeAnim = mix(routeAnim, float(i) + 2., t);
        t = tl(time, 0., 0.5);
        sphereAnim = mix(sphereAnim, float(i) + 2., t);
    }
    
    
    // t = tl(time, 1.0, 1.0);
    // cameraPos = mix(cameraPos, vec3(2., 3., -3.), eio(t));
    // cameraTarget = mix(cameraTarget, vec3(0.0, 0.0, 3.0), eio(t));
    // radiusAlpha = mix(radiusAlpha, 0.0, t);
    // routeAnim = mix(routeAnim, 1.0, t);
    // sphereAnim = mix(sphereAnim, 1.0, t);
    // colToUv = mix(colToUv, 1.0, eio(t));
    // screenAlpha = mix(screenAlpha, 0.0, t);
    
    t = tl(time, 0.5, 0.0);
    radiusAnim = mix(radiusAnim, 0.0, t);
    routeAnim = mix(routeAnim, 0.0, t);
    sphereAnim = mix(sphereAnim, 0.0, t);

    
    t = tl(time, 0.5, 1.0);
    rayDirectionAnim2 = mix(rayDirectionAnim2, 1.0, t); // fio(t)); // eio(t));
    
    t = tl(time, 0.5, 3.0);
    rayDirectionAnim = mix(rayDirectionAnim, 1.0, fio(t));
    
    t = tl(time, 3.5, 0.0);
    rayDirectionAnim2 = mix(rayDirectionAnim2, 0.0, t);
    raySphereAlpha = mix(raySphereAlpha, 0.0, t);
    
    // rayDirectionAnim = mix(rayDirectionAnim, 0.0, t);
    // raySphereAlpha = mix(raySphereAlpha, 0.0, t);
    // rayDirectionAnim2 = mix(rayDirectionAnim2, 0.0, t);
    
    
    // t = tl(time, 0.5, 1.0);
    // cameraPos = mix(cameraPos, vec3(0.), eio(t));
    // cameraTarget = mix(cameraTarget, vec3(0.0, 0.0, 5.), eio(t));
}

vec2 opU(vec2 a, vec2 b)
{
    return a.x < b.x ? a : b;
}

float sdLine( vec3 p, vec3 a, vec3 b, float r )
{
    vec3 pa = p - a, ba = b - a;
    float h = clamp( dot(pa,ba)/dot(ba,ba), 0.0, 1.0 );
    return length( pa - ba*h ) - r;
}

float sdBox( vec3 p, vec3 b )
{
  vec3 d = abs(p) - b;
  return length(max(d,0.0))
         + min(max(d.x,max(d.y,d.z)),0.0); // remove this line for an only partially signed sdf 
}

float sdSphere(vec3 p, float s)
{
    return length(p) - s;
}

vec3 sunDir = normalize(vec3(.3, .25, .2));
vec3 skyColor(vec3 rd)
{
    float sundot = clamp(dot(rd,sunDir),0.0,1.0);
    // sky		
    vec3 col = mix(vec3(0.2,0.5,0.85)*1.1, vec3(0.0,0.15,0.7), rd.y);
    col = mix( col, 0.85*vec3(0.8,0.8,0.7), pow( 1.0-max(rd.y,0.0), 4.0 ) );
    // sun
    col += 0.3*vec3(1.0,0.7,0.4)*pow( sundot,5.0 );
    col += 0.5*vec3(1.0,0.8,0.6)*pow( sundot,64.0 );
    col += 6.0*vec3(1.0,0.8,0.6)*pow( sundot,1024.0 );
    return col * 3.0;
}

vec3 screenPos;

vec2 sceneSpheres(vec3 p)
{
    vec2 s1 = vec2(sdSphere(p - vec3(-2.0, 0.0, 6.0), 1.), MAT_SPHERE1);
    vec2 s2 = vec2(sdSphere(p - vec3(0.0, 0.0, 16.0), 1.), MAT_SPHERE2);
    vec2 s3 = vec2(sdSphere(p - vec3(2.0, 0.0, 9.0), 1.), MAT_SPHERE3);
    return opU(opU(s1, s2), s3);
}

vec2 sceneMap(vec3 p)
{
    vec2 s = sceneSpheres(p);
    vec2 p1 = vec2(p.y + 2.0, MAT_FLOOR);
    return opU(s, p1);
}

vec3 normal(vec3 p)
{
    vec2 e = vec2(0.001, 0.0);
    float d = sceneMap(p).x;
    return normalize(d - vec3(sceneMap(p - e.xyy).x, sceneMap(p - e.yxy).x, sceneMap(p - e.yyx).x));
}

void sceneTrace(inout vec3 pos, vec3 ray, out vec2 mat, inout float depth, float maxD)
{
    vec3 ro = pos;
    int i = 0;
    for(; i < 70; i++) {
        if (depth > maxD) {
        	depth = maxD;
            break;
        }
    	pos = ro + ray * depth;
        mat = sceneMap(pos);
        if (mat.x < 0.001) {
            break;
        }
        depth += mat.x;
    }
}

vec2 screenMap(vec3 p)
{
    vec2 screenSize = iResolution.xy / min(iResolution.x, iResolution.y);
    vec2 b1 = vec2(sdBox(p - vec3(0., 0., screenZ), vec3(screenSize, 0.01)), MAT_SCREEN);
    return b1;
}

void screenTrace(inout vec3 pos, vec3 ray, out vec2 mat, inout float depth, float maxD)
{
    vec3 ro = pos;
    for(int i = 0; i < 20; i++) {
        if (depth > maxD) {
        	depth = maxD;
            break;
        }
    	pos = ro + ray * depth;
        mat = screenMap(pos);
        if (mat.x < 0.001) {
            break;
        }
        depth += mat.x;
    }
}

float remap(float val, float im, float ix, float om, float ox)
{
    return clamp(om + (val - im) * (ox - om) / (ix - im), om, ox);
}

vec2 gizmoCorners(vec3 p)
{
    vec2 screenSize = iResolution.xy / min(iResolution.x, iResolution.y);
    
    float a1 = remap(cornersAnim, 0.0, 0.25, 0.0, 1.0);
    float a2 = remap(cornersAnim, 0.25, 0.5, 0.0, 1.0);
    float a3 = remap(cornersAnim, 0.5, 0.75, 0.0, 1.0);
    float a4 = remap(cornersAnim, 0.75, 1.0, 0.0, 1.0);
    
    vec2 c1 = vec2(sdLine(p, vec3(0.), mix(vec3(0.), vec3(screenSize, screenZ), a1), 0.02), MAT_CORNER);
    vec2 c2 = vec2(sdLine(p, vec3(0.), mix(vec3(0.), vec3(screenSize * vec2(1.,-1.), screenZ), a2), 0.02), MAT_CORNER);
    vec2 c3 = vec2(sdLine(p, vec3(0.), mix(vec3(0.), vec3(screenSize * vec2(-1.,-1.), screenZ), a3), 0.02), MAT_CORNER);
    vec2 c4 = vec2(sdLine(p, vec3(0.), mix(vec3(0.), vec3(screenSize * vec2(-1.,1.), screenZ), a4), 0.02), MAT_CORNER);
    return opU(c1, opU(c2, opU(c3, c4)));
}

vec2 gizmoScreenZ(vec3 p)
{
    vec2 c1 = vec2(sdLine(p, vec3(0.), vec3(0., 0., screenZ), 0.03), MAT_SCREENZ);
    return c1;
}

float sphereID = 0.0;

vec3 rayRoute = vec3(0., 1., 0.); // y must be init to non-zero
vec2 gizmoMarching(vec3 p)
{
    vec3 ray = vec3(0., 0., 1.);
    vec2 d = vec2(10000.);
    
    if(rayDirectionAnim2 > 0.0) {
        if(abs(rayRoute.y - 0.) <= 0.01) {
            ray = normalize(vec3(rayRoute.x, rayRoute.y, rayRoute.z));
            radiusAnim = 20.;
            routeAnim = 20.;
            sphereAnim = 20.;
        }
    }
    
    float t = 0.0;
    vec3 pos;
    int maxSteps = hitSphereID == 0. ? 20 : int(hitSphereID) + 1;
    for(int i=0; i<maxSteps; i++) {
    	pos = ray * t;
        vec2 s = vec2(sdSphere(p - pos, 0.15), MAT_MARCHSPHERE);
        if (s.x < d.x) {
            d = s;
            sphereID = float(i);   
        }
        
        float dist = sceneSpheres(pos).x;
        
        float anim = clamp(routeAnim - float(i) - 1., 0.0, 1.0);
        vec2 c1 = vec2(sdLine(p, pos, pos + mix(vec3(0.), ray * dist, anim), 0.03), MAT_MARCHROUTE);
        
        d = opU(d, c1);
        t += dist;
        
    }    
    return d;
}

vec2 gizmoMarchingRadius(vec3 p)
{
    vec2 d = vec2(p.y, MAT_MARCHRADIUS);
    return d;
}


vec2 gizmoRayDirection(vec3 p)
{
    float a1 = fract(rayDirectionAnim * 19.99999);
    float a2 = floor(rayDirectionAnim * 19.99999) / 20.;
    vec2 screenSize = iResolution.xy / min(iResolution.x, iResolution.y);
    screenSize.y *= -1.;
    screenSize = mix(-screenSize, screenSize, vec2(a1, a2));
    rayRoute = mix(vec3(0.), vec3(screenSize, screenZ) * 10.0, rayDirectionAnim2);
    vec2 c1 = vec2(sdLine(p, vec3(0.), mix(vec3(0.), vec3(screenSize, screenZ) * 10.0, rayDirectionAnim2), 0.02), MAT_RAYDIRECTION);
    
    vec3 ray = normalize(vec3(screenSize, screenZ));
    vec3 pos;
    float t = 0.01;
    for(int i = 0; i < 40; i++) {
    	pos = ray * t;
        vec2 d = sceneMap(pos);
        if (d.x < 0.001) {
            break;
        }
        t += d.x;
    }
    
    c1 = opU(c1, vec2(sdSphere(p - pos, 0.15), MAT_HITPOINT));
    
    return c1;
}

vec2 gizmoMap(vec3 p)
{
    vec2 d = opU(gizmoCorners(p), gizmoScreenZ(p));
    d = opU(d, gizmoMarching(p));
    d = opU(d, gizmoRayDirection(p));
    return d;
}

void gizmoTrace(inout vec3 pos, vec3 ray, out vec2 mat, inout float depth, float maxD)
{
    vec3 ro = pos;
    for(int i = 0; i < 60; i++) {
        if (depth > maxD) {
        	depth = maxD;
            break;
        }
    	pos = ro + ray * depth;
        mat = gizmoMap(pos);
        if (mat.x < 0.001) {
            break;
        }
        depth += mat.x;
    }
}

vec2 radiusMap(vec3 p)
{
    return gizmoMarchingRadius(p);
}

void radiusTrace(inout vec3 pos, vec3 ray, out vec2 mat, inout float depth, float maxD)
{
    vec3 ro = pos;
    for(int i = 0; i < 40; i++) {
        if (depth > maxD) {
        	depth = maxD;
            break;
        }
    	pos = ro + ray * depth;
        mat = radiusMap(pos);
        if (mat.x < 0.001 || depth > maxD) {
            break;
        }
        depth += mat.x;
    }
}

float shadow(in vec3 p, in vec3 l, float ma)
{
    float t = 0.1;
    float t_max = ma;
    
    float res = 1.0;
    for (int i = 0; i < 16; ++i)
    {
        if (t > t_max) break;
        vec3 pos = p + t*l;
        float d = opU(sceneMap(pos), screenMap(pos)).x; // sceneMap(pos).x; //opU(, screenMap(pos)).x;
        if (d < 0.001)
        {
            return 0.0;
        }
        t += d*1.0;
        res = min(res, 10.0 * d / t);
    }
    
    return res;
}

// checkerbord
// https://www.shadertoy.com/view/XlcSz2
float checkersTextureGradBox( in vec2 p, in vec2 ddx, in vec2 ddy )
{
    // filter kernel
    vec2 w = max(abs(ddx), abs(ddy)) + 0.01;  
    // analytical integral (box filter)
    vec2 i = 2.0*(abs(fract((p-0.5*w)/2.0)-0.5)-abs(fract((p+0.5*w)/2.0)-0.5))/w;
    // xor pattern
    return 0.5 - 0.5*i.x*i.y;                  
}

vec3 sceneShade(vec2 mat, vec3 pos, vec3 ray, float depth, float maxD)
{
    vec3 col;
    vec3 sky = skyColor(ray);
    if (depth > maxD - 0.01) {
    	return sky;
    }
    float sha = shadow(pos, sunDir, 10.);
    vec3 norm = normal(pos);
    vec3 albedo = vec3(0.);
    if(mat.y == MAT_SPHERE1) {
        albedo = vec3(1., 0., 0.);
    } else if (mat.y == MAT_SPHERE2) {
        albedo = vec3(0., 1., 0.);
    } else if (mat.y == MAT_SPHERE3) {
        albedo = vec3(0., 0., 1.);
    } else if(mat.y == MAT_FLOOR) {
        vec2 ddx_uvw = dFdx( pos.xz ); 
        vec2 ddy_uvw = dFdy( pos.xz ); 
        float checker = checkersTextureGradBox( pos.xz, ddy_uvw, ddy_uvw );
        albedo = vec3(max(0.2, checker)) * vec3(.8,0.8,0.7) * 2.0;
    }
    
    float diffuse = clamp(dot(norm, sunDir), 0.0, 1.0) * sha * 2.0;
    col = albedo * (diffuse + 0.05);
    
    float fo = 1.0 - exp2(-0.0001 * depth * depth);
    vec3 fco = 0.65*vec3(0.4,0.65,1.0);
    col = mix( col, sky, fo );
    return col;
}

vec3 screenShade(vec2 mat, vec3 pos)
{
    vec3 ro = vec3(0., 0., 0.);
    vec3 ta = vec3(0., 0., 3.);
    vec3 fo = normalize(ta - ro);
    vec3 ri = normalize(cross(vec3(0.,1.,0.), fo));
    vec3 up = normalize(cross(fo,ri));
    mat3 cam = mat3(ri,up,fo);
    vec3 ray = cam * normalize(pos);
    float depth = 0.01;
    vec3 p = ro;
    sceneTrace(p, ray, mat, depth, 100.);
    vec3 col = vec3(0.);
    col = sceneShade(mat, p, ray, depth, 100.);
    
    float a1 = fract(rayDirectionAnim * 19.99999);
    float a2 = floor(rayDirectionAnim * 19.99999 + 1.0) / 20.;
    float a3 = floor(rayDirectionAnim * 19.99999) / 20.;
    
    float aspect = iResolution.y / iResolution.x;
    float halfAspect = aspect * 0.5;
    
    a1 = step(pos.x * halfAspect + 0.5, a1);
    a2 = step(-pos.y * 0.49 + 0.51+ 0.0, a2);
    a3 = step(-pos.y * 0.49 + 0.51 + 0.0, a3);
    //col *= min(screenAlpha + min(a2 * a1 + a3, 1.0), 1.0);
    
    // vec3 uvCoord = vec3(pow(clamp(mix(pos.xy*0.5+0.5, pos.xy, uvToP), 0.0, 1.0), vec2(2.2)), 0.0);
    vec3 uvCoord = vec3(1.);
    col = mix(col, uvCoord, colToUv - min(screenAlpha + min(a2 * a1 + a3, 1.0), 1.0));
    return col;
    //return vec3(pow(clamp(pos.xy, 0.0, 1.0), vec2(2.2)), 0.0);
}

vec4 gizmoShade(vec2 mat, vec3 p)
{
    vec4 col = vec4(0.);

    if(mat.y == MAT_CORNER) {
        col = vec4(1.,0.,0., cornersAlpha);
    } else if (mat.y == MAT_SCREENZ) {
    	col = vec4(0.05, 0.05, 1., screenZAlpha);
    } else if (mat.y == MAT_MARCHSPHERE) {
        float alpha = clamp(sphereAnim - sphereID, 0.0, 1.0);
        vec3 sc = mix(vec3(.0, .1, 3.), vec3(1.02, 1., 0.02), float(sphereID == 0. || sphereID == hitSphereID));
        // vec3 sc = mix(vec3(.0, .1, 3.), vec3(.02, 1., 1.02), float(sphereID == 0. || sphereID == hitSphereID));
        float inRange = 1.0; //sphereID <= hitSphereID ? 1.0 : 0.0;
    	col = vec4(sc, alpha * raySphereAlpha * inRange);
    } else if (mat.y == MAT_MARCHROUTE) {
        col = vec4(1., 0., 0., .9);
    } else if (mat.y == MAT_RAYDIRECTION) {
        col = vec4(1., 0., 0., .9);
    } else if (mat.y == MAT_HITPOINT) {
    	col = vec4(1.02, 1., .02, raySphereAlpha);
    }
    return col;
}

vec4 radiusShade(vec2 mat, vec3 p)
{
    vec4 col = vec4(0.);

    vec3 ray = vec3(0., 0., 1.);
    vec2 d = vec2(10000.);
    
    if(rayDirectionAnim2 > 0.0) {
        if(abs(rayRoute.y - 0.) <= 0.01) {
            ray = normalize(vec3(rayRoute.x, rayRoute.y, rayRoute.z));
            radiusAnim = 20.;
            routeAnim = 20.;
            sphereAnim = 20.;
        }
    }
    
    float t = 0.0;
    for(int i=0; i<20; i++) {
        vec3 pos = ray * t;
        vec2 dd = sceneSpheres(pos);
        d = vec2(sdSphere(p - pos, dd.x), MAT_MARCHSPHERE);

        vec2 sceneDist = sceneMap(pos);
        if (sceneDist.x < 0.001) {
            if(hitSphereID == 0.) {
              hitSphereID = float(i);
            }
            break;
        }

        float alpha2 = step(radiusAnim, float(i) + 2.);
        float alpha = clamp(radiusAnim - float(i) - 1., 0.0, 1.0);
        
        vec4 cirCol = vec4(.0, 0.05, 0.1, 0.9); //mix(vec4(.0, 0.05, 0.1, 0.9), vec4(0.2, 1., 1.4, .6) , alpha2);
        // fill
        col = mix(col, cirCol, smoothstep(0.01, 0., d.x) * cirCol.a * alpha);
        // strike
        col = mix(col, vec4(0., 0., 0., 1.), smoothstep(0.02, 0., abs(d.x) - 0.01) * alpha);
        t += dd.x;
    }
    col *= radiusAlpha;
    return col;
}

float luminance2(vec3 col)
{
    return dot(vec3(0.298912, 0.586611, 0.114478), col);
}

vec3 reinhard(vec3 col, float exposure, float white) {
    col *= exposure;
    white *= exposure;
    float lum = luminance2(col);
    return (col * (lum / (white * white) + 1.0) / (lum + 1.0));
}

vec3 render(vec2 p) {
    screenPos = vec3(p, 2.5);
    vec3 ro = cameraPos;
    vec3 ta = cameraTarget;
    vec3 fo = normalize(ta - ro);
    vec3 ri = normalize(cross(vec3(0.,1.,0.), fo));
    vec3 up = normalize(cross(fo,ri));
    mat3 cam = mat3(ri,up,fo);
    vec3 ray = cam * normalize(screenPos);
    float depth = 0.01;
    vec2 mat;
    vec3 col = vec3(0.);
    
    vec3 pos = ro;
    sceneTrace(pos, ray, mat, depth, 100.);
    
    col = sceneShade(mat, pos, ray, depth, 100.);
    
    float sceneDepth = depth;
    depth = 0.01;
    pos = ro;
    screenTrace(pos, ray, mat, depth, sceneDepth);
    if (depth < sceneDepth) {
        col = screenShade(mat, pos);
    }
    
    float sceneAndScreenDepth = depth; // sceneDepth
    depth = 0.01;
    pos = ro;
    radiusTrace(pos, ray, mat, depth, sceneAndScreenDepth);
    if (depth < sceneAndScreenDepth) {
        vec4 radius = radiusShade(mat, pos);
        col = mix(col, radius.rgb, radius.a);
    }
    
    float sceneAndScreenAndGizmoDepth = sceneAndScreenDepth;
    depth = 0.01;
    pos = ro;
    gizmoTrace(pos, ray, mat, depth, sceneAndScreenAndGizmoDepth);
    if (depth < sceneAndScreenAndGizmoDepth) {
        vec4 gizmo = gizmoShade(mat, pos);
        col = mix(col, gizmo.rgb, gizmo.a);
    }
    return col;
}

void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
    vec2 uv = (iMouse.xy - iResolution.xy)/min(iResolution.x, iResolution.y);
    // timeLine(mod(length(uv.x), 41.));
    timeLine(mod(iTime/4., 4.5));

    vec2 p = (fragCoord * 2.0 - iResolution.xy)/min(iResolution.x, iResolution.y);
    vec2 dp = 1. / iResolution.xy;

    vec3 col = vec3(0.);
    
    
    // AA
    // https://www.shadertoy.com/view/Msl3Rr
    for(int y = 0; y < 2; y++) {
        for(int x = 0; x < 2; x++) {
            vec2 off = vec2(float(x),float(y))/2.;
            vec2 xy = (-iResolution.xy+2.0*(fragCoord + off * dp / 4.0)) / iResolution.y;
        	col += render(xy)*0.25;
        }
    }
    
    col = reinhard(col, 1.0, 1000.0);
    col = pow(col, vec3(1.0/2.2));
    
    //col = mix(col, vec3(mix(uv, p, uvToP), 0.), colToUv);
    
    fragColor = vec4(col,1.0);
}

adapted from kaneta's Raymarching in Raymarching

add the algorithm and the shader for sphere tracing

theorem. [gillespie2024ray, thm. 2] [ag-000U]

Given a function \(F: \mathbb {R} \rightarrow \mathbb {R}\) with Lipschitz bound \(\lambda \geq \) Lip \(F\), and an initial point \(t_0\), sphere tracing converges linearly to the smallest root greater than \(t_0\).

remark. [ag-000Q]

theorem [ag-000U] means that sphere tracing requires the function to be Lipschitz, i.e. has a known Lipschitz bound.

Not all functions are Lipschitz, e.g. Harmonic Functions are not globally Lipschitz, many may have no local Lipschitz bound even within small neighborhoods, thus requires a new technique named Harnack tracing developed in [gillespie2024ray].

Notes on Topos Theory and Type Theory [tt-0001]
📄PDF

May 28, 2024
Utensil Song

category theory [tt-000Z]

Categories [tt-0005]

definition. category [kostecki2011introduction, 1.1] [tt-0002]

A category \({\cal C}\) consists of:

objects \(\operatorname {Ob}({\cal C})\): \(O, X, Y, \dots \)
arrows \(\operatorname {Arr}({\cal C})\): \(f, g, h, \dots \), where for each arrow \(f\),
- a pair of operations \(\operatorname {dom}\) and \(\operatorname {cod}\) assign a domain object \(X=\operatorname {dom}(f)\) and a codomain object \(Y=\operatorname {cod}(f)\) to \(f\),
- thus f can be denoted by \[f : X \to Y\] or \[X \xrightarrow {f} Y\]
compositions: a composite arrow of any pair of arrows \(f\) and \(g\), denoted \(g \circ f\) or \( f \mathbin {\bullet } g \), makes the diagram commute (we say that \(f \mathbin {\bullet } g\) factors through \(Y\)),
a identity arrow for each object \(O\), denoted \(\mathit {1}_O : O \to O\)

satisfying:

associativity of composition: the diagram commutes,
identity law: the diagram commutes.

convention. composition [kostecki2011introduction, 1.1] [tt-003X]

In literatures, composition of arrows in a category is most frequently denoted \(\circ \) or just by juxtaposition, i.e. for the diagram \(g \circ f\) or just \(gf\) mean \(g\) applies to the result of \(f\).

But this is the opposite of the order of arrows in the diagram, so arguably a more natural way could be to denote the above as \(f ; g\) [kostecki2011introduction, 1.1], or as \(f \mathbin {⨟} g\) [fong2018seven, 3.6], or as \(f \mathbin {\bullet } g\) [nakahira2023diagrammatic, sec. 1.1].

We use \(f \mathbin {\bullet } g\) throughout this note, so it can always be understood as \(\xrightarrow {f}\xrightarrow {g}\). This way saves mental energy when reading commuting diagrams, pasting diagrams and string diagrams which we employ heavily.

definition. (locally) small, hom-set [kostecki2011introduction, 1.3] [tt-000A]

A category C is called locally small iff for any of its objects \(X, Y\) , the collection of arrows from \(X\) to \(Y\) is a set, called a hom-set, denoted \[\operatorname {Hom}_{{\cal C}}(X, Y)\]

A category is called small iff the collection of all its arrows is a set.

notation. hom-set, hom-class [zhang2021type, 1.2] [tt-0003]

\(\operatorname {Hom}\) in \(\operatorname {Hom}_{{\cal C}}(X, Y)\) is short for homomorphism, since an arrow in category theory is a morphism (i.e. an arrow), a generalization of homomorphism between algebraic structures.

This notation could be unnecessarily verbose, so when there is no confusion, we follow [leinster2016basic] and [zhang2021type] to simply write \(X,Y \in \operatorname {Ob}({\cal C})\) as \[X,Y \in {\cal C}\] and \(f \in \operatorname {Hom}_{{\cal C}}(X, Y)\) as \[f \in {\cal C}(X, Y)\]

In some other scenarios, when the category in question is clear (and it might be to too verbose to write out, e.g. a functor category), we omit the subscript of the category and write just \[\operatorname {Hom}(X, Y)\]

In general, collection \(\operatorname {Ob}({\cal C})\) and \(\operatorname {Arr}({\cal C})\) are not neccessarily sets, but classes. In that case, \(\operatorname {Hom}_{{\cal C}}(X, Y)\) is called a hom-class.

Later, we will also learn that \(\operatorname {Ob}\) and \(\operatorname {Arr}\) are representable functors.

definition. finite [kostecki2011introduction, 1.3] [tt-0034]

A category is called finite iff it is small and it has only a finite number of objects and arrows.

definition. commuting diagram [kostecki2011introduction, 1.2] [tt-0007]

A diagram in a category \({\cal C}\) is defined as a collection of objects and arrows that belong to the category \({\cal C}\) with specified operations \(\operatorname {dom}\) and \(\operatorname {cod}\).

A commuting diagram is defined as such diagram that any two arrows of this diagram which have the same domain and codomain are equal.

For example, that the diagram commutes means \(f \mathbin {\bullet } g = h \mathbin {\bullet } k\).

It's also called a arrow diagram [nakahira2023diagrammatic, 1.1] when compared to a string diagram, as it represents arrows with \(\to \).

definition. string diagram [nakahira2023diagrammatic, table 1.1] [tt-003V]

A string diagram represents categories as surfaces (2-dimensional), functors as wires (1-dimensional), natural transformations as blocks or just dots (0-dimensional).

String diagrams has the advantage of being able to represent objects, arrows, functors, and natural transformations from multiple categories, and their vertical and horizontal composition, and has various topologically plausible calculational rules for proofs.

notation. string diagrams: category, object and arrow [marsden2014category, sec. 2.1] [tt-0004]

Later, when we have learned about functors and natural transformations, we will see that, in string diagram for 1-category:

A category \({\cal C}\) is represented as a colored region:
Functors of type \(\mathbf {1} \to {\cal C}\) can be identified with objects of the category \({\cal C}\), where \(\mathbf {1}\) is the the terminal category, so an object \(X \in \operatorname {Ob}({\cal C})\) can be represented as:
An arrow \(f : X \to Y\) is then a natural transformation between two of these functors, represented as:

convention. letters [tt-0010]

The general idea is that we try to use visually distinct letters for different concepts:

uppercase calligraphic letters \({\cal C}, {\cal D}, {\cal E}, {\cal J}, {\cal S}\) denote categories
- \({\cal C}, {\cal D}, {\cal E}\) are prefered in concepts about one to three categories, since \({\cal C}\) is the first letter of "category"
- \({\cal J}\) is used in concepts like diagram, and assumed to be small
- \({\cal S}\) is used in concepts like subcategory, or sometimes in concepts about a small category
boldface uppercase Roman letters denote specific categories, e.g. \(\mathbf {Cat}, \mathbf {Set}, \mathbf {Grp}, \mathbf {Top}, \mathbf {1}\)
uppercase Roman letters \(X, Y, Z, W, O, E, V\) denote objects in categories
- \(X, Y\) usually mean objects in \({\cal C}\) and \({\cal D}\), respectively
- \(O\) denotes any object
- \(E\) denotes the equalizer object
- in limit-related concepts, \(\mathrm {-}\) denotes any object, \(V\) denotes the vertex
lowercase Roman letters \(f, g, h, i, k, l, r\) and sometimes the lowercase Roman letter of the corresponding codomain or domain object denote arrows
- occationally, when two arrows are closely related, they are denoted by the same letter with different subscripts, e.g. \(g_1, g_2\)
- as special cases, \(\iota , p, i\) denote the inclusion, projection and injection arrows
uppercase script letters \(\mathscr {F}, \mathscr {G}, \mathscr {H}, \mathscr {I}, \mathscr {K}, \mathscr {L}, \mathscr {R}\) denote functors
- \(\mathscr {D}\) denotes a diagram functor
- \(\mathscr {L}\) and \(\mathscr {R}\) denote the left and right adjoint functors in an adjunction, respectively
- \(\mathscr {H}\) denotes the Yoneda embedding functors
- as a special case, functors with the terminal category (i.e. constant object functors) as the domain are identified with the objects in the codomain category, thus are denoted like an object: \(X : \mathbf {1} \to {\cal C}, \mathrm {*} \mapsto X\)
- \(\mathscr {I}\) is only used in the inclusion functor (note that this is letter "I")
- we do not use \(\mathscr {J}\) and \(\mathscr {S}\) because they are visually ambiguous
lowercase Greek letters \(\alpha , \beta , \eta , \epsilon , \sigma \) denote natural transformations, their components are denoted by them with subscripts.

isomorphism [tt-000F]

definition. monic [kostecki2011introduction, 2.1] [tt-000B]

An arrow \(f : X \to Y\) is monic if the diagram commutes, i.e. \(g_1 \mathbin {\bullet } f = g_2 \mathbin {\bullet } f \implies g_1 = g_2\), denoted \(f : X \rightarrowtail Y\).

"Monic" is short for "monomorphism", which is a generalization of the concept of injective (one-to-one) functions between sets.

definition. epic [kostecki2011introduction, 2.2] [tt-000C]

An arrow \(f : X \to Y\) is epic if the diagram commutes, i.e. \(f \mathbin {\bullet } g_1 = f \mathbin {\bullet } g_2 \implies g_1 = g_2\), denoted \(f : X \twoheadrightarrow Y\).

"Epic" is short for "epimorphism", which is a generalization of the concept of surjective (onto) functions between sets.

definition. iso [kostecki2011introduction, 2.3] [tt-000D]

An arrow \(f : X \to Y\) is iso, or \(X\) and \(Y\) are isomorphic, denoted \(X \cong Y\), or \(X \xrightarrow {\sim } Y\), if the diagram commutes, where \(!g\) means there exists a unique arrow \(g\), and \(g\) is called the inverse of \(f\), denoted \(f^{-1}\).

"Iso" is short for "isomorphism", which is a generalization of the concept of bijective (one-to-one and onto) functions.

convention. uniqueness: dashed arrow [tt-000J]

Uniqueness of an arrow is denoted \(\exists ! f\) or simply \(!f\), and visualized as a dashed arrow in diagrams, and \(!\) is often omitted.

lemma. Iso [kostecki2011introduction, 2.4] [tt-000E]

An iso arrow is always monic and epic. However, not every arrow which is monic and epic is also iso.

proof.

The diagram commutes for the iso arrow \(f\), thus

\(g = h\) i.e. \(f\) is monic,
\(k = l\) i.e. \(f\) is epic.

special objects and categories [tt-000G]

definition. initial, terminal and null objects [kostecki2011introduction, 2.6] [tt-000I]

An initial object in a category \({\cal C}\) is an object \(\mathrm {0} \in \operatorname {Ob}({\cal C})\) such that for any object \(X\) in \({\cal C}\), there exists a unique arrow \(\mathrm {0} \to X\). It's also called a universal object, or a free object.

A terminal object in a category C is an object \(\mathrm {1} \in \operatorname {Ob}({\cal C})\) such that for any object \(X\) in \({\cal C}\), there exists a unique arrow \(X \to \mathrm {1}\). It's also called a final object, or a bound object.

Diagramatically,

A null object is an object which is both terminal and initial, confusingly, it's also called a zero object.

lemma. Uniqueness [kostecki2011introduction, 2.7] [tt-000K]

All initial objects in a category are isomorphic.

All terminal objects in a category are isomorphic.

In other words, they are unique up to isomorphism, respectively.

definition. element [kostecki2011introduction, 2.8, 2.9] [tt-000M]

Let \(X, S \in \operatorname {Ob}({\cal C})\).

An element or a generalized element of \(X\) at stage \(S\) (or, of shape \(S\)) is an arrow \(x : S \to X\) in \({\cal C}\), also denoted \(x \in _{S} X\).

An arrow \(1 \to X\) is called a global element of \(X\), a.k.a. a point of \(X\).

An arrow \(S \to X\), if \(S\) is not isomorphic to \(1\), is called the local element of \(X\) at stage \(S\).

An arrow \(\mathit {1}_X : X \to X\) is called the generic element of \(X\).

remark. element [kostecki2011introduction, 2.8] [tt-002W]

In an element \(x : S \to X\), the object \(S\) is called a stage in order to express the intuition that it is a "place of view" on \(X\). In the same sense, \(S\) is also called a domain of variation, and \(X\) a variable element.

Sometimes, the term shape is used instead [leinster2016basic, 4.1.25], intuitive examples are:

when the object is a set, a generalized element of \(X\) of shape \(\mathbb N\) is a sequence in the set \(X\)
when the object is a topological space, a generalized element of \(X\) of shape \(S^1\) is a loop

In the context of studying solutions to polynomial equations, we may also call it a \(S\)-valued point in \(X\), where \(S\) is the number set where the solution is taken, e.g. the real, complex, and \(\operatorname {Spec} \mathbb F_p\)-valued solutions.

definition. equivalent, equivalence class [kostecki2011introduction, 2.10] [tt-000N]

Two monic arrows \(x\) and \(y\) which satisfy are called equivalent, which is denoted as \(x \sim y\).

The equivalence class of \(x\) is denoted as \([x]\), i.e., \([x]=\{y \mid x \sim y\}\).

definition. subobject, Sub [kostecki2011introduction, 2.10] [tt-004J]

A subobject of any object is defined as an equivalence class of monic arrows into it.

The class of subobjects of an object \(X\) is denoted as \[ \operatorname {Sub}(X):=\{[f] \mid \operatorname {cod}(f)=X \wedge f \text { is monic }\}. \]

definition. Set [kostecki2011introduction, 1.1, example 1] [tt-0008]

\(\mathbf {Set}\), the category of sets, consists of objects which are sets, and arrows which are functions between them. The axioms of composition, associativity and identity hold due to standard properties of sets and functions.

\(\mathbf {Set}\) has the initial object \(\varnothing \), the empty set, and the terminal object, \(\{*\}\), the singleton set.

\(\mathbf {Set}\) doesn't have a null object.

Monic arrows in \(\mathbf {Set}\) are denoted by \(f: X \hookrightarrow Y\), interpreted as an inclusion map (see also inclusion function in nLab).

Given \(X : \mathbf {Set}\), the subobjects of \(X\) are in canonical one-to-one correspondence with the subsets of \(X\).

notation. inclusion [leinster2016basic, 0.8] [tt-000T]

In the symbol \(\hookrightarrow \) is used for inclusions. It is a combination of a subset symbol \(\subset \) and an arrow.

definition. Cat [leinster2016basic, 3.2.10] [tt-003G]

We denote by \(\mathbf {Cat}\) the category of small categories and functors between them.

definition. discrete category [leinster2016basic, 1.1.8] [tt-003K]

A discrete category has no arrows apart from the identity arrow, i.e. it amounts to just a class of objects.

We can regard a set as a discrete category.

definition. terminal category [leinster2016basic, 4.1.6] [tt-003F]

A terminal category, denoted \(\mathbf {1}\), has only one object, denoted \(\mathrm {*}\), and only the identity arrow, denoted \(\mathit {1}\).

definition. opposite category [kostecki2011introduction, 2.5] [tt-000H]

A category is called the opposite category of \({\cal C}\), denoted \(C^{op}\), iff

(reversion of arrows) \[\operatorname {Ob}({\cal C}^{op}) = \operatorname {Ob}({\cal C})\] \[\operatorname {Arr}({\cal C}^{op}) \ni f: Y \to X \Longleftrightarrow \operatorname {Arr}({\cal C}) \ni f: X \to Y\]
(reversion of composition)

definition. product category [leinster2016basic, 1.1.11] [tt-0048]

Given categories \({\cal C}\) and \({\cal D}\), there is a product category, denoted \({\cal C} \times {\cal D}\), in which

an object is a pair \((X, Y)\)
an arrow \((X, Y) \to \left (X', Y'\right )\) is a pair \((f, g)\)
the composition is given by \[(f_1, g_1) \mathbin {\bullet } (f_2, g_2) = (f_1 \mathbin {\bullet } f_2, g_1 \mathbin {\bullet } g_2)\]
the identity on \((X, Y)\), denoted \(\mathit {1}_{(X, Y)}\) is \((\mathit {1}_X, \mathit {1}_Y)\)

where \(X \in {\cal C}\), \(Y \in {\cal D}\), \(f: X \to X' \in {\cal C}\), and \(g: Y \to Y' \in {\cal D}\).

definition. (full) subcategory [leinster2016basic, 1.2.18] [tt-002Y]

Let \({\cal C}\) be a category. A subcategory \({\cal S}\) of \({\cal C}\) consists of a subclass \(\operatorname {Ob}({\cal S})\) of \(\operatorname {Ob}({\cal C})\) together with, for each \(S, S' \in \operatorname {Ob}({\cal S})\), a subclass \({\cal S}\left (S, S'\right )\) of \({\cal C}\left (S, S'\right )\), such that \({\cal S}\) is closed under composition and identities.

It is a full subcategory if \({\cal S}\left (S, S'\right )={\cal C}\left (S, S'\right )\) for all \(S, S' \in \operatorname {Ob}({\cal S})\).

functors [tt-0013]

definition. (covariant) functor [kostecki2011introduction, 3.1] [tt-0014]

A (covariant) functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is given by the diagram i.e. a map of objects and arrows between categories \({\cal C}\) and \({\cal D}\) that preserves the structure of the compositions and identities.

definition. contravariant functor [kostecki2011introduction, 3.1] [tt-0015]

A functor \(\mathscr {F}\) is called a contravariant functor from \({\cal C}\) to \({\cal D}\), and denoted \(\mathscr {F} : {\cal C}^{op} \to {\cal D}\), if it obeys the definition given by the (covariant) functor for \({\cal C}\) replaced by \({\cal C}^{op}\), i.e. it's given by the diagram i.e. a map of objects and arrows between categories \({\cal C}\) and \({\cal D}\) that reverses the structure of the arrows, compositions and identities.

definition. functorial in [leinster2016basic, sec. 4.1] [tt-0041]

For some expression \(E(X)\) containing \(X\), when we say \(E(X)\) is (covariant) functorial in \(X\), we mean that there exists a functor \(\mathscr {F}\) such that for every \(f : X \to X'\).

Dually, we use the term contravariantly functorial in.

convention. functors [tt-001D]

For simplicity, when there is no confusion, we use \(\bullet \) to represent corresponding objects, and omit the arrow names in the codomain of a functor, e.g.

definition. full and faithful [kostecki2011introduction, 3.2] [tt-0017]

A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is full iff for any pair of objects \(X, Y\) in \({\cal C}\) the induced map \(F_{X, Y}: {\cal C}(X, Y) \to {\cal D}(\mathscr {F}(X), \mathscr {F}(Y))\) is surjective (onto). \(\mathscr {F}\) is faithful if this map is injective (one-to-one).

definition. preserve and reflect [kostecki2011introduction, 3.3] [tt-0018]

A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is called to preserve a property \(\wp \) of an arrow iff for every \(f \in \operatorname {Arr}({\cal C})\) that has a property \(\wp \) it follows that \(\mathscr {F}(f) \in \operatorname {Arr}({\cal D})\) has this property. A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is called to reflect a property \(\wp \) of an arrow iff for every \(\mathscr {F}(f) \in \operatorname {Arr}({\cal D})\) that has a property \(\wp \) it follows that \(f \in \operatorname {Arr}({\cal C})\) has this property.

example. full, faithful, preserve and reflect [kostecki2011introduction, 3.3] [tt-0019]

Every inclusion functor is faithful.

Every functor preserves isomorphisms.

Every faithful functor reflects monomorphisms and epimorphisms.

Every full and faithful functor reflects isomorphisms.

special functors [tt-003N]

definition. identity functor [kostecki2011introduction, 3.1, example 1] [tt-003P]

The identity functor \(\mathit {1}_{{\cal C}}: {\cal C} \to {\cal C}\) (denoted also by \( 1_{{\cal C}}: {\cal C} \to {\cal C}\)), defined by \(\mathit {1}_{{\cal C}}(X)=X\) and \(\mathit {1}_{{\cal C}}(f)=f\) for every \(X \in \operatorname {Ob}({\cal C})\) and every \(f \in \operatorname {Arr}({\cal C})\).

definition. constant functor [kostecki2011introduction, 3.1, example 2] [tt-003Q]

The constant functor \(\Delta _O: {\cal C} \to {\cal D}\) which assigns a fixed \(O \in \mathrm {Ob}({\cal D})\) to any object of \({\cal C}\) and \(\mathit {1}_O\), the identity arrow on \(O\), to any arrows from \({\cal C}\) : with compositions and identities preserved in a trivial way.

definition. constant object functor [leinster2016basic, 4.1.6] [tt-003H]

A functor from the terminal category \(\mathbf {1}\) to a category \({\cal C}\) simply picks out an object of \({\cal C}\), called a constant object functor (which is a constant functor), denoted \(\Delta _X : \mathbf {1} \to {\cal C}\) for some \(X \in \operatorname {Ob}({\cal C})\), or simly denoted by the object, e.g. \(X\).

As special cases, constant object functor for initial and terminal objects are denoted by \(\mathrm {0}\) and \(\mathrm {1}\), respectively.

definition. unique functor [nakahira2023diagrammatic, eq. 2.3] [tt-003Z]

A unique functor, is a functor from a category \({\cal C}\) to the terminal category \(\mathbf {1}\), uniquely determined by mapping all arrows in \({\cal C}\) to the identity arrow \(\mathit {1}_{\mathrm {*}}\) of the unique object \(\mathrm {*}\) in \(\mathbf {1}\).

This functor is often denoted by \(! : {\cal C} \to \mathbf {1}\).

Intuitively, the functor \(!\) acts to erase all information about the input.

definition. diagonal functor [leinster2016basic, sec. 6.1] [tt-003T]

Given a small category \({\cal J}\) and a category \({\cal C}\), the diagonal functor \[\Delta _{{\cal J}}: {\cal C} \to [{\cal J}, {\cal C}]\] maps each object \(X \in {\cal C}\) to the constant functor \(\Delta _{{\cal J}}(X): {\cal J} \to {\cal C}\), which in turn maps each object in \({\cal J}\) to \(X\), and all arrows in \({\cal J}\) to \(\mathit {1}_X\).

When \({\cal J}\) is clear in the context, we may write \(\Delta _{{\cal J}}(X)\) as \(\Delta _X\).

Particularly [kostecki2011introduction, 3.1, example 6], when \({\cal J}\) is a discrete category of two objects, \(\Delta : {\cal C} \to {\cal C} \times {\cal C}, \Delta (X)=(X, X)\) and \(\Delta (f)=(f, f)\) for \(f: X \to X^{\prime }\)

\(\Delta _{{\cal J}}(X)\) is the same as \(X \mathbin {\bullet } !\), thus [nakahira2023diagrammatic, eq. 2.12] \[\Delta _{{\cal J}} = - \mathbin {\bullet } !\]

definition. forgetful functor [kostecki2011introduction, 3.1, example 3] [tt-003R]

The forgetful functor, which forgets some part of structure, however arrows, compositions and identities are preserved.

definition. inclusion functor [leinster2016basic, 1.2.18] [tt-003S]

Whenever \({\cal S}\) is a subcategory of a category \({\cal C}\) , there is an inclusion functor \(\mathscr {I} : {\cal S} \hookrightarrow {\cal C}\) defined by \(\mathscr {I}(S) = S\) and \(\mathscr {I}( f ) = f\) , i.e. it sends objects and arrows of \({\cal S}\) into themselves in category \({\cal C}\). It is automatically faithful, and it is full iff S is a full subcategory.

example. other special functors [kostecki2011introduction, 3.1, example 10, 11, 4.6] [tt-0016]

Some other special functors are introduced in later sections in context, e.g. hom-functor, Yoneda embedding functors.

universal properties [tt-003O]

definition. universal arrow [kostecki2011introduction, 3.4, 3.5] [tt-001A]

A universal arrow from \(X \in {\cal D}\) to \(\mathscr {F} : {\cal C} \to {\cal D}\) is a unique pair \((O , u)\) that makes the diagram commute for any \(f \in {\cal D}\).

Conversely, a (co)universal arrow from \(\mathscr {F} : {\cal C} \to {\cal D}\) to \(X \in {\cal D}\) is a unique pair \((O , u)\) that makes the diagram commute for any \(f \in {\cal D}\).

remark. universal property [leinster2016basic, sec. 1] [tt-002Z]

We say universal arrows satisfy some universal property.

The term universal property is used to describe some property (usually some equality of some compositions, or a commuting diagram in general) that is satisfied for all (hence universal) relevant objects/arrows in the "world" (i.e. in the relevant categories), by a corresponding unique object/arrow.

It's usually spell out like this:

In a context where (usually a given diagram), for all (some objects/arrows), there exists a unique (object/arrow) such that (some property).

Diagramatically,

For clarity and brevity, usually the \(\forall \) clauses are specified outside the diagram, and the \(\exists !\) clause is expressed by dashed arrows. For more elaborate diagrams, see [freyd1976properties][fong2018seven][ochs2022missing].

For the diagram above, we also say that \(f\) uniquely factors through \(u\) along \(g\) [riehl2017category, 2.3.7].

A universal property is a property of some construction which boils down to (is manifestly equivalent to) the property that an associated object is an initial object of some (auxiliary) category [nlab2020univ].

definition. comma category [leinster2016basic, 2.3.1] [tt-001C]

Given categories and functors the comma category \(\mathscr {F} \downarrow \mathscr {G}\) (or \((\mathscr {F} \Rightarrow \mathscr {G})\)) is the category given by objects \((X, h, Y)\) and arrows \((f, g)\) that makes the diagram commute.

lemma. universal arrow via initial/terminal object of comma category [kostecki2011introduction, 3.6] [tt-001B]

A universal arrow from \(X \in {\cal D}\) to \(\mathscr {F} : {\cal C} \to {\cal D}\) is an initial object in the comma category \(X \downarrow \mathscr {F}\).

Conversely, a (co)universal arrow from \(\mathscr {F} : {\cal C} \to {\cal D}\) to \(X \in {\cal D}\) is a terminal object in the comma category \(\mathscr {F} \downarrow X\).

natural transformation and functor category [tt-001M]

definition. natural transformation [leinster2016basic, 1.3.1] [tt-001E]

Given categories and functors

the natural transformation \(\alpha : \mathscr {F} \to \mathscr {G}\), denoted

is a collection of arrows \(\left \{\alpha _X : \mathscr {F}(X) \to \mathscr {G}(X) \right \}_{X \in \operatorname {Ob}({\cal C})}\) in \({\cal D}\) which satisfies naturality, i.e. makes the diagram

commute for every arrow \(f : X \to X'\) in \({\cal C}\). The arrows \(\left \{\alpha _X\right \}_{X \in \operatorname {Ob}({\cal C})}\) are called the components of the natural transformation.

definition. pasting diagram [nakahira2023diagrammatic, table 1.1] [tt-003W]

A pasting diagram represents categories as points (0-dimensional), arrows \(\to \) (1-dimensional), natural transformations as surfaces with level-2 arrows \(\Rightarrow \) (0-dimensional).

For example:

It's dual to a corresponding string diagram

definition. Nat [kostecki2011introduction, 4.4] [tt-001N]

The collection of natural transformations from functors \(\mathscr {F}\) to \(\mathscr {G}\) is denoted \(\operatorname {Nat}(\mathscr {F}, \mathscr {G})\).

notation. string diagrams: functor and natural transformation [marsden2014category, sec. 2] [tt-001G]

In string diagrams,

A functor \(\mathscr {F} : {\cal C} \to {\cal D}\) can be represented as an edge, commonly referred to as a wire:
Functors compose from left to right:
A natural transformation \(\alpha : \mathscr {F} \to \mathscr {F}'\) can be represented as a dot on the wire from top to bottem (the opposite direction of [marsden2014category], but the same as [sterling2023models]), connecting the two functors :
Natural transformations (for the same pair of categories) compose vertically from top to bottem:
Natural transformations (from different pairs of categories) compose horizontally from left to right:
The two ways of composing natural transformations are related by the interchange law: i.e. \[(\alpha \mathbin {\bullet } \alpha ') \mathbin {\bullet } (\beta \mathbin {\bullet } \beta ') = (\alpha \mathbin {\bullet } \beta ) \mathbin {\bullet } (\alpha ' \mathbin {\bullet } \beta ')\]
The naturality in natural transformations is equivalent to the following equality: where \(X\) and \(X'\) are objects in \({\cal C}\), understood as functors from the terminal category \(\mathit {1}\) to \({\cal C}\).
👉
Since a string diagrams is composed from top to bottem, left to right, we can read as \[(X \mathbin {\bullet } \mathscr {F}) \mathbin {\bullet } (\alpha _X) \mathbin {\bullet } (f \mathbin {\bullet } \mathscr {G})=(X' \mathbin {\bullet } \mathscr {G}) \quad {\large =} \quad (X \mathbin {\bullet } \mathscr {F}) \mathbin {\bullet } (f \mathbin {\bullet } \mathscr {F}) \mathbin {\bullet } (\alpha '_X)=(X' \mathbin {\bullet } \mathscr {G})\] where each pair of parentheses corresponds to an overlay in the string diagram, or with the notation in the opposite direction that is more familiar to most: \[\mathscr {G}(f) \circ \alpha _X \circ \mathscr {F}(X) = \mathscr {G}(X') \quad {\large =} \quad \alpha '_X \circ \mathscr {F}(f) \circ \mathscr {F}(X) = \mathscr {G}(X')\] Note that we read the wire from \(\mathscr {F}\) to \(\mathscr {G}\) as \(\mathscr {F}\) before the natural transformation, but as \(\mathscr {G}\) after the transformation.
Effectively naturality says that the natural transformation and function \(f\) “slide past each other”, and so we can draw them as two parallel wires to illustrate this.

definition. functor category [leinster2016basic, 1.3.6] [tt-001F]

The functor category from \({\cal C}\) to \({\cal D}\), denoted \([{\cal C}, {\cal D}]\) or \({\cal D}^{{\cal C}}\), is a category whose objects are functors from \({\cal C}\) to \({\cal D}\) and whose arrows are natural transformations between them, where composition is given by and the identity is given by

remark. indexed, labelled [kostecki2011introduction, 4.5] [tt-001P]

One can think of a functor category \([{\cal C}, {\cal D}]\) or \({\cal D}^{{\cal C}}\) as a category of diagrams in \(D\) indexed (or labelled) by the objects from \({\cal C}\).

This particularly makes sense in a diagram.

definition. natural isomorphism [kostecki2011introduction, 4.2] [tt-001H]

A natural transformation \(\sigma : \mathscr {F} \to \mathscr {G}\) between functors \(\mathscr {F}: {\cal C} \to {\cal D}\) and \(\mathscr {G} : {\cal C} \to {\cal D}\) is called a natural isomorphism or a natural equivalence, denoted \(\sigma : \mathscr {F} \cong \mathscr {G}\), if each component \(\sigma _X : \mathscr {F}(X) \to \mathscr {G}(X)\) is an isomorphism in \({\cal D}\), i.e. \(\mathscr {F}(X) \underset {\sigma _X}{\cong } \mathscr {G}(X)\).

We call \(\mathscr {F}\) and \(\mathscr {G}\) naturally isomorphic to each other.

We also say that \(\mathscr {F}(X) \cong \mathscr {G}(X)\) naturally in \(X\) [leinster2016basic, 1.3.12].

Diagramatically,

lemma. natural isomorphism [leinster2016basic, 1.3.10] [tt-001I]

A natural isomorphism between functors from categories \({\cal C}\) and \({\cal D}\) is an isomorphism in the functor category \([{\cal C}, {\cal D}]\).

definition. isomorphism of categories [kostecki2011introduction, 4.3] [tt-001J]

The cateories \({\cal C}\) and \({\cal D}\) are called isomorphic, denoted \({\cal C} \cong {\cal D}\), iff there exists functors such that \[\mathit {1}_{{\cal C}}=\mathscr {F} \mathbin {\bullet } \mathscr {G}\] and \[\mathit {1}_{{\cal D}}=\mathscr {G} \mathbin {\bullet } \mathscr {F} \]

definition. equivalence of categories [kostecki2011introduction, 4.3] [tt-001K]

The categories \({\cal C}\) and \({\cal D}\) are called equivalent, denoted \({\cal C} \simeq {\cal D}\), iff there exist functors together with natural isomorphisms \[\mathit {1}_{{\cal C}} \cong \mathscr {F} \mathbin {\bullet } \mathscr {G}\] and \[\mathscr {G} \mathbin {\bullet } \mathscr {F} \cong \mathit {1}_{{\cal D}} \]

representables [tt-002K]

remark. representables [leinster2016basic, ch. 4, 4.1.15] [tt-002L]

A category is a world of objects, all looking at one another. Each sees the world from a different viewpoint.

We may ask: what objects see? Fix an object, this can be described by the arrows from it, this corresponds to the covariantly representable functor.

We can also ask the dual question: how objects are seen? Fix an object, this can be described by the arrows into it, this corresponds to the contravariantly representable functor.

definition. set-valued [kostecki2011introduction, 4.4] [tt-001L]

A functor \(\mathscr {F} : {\cal C} \to \mathbf {Set}\) is called set-valued.

definition. hom-functor [kostecki2011introduction, 3.1, example 10] [tt-001S]

For every locally small category \({\cal C}\), the covariant hom-functor, denoted \[{\cal C}(X,-): {\cal C} \to \mathbf {Set} \] is given by

Conversely, the contravariant hom-functor, denoted \[{\cal C}(-,X): {\cal C}^{op} \to \mathbf {Set} \] is given by

Further, the hom-bifunctor, denoted \[{\cal C}(-,=): {\cal C}^{op} \times {\cal C} \to \mathbf {Set} \] defined as a contravariant hom-functor at first argument and as a covariant hom-functor at second argument.

We see \(-\) and \(=\) as placeholders for any object and its "associated arrow" (whose domain/codomain is the object, respectively) in the corresponding category. And we use boxes to mark the placeholder objects in diagrams.

Diagramatically [leinster2016basic, 4.1.22],

definition. representable functor [kostecki2011introduction, 4.4] [tt-001O]

A set-valued functor \(\mathscr {F}: {\cal C} \to \mathbf {Set}\) is called covariantly representable if for some \(X \in {\cal C}\), \[\tau : \mathscr {F} \cong {\cal C}(X,-)\] where \(\cong \) denotes a natural isomorphism.

Conversely, a set-valued functor \(\mathscr {G} : {\cal C}^{op} \to \mathbf {Set}\) is called contravariantly representable if for some \(X \in {\cal C}\), \[\tau : \mathscr {G} \cong {\cal C}(-, X)\]

Such an object \(X\) is called a representing object for the functor \(\mathscr {F}\) or \(\mathscr {G}\), respectively.

The pair \((\tau , X)\) is called a representation of the functor \(\mathscr {F}\) (respectively, \(\mathscr {G}\) ).

remark. representation [leinster2016basic, 4.1.3, 4.1.17] [tt-002M]

A representation \((\tau , X)\) of a representable functor \(\mathscr {F}\) is a choice of an object \(X \in {\cal C}\) and an isomorphism \(\tau \) between the corresponding type of hom-functor and \(\mathscr {F}\).

Representable functors are sometimes just called representables. Only set-valued functors can be representable.

definition. Yoneda embedding functors [leinster2016basic, 4.1.15] [tt-002T]

Let \({\cal C}\) be a locally small category. The covariant Yoneda embedding functor of \({\cal C}\) is the functor \[ \mathscr {H}^{\bullet }: {\cal C}^{op} \to [{\cal C}, \mathbf {Set}] \] defined on objects \(X\) by the covariant hom-functor on \(X\).

This functor embeds what every object in \({\cal C}\) sees the "world" of the category \({\cal C}\), i.e. arrows from each object.

Conversely, the (contravariant) Yoneda embedding functor of \({\cal C}\) is the functor \[ \mathscr {H}_{\bullet }: {\cal C} \to [{\cal C}^{op}, \mathbf {Set}] \] defined on objects \(X\) by the contravariant hom-functor on \(X\).

This functor embeds how every object in \({\cal C}\) is "seen", i.e. arrows to each object.

\(\bullet \) is a placeholder for an object. \(\mathscr {H}^X\) and \(\mathscr {H}_X\) denote the corresponding Yoneda embedding functors applied to \(X\), and are called covariant/contravariant Yoneda functors, respectively.

Diagramatically [rosiak2022sheaf, def. 161]:

When one speaks of the Yoneda (embedding) functor without specifying covariant or contravariant, it means the contravariant one, because it's the one used in the Yoneda lemma.

lemma. Yoneda [leinster2016basic, 4.2.1] [tt-002N]

Let \({\cal C}\) be a locally small category. Then \[ \operatorname {Nat}(\mathscr {H}_X, \mathscr {F}) \cong \mathscr {F}(X) \] naturally in \(X \in {\cal C}\) and \(\mathscr {F} \in \left [{\cal C}^{\mathrm {op}}, \mathbf {Set} \right ]\), where \(\mathscr {H}_X\) is the (contravariant) Yoneda embedding functor on \(X\), and Nat denotes all the natural transformations between the two functors.

notation. Yoneda lemma [leinster2016basic, 4.2.1] [tt-002U]

Diagramatically, \(\operatorname {Nat}(\mathscr {H}_X, \mathscr {F})\) is and it's also denoted by \(\left [{\cal C}^{op}, \mathbf {Set}\right ]\left (\mathscr {H}_X, \mathscr {F}\right )\) in the sense of \(\operatorname {Hom}_{\left [{\cal C}^{op}, \mathbf {Set}\right ]}\left (\mathscr {H}_X, \mathscr {F}\right )\) where \(\left [{\cal C}^{op}, \mathbf {Set}\right ]\) is a functor category.

remark. Yoneda philosophy [rosiak2022sheaf, sec. 6.6] [tt-002S]

The Yoneda lemma can be regarded as saying:

To understand an object it suffices to understand all its relationships with other things.

This is similar to the seventeenth-century philosopher Spinoza's idea that what a body is (its “essence”) is inseparable from all the ways that the body can affect (causally influence) and be affected (causally influenced) by other bodies.

The idea of Yoneda is that we can be assured that if a robot wants to learn whether some object \(X\) is the same thing as object \(Y\), it will suffice for it learn whether \[{\cal C}(-, X) \cong {\cal C}(-, Y)\] or, dually, \[{\cal C}(X, -) \cong {\cal C}(Y, -)\] i.e. whether all the ways of probing \(X\) with objects of its environment amount to the same as all the ways of probing \(Y\) with objects of its environment.

lemma. full and faithful [kostecki2011introduction, 4.8] [tt-002X]

The Yoneda embedding functor \(\mathscr {H}_{\bullet }: {\cal C} \to [{\cal C}^{op}, \mathbf {Set}]\) is full and faithful.

definition. Ob, Arr [leinster2016basic, 4.1.6] [tt-003U]

Given a small category \({\cal C}\), there is a functor \(\operatorname {Ob} : \mathbf {Cat} \to \mathbf {Set}\) that sends \({\cal C}\) to its set of objects where \(\mathbf {Cat}\) is the category of small categories. Thus, \[ \mathscr {H}^{\mathrm {1}}({\cal C}) \cong \operatorname {Ob}({\cal C}) \] where \(\mathscr {H}\) is a Yoneda embedding functor.

This isomorphism is natural in \({\cal C}\); hence \(\operatorname {Ob} \cong \mathbf {Cat}(\mathrm {1}, -)\) where \(\mathbf {Cat}(\mathrm {1}, -)\) is a covariant hom-functor.

Functor \(\operatorname {Ob}\) is representable. Similarly, the functor \(\operatorname {Arr} : \mathbf {Cat} \to \mathbf {Set}\) sending a small category to its set of arrows is representable.

definition. presheaf [leinster2016basic, 1.2.15] [tt-002Q]

Let \({\cal C}\) be a category. A presheaf \(\mathscr {F}\) on \({\cal C}\) is a functor \({\cal C}^{op} \to \mathbf {Set}\).

It is called representable if \(\mathscr {F} \cong \mathscr {H}_X\) for some \(X\).

notation. presheaf [tt-004B]

We'll use \(\mathscr {F}\) to denote a presheaf since sheaf in French is faisceau.

remark. presheaf and Yoneda lemma [leinster2016basic, 4.2.1] [tt-002V]

The Yoneda lemma says that for any \(X \in {\cal C}\) and presheaf \(\mathscr {F}\) on \({\cal C}\), a natural transformation \(\mathscr {H}_X \to \mathscr {F}\) is an element of \(\mathscr {F}(X)\) of shape \(\mathscr {H}_X\).

We may ask the question [chen2016infinitely, 68.6.4]:

What kind of presheaves are already "built in" to the category \({\cal C}\)?

The answer by the Yoneda lemma is, the Yoneda embedding \(\mathscr {H}_{\bullet }: {\cal C} \to [{\cal C}^{op}, \mathbf {Set}]\) embeds \({\cal C}\) into its own presheaf category.

In mathematics at large, the word "embedding" is used (sometimes informally) to mean a map \(i: X \to Y\) that makes \(X\) isomorphic to its image in \(Y\), i.e. \(X \cong i(X)\). [leinster2016basic, 1.3.19] tells us that in category theory, a full and faithful functor \(\mathscr {I}: X \to Y\) can reasonably be called an embedding, as it makes \(X\) equivalent to a full subcategory of \(Y\).

So, \({\cal C}\) is equivalent to the full subcategory of the presheaf category \([{\cal C}^{op}, \mathbf {Set}]\) whose objects are the representables.

definition. The category of presheaves [kostecki2011introduction, 4.5, example 1] [tt-002R]

The functor category of contravariant set-valued functors \([{\cal C}^{op}, \mathbf {Set}]\), called the category of presheaves or varying sets, the objects of which are contravariant functors \({\cal C}^{op} \to \) Set. It may be regarded as a category of diagrams in \(\mathbf {Set}\) indexed contravariantly by the objects of \({\cal C}\).

By definition, objects of \({\cal C}\) play the role of stages, marking the "positions" (in passive view) or "movements" (in active view) of the varying set \(\mathscr {F}: {\cal C}^{op} \to \mathbf {Set}\). For every \(X\) in \({\cal C}^{op}\), the set \(\mathscr {F}(X)\) is a set of elements of \(\mathscr {F}\) at stage \(X\).

An arrow \(f: Y \to X\) between two objects in \({\cal C}^{op}\) induces a transition arrow \(\mathscr {F}(f): \mathscr {F}(X) \to \mathscr {F}(Y)\) between the varying set \(\mathscr {F}\) at stage \(A\) and the varying set \(\mathscr {F}\) at stage \(B\).

basic types of diagrams [tt-000P]

By "basic types of diagrams", we mean some basic structures inside a category.

definition. diagram, shape [leinster2016basic, 5.1.18] [tt-0025]

Let \({\cal C}\) be a category and \({\cal J}\) a small category. A functor \(\mathscr {D} : {\cal J} \to {\cal C}\) is called a diagram in \({\cal C}\) of shape \({\cal J}\).

\({\cal J}\) is also called the indexing category of the diagram, and we say that \(\mathscr {D}\) is a diagram indexed by \({\cal J}\) [rosiak2022sheaf, example 35]. \(J\) is also called the template.

definition. shape E [leinster2016basic, 5.14] [tt-000O]

The diagram is called a diagram of shape E.

For simplicity, we refer to a diagram of shape E by "a shape \(E(f, g)\)".

"E" in "shape E" stands for "equal", and the reason will unfold in the definition of equalizer.

definition. fork [leinster2016basic, 5.4] [tt-000R]

A fork over a shape \(E(f, g)\) is the diagram that makes the diagram commute.

For simplicity, we refer to a fork by "a fork \((E, \iota )\) (over the shape \(E(f, g)\))".

convention. grey arrow [tt-000S]

We use grey arrows to represent the composition arrow in a fork. This convention is not from literatures and is subject to change.

definition. equalizer [leinster2016basic, 5.1.11] [tt-000Q]

An equalizer of a shape \(E(f, g)\) is a fork \((E, \iota )\) over it, such that, for any \((\mathrm {-}, d)\) over the fork, the diagram commutes (i.e. any arrow \(d : \mathrm {-} \to X\) must uniquely factor through \(E\)).

For simplicity, we refer to the equalizer of a shape \(E(f, g)\) as \(\operatorname {Eq}(f, g)\), and \(\iota \) is the canonical inclusion.

We say that a category \({\cal C}\) has equalizers iff every shape E in \({\cal C}\) has an equalizer.

remark. equalizing set [kostecki2011introduction, eq. 32] [tt-000U]

Equalizer in a category is a generalisation of a subset which consists of elements of a given set such that two given functions are equal on them, formally:

For any two arrows \(f, g: X \to Y\), their equalizing set \(E \subseteq X\) is defined as \[ E:=\{e \mid e \in X \wedge f(e)=g(e)\} \]

definition. shape P [leinster2016basic, 5.14] [tt-000W]

The diagram is called a diagram of shape P.

For simplicity, we refer to a diagram of shape P by "a shape \(P(f, g)\)".

"P" in "shape P" may stand for "product/projection/pullback", and the reason will unfold in the definition of pullback.

definition. pullback (fiber product) [kostecki2011introduction, 2.12] [tt-000V]

A pullback of a shape \(P(f, g)\) is an object \(X \times _O Y\) in \({\cal C}\) together with arrows \(p_X\) and \(p_Y\), called projections, such that, for any object \(\mathrm {-}\) and arrows \(h\) and \(k\), the diagram commutes.

We say that a category \({\cal C}\) has pullbacks iff every shape \(P(f, g)\) in \({\cal C}\) has a pullback in \({\cal C}\).

A pullback is also called a fiber product.

The square is called the pullback square of \(f\) and \(g\). The object \(X \times _O Y\) in \({\cal C}\) is called the fiber product object.

lemma. pasting pullbacks [spivak2013category, 2.5.1.17] [tt-0056]

Pullbacks can be pasted together, i.e. for diagram given that the right-hand square is a pullback, the left-hand square is a pullback if and only if the outer rectangle is a pullback.

definition. shape T [leinster2016basic, 5.14] [tt-0026]

The diagram is called a diagram of shape T.

For simplicity, we refer to a diagram of shape T by "a shape \(T(X, Y)\)" where \(X\) and \(Y\) are the 2 objects.

"T" in "shape T" stands for "two". Shape T is useful in the definition of binary product [kostecki2011introduction, 2.18].

definition. binary product [kostecki2011introduction, 2.18] [tt-000Y]

A binary product of objects \(X\) and \(Y\) is an object \(X \times Y\) in \({\cal C}\) together with arrows \(p_X\) and \(p_Y\), called projections, such that, for any object \(\mathrm {-}\) and arrows \(h\) and \(k\), the diagram

commutes.

We say that a category \({\cal C}\) has binary products iff every pair \(X, Y\) in \({\cal C}\) has a binary product \(X \times Y\) in \({\cal C}\).

When there is no confusion, we simply call bianry products products.

definition. coshape, coequalizer, pushout (fiber coproduct), binary coproduct [kostecki2011introduction, 2.14, 2.16, 2.19] [tt-000X]

coshape, coequalizer, pushout (fiber coproduct), binary coproduct can be defined by reversing all arrows in the definitions of shape, equalizer, pullback (fiber product), binary product respectively.

The pushout equivalent of the fiber product object in pullback is the fiber coproduct object, denoted \(X +_{O} Y\), and the pushout equivalent of projections in pullback are injections, denoted \(i_X\) and \(i_Y\), respectively. The unique arrow of a pushout is denoted \([f, g]\).

The binary coproduct equivalent of the binary product object in binary product is the binary coproduct object, denoted \(X + Y\), and the binary coproduct equivalent of projections in binary product are injections, denoted \(i_X\) and \(i_Y\), respectively. The unique arrow of a binary coproduct is denoted \([f, g]\).

Diagramatically,

Coshapes:
- \(T\) =
- \(E\) =
- \(P\) =
coequalizer:
pushout (fiber coproduct):
binary coproduct:

lemma. monic and pullback [leinster2016basic, 5.1.32] [tt-003J]

An arrow \(X \xrightarrow {f} Y\) is monic iff the square is a pullback.

The significance of this lemma is that whenever we prove a result about limits, a result about monics will follow.

lemma. epic and pushout [leinster2016basic, sec. 5.2] [tt-003M]

An arrow \(X \xrightarrow {f} Y\) is epic iff the square is a pushout.

This is dual to lemma [tt-003J].

definition. n-fold (co)products [kostecki2011introduction, 2.22] [tt-0011]

In any category with binary products the objects \(X \times (Y \times Z)\) and \((X \times Y) \times Z\) are isomorphic. In any category with binary coproducts the objects \(X+(Y+Z)\) and \((X+Y)+Z\) are isomorphic.

This allows to consider \(n\)-fold products \(X_1 \times \dots \times X_n\) and \(n\)-fold coproducts \(X_1 + \dots + X_n\) of objects of a given category.

definition. have finite (co)products [kostecki2011introduction, 2.23] [tt-0012]

A category which has n-fold (co)products for any \(n \in \mathbb N\) is said to have finite (co)products.

limits [tt-0023]

remark. Limits [leinster2016basic, ch. 5] [tt-0024]

Adjointness is about the relationships between categories. Representability is a property of set-valued functors. Limits are about what goes on inside a category.

Whenever you meet a method for taking some objects and arrows in a category and constructing a new object out of them, there is a good chance that you are looking at either a limit or a colimit.

definition. cone [leinster2016basic, 5.1.19] [tt-0028]

Let \({\cal C}\) be a category, \({\cal J}\) a small category, and \(\mathscr {D} : {\cal J} \to {\cal C}\) a diagram in \({\cal C}\) of shape \({\cal J}\).

A cone on \(\mathscr {D}\) is an object \(V \in {\cal C}\) (the vertex of the cone) together with a family \[ \left (V \xrightarrow {\pi _J} \mathscr {D}(J)\right )_{J \in {\cal J}} \] of arrows in \({\cal C}\) such that for all arrows \(J \to J'\) in \({\cal J}\), the diagram commutes.

The family of arrows are components of a natural transformation \(\pi : \Delta _V \to \mathscr {D}\), i.e. from the constant functor ( which assigns the same object \(V\) to any object \(J_i\) in \({\cal J}\)) to diagram functor \(\mathscr {D}\).

For simplicity, we refer to a cone by "a cone \((V, \pi )\) on \(\mathscr {D}\)".

example. a cone on a diagram [kostecki2011introduction, 4.9] [tt-0029]

The cone for a diagram is which indeed looks like a cone with the vertex \(V\).

definition. limit [kostecki2011introduction, 4.10] [tt-002A]

A cone \((V, \pi )\) on \(\mathscr {D} : {\cal C} \to {\cal J}\) is called a limit of \(\mathscr {D}\), denoted \[\lim \mathscr {D}\] if the diagram commutes for every cone \((V, \pi )\) on \(\mathscr {D}\).

The arrows \(\pi _J\) are called the projections of the limit.

Other possible terms of limit are limiting cone, universal cone.

example. limits [kostecki2011introduction, 4.11, example 1-4] [tt-002B]

The basic types of diagrams are actually examples of limits:

binary product [kostecki2011introduction, 2.18]:
pullback (fiber product) [kostecki2011introduction, 2.12]:
equalizer [leinster2016basic, 5.1.11]:
the limit of \(\mathscr {D} : \mathbf {\emptyset } \to {\cal C} \), where \(\mathbf {\emptyset }\) is an empty category: i.e. the terminal object \(\mathrm {1}\) in \({\cal C}\). In particular, for \({\cal C} = \mathbf {Set}\) we have \[\mathrm {-} \xrightarrow {!} V = \lim \mathscr {D} = \{∗\} \]

remark. cocone, colimit [kostecki2011introduction, 4.10] [tt-002E]

A cocone and a colimit are defined by dualization, that is, by reversing the arrows in cone [leinster2016basic, 5.1.19] and limit [kostecki2011introduction, 4.10].

In another word, given \(\mathscr {D}^{op} : {\cal J}^{op} \to {\cal C}^{op}\), a cocone on \(\mathscr {D}\) is a cone on \(\mathscr {D}^{op}\), a colimit of \(\mathscr {D}\) is a limit of \(\mathscr {D}^{op}\) [leinster2016basic, 5.2.1].

The arrows \(\pi _J\) are called the coprojections of the colimit.

In the same say, one can and show that coequaliser, coproduct, pushout and initial object are examples of colimits.

lemma. limits via products and equalizers [stacks2017stacks, 002N, 002P] [tt-0057]

If all products and equalizers exist, all limits exist.

Dually, if all coproducts and coequalizers exist, all colimits exist.

definition. has (finite) limits, (finitely) complete, left exact [kostecki2011introduction, 4.10] [tt-002F]

We say that a category \({\cal C}\) has (finite) limits or is (finitely) complete if every diagram \(\mathscr {D} : {\cal J} \to {\cal C}\), where \({\cal J}\) is a (finite) category, has a limit.

A category \({\cal C}\) is called left exact iff it is finitely complete.

remark. (finitely) cocomplete, right exact [kostecki2011introduction, 4.10] [tt-003C]

Dually to definition [tt-002F], when every diagram \(\mathscr {D} : {\cal J} \to {\cal C}\), where \({\cal J}\) is a (finite) category, has a colimit, it is said that the category \({\cal C}\) has (finite) colimits or is (finitely) cocomplete.

A category is called right exact iff it is finitely cocomplete.

lemma. (finitely) (co)complete category [kostecki2011introduction, 4.14] [tt-002J]

A category \({\cal C}\) is (finitely) complete if it has a terminal object, equalizers and (finite) products, or if it has a terminal object and (finite) pullbacks.

Dually, a category \({\cal C}\) is (finitely) cocomplete if it has an initial object, coequalizers and (finite) coproducts, or if it has an initial object and (finite) pushouts.

lemma. (co)complete functor category [kostecki2011introduction, 4.15] [tt-002I]

If category \({\cal D}\) is complete and category \({\cal C}\) is small, then the functor category \({\cal D}^{{\cal C}}\) is complete.

Dually, if category \({\cal D}\) is cocomplete and category \({\cal C}\) is small, then the functor category \({\cal D}^{{\cal C}}\) is cocomplete.

definition. preorder, partial order, total order [kostecki2011introduction, 1.2, example 9] [tt-002C]

Let \(P\) be a set. The properties

(reflexivity) \(\forall p \in P, p \leq p\)
(transitivity) \(\forall p, q, r \in P, p \leq q \wedge q \leq r \Rightarrow p \leq r\)

define a preorder \((P, \leq )\).

A partially ordered set (called a partial order, or a poset) is defined as a preorder \((P, \leq )\) for which

(antisymmetry) \(\forall p \in P, p \leq q \wedge q \leq p \Rightarrow p=q\)

holds.

A total order (or a linear order) is a partial order \((P, \leq )\) for which

(comparability) \(\forall p, q \in P, p \leq q \vee q \leq p\)

The category \(\mathbf {Preord}\) consists of objects which are preorders and of arrows which are orderpreserving functions.

The category \(\mathbf {Poset}\) consists of objects which are posets and of arrows which are order-preserving functions between posets, that is, the maps \(T: P \to P'\) such that \[ p \leq q \Rightarrow T(p) \leq T(q) \]

Any any preorder \((P, \leq )\) and poset \((P, \leq )\) can be considered as a category consisting of objects which are elements of a set \(P\) and arrows defined by \(p \to q \Longleftrightarrow p \leq q\).

definition. directed poset [rosiak2022sheaf, def. 285] [tt-0051]

A directed poset is a poset that is inhabited (nonempty) and for which every finite subset has an upper bound. Explicitly,

(directedness) \(\forall x,y \in P, \exists z \in P, x \leq z \land y \leq z\)

example. preorder, poset, directed poset [tt-0052]

An example of a preorder category which is not poset is:

An example of a poset category which is not a directed poset is [rosiak2022sheaf, example 3] :

An example that is a directed poset category but not a total order is: where each pair of nodes has a common upper bound (thus satisfying directedness), but there is no path between the two nodes on the center, thus violating comparability.

A more complicated example of a directed poset category which is not a total order is [spivak2013category, example 3.4.1.3]:

One can see immediately that this is a preorder because length=0 paths give reflexivity and concatenation of paths gives transitivity. To see that it is a partial order we only note that there are no loops.

To see that it is a poset, we note that every pair of nodes from one side or both sides has the central node as an upper bound, thus satisfying directedness.

But this partial order is not a total order because there is no path (in either direction) between some nodes, thus violating comparability.

definition. inverse limit, projective limit [kostecki2011introduction, 4.11] [tt-002D]

Let \({\cal J}\) be a directed poset and \(\mathscr {F} : {\cal J} \to {\cal C}\) be a contravariant functor. The limit of \(\mathscr {F}\) is called an inverse limit or projective limit, and is denoted \(\lim \limits _{\leftarrow {\cal J}} \mathscr {F}\) or simply \(\lim \limits _{\longleftarrow } \mathscr {F}\).

definition. direct limit, inductive limit [kostecki2011introduction, 4.12] [tt-002G]

Let \({\cal J}\) be a directed poset and \(\mathscr {F}: {\cal J} \to {\cal C}\) be a contravariant functor. The colimit of \(\mathscr {F}\) is called a direct limit (some called directed limit) or inductive limit, and is denoted \(\lim \limits _{\to {\cal J}} \mathscr {F}\), or simply \(\lim \limits _{\longrightarrow } \mathscr {F}\).

This is dual to inverse limit.

definition. preserves (all) (co)limits, left/right exact [kostecki2011introduction, 4.13] [tt-002H]

A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) preserves (all) limits and is called left exact iff it sends all limits in \({\cal C}\) into limits in \({\cal D}\).

Dually, a functor \(\mathscr {F}: {\cal C} \to {\cal D}\) preserves (all) colimits and is called right exact iff it sends all colimits in \({\cal C}\) into colimits in \({\cal D}\).

remark. directions in (co)limits [tt-004Y]

	Limit	Colimit
diagram
arrows through the vertex	into the diagram	out of the diagram
on (co)shape P	pullback	pushout
categories have finite ...	left exact	right exact
functors preserve all ...	left exact	right exact
on directed poset	inverse/projective limit \(\lim \limits _{\longleftarrow } \mathscr {F}\)	direct/inductive limit \(\lim \limits _{\longrightarrow } \mathscr {F}\)

One can see from the table that, in general, limits have the direction "back" "into" (where "back", "left", "inverse" are directional consistent), and colimits have the opposite: "forward" "out of".

This might help to memorize the directions in these concepts without disorientation.

adjunctions [tt-001T]

definition. adjoint functor [kostecki2011introduction, 5.1] [tt-001Q]

Given functors we say \(\mathscr {L}\) and \(\mathscr {R}\) are a pair of adjoint functors, or together called an adjunction between them, \(\mathscr {L}\) is called left adjoint to \(\mathscr {R}\), and \(\mathscr {R}\) is called right adjoint to \(\mathscr {L}\), denoted \[\mathscr {L} \dashv \mathscr {R} : {\cal C} \rightleftarrows {\cal D}\] or iff there exists a natural isomorphism \(\sigma \) between the following two hom-bifunctors: \[ {\cal D}(\mathscr {L}(-), =) \cong {\cal C}(-, \mathscr {R}(=)) \] diagramatically,

The components of the natural isomorphism \(\sigma \) are isomorphisms \[ \sigma _{XY} : {\cal D}(\mathscr {L}(X), Y) \cong {\cal C}(X, \mathscr {R}(Y)) \]

remark. adjoint functor [kostecki2011introduction, 5.1] [tt-001R]

An adjunction \(\mathscr {L} \dashv \mathscr {R}\) means arrows \(\mathscr {L}(X) \to Y\) are essentially the same thing as arrows \(X \to \mathscr {R}(Y)\) for any \(X \in {\cal C}\) and \(Y \in {\cal D}\).

This means the diagram commutes for any arrows \(f: \mathscr {L}(X) \to Y\) in \({\cal D}\).

The above can also be diagramatically denoted by transposition diagram \[ \begin {array}{ccccccc} X' & \xrightarrow {x} & X & \xrightarrow {\sigma _{X Y}(f)} & \mathscr {R}(Y) & \xrightarrow {\mathscr {R}(y)} & \mathscr {R}\left (Y'\right ) \\ \hline \mathscr {L}\left (X'\right ) & \xrightarrow {\mathscr {L}(x)} & \mathscr {L}(X) & \xrightarrow {f} & Y & \xrightarrow {y} & Y' \end {array} \] or simply, \[ \frac {X \to \mathscr {R}(Y) \quad ({\cal C})}{\mathscr {L}(X) \to Y \quad ({\cal D})} \]

An adjunction is a concept that describes the relationship between two functors that are weakly inverse to each other [nakahira2023diagrammatic, sec. 4].

By "weakly inverse", we don't mean that applying one after the other gives the identity functor, but in a sense similar to eroding (i.e. enhancing holes) and dilating (i.e. filling holes) an image, applying them in different order yeilds upper/lower "bounds" of the original image [rosiak2022sheaf, sec. 7.1].

notation. string diagrams: adjunction [nakahira2023diagrammatic, sec. 3.1] [tt-001U]

Here we follow the string diagram style of [marsden2014category] and [sterling2023models], but with additional string diagram types inspired by [nakahira2023diagrammatic, eq. 3.1, 4.3].

The covariant hom-functor \({\cal C}(X, -)\), denoted \(-^X\), can be represented in string diagrams as where the dotted circle denotes any arrows with the domain \(X\) and codomain \(-\).
The contravariant hom-functor \({\cal C}(-, X)\), denoted \(X^-\), can be represented in a similar manner.
The hom-bifunctor \({\cal C}(-,=)\), also denoted \(=^-\), can be represented as where the dotted circle denotes any arrows with the domain \(-\) and codomain \(=\).
The natural isomorphism \[{\cal D}(\mathscr {L}(-), =) \cong {\cal C}(-, \mathscr {R}(=))\] in adjunction can be represented as

definition. transpose [kostecki2011introduction, 5.1] [tt-001X]

Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), there exists \(f^{\sharp }\) and \(g^{\flat }\) such that the diagrams commute for any arrow \(f: X \to \mathscr {R}(Y)\) in \({\cal C}\), \(g: \mathscr {L}(X) \to Y\) in \({\cal D}\).

\(f^{\sharp }\) is called the left transpose of \(f\). \(g^{\flat }\) is called the right transpose of \(g\).

Other possible terms are left/right adjunct of each other, and mates [nlab2023adjunct].

remark. idempotent [zhang2021type, 5.30] [tt-0037]

Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), we may obtain two endofunctors \( \mathscr {L} \mathbin {\bullet } \mathscr {R} : {\cal C} \to {\cal C}\) and \( \mathscr {R} \mathbin {\bullet } \mathscr {L} : {\cal D} \to {\cal D}\) that commute the diagram that means they are both idempotent, i.e. applying \(\mathscr {L} \mathbin {\bullet } \mathscr {R}\) any times yields the same result as applying it once, and similarly for \(\mathscr {R} \mathbin {\bullet } \mathscr {L}\).

definition. (co)unit [zhang2021type, 5.30] [tt-001V]

Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), the natural transformation \[\eta : \mathit {1}_{{\cal C}} \to \mathscr {L} \mathbin {\bullet } \mathscr {R} \] is called the unit of the adjunction, and \[\epsilon : \mathscr {R} \mathbin {\bullet } \mathscr {L} \to \mathit {1}_{{\cal D}}\] is called the counit.

We call an arrow \[\eta _X: X \to (\mathscr {L} \mathbin {\bullet } \mathscr {R})(X)\] a unit over \(X\), and \[\epsilon _Y: (\mathscr {R} \mathbin {\bullet } \mathscr {L})(Y) \to Y\] a counit over \(Y\). They are components of the natural transformations \(\eta \) and \(\epsilon \), respectively.

Diagramatically, the diagrams commute.

lemma. universality of (co)unit [kostecki2011introduction, 5.3] [tt-001W]

The unit \(\eta \) and counit \(\epsilon \) of an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\) are universal, i.e. the diagram commutes for any \(f \in {\cal C}\), and the diagram commutes for any \(g \in {\cal D}\).

lemma. triangle identities [kostecki2011introduction, 5.4] [tt-001Z]

Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), the diagrams commute.

Note that \(\mathscr {L}\) in \(\epsilon _\mathscr {L}\) is a subscript, meaning \(\epsilon _\mathscr {L}: {\cal D} \to {\cal D}, \mathscr {L}(X) \mapsto \epsilon _{\mathscr {L}(X)}\) for \(X \in {\cal C}\). Similar for \(\eta _\mathscr {R}\).

lemma. snake identities [nakahira2023diagrammatic, thm. 4.8] [tt-0030]

Continuing from notation [tt-001U], the triangle identities can be represented in string diagrams as follows, and called the snake identities (or zig-zag identities): where are the unit and counit of the adjunction, respectively.

notation. string diagram: snake identities [nakahira2023diagrammatic, thm. 4.8] [tt-0031]

Following notation [tt-001G], recall that a string diagram is composed from top to bottem, left to right, we can read the left snake in snake identities as \[ \mathscr {L} \xmapsto {(\eta \mathbin {\bullet } \mathscr {L}) \mathbin {\bullet } (\mathscr {L} \mathbin {\bullet } \epsilon )} \mathscr {L}\]

where each pair of parentheses corresponds to an overlay in the string diagram, and this is equivalent to

in triangle identities.

lemma. (co)unit and transposes [leinster2016basic, 2.2.4] [tt-001Y]

Given an adjunction with unit \(\eta \) and counit \(\epsilon \), the diagrams \[h=\eta _X \mathbin {\bullet } \mathscr {R}(h^{\sharp })\] and \[f^{\sharp }=\mathscr {L}(f) \mathbin {\bullet } \epsilon _Y\] commute.

notation. string diagrams: (co)unit and transposes [marsden2014category, lem. 3.6] [tt-0033]

In string diagrams, lemma [tt-001Y] can be represented as:

remark. topologically plausible [leinster2016basic, 2.2.9] [tt-0039]

The string diagrams in lemma [tt-0030] and notation [tt-0033] are topologically plausible equations, i.e. the equality can be obtained by simply pulling the string straight.

lemma. (co)unit and natural isomorphism [kostecki2011introduction, eq. 127] [tt-0020]

The natural transformation \(\sigma _{XY}\) and \(\tau _{XY}\) that are the components of the natural isomorphism in the adjunction \(\mathscr {L} \dashv \mathscr {R} : {\cal C} \rightleftarrows {\cal D}\) are related to the unit and counit of the adjunction: \[\begin {aligned} \sigma _{XY}(f) & = \eta _X \mathbin {\bullet } \mathscr {R}(f^{\sharp })\\ \tau _{XY}(g) & = \mathscr {L}(g^{\flat }) \mathbin {\bullet } \epsilon _Y \end {aligned} \] and they are reverse of each other \[\sigma _{XY} = \tau _{YX}^{-1}\]

proof.

This can be read out from the diagrams in universality of (co)unit [kostecki2011introduction, 5.3] and transpose [kostecki2011introduction, 5.1].

lemma. uniqueness of adjoints [kostecki2011introduction, 5.8] [tt-0021]

A left or right adjoint, if it exists, is unique up to natural isomorphism.

proof.

For the left adjoint, from the university of \(\eta \) it follows that there exists a unique, up to isomorphism, isomorphism between different left adjoints. It remains to show naturality of this isomorphim, which is left as an exercise. The proof for right adjoint follows by duality.

theorem. adjunction via (co)units [leinster2016basic, 2.2.5] [tt-0038]

Given categories and functors there is a one-to-one correspondence between the adjunction \(\mathscr {L} \dashv \mathscr {R}\) and the pairs of natural transformations \(\eta \) and \(\epsilon \) satisfying the the triangle identities.

proof.

From lemma [tt-001Z], it follows that every adjunction between \(\mathscr {L}\) and \(\mathscr {R}\) gives rise to a pair of transformations \(\eta \) and \(\epsilon \) satisfying the triangle identities.

To show that there exists a unique adjunction for \(\eta \) and \(\epsilon \), the uniqueness follows from lemma [tt-001Y], the existence can use the construction in the spirit of definition [tt-001X].

theorem. adjunction via initial objects [leinster2016basic, 2.3.6] [tt-003A]

Given categories and functors there is a one-to-one correspondence between:

the adjunction \(\mathscr {L} \dashv \mathscr {R}\)
natural transformations \(\eta : \mathit {1}_{{\cal C}} \to \mathscr {L} \mathbin {\bullet } \mathscr {R}\) such that \(\eta _X\) is initial in the comma category \(X \Rightarrow \mathscr {R}\) for every \(X \in {\cal C}\)

Diagramatically, where the functor \(X\) is the constant object functor.

interactions [tt-002P]

remark. interactions [leinster2016basic] [tt-0043]

In this section, we will discuss the interactions between

(co)limits
adjunctions
representables

and their relationships with universal properties.

lemma. adjunction preserves (co)limits [leinster2016basic, 6.3.1] [tt-002O]

Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), \(\mathscr {L}\) preserves colimits, and \(\mathscr {R}\) preserves limits.

Explictly, given \(\mathscr {D}: {\cal J} \to {\cal C}\), we have \[(\operatorname {colim} \mathscr {D}) \mathbin {\bullet } \mathscr {L} \cong \operatorname {colim}(\mathscr {D} \mathbin {\bullet } \mathscr {L})\] and given \(\mathscr {D}': {\cal J} \to {\cal D}\), we have \[(\lim \mathscr {D}') \mathbin {\bullet } \mathscr {R} \cong \lim (\mathscr {D}' \mathbin {\bullet } \mathscr {R})\]

corollary. a representation is a universal element [leinster2016basic, 4.3.2] [tt-0035]

Let \({\cal C}\) be a locally small category and \(\mathscr {F} : {\cal C}^{op} \to \mathbf {Set}\). Then a representation of \(\mathscr {F}\) consists of a pair \((X, u)\) such that the diagram commutes.

remark. universal element [leinster2016basic, 4.3.2] [tt-0036]

Pairs \((Y, y)\) with \(Y \in {\cal C}\) and \(y \in \mathscr {F}(Y)\) in corollary [tt-0035] are sometimes called elements of the presheaf \(\mathscr {F}\).

Indeed, lemma [tt-002N] (Yoneda) tells us that \(y\) amounts to a generalized element of \(\mathscr {F}\) of shape \(\mathscr {H}_Y\).

An element \(u\) satisfying condition in corollary [tt-0035] is sometimes called a universal element of \(\mathscr {F}\). So, corollary [tt-0035] says that a representation of a presheaf \(\mathscr {F}\) amounts to a universal element of \(\mathscr {F}\).

lemma. adjunction and representable [leinster2016basic, 4.1.11] [tt-003I]

Any set-valued functor with a left adjoint is representable.

definition. cone as a natural transformation [leinster2016basic, eq. 6.1] [tt-0040]

Now, given a diagram \(\mathscr {D}: {\cal J} \to {\cal C}\) and an object \(V \in {\cal C}\), a cone on \(\mathscr {D}\) with vertex \(V\) is simply a natural transformation from the diagonal functor \(\Delta _V\) to the diagram \(\mathscr {D}\).

Writing \(\operatorname {Cone}(V, \mathscr {D})\) for the set of cones on \(\mathscr {D}\) with vertex \(V\), we therefore have \[ \operatorname {Cone}(V, \mathscr {D})=[{\cal J}, {\cal C}] (\Delta _V, \mathscr {D}) . \]

Thus, \(\operatorname {Cone}(V, \mathscr {D})\) is functorial in \(V\) (contravariantly) and \(\mathscr {D}\) (covariantly).

lemma. limit via representation [leinster2016basic, 6.1.1] [tt-003Y]

Let \({\cal J}\) be a small category, \({\cal C}\) a category, and \(\mathscr {D}: {\cal J} \to {\cal C}\) a diagram. Then there is a one-to-one correspondence between

limit cones on \(\mathscr {D}\)
representations of the natural transformation Cone

with the representing objects being the limit objects (i.e. the vertices) of \(\mathscr {D}\).

Briefly put: a limit \((V, \pi )\) of \(\mathscr {D}\) is a representation of \([{\cal J}, {\cal C}] (\Delta _{-}, \mathscr {D})\).

Diagramatically,

It implies that \[\operatorname {Cone}(\mathrm {-}, \mathscr {D}) \cong {\cal C}\left (\mathrm {-}, \lim \limits _{\leftarrow {\cal J}} \mathscr {D} \right )\] for any \(\mathrm {-} \in {\cal C}\).

lemma. representables preserve limits [leinster2016basic, 6.2.2] [tt-0042]

Let \(\mathscr {A}\) be a locally small category and \(X \in {\cal C}\). Then \({\cal C}(X,-): {\cal C} \to \mathbf {Set}\) preserves limits.

proof.

It follows from lemma [tt-003Y] and that [leinster2016basic, 6.2.1] \[\operatorname {Cone}(X, \mathscr {D}) \cong \lim \limits _{\leftarrow {\cal J}} {\cal C}(X, \mathscr {D})\] naturally in \(X\) and \(\mathscr {D}\).

lemma. limits commute with limits [leinster2016basic, 6.2.8] [tt-0047]

Let \({\cal I}\) and \({\cal J}\) be small categories. Let \({\cal C}\) be a locally small category with limits of shape \({\cal I}\) and of shape \({\cal J}\).

Define \[\begin {array}{llll} \mathscr {D}^{\bullet }: & {\cal I} & \to & {[{\cal J}, {\cal C}]} \\ & I & \mapsto & \mathscr {D}(I,-) \end {array}\] and \[\begin {array}{rrrr} \mathscr {D}_{\bullet }: & {\cal J} & \to & {[{\cal I}, {\cal C}]} \\ & J & \mapsto & \mathscr {D}(-, J) \end {array}\]

Then for all \(\mathscr {D}: {\cal I} \times {\cal J} \to {\cal C}\), we have \[ \lim _{\leftarrow {\cal J}} \lim _{\leftarrow {\cal I}} \mathscr {D}^{\bullet } \cong \lim _{\leftarrow {\cal I} {\cal J}} \mathscr {D} \cong \lim _{\leftarrow {\cal I} \leftarrow {\cal J}} \mathscr {D}_{\bullet } \] and all these limits exist. In particular, \({\cal C}\) has limits of shape \({\cal I} \times {\cal J}\).

lemma. colimits commute with colimits [leinster2016basic, 6.2.10] [tt-0049]

Dual to lemma [tt-0047], colimits commute with colimits.

remark. [leinster2016basic, 6.2.10] [tt-004A]

Limits do not in general commute with colimits.

Some special cases where they do:

filtered colimits commute with finite limits [stacks2017stacks, 002W].

lemma. initial and terminal objects via adjunction [leinster2016basic, 2.1.9] [tt-0054]

Initial and terminal objects can be described as adjoints. Let \({\cal C}\) be a category. There exist the unique functor \(! : {\cal C} \to \mathbf {1}\), and a constant object functor \(X : 1 \to {\cal C}\) for each object \(X\).

A left adjoint to \(!\) is exactly an initial object of \({\cal C}\): \[ \mathrm {0} \dashv \ ! : \mathbf {1} \rightleftarrows {\cal C} \]

Similarly, a right adjoint to \(!\) is exactly a terminal object of \({\cal C}\): \[ ! \dashv \mathrm {1} : {\cal C} \rightleftarrows \mathbf {1} \]

proof.

In both cases, being an adjunction gives an isomorphism for each object \(X\), one side of the isomorphism is \(\mathbf {1}(\mathrm {*}, \mathrm {*})\) which is just \(\mathit {1}_{\mathrm {*}}\), and the other side are \({\cal C}(\mathrm {0}, X)\) or \({\cal C}(X, \mathrm {1})\), and the isomorphism establishes the uniqueness of the arrows (from \(\mathrm {0}\) or to \(\mathrm {1}\)) for each object. The initial or terminal object exists if the corresponding adjunction exists.

lemma. (co)limits via adjunction [rosiak2022sheaf, example 200] [tt-005A]

(Co)limits can be phrased entirely in terms of adjunctions:

The advantages of this adjunction perspective is that the (co)limit of every \({\cal J}\)-shaped diagram in \({\cal C}\) can be defined all at once.

cartesian closed categories [tt-004S]

definition. evaluation [leinster2016basic, 6.2.4] [tt-0044]

Let \({\cal S}\) be a small category, \({\cal C}\) a locally small category. For each \(X \in {\cal S}\), there is a functor \[ \begin {array}{lccc} \operatorname {ev}_X: & [{\cal S}, {\cal C}] & \to & {\cal C} \\ & \mathscr {F} & \mapsto & \mathscr {F}(X) \end {array} \] called evaluation at \(X\).

Given a diagram \(\mathscr {D}: {\cal J} \to [{\cal S}, {\cal C}]\), we have for each \(X \in {\cal S}\) a functor \[ \begin {array}{lccc} \mathscr {D} \mathbin {\bullet } \operatorname {ev}_X : & {\cal J} & \to & {\cal C} \\ & J & \mapsto & \mathscr {D}(J)(X) \end {array} \]

We write \(\mathscr {D} \mathbin {\bullet } \operatorname {ev}_X\) as \(\mathscr {D}(-)(X)\).

definition. cartesian product functor [kostecki2011introduction, 5.13] [tt-004F]

If \({\cal C}\) is a category with binary products, we can define for every \(X\) the cartesian product functor \(X \times (-)\) : \({\cal C} \rightarrow {\cal C}\), with the following action: \[ \begin {aligned} & (X \times (-))(Y)=X \times Y \\ & (X \times (-))(f)=\operatorname {id}_X \times f \end {aligned} \]

definition. exponential [kostecki2011introduction, 5.13] [tt-004G]

If for \(X \in {\cal C}\), a cartesian product functor \(X \times (-)\) has a right adjoint, it is called an exponential or the exponential object, denoted \((-)^X\).

Explicitly, \(X \times (-) \dashv (-)^X\) means there is a natural isomorphism of bifunctors \(\operatorname {Hom}(X \times (-),-) \cong \operatorname {Hom}(-, (-)^X)\), i.e. for any arrow \(f: X \times Y \to Z\) there is a unique arrow \(f^b: Y \to Z^X\), which is the transpose of the adjunction, called exponential transpose.

The arrow \(\operatorname {ev} : X \times (-)^X \to (-)\) is called the evaluation arrow of the exponential, and is the counit of the adjunction.

Diagramatically, the diagram commutes.

We say that category \({\cal C}\) has exponentials if for any \(X \in {\cal C}\), there exists an exponential \((-)^X\).

definition. cartesian closed category [kostecki2011introduction, 5.14] [tt-004H]

A category \({\cal C}\) is called cartesian closed iff \({\cal C}\) has exponentials and has finite products.

definition. power object [kostecki2011introduction, 6.8] [tt-004L]

The power object \(P(X)\) of an object \(X\) in a cartesian closed category \({\cal C}\) with subobject classifier \(\Omega \) is defined as the exponential object \(\Omega ^X\).

If \(\Omega ^X\) exists for any \(X\) in \({\cal C}\), we say that \({\cal C}\) has power objects.

lemma. properties of cartesian closed categories [kostecki2011introduction, 5.15] [tt-004N]

For any cartesian closed category \({\cal C}\), and any objects \(X, Y, Z\) of \({\cal C}\), we have

\(\mathrm {0} \times X \cong \mathrm {0}\) if \(\mathrm {0}\) exists in \({\cal C}\)
\(1 \times X \cong X\)
\(X^{\mathrm {0}} \cong \mathrm {1}\) if \(\mathrm {0}\) exists in \({\cal C}\)
\(X^1 \cong X\)
\(\mathrm {1}^X \cong 1\)
\(X \times Y \cong Y \times X\)
\((X \times Y) \times Z \cong X \times (Y \times Z)\)
\(Y^X \times Z^X \cong (Y \times Z)^X\)
\(Z^{X \times Y} \cong \left (Z^X\right )^Y\)
\(X+Y \cong Y+X\)
\((X+Y)+Z \cong X+(Y+Z)\)
\(Z^X \times Z^Y \cong Z^{X+Y}\) if \({\cal C}\) has binary coproducts
\((X \times Y)+(X \times Z) \cong X \times (Y+Z)\) if \({\cal C}\) has binary coproducts

subobject classifier [tt-004T]

definition. category of elements [leinster2016basic, 6.2.16] [tt-004C]

Let \({\cal C}\) be a category and \(\mathscr {X} : {\cal C}^{op} \to \mathbf {Set}\) a presheaf on \({\cal C}\). The category of elements \({\cal E}(\mathscr {X})\) of \(\mathscr {X}\) is the category in which:

There is a projection functor \(\mathscr {P}: \mathbf {E}(\mathscr {X}) \rightarrow {\cal C}\) defined by \(\mathscr {P}(X, x) = X\) and \(\mathscr {P}(f) = f\).

The Yoneda lemma implies that for a presheaf \(\mathscr {X}\), the generalized elements of \(\mathscr {X}\) of representable shape correspond to objects of the category of elements.

theorem. density [leinster2016basic, 6.2.17] [tt-004D]

Let \({\cal C}\) be a small category and \(\mathscr {X}\) a presheaf on \({\cal C}\). Then \(\mathscr {X}\) is the colimit of the diagram \[ {\cal E}(\mathscr {X}) \xrightarrow {\mathscr {P}} \mathbf {{\cal C}} \xrightarrow {H_{\bullet }}\left [\mathbf {{\cal C}}^{op}, \text { Set }\right ] \] in \(\left [\mathbf {{\cal C}}^{op}, \mathbf {Set} \right ]\), i.e. \[\mathscr {X} \cong \lim \limits _{\rightarrow {\cal E}(\mathscr {X})}\left (\mathscr {P} \mathbin {\bullet } H_{\bullet } \right )\] where \({\cal E}(\mathscr {X})\) is the category of elements, \(\mathscr {P}\) the projection functor in it, and \(H_{\bullet }\) the (contravariant) Yoneda embedding.

This theorem states that every presheaf is a colimit of representables in a canonical way. It is secretly dual to the Yoneda lemma. This becomes apparent if one expresses both in suitably lofty categorical language (that of ends, or that of bimodules).

definition. subobject classifier [kostecki2011introduction, 6.4] [tt-004I]

A subobject classifier is an object \(\Omega \) in \({\cal C}\), together with an arrow \(\top : \mathbf {1} \to \Omega \), called the true arrow, such that for each monic arrow \(m: Y \mapsto X\) there is a unique arrow \(\chi : X \to \Omega \), called the characteristic arrow of \(m\) (or of \(Y\)), such that the diagram is a pullback, where \(!\) is the unique functor.

\(\Omega \) is also called a generalized truth-value object.

The arrow \((! \mathbin {\bullet } T) : Y \stackrel {!}{\to } \mathbf {1} \rightarrowtail \Omega \) is often denoted as \(\top _Y: Y \to \Omega \).

lemma. isomorphism to class of subobjects [kostecki2011introduction, 6.7] [tt-004K]

In any category \({\cal C}\) with a subobject classifier \(\Omega \), \[ \operatorname {Sub}(X) \cong {\cal C}(X, \Omega ) \] i.e. the class of subobjects of an object \(X\) in \({\cal C}\) is isomorphic to the class of arrows from \(X\) to \(\Omega \).

proof.

It follows from the definitions and lemma [tt-003J] that for every \(f : Y \to X\) and \([f] \in \operatorname {Sub}(X)\),

(surjection) \(\chi (f) \in {\cal C}(X, \Omega )\)
(injection) for every \(h : X \to \Omega \), \(\chi (f) = h\) since is a pullback.

definition. power object functor [kostecki2011introduction, 6.8] [tt-004M]

The contravariant power object functor \(\mathscr {P}: {\cal C}^{o p} \to {\cal C}\), given by \[\mathscr {P}: X \mapsto \Omega ^X\] for \(X \in \mathrm {Ob}({\cal C})\) and such that \(\mathscr {P}(f): \Omega ^Y \mapsto \Omega ^X\) for \(f: X \to Y\) in \({\cal C}\) is given by \[\mathscr {P}(f)(Y)=\{x \in X \mid f(x) \in Y\}\]

When power object is defined for cartesian closed categories, we have \[ \frac {X \times Y \to \Omega }{Y \to \Omega ^X} \] thus for every category with power objects \[ \operatorname {Hom}(X \times Y, \Omega ) \cong \operatorname {Hom}\left (Y, \Omega ^X\right ) \] This equation, together with lemma [tt-004K] written in the form \(\operatorname {Sub}(X \times Y) \cong \operatorname {Hom}(X \times Y, \Omega )\), gives the isomorphism \[ \operatorname {Sub}(X \times Y) \cong \operatorname {Hom}(Y, \mathscr {P}(X)) \]

Topos theory [tt-004U]

definition. topos [kostecki2011introduction, 7.1] [tt-004O]

A topos or elementary topos is a category satisfying one of these equivalent conditions:

it is a complete category with exponentials and subobject classifier
it is a complete category with subobject classifier and its power object
it is a cartesian closed category with equalizers and subobject classifier

Since the completeness of a category with subobject classifier implies its cocompleteness. Thus a topos not only has all finite limits, but also has all finite colimits.

This means that topos is such category which has, in particular,

terminal object
equalizers
pullbacks
all other limits
exponential objects
subobject classifier

remark. topoi, toposes [kostecki2011introduction, 7.1] [tt-004P]

The name "topos" originates from the Greek word "τoπoς", meaning a place, as topos could mean a place of geometry, and at the same time as a place of logic.

Following the ancient Greek naming convention, the plural of topos is topoi, but people also use toposes. We use them interchangeably.

example. topos [tt-004R]

Topos theory unifies, in an extraordinary way, important aspects of geometry and logic.

Grothendieck topos was first introduced by Grothendieck to generalize topological space [kostecki2011introduction, 7.2]:

every space gives rise to a topos (namely, the category of sheaves on it).

Topological properties of the space can be reinterpreted in a useful way as categorical properties of its associated topos.

Elementary topos was introduced by Lawvere and Tierney, to generalize Set. A topos can be regarded as a 'universe of sets' [leinster2016basic, 6.3.20]. \(\mathbf {Set}\) is the most basic example of a topos, and every topos shares enough features with Set that [⧉]

anything you really really needed to do in the category of sets can be done in any topos.

Every presheaf category, is a topos. [leinster2016basic, 6.3.27]

A topos can allow the interpretation of a higher-order logic. In particular, objects can be seen as collections of elements of a given type, subobjects are viewed as propositions. Products and coproducts are interpreted as conjunction and disjunction respectively. For an introduction, see [pitts2001categorical].

definition. geometric morphism, Topoi [kostecki2011introduction, 7.2] [tt-004Q]

If \({\cal E}_1\) and \({\cal E}_2\) are toposes, then a geometric morphism \(\mathscr {G}: {\cal E}_1 \to {\cal E}_2\) is defined as a pair of adjoint functors \(\mathscr {G}^* \dashv \mathscr {G}_*\) between \({\cal E}_1\) and \({\cal E}_2\), such that \(\mathscr {G}^*\) preserves finite limits (i.e. is left exact), which implies \(\mathscr {G}_*\) preserves colimits (i.e. is right exact).

The category of toposes and their geometric morphisms is denoted \(\mathbf {Topoi}\).

Appendix [tt-0006]

These are my notes on:

Category theory
Topos theory
Type theory
Sheaf theory
Differential sheaves
SDG (Synthetic Differential Geometry)

The primary references for these notes are:

[kostecki2011introduction]
[leinster2016basic]
[kostecki2009differential]
[rosiak2022sheaf]
[mallios2015differential]
[rosiak2022sheaf]
[zhang2021type]
[chen2016infinitely]
[sterling2023models]

[marsden2014category]

[fauser2004grade]

For draft notes, see drafts for Notes on Topos Theory and Type Theory.

Notes on Clifford Algebras [ca-0001]
📄PDF

Readers may also check out Compendium on Spin groups.

Preliminaries [ca-000N]

This section introduces the algebraic environment of Clifford Algebra, covering vector spaces, groups, algebras, representations, modules, multilinear algebras, quadratic forms, filtrations and graded algebras.

The material in this section should be familiar to the readers, but it is worth reading through it to become familiar with the notation and terminology that is used, as well as their counterparts in Lean, which usually require some additional treatment, both mathematically and technically (probably applicable to other formal proof verification systems).

Details can be found in the references in corresponding section, or you may use the L∃∀N button to check the corresponding \(\textsf {Mathlib}\) document and Lean 4 source code.

In this section, we follow [jadczyk2019notes], with supplements from [garling2011clifford][chen2016infinitely], and extensive modifications to match the counterparts in Lean's \(\textsf {Mathlib}\).

basics: from groups to modules [ca-000P]

[ca-000O]

\card{convention}{definition style}{

In this document, we unify the informal mathematical language for a definition to:

\optional{Let \placeholder{\(X\)} be a \placeholder{\vocab{concept \(X\)}}. }

A \newvocab{\placeholder{concept \(Z\)}} is a set/pair/triple/tuple \((Z, \mathtt {op}, ...)\), satisfying:

\placeholder{\(Z\)} is a \placeholder{\vocab{concept \(Y\)}} \optional{ over \placeholder{\(X\)} under \placeholder{op} }.
\placeholder{formula} for all \placeholder{elements in \(Z\)} \optional{(\vocab{ \placeholder{property} })}.
\optional{for each \placeholder{element} in \placeholder{\vocab{concept \(X\)}}, } there exists \placeholder{element} such that \placeholder{formula} for all \placeholder{elements in concept \(Z\)}.
\optional{\placeholder{op} is called \placeholder{\vocab{op name}}, }for all \placeholder{elements in \(Z\)}, we have
1. \placeholder{formula}
2. \placeholder{formula}
\optional{(\vocab{ \placeholder{property} })}.

By default, \placeholder{\(X\)} is a set, \placeholder{op} is a binary operation on \placeholder{\(X\)}.

}

definition. group [garling2011clifford, 1.1] [ca-000Q]
Group AddGroup
L∃∀N

A group is a pair \((G, *)\), satisfying:

\((a * b) * c = a * (b * c)\) for all \(a, b, c \in G\) (associativity).
there exists \(1 \in G\) such that \[1 * a = a * 1 = a\] for all \(a \in G\).
for each \(a \in G\), there exists \(a^{-1} \in G\) such that \[a * a^{-1} = a^{-1} * a = 1\].

remark. [ca-000R]

It then follows that \(1\), the identity element, is unique, and that for each \(g \in G\) the inverse \(g^{-1}\) is unique. A group G is abelian, or commutative, if \(g * h = h * g\) for all \(g, h \in G\).

notation. product [ca-000S]

In literatures, the binary operation of a group is usually called a product. It's denoted by juxtaposition \(g h\), and is understood to be a mapping \((g, h) \mapsto g * h\) from \(G \times G\) to \(G\).

Here we use an explicit symbol for the operation. It can be denoted multiplicatively as \(*\) in Group or additively as \(+\) in AddGroup, where the identity element will be denoted by \(1\) or \(0\), respectively.

Sometimes we use notations with subscripts (e.g. \(*_G\), \(1_G\)) to indicate where they are.

\(\textsf {Mathlib}\) encodes the mapping \(G \times G \to G\) as \(G \to G \to G\), it is understood to be \(G \to (G \to G)\), that sends \(g \in G\) to a mapping that sends \(h \in G\) to \(g * h \in G\).

Furthermore, a mathematical construct, e.g. a group, is represented by a "type", as Lean has a dependent type theory foundation, see [carneiro2019type] and [ullrich2023extensible, sec. 3.2].

definition. monoid [ca-000T]
Monoid AddMonoid
L∃∀N

A monoid is a pair \((R, *)\), satisfying:

\((a * b) * c = a * (b * c)\) for all \(a, b, c \in R\) (associativity).
there exists an element \(1 \in R\) such that \(1 * a = a * 1 = a\) for all \(a \in R\)

neutral element

definition. ring [jadczyk2019notes, 1.1] [ca-000U]
Ring CommRing CommSemiring
L∃∀N

A ring is a triple \((R, +, *)\), satisfying:

\(R\) is a commutative group under \(+\).
\(R\) is a monoid under \(*\).
for all \(a, b, c \in R\), we have
1. \(a * (b + c) = a * b + a * c\)
2. \((a + b) * c = a * c + b * c\)
(left and right distributivity over \(+\)).

remark. [ca-000V]

In applications to Clifford algebras \(R\) will be always assumed to be commutative.

definition. division ring [ca-000W]
DivisionRing
L∃∀N

A division ring is a ring \((R, +, *)\), satisfying:

\(R\) contains at least 2 elements.
for all \(a \neq 0\) in \(R\), there exists a multiplicative inverse \(a^{-1} \in R\) such that

definition. module [jadczyk2019notes, 1.3] [ca-000X]
Module
L∃∀N

Let \(R\) be a commutative ring. A module over \(R\), called an \(R\)-module, is a pair \((M, \bullet )\), satisfying:

M is a group under \(+\).
\(\bullet : R \to M \to M\) is called scalar multiplication, for every \(a, b \in R\), \(x, y \in M\), we have
1. \(a \bullet (x + y) = a \bullet x + b \bullet y\)
2. \((a + b) \bullet x = a \bullet x + b \bullet x\)
3. \(a * (b \bullet x)=(a * b) \bullet x\)
4. \(1_R \bullet x = x\)

remark. [ca-000Y]

The notation of scalar multiplication is generalized as heterogeneous scalar multiplication HMul in \(\textsf {Mathlib}\): \[ \bullet : \alpha \to \beta \to \gamma \] where \(\alpha \), \(\beta \), \(\gamma \) are different types.

definition. vector space [jadczyk2019notes, 1.5] [ca-000Z]
Module
L∃∀N

If \(R\) is a division ring, then a module \(M\) over \(R\) is called a vector space.

remark. [ca-0010]

For generality, \(\textsf {Mathlib}\) uses Module throughout for vector spaces, particularly, for a vector space \(V\), it's usually declared as

/--
    Let $K$ be a division ring, a module $V$ over $K$ is a vector space
    where being a module requires $V$ to be a commutative group over $+$.
-/
variable [DivisionRing K] [AddCommGroup V] [Module K V]

For definitions/theorems about it, and most of them can be found under Mathlib.LinearAlgebra e.g. LinearIndependent.

definition. submodule [ca-0011]

A submodule \(N\) of \(M\) is a module \(N\) such that every element of \(N\) is also an element of \(M\). If \(M\) is a vector space then \(N\) is called a subspace.

definition. dual module [ca-0012]
Module.Dual
L∃∀N

The dual module \(M^* : M \to _{l[R]} R\) is the \(R\)-module of all linear maps from \(M\) to \(R\).

algebras [ca-0013]

definition. ring homomorphism [chen2016infinitely, 4.5.1] [ca-0014]
RingHom RingHomClass algebraMap
L∃∀N

Let \((\alpha , +_\alpha , *_\alpha )\) and \((\beta , +_\beta , *_\beta )\) be rings. A ring homomorphism from \(\alpha \) to \(\beta \) is a map \(\mathit {1} : \alpha \to _{+*} \beta \) such that

\(\mathit {1}(x +_{\alpha } y) = \mathit {1}(x) +_{\beta } \mathit {1}(y)\) for each \(x,y \in \alpha \).
\(\mathit {1}(x *_{\alpha } y) = \mathit {1}(x) *_{\beta } \mathit {1}(y)\) for each \(x,y \in \alpha \).
\(\mathit {1}(1_{\alpha }) = 1_{\beta }\).

definition. isomorphism, endomorphism, automorphism [ca-0015]

Isomorphism \(A \cong B\) is a bijective homomorphism \(\phi : A \to B\) (it follows that \(\phi ^{-1} : B \to A\) is also a homomorphism).

Endomorphism is a homomorphism from an object to itself, denoted \(\operatorname {End}(A)\).

Automorphism is an endomorphism which is also an isomorphism, denoted \(\operatorname {Aut}(A)\).

definition. algebra [ca-0016]
Algebra
L∃∀N

Let \(R\) be a commutative ring. An algebra \(A\) over \(R\) is a pair \((A, \bullet )\), satisfying:

\(A\) is a ring under \(*\).
there exists a ring homomorphism from \(R\) to \(A\), denoted \(\mathit {1} : R \to _{+*} A\).
\(\bullet : R \to M \to M\) is a scalar multiplication
for every \(r \in R\), \(x \in A\), we have
1. \(r * x = x * r\) (commutativity between \(R\) and \(A\))
2. \(r \bullet x = r * x\) (definition of scalar multiplication)
where we omitted that the ring homomorphism \(\mathit {1}\) is applied to \(r\) before each multiplication.

notation. [ca-0017]

Following literatures, for \(r \in R\), usually we write \(\mathit {1}_A(r) : R \to _{+*} A\) as a product \(r \mathit {1}_A\) if not omitted, while they are written as a call to algebraMap _ _ r in \(\textsf {Mathlib}\), which is defined to be Algebra.toRingHom r.

remark. [ca-0018]

The definition above (adopted in \(\textsf {Mathlib}\)) is more general than the definition in literature (e.g. [jadczyk2019notes, 1.6]):

Let \(R\) be a commutative ring. An algebra \(A\) over \(R\) is a pair \((M, *)\), satisfying:

\(A\) is a module \(M\) over \(R\) under \(+\) and \(\bullet \).
\(A\) is a ring under \(*\).
For \(x, y \in A, a \in R\), we have

See Implementation notes in Algebra for details.

remark. [ca-0019]

What's simply called algebra is actually associative algebra with identity, a.k.a. associative unital algebra. See the red herring principle for more about such phenomenon for naming, particularly the example of (possibly) nonassociative algebra.

definition. algebra homomorphism [ca-001A]
AlgHom
L∃∀N

Let \(A\) and \(B\) be \(R\)-algebras. \(\mathit {1}_A\) and \(\mathit {1}_B\) are ring homomorphisms from \(R\) to \(A\) and \(B\), respectively. A algebra homomorphism from \(A\) to \(B\) is a map \(f : \alpha \to _{a} \beta \) such that

\(f\) is a ring homomorphism
\(f(\mathit {1}_{A}(r)) = \mathit {1}_{B}(r)\) for each \(r \in R\)

definition. ring quotient [ca-001B]
RingQuot
L∃∀N

Let \(R\) be a non-commutative ring, \(r\) an arbitrary equivalence relation on \(R\). The ring quotient of \(R\) by \(r\) is the quotient of \(R\) by the strengthen equivalence relation of \(r\) such that for all \(a, b, c\) in \(R\):

\(a + c \sim b + c\) if \(a \sim b\)
\(a * c \sim b * c\) if \(a \sim b\)
\(a * b \sim a * c\) if \(b \sim c\)

i.e. the equivalence relation is compatible with the ring operations \(+\) and \(*\).

remark. [ca-001C]

As ideals haven't been formalized for the non-commutative case, \(\textsf {Mathlib}\) uses RingQuot in places where the quotient of non-commutative rings by ideal is needed. The universal properties of the quotient are proven, and should be used instead of the definition that is subject to change.

definition. free algebra [ca-001D]
FreeAlgebra FreeAlgebra.Pre FreeAlgebra.Rel
L∃∀N

Let \(X\) be an arbitrary set. An free \(R\)-algebra on \(X\) (or "generated by \(X\)"), named \(A\), is the ring quotient of the following inductively constructed set \(A_{\mathtt {pre}}\)

for all \(x\) in \(X\), there exists a map \(X \to A_{\mathtt {pre}}\).
for all \(r\) in \(R\), there exists a map \(R \to A_{\mathtt {pre}}\).
for all \(a, b\) in \(A_{\mathtt {pre}}\), \(a + b\) is in \(A_{\mathtt {pre}}\).
for all \(a, b\) in \(A_{\mathtt {pre}}\), \(a * b\) is in \(A_{\mathtt {pre}}\).

by that equivalence relation that makes \(A\) an \(R\)-algebra, namely:

there exists a ring homomorphism from \(R\) to \(A_{\mathtt {pre}}\), denoted \(R \to _{+*} A_{\mathtt {pre}}\).
\(A\) is a commutative group under \(+\).
\(A\) is a monoid under \(*\).
left and right distributivity under \(*\) over \(+\).
\(0 * a \sim a * 0 \sim 0\).
for all \(a, b, c\) in \(A\), if \(a \sim b\), we have
1. \(a + c \sim b + c\)
2. \(c + a \sim c + b\)
3. \(a * c \sim b * c\)
4. \(c * a \sim c * b\)
(compatibility with the ring operations \(+\) and \(*\))

remark. [ca-001E]

Similar to remark [ca-0019], what we defined here is the free (associative, unital) \(R\)-algebra on \(X\), it can be denoted \(R\langle X \rangle \), expressing that it's freely generated by \(R\) and \(X\), where \(X\) is the set of generators.

definition. linear map [ca-001F]
LinearMap
L∃∀N

Let \(R, S\) be rings, \(M\) an \(R\)-module, \(N\) an \(S\)-module. A linear map from \(M\) to \(N\) is a function \(f : M \to _{l} N\) over a ring homomorphism \(\sigma : R \to _{+*} S\), satisfying:

\(f(x + y) = f(x) + f(y)\) for all \(x, y \in M\).
\(f(r \bullet x) = \sigma (r) \bullet f(x)\) for all \(r \in R\), \(x \in M\).

remark. Lin [ca-001G]

The set of all linear maps from \(M\) to \(M'\) is denoted \(\operatorname {Lin}(M, M')\), and \(\operatorname {Lin}(M)\) for mapping from \(M\) to itself. \(\operatorname {Lin}(M)\) is an endomorphism.

definition. tensor algebra [ca-001H]
TensorAlgebra
L∃∀N

Let \(A\) be a free \(R\)-algebra generated by module \(M\), let \(\iota : M \to A\) denote the map from \(M\) to \(A\). An tensor algebra over \(M\) (or "of \(M\)") \(T\) is the ring quotient of the free \(R\)-algebra generated by \(M\), by the equivalence relation satisfying:

for all \(a, b\) in \(M\), \(\iota (a + b) \sim \iota (a) + \iota (b)\).
for all \(r\) in \(R\), \(a\) in \(M\), \(\iota (r \bullet a) \sim r * \iota (a)\).

i.e. making the inclusion of \(M\) into an \(R\)-linear map.

remark. [ca-001I]

The definition above is equivalent to the following definition in literature (e.g. [jadczyk2019notes, 1.7]):

Let \(M\) be a module over \(R\). An algebra \(T\) is called a tensor algebra over \(M\) (or "of \(M\)") if it satisfies the following universal properties:

\(T\) is an algebra containing \(M\) as a submodule, and it is generated by \(M\),
Every linear mapping \(\lambda \) of \(M\) into an algebra \(A\) over \(R\), can be extended to

homomorphism

Forms [ca-000L]

definition. Bilinear form [ca-0003]
BilinForm
L∃∀N

Let \(R\) be a ring, \(M\) an \(R\)-module.

An bilinear form \(B\) over \(M\) is a map \(B : M \to M \to R\), satisfying:

\( B(x + y, z) = B(x, z) +B(y, z) \)
\( B(x, y + z) = B(x, y) +B(x, z) \)
\( B(a \bullet x, y) = a * B(x, y)\)
\( B(x, a \bullet y) = a * B(x, y)\)

for all \(a \in R, x, y, z \in M\).

definition. quadratic form [jadczyk2019notes, 1.9] [ca-0004]
QuadraticForm QuadraticForm.polar
L∃∀N

Let \(R\) be a commutative ring, \(M\) a \(R\)-module.

An quadratic form \(Q\) over \(M\) is a map \(Q : M \to R\), satisfying:

\( Q(a \bullet x) = a * a * Q(x)\) for all \(a \in R, x \in M\).
there exists a companion bilinear form \(B : M \to M \to R\), such that \(Q(x + y) = Q(x) + Q(y) + B(x, y)\)

In [jadczyk2019notes], the bilinear form is denoted \(\Phi \), and called the polar form associated with the quadratic form \(Q\), or simply the polar form of \(Q\).

remark. [ca-0005]

This notion generalizes to commutative semirings using the approach in [izhakian2016supertropical].

Clifford Algebra [ca-000M]

Here we provide a detailed account of the formalization of Clifford Algebra [hestenes2012clifford] in the Lean 4 theorem prover and programming language [moura2021lean][moura2015lean][ullrich2023extensible] and using its Mathematical Library \(\textsf {Mathlib}\) [mathlib2020lean].

The primary references of the formalization are [wieser2022formalizing] and [wieser2024formalizing]. This section and the previous section are adapted from our [wieser2024blueprint].

Notes on Clifford Algebras › Definition [ca-000F]

definition. Clifford algebra [wieser2022formalizing] [ca-0006]
CliffordAlgebra TensorAlgebra.toClifford
L∃∀N

Let \(M\) be a module over a commutative ring \(R\), equipped with a quadratic form \(Q: M \to R\).

Let \(\iota : M \to _{l[R]} T(M)\) be the canonical \(R\)-linear map for the tensor algebra \(T(M)\).

Let \(\mathit {1} : R \to _{+*} T(M)\) be the canonical map from \(R\) to \(T(M)\), as a ring homomorphism.

A Clifford algebra over \(Q\), denoted \(\mathcal {C}\kern -2pt\ell (Q)\), is the ring quotient of the tensor algebra \(T(M)\) by the equivalence relation satisfying \(\iota (m)^2 \sim \mathit {1}(Q(m))\) for all \(m \in M\).

The natural quotient map is denoted \(\pi : T(M) \to \mathcal {C}\kern -2pt\ell (Q)\) in some literatures, or \(\pi _\Phi \)/\(\pi _Q\) to emphasize the bilinear form \(\Phi \) or the quadratic form \(Q\), respectively.

remark. [ca-0007]

In literatures, \(M\) is often written \(V\), and the quotient is taken by the two-sided ideal \(I_Q\) generated from the set \(\{ v \otimes v - Q(v) \vert v \in V \}\). See also Clifford algebra [pr-spin].

As of writing, \(\textsf {Mathlib}\) does not have direct support for two-sided ideals, but it does support the equivalent operation of taking the ring quotient by a suitable closure of a relation like \(v \otimes v \sim Q(v)\).

Hence the definition above.

remark. [ca-0008]

This definition and what follows in \(\textsf {Mathlib}\) is initially presented in [wieser2022formalizing], some further developments are based on [grinberg2016clifford], and in turn based on [bourbaki2007algebra] which is in French and never translated to English.

The most informative English reference on [bourbaki2007algebra] is [jadczyk2019notes], which has an updated exposition in [jadczyk2023bundle].

example. Clifford algebra over a vector space [ca-0009]

Let \(V\) be a vector space \(\mathbb R^n\) over \(\mathbb R\), equipped with a quadratic form \(Q\).

Since \(\mathbb R\) is a commutative ring and \(V\) is a module, the definition above applies.

definition. Clifford map [wieser2022formalizing] [ca-000A]
CliffordAlgebra.ι
L∃∀N

We denote the canonical \(R\)-linear map to the Clifford algebra \(\mathcal {C}\kern -2pt\ell (M)\) by \(\iota : M \to _{l[R]} \mathcal {C}\kern -2pt\ell (M)\).

It's denoted \(i_\Phi \) or just \(i\) in some literatures.

definition. Clifford lift [wieser2022formalizing] [ca-000B]
CliffordAlgebra.lift
L∃∀N

Given a linear map \(f : M \to _{l[R]} A\) into an \(R\)-algebra \(A\), satisfying \(f(m)^2 = Q(m)\) for all \(m \in M\), called is Clifford, the canonical lift of \(f\) is defined to be a algebra homomorphism from \(\mathcal {C}\kern -2pt\ell (Q)\) to \(A\), denoted \(\operatorname {lift} f : \mathcal {C}\kern -2pt\ell (Q) \to _{a} A\).

theorem. Universal property [wieser2022formalizing] [ca-000C]
CliffordAlgebra.ι_comp_lift CliffordAlgebra.lift_unique
L∃∀N

Given \(f : M \to _{l[R]} A\), which is Clifford, \(F = \operatorname {lift} f\) (denoted \(\bar {f}\) in some literatures), we have:

\(F \circ \iota = f\), i.e. the following diagram commutes:
\(\operatorname {lift}\) is unique, i.e. given \(G : \mathcal {C}\kern -2pt\ell (Q) \to _{a} A\), we have: \[ G \circ \iota = f \iff G = \operatorname {lift} f\]

remark. [ca-000D]

The universal property of the Clifford algebras is now proven, and should be used instead of the definition that is subject to change.

definition. Exterior algebra [wieser2022formalizing] [ca-000E]
ExteriorAlgebra
L∃∀N

An Exterior algebra over \(M\) is the Clifford algebra \(\mathcal {C}\kern -2pt\ell (Q)\) where \(Q(m) = 0\) for all \(m \in M\).

Notes on Clifford Algebras › Operations [ca-000G]

convention. [wieser2022formalizing] [ca-000H]

Same as the previous section, let \(M\) be a module over a commutative ring \(R\), equipped with a quadratic form \(Q: M \to R\).

We also use \(m\) or \(m_1, m_2, \dots \) to denote elements of \(M\), i.e. vectors, and \(x, y, z\) to denote elements of \(\mathcal {C}\kern -2pt\ell (Q)\).

definition. Grade involution [wieser2022formalizing] [ca-000I]
CliffordAlgebra.involute
L∃∀N

Grade involution, intuitively, is negating each basis vector.

Formally, it's an algebra homomorphism \(\alpha : \mathcal {C}\kern -2pt\ell (Q) \to _{a} \mathcal {C}\kern -2pt\ell (Q)\), satisfying:

\(\alpha \circ \alpha = \operatorname {id}\)
\(\alpha (\iota (m)) = - \iota (m)\)

for all \(m \in M\).

That is, the following diagram commutes:

It's called main involution \(\alpha \) or main automorphism in [jadczyk2019notes], the canonical automorphism in [gallier2008clifford].

It's denoted \(\hat {m}\) in [lounesto2001clifford], \(\alpha (m)\) in [jadczyk2019notes], \(m^*\) in [chisolm2012geometric].

definition. Grade reversion [wieser2022formalizing] [ca-000J]
CliffordAlgebra.reverse
L∃∀N

Grade reversion, intuitively, is reversing the multiplication order of basis vectors. Formally, it's an algebra homomorphism \(\tau : \mathcal {C}\kern -2pt\ell (Q) \to _{a} \mathcal {C}\kern -2pt\ell (Q)^{\mathtt {op}}\), satisfying:

\(\tau (m_1 m_2) = \tau (m_2) \tau (m_1)\)
\(\tau \circ \tau = \operatorname {id}\)
\(\tau (\iota (m)) = \iota (m)\)

That is, the following diagram commutes:

It's called anti-involution \(\tau \) in [jadczyk2019notes], the canonical anti-automorphism in [gallier2008clifford], also called transpose/transposition in some literature, following tensor algebra or matrix.

It's denoted \(\tilde {m}\) in [lounesto2001clifford], \(m^\tau \) in [jadczyk2019notes] (with variants like \(m^t\) or \(m^\top \) in other literatures), \(m^\dagger \) in [chisolm2012geometric].

definition. Clifford conjugate [wieser2022formalizing] [ca-000K]
CliffordAlgebra.reverse
L∃∀N

Clifford conjugate is an algebra homomorphism \({*} : \mathcal {C}\kern -2pt\ell (Q) \to _{a} \mathcal {C}\kern -2pt\ell (Q)\), denoted \(x^{*}\) (or even \(x^\dagger \), \(x^v\) in some literatures), defined to be: \[ x^{*} = \operatorname {reverse}(\operatorname {involute}(x)) = \tau (\alpha (x)) \] for all \(x \in \mathcal {C}\kern -2pt\ell (Q)\), satisfying (as a \(*\)-ring):

\((x + y)^{*} = (x)^{*} + (y)^{*}\)
\((x y)^{*} = (y)^{*} (x)^{*}\)
\({*} \circ {*} = \operatorname {id}\)
\(1^{*} = 1\)

and (as a \(*\)-algebra): \[ (r x)^{*} = r' x^{*} \] for all \(r \in R\), \(x, y \in \mathcal {C}\kern -2pt\ell (Q)\) where \('\) is the involution of the commutative \(*\)-ring \(R\).

Note: In our current formalization in \(\textsf {Mathlib}\), the application of the involution on \(r\) is ignored, as there appears to be nothing in the literature that advocates doing this.

Clifford conjugate is denoted \(\bar {m}\) in [lounesto2001clifford] and most literatures, \(m^\ddagger \) in [chisolm2012geometric].

Compendium on Spin groups [spin-0001]
📄PDF

April 26, 2024
Utensil Song

Motivation [spin-000R]

This survey is built on my notes in the process of figuring out Eric Wieser's MathOverflow question Definition of a spin group for our PR to Lean 4's Mathlib about Spin groups.

This is my first Forester experiment, also a spiritual successor to my writeup The Many Faces of Geometric Algebra.

Our definition [spin-000X]

definition. Clifford algebra [pr-spin] [ca-0002]

Let \(M\) be a module over a commutative ring \(R\), equipped with a quadratic form \(Q: M \to R\).

A Clifford algebra over \(Q\) is \[ \mathcal {C}\kern -2pt\ell (Q) \equiv T(M)/I_Q \] where \(T(M)\) is the tensor algebra of \(M\), \(I_Q\) is the two-sided ideal generated from the set \[ \{ m \otimes m - Q(m) \mid m \in M \}. \]

We denote the canonical linear map \(M \to \mathcal {C}\kern -2pt\ell (Q)\) as \(\iota _Q\).

definition. Lipschitz-Clifford group [pr-spin] [spin-000W]

The Lipschitz-Clifford group is defined as the subgroup closure of all the invertible elements in the form of \(\iota _Q(m)\), \[ \Gamma \equiv \left \{ x_1 \ldots x_k \in \mathcal {C}\kern -2pt\ell ^{\times }(Q) \mid x_i \in V \right \} \] where \[ \mathcal {C}\kern -2pt\ell ^{\times }(Q) \equiv \left \{ x \in \mathcal {C}\kern -2pt\ell (Q) \mid \exists x^{-1} \in \mathcal {C}\kern -2pt\ell (Q), x^{-1} x = x x^{-1}=1\right \} \] is the group of units (i.e. invertible elements) of \(\mathcal {C}\kern -2pt\ell (Q)\), \[ V \equiv \left \{ \iota _Q(m) \in \mathcal {C}\kern -2pt\ell (Q) \mid m \in M \right \} \] is the subset \(V\) of \(\mathcal {C}\kern -2pt\ell (Q)\) in the form of \(\iota _Q(m)\).

definition. Pin group [pr-spin] [spin-000Y]
pinGroup
L∃∀N

The Pin group is defined as \[ \operatorname {Pin}(Q) \equiv \Gamma \sqcap \operatorname {U}(\mathcal {C}\kern -2pt\ell (Q)) \] where \(\sqcap \) is the infimum (or greatest lower bound, or meet), and the infimum of two submonoids is just their intersection \(\cap \), \[ \operatorname {U}(S) \equiv \left \{ x \in S \mid x^{*} * x = x * x^{*} = 1 \right \} \] are the unitary elements of the Clifford algebra as a \(*\)-monid, and we have defined the star operation of Clifford algebra as Clifford conjugate [wieser2022formalizing], denoted \(\bar {x}\).

This definition is equivalent to the following: \[ \operatorname {Pin}(Q) \equiv \left \{ x \in \Gamma \mid \operatorname {N}(x) = 1 \right \} \] where \[ \operatorname {N}(x) \equiv x \bar {x}. \]

definition. Spin group [pr-spin] [spin-0012]
spinGroup
L∃∀N

The Spin group is defined as \[ \operatorname {Spin}(Q) \equiv \operatorname {Pin}(Q) \sqcap \mathcal {C}\kern -2pt\ell ^{+}(Q) \] where \(\mathcal {C}\kern -2pt\ell ^{+}(Q)\) is the even subalgebra of the Clifford algebra.

Appendix: Many faces of Spin group [spin-000Z]

Definitions coming from different sources are simply quoted here with minimal modifications, to include immediate prerequisites, and omit some discussions or theorems.

They are not classified, ordered, or pruned by similarity.

definition. Spin group [lawson2016spin] [spin-0002]

Let \(V\) be a vector space over the commutative field \(k\) and suppose \(q\) is a quadratic form on \(V\).

We now consider the multiplicative group of units in the Clifford algebra \(C \ell (V, q)\) associated to \(V\), which is defined to be the subset \[ C \ell ^{\times }(V, q) \equiv \left \{\varphi \in C \ell (V, q): \exists \varphi ^{-1} \text { with } \varphi ^{-1} \varphi =\varphi \varphi ^{-1}=1\right \} \]

This group contains all elements \(v \in V\) with \(q(v) \neq 0\).

The group of units always acts naturally as automorphisms of the algebra. That is, there is a homomorphism \[ \mathrm {Ad}: \mathrm {C} \ell ^{\times }(V, q) \longrightarrow \operatorname {Aut}(\mathrm {C} \ell (V, q)) \] called the adjoint representation, which is given by \[ \operatorname {Ad}_{\varphi }(x) \equiv \varphi \times \varphi ^{-1} \] The Pin group of \((V, q)\) is the subgroup \(\operatorname {Pin}(V, q)\) of \(\mathrm {P}(V, q)\) generated by the elements \(v \in V\) with \(q(v)= \pm 1\). The associated Spin group of \((V, q)\) is defined by \[ \operatorname {Spin}(V, q)=\operatorname {Pin}(V, q) \cap C \ell ^0(V, q) . \]

definition. Spin group [wiki2024clifford] [spin-0003]

The pin group \(\operatorname {Pin}_V(K)\) is the subgroup of the Lipschitz group \(\Gamma \) of elements of spinor norm 1, and similarly the spin group \(\operatorname {Spin}_V(K)\) is the subgroup of elements of Dickson invariant 0 in \(\operatorname {Pin}_V(K)\).

definition. Spin group [li2008invariant] [spin-0004]

A versor refers to a Clifford monomial composed of invertible vectors. It is called a rotor, or spinor, if the number of vectors is even. It is called a unit versor if its magnitude is 1.

All versors in \(\mathcal {C L}\left (\mathcal {V}^n\right )\) form a group under the geometric product, called the versor group, also known as the Clifford group, or Lipschitz group. All rotors form a subgroup, called the rotor group. All unit versors form a subgroup, called the pin group, and all unit rotors form a subgroup, called the spin group, denoted by \(\operatorname {Spin}\left (\mathcal {V}^n\right )\).

definition. Spin group [sommer2013geometric] [spin-0005]

The Clifford group \(\Gamma _{\mathrm {p}, \mathrm {q}}\) of a Clifford algebra \(\mathcal {C}_{p, q}\) is defined as \[ \Gamma _{\mathrm {p}, \mathrm {q}}:=\left \{s \in \mathcal {C}_{p, q} \mid \forall x \in \mathbb {R}_{p, q}: s x \hat {s}^{-1} \in \mathbb {R}_{p, q}\right \} . \] From that definition we get immediately \[ \Gamma _{\mathrm {p}, \mathrm {q}} \times \mathbb {R}_{p, q} \rightarrow \mathbb {R}_{p, q} ; \quad (s, x) \mapsto s x \hat {s}^{-1} \] as the operation of the Clifford group \(\Gamma _{\mathrm {p}, \mathrm {q}}\) on \(\mathbb {R}_{p, q}\). \(\Gamma _{\mathrm {p}, \mathrm {q}}\) is a multiple cover of the orthogonal group \(O(p, q)\). However, it is still unnecessarily large. Therefore, we first reduce \(\Gamma _{\mathrm {p}, \mathrm {q}}\) to a two-fold cover of \(O(p, q)\) by defining the so-called Pin group \[ \operatorname {Pin}(\mathrm {p}, \mathrm {q}):=\left \{s \in \Gamma _{\mathrm {p}, \mathrm {q}} \mid s \tilde {s}= \pm 1\right \} . \] The even elements of \(\operatorname {Pin}(p, q)\) form the spin group \[ \operatorname {Spin}(\mathrm {p}, \mathrm {q}):=\operatorname {Pin}(\mathrm {p}, \mathrm {q}) \cap \mathcal {C}_{p, q}^{+} \] which is a double cover of the special orthogonal group \(S O(p, q)\). Finally, those elements of \(\operatorname {Spin}(\mathrm {p}, \mathrm {q})\) with Clifford norm equal 1 form a further subgroup \[ \operatorname {Spin}_{+}(\mathrm {p}, \mathrm {q}):=\{s \in \operatorname {Spin}(\mathrm {p}, \mathrm {q}) \mid s \tilde {s}=1\} \] that covers \(\mathrm {SO}_{+}(p, q)\) twice. Thereby, \(\mathrm {SO}_{+}(p, q)\) is the connected component of the identity of \(O(p, q)\).

definition. Spin group [perwass2009geometric] [spin-0006]

A versor is a multivector that can be expressed as the geometric product of a number of non-null 1-vectors. That is, a versor \(\boldsymbol {V}\) can be written as \(\boldsymbol {V}=\prod _{i=1}^k \boldsymbol {n}_i\), where \(\left \{\boldsymbol {n}_1, \ldots , \boldsymbol {n}_k\right \} \subset \mathbb {G}_{p, q}^{\varnothing 1}\) with \(k \in \mathbb {N}^{+}\), is a set of not necessarily linearly independent vectors.

The subset of versors of \(\mathbb {G}_{p, q}\) together with the geometric product, forms a group, the Clifford group, denoted by \(\mathfrak {G}_{p, q}\).

A versor \(\boldsymbol {V} \in \mathfrak {G}_{p, q}\) is called unitary if \(\boldsymbol {V}^{-1}=\tilde {\boldsymbol {V}}\), i.e. \(\boldsymbol {V} \widetilde {\boldsymbol {V}}=+1\).

The set of unitary versors of \(\mathfrak {G}_{p, q}\) forms a subgroup \(\mathfrak {P}_{p, q}\) of the Clifford group \(\mathfrak {G}_{p, q}\), called the pin group.

A versor \(\boldsymbol {V} \in \mathfrak {G}_{p, q}\) is called a spinor if it is unitary \((\boldsymbol {V} \tilde {\boldsymbol {V}}=1)\) and can be expressed as the geometric product of an even number of 1-vectors. This implies that a spinor is a linear combination of blades of even grade.

The set of spinors of \(\mathfrak {G}_{p, q}\) forms a subgroup of the pin group \(\mathfrak {P}_{p, q}\), called the spin group, which is denoted by \(\mathfrak {S}_{p, q}\).

definition. Spin group [jadczyk2019notes] [spin-0007]

We define the Clifford group \(\Gamma =\Gamma (q)\) to be the group of all invertible elements \(u \in \mathrm {Cl}(q)\) which have the property that \(uyu^{-1}\) is in \(M\) whenever \(y\) is in \(M\). We define \(\Gamma (q)^{ \pm }\)as the intersection of \(\Gamma (q)\) and \(\mathrm {Cl}(q)_{ \pm }\).

For every element \(u \in \Gamma (q)\) we define the spinor norm \(N(u)\) by the formula \[ N(u)=\tau (u) u, \] where \(\tau \) is the main involution of the Clifford algebra \(\mathrm {Cl}(q)\). The following groups are called spin groups: \[ \begin {aligned} & \operatorname {Pin}(q):=\left \{s \in \Gamma (q)^{+} \cup \Gamma (q)^{-}: N(s)= \pm 1\right \} \\ & \operatorname {Spin}(q):=\left \{s \in \Gamma (q)^{+}: N(s)= \pm 1\right \} \\ & \operatorname {Spin}^{+}(q):=\left \{s \in \Gamma (q)^{+}: N(s)=+1\right \} . \end {aligned} \]

definition. Spin group [garling2011clifford] [spin-0008]

Suppose that \((E, q)\) is a regular quadratic space. We consider the action of \(\mathcal {G}(E, q)\) on \(\mathcal {A}(E, q)\) by adjoint conjugation. We set \[ A d_g^{\prime }(a)=g a g^{-1}, \] for \(g \in \mathcal {G}(E, q)\) and \(a \in \mathcal {A}(E, q)\). We restrict attention to those elements of \(\mathcal {G}(E, q)\) which stabilize \(E\). The Clifford group \(\Gamma =\Gamma (E, q)\) is defined as \[ \left \{g \in \mathcal {G}(E, q): A d_g^{\prime }(x) \in E \text { for } x \in E\right \} . \] If \(g \in \Gamma (E, q)\), we set \(\alpha (g)(x)=A d_g^{\prime }(x)\). Then \(\alpha (g) \in G L(E)\), and \(\alpha \) is a homomorphism of \(\Gamma \) into \(G L(E) . \alpha \) is called the graded vector representation of \(\Gamma \). It is customary to scale the elements of \(\Gamma (E, q)\); we set \[ \begin {aligned} \operatorname {Pin}(E, q) & =\{g \in \Gamma (E, q): \Delta (g)= \pm 1\}, \\ \Gamma _1(E, q) & =\{g \in \Gamma (E, q): \Delta (g)=1\} . \end {aligned} \] If \((E, q)\) is a Euclidean space, then \(\operatorname {Pin}(E, q)=\Gamma _1(E, q)\); otherwise, \(\Gamma _1(E, q)\) is a subgroup of \(\operatorname {Pin}(E, q)\) of index 2. We have a short exact sequence \[ 1 \longrightarrow D_2 \xrightarrow {\subseteq } \operatorname {Pin}(E, q) \xrightarrow {\alpha } O(E, q) \longrightarrow 1 ; \] \(\operatorname {Pin}(E, q)\) is a double cover of \(O(E, q)\). In fact there is more interest in the subgroup \(\operatorname {Spin}(E, q)\) of \(\operatorname {Pin}(E, q)\) consisting of products of an even number of unit vectors in \(E\). Thus \(\operatorname {Spin}(E, q)=\operatorname {Pin}(E, q) \cap \mathcal {A}^{+}(E, q)\) and \[ \operatorname {Spin}(E, q)=\left \{g \in \mathcal {A}^{+}(E, q): g E=E g \text { and } \Delta (g)= \pm 1\right \} . \] If \(x, y\) are unit vectors in \(E\) then \(\alpha (x y)=\alpha (x) \alpha (y) \in S O(E, q)\), so that \(\alpha (\operatorname {Spin}(E, q)) \subseteq S O(E, q)\). Conversely, every element of \(S O(E, q)\) is the product of an even number of simple reflections, and so \(S O(E, q) \subseteq \alpha \left (\operatorname {Spin}(E, q)\right )\). Thus \(\alpha \left (\operatorname {Spin}(E, q)\right )=S O(E, q)\), and we have a short exact sequence. \[ 1 \longrightarrow D_2 \xrightarrow {\subseteq } \operatorname {Spin}(E, q) \xrightarrow {\alpha } S O(E, q) \longrightarrow 1 ; \] \(\operatorname {Spin}(E, q)\) is a double cover of \(S O(E, q)\). Note also that if \(a \in \operatorname {Spin}(E, q)\) and \(x \in E\) then \(\alpha (a)(x)=a x a^{-1}\); conjugation and adjoint conjugation by elements of \(\operatorname {Spin}(E, q)\) are the same.

definition. Spin group [meinrenken2009lie] [spin-0009]

Recall that \(\Pi : \mathrm {Cl}(V) \rightarrow \) \(\mathrm {Cl}(V), x \mapsto (-1)^{|x|} x\) denotes the parity automorphism of the Clifford algebra. Let \(\mathrm {Cl}(V)^{\times }\)be the group of invertible elements in \(\mathrm {Cl}(V)\).

The Clifford group \(\Gamma (V)\) is the subgroup of \(\mathrm {Cl}(V)^{\times }\), consisting of all \(x \in \mathrm {Cl}(V)^{\times }\) such that \(A_x(v):=\Pi (x) v x^{-1} \in V\) for all \(v \in V \subset \mathrm {Cl}(V)\).

Hence, by definition the Clifford group comes with a natural representation, \(\Gamma (V) \rightarrow \mathrm {GL}(V), x \mapsto A_x\). Let \(S \Gamma (V)=\Gamma (V) \cap \mathrm {Cl}^{\overline {0}}(V)^{\times }\)denote the special Clifford group.

The canonical representation of the Clifford group takes values in \(\mathrm {O}(V)\), and defines an exact sequence, \[ 1 \longrightarrow \mathbb {K}^{\times } \longrightarrow \Gamma (V) \longrightarrow \mathrm {O}(V) \longrightarrow 1 . \] It restricts to a similar exact sequence for the special Clifford group, \[ 1 \longrightarrow \mathbb {K}^{\times } \longrightarrow S \Gamma (V) \longrightarrow \mathrm {SO}(V) \longrightarrow 1 . \] The elements of \(\Gamma (V)\) are all products \(x=v_1 \cdots v_k\) where \(v_1, \ldots , v_k \in V\) are non-isotropic. \(S \Gamma (V)\) consists of similar products, with \(k\) even. The corresponding element \(A_x\) is a product of reflections: \[ A_{v_1 \cdots v_k}=R_{v_1} \cdots R_{v_k} . \]

Suppose \(\mathbb {K}=\mathbb {R}\). The Pin group \(\operatorname {Pin}(V)\) is the preimage of \(\{1,-1\}\) under the norm homomorphism \(N: \Gamma (V) \rightarrow \mathbb {K}^{\times }\). Its intersection with \(S \Gamma (V)\) is called the Spin group, and is denoted \(\operatorname {Spin}(V)\).

Since \(N(\lambda )=\lambda ^2\) for \(\lambda \in \mathbb {K}^{\times }\), the only scalars in \(\operatorname {Pin}(V)\) are \(\pm 1\). Hence, the exact sequence for the Clifford group restricts to an exact sequence, \[ 1 \longrightarrow \mathbb {Z}_2 \longrightarrow \operatorname {Pin}(V) \longrightarrow \mathrm {O}(V) \longrightarrow 1, \] so that \(\operatorname {Pin}(V)\) is a double cover of \(\mathrm {O}(V)\). Similarly, \[ 1 \longrightarrow \mathbb {Z}_2 \longrightarrow \operatorname {Spin}(V) \longrightarrow \mathrm {SO}(V) \longrightarrow 1, \] defines a double cover of \(\mathrm {SO}(V)\). Elements in \(\operatorname {Pin}(V)\) are products \(x=\) \(v_1 \cdots v_k\) with \(B\left (v_i, v_i\right )= \pm 1\). The group \(\operatorname {Spin}(V)\) consists of similar products, with \(k\) even.

definition. Spin group [weber2013lie] [spin-000A]

The "group of units" in \(C l_{p, q}\), denoted \(C l_{p, q}^{\times }\), is the group of all invertible elements.

An important subgroup of \(C l^{\times }(V, q)\) is the group \(P(V, q)\) generated by elements \(v \in V\) with \(q(v) \neq 0\). Quotienting out by constants, we obtain the Pin group. Specifically, \(\operatorname {Pin}(V, q)\) (or \(\operatorname {Pin}_{p, q}\) ) is the group generated by elements \(v \in V\) with \(q(v)= \pm 1\). Further define the spin groups to be \[ \operatorname {Spin}(V, q)=\operatorname {Pin}(V, q) \cap C l^0(V, q) . \]

definition. Spin group [bar2011spin] [spin-000B]

We define the Pin \(\operatorname {group} \operatorname {Pin}(\boldsymbol {n})\) by \[ \operatorname {Pin}(n):=\left \{v_1 \cdot \ldots \cdot v_m \in \mathrm {Cl}_n \mid v_j \in S^{n-1} \subset \mathbb {R}^n, m \in \mathbb {N}_0\right \} \] We define the Spin group \(\operatorname {Spin}(\boldsymbol {n})\) by \[ \begin {aligned} \operatorname {Spin}(n) & :=\operatorname {Pin}(n) \cap \mathrm {Cl}_n^0 \\ & =\left \{v_1 \cdot \ldots \cdot v_m \in \mathrm {Cl}_n \mid v_j \in S^{n-1}, m \in 2 \mathbb {N}_0\right \} \end {aligned} \]

definition. Spin group [woit2012lie] [spin-000C]

April 29, 2024

There are several equivalent possible ways to go about defining the \(\operatorname {Spin}(n)\) groups as groups of invertible elements in the Clifford algebra.

1. One can define \(\operatorname {Spin}(n)\) in terms of invertible elements \(\tilde {g}\) of \(C_{\text {even }}(n)\) that leave the space \(V=\mathbf {R}^n\) invariant under conjugation \[ \tilde {g} V \tilde {g}^{-1} \subset V \] 2. One can show that, for \(v, w \in V\), \[ v \rightarrow v-2 \frac {Q(v, w)}{Q(w, w)} w=-w v w / Q(w, w)=w v w^{-1} \] is reflection in the hyperplane perpendicular to \(w\). Then \(\operatorname {Pin}(n)\) is defined as the group generated by such reflections with \(||w||^2=1\) . \(\operatorname {Spin}(n)\) is the subgroup of \(\operatorname {Pin}(n)\) of even elements. Any rotation can be implemented as an even number of reflections (Cartan-Dieudonné) theorem.

3. One can define the Lie algebra of \(\operatorname {Spin}(n)\) in terms of quadratic elements of the Clifford algebra.

definition. Spin group [liu2016lie] [spin-000D]

The space of quadratic vectors in \(\mathrm {Cl}\) is the Lie algebra of \(\mathrm {SO}(n)\). The corresponding Lie group, called the Spin group \(\operatorname {Spin}(Q)\), is the set of invertible elements \(x \in \mathrm {Cl}\) that preserve \(V\) under \(v \mapsto x v x^{-1}\). Clearly this map is in \(\mathrm {SO}(V, Q)\) since it preserves the quadratic form \(Q\), and is a two-fold cover with kernel \(\pm 1\).

definition. Spin group [figueroa2010spin] [spin-000E]

The Pin group Pin \((V)\) of \((V, Q)\) is the subgroup of (the group of units of) \(C \ell (V)\) generated by \(v \in \mathrm {V}\) with \(\mathrm {Q}(v)= \pm 1\). In other words, every element of \(\operatorname {Pin}(\mathrm {V})\) is of the form \(u_1 \cdots u_r\) where \(u_i \in \mathrm {V}\) and \(\mathrm {Q}\left (u_i\right )= \pm 1\). We will write \(\operatorname {Pin}(s, t)\) for \(\operatorname {Pin}\left (\mathbb {R}^{s, t}\right )\) and \(\operatorname {Pin}(n)\) for \(\operatorname {Pin}(n, 0)\).

The spin group of \((V, Q)\) is the intersection \[ \operatorname {Spin}(V)=\operatorname {Pin}(V) \cap C \ell (V)_0 . \] Equivalently, it consists of elements \(u_1 \cdots u_{2 p}\), where \(u_i \in \mathrm {V}\) and \(\mathrm {Q}\left (u_i\right )= \pm 1\). We will write \(\operatorname {Spin}(s, t)\) for \(\operatorname {Spin}\left (\mathbb {R}^{s, t}\right )\) and \(\operatorname {Spin}(n)\) for \(\operatorname {Spin}(n, 0)\).

definition. Spin group [lundholm2009clifford] [spin-000F]

We identify the following groups embedded in \(\mathcal {G}\) : \[ \begin {aligned} \mathcal {G}^{\times } &:=\{x \in \mathcal {G}: \exists y \in \mathcal {G}: x y=y x=1\} & \text { the group of all invertible elements } \\ \tilde {\Gamma } &:=\left \{x \in \mathcal {G}^{\times }: x^{\star } V x^{-1} \subseteq V\right \} & \text { the Lipschitz group } \\ \Gamma &:=\left \{v_1 v_2 \ldots v_k \in \mathcal {G}: v_i \in V^{\times }\right \} & \text { the versor group } \\ \text { Pin } &:=\left \{x \in \Gamma : x x^{\dagger } \in \{-1,1\}\right \} & \text { the group of unit versors } \\ \text { Spin } &:=\operatorname {Pin} \cap \mathcal {G}^{+} & \text { the group of even unit versors } \\ \operatorname {Spin}^{+} &:=\left \{x \in \operatorname {Spin}: x x^{\dagger }=1\right \} & \text { the rotor group } \\ \end {aligned} \] In the above, \(V^{\times }:=\left \{v \in V: v^2 \neq 0\right \}\) denotes the set of all invertible vectors.

definition. Spin group [renaud2020clifford] [spin-000G]

The Clifford group \(\Gamma (p, q)\) is the (multiplicative) group generated by invertible 1-vectors in \(R^{p, q}\).

The Pin group \(\operatorname {Pin}(p, q)\). \[ \operatorname {Pin}(p, q)=\{g \in \Gamma (p, q): g \widetilde {g}= \pm 1\} . \] So if \(g \in \operatorname {Pin}(p, q), \widetilde {g}\) is a scalar multiple of \(g^{-1}\). This is not true for arbitrary elements of \(\Gamma (p, q)\). The Spin group \(\operatorname {Spin}(p, q)\). This is the subgroup of \(\operatorname {Pin}(p, q)\) consisting of even elements only, i.e. \[ \operatorname {Spin}(p, q)=\operatorname {Pin}(p, q) \cap C l^{+}(p, q) . \] The \(\operatorname {Spin}\) group \(\operatorname {Spin}(p, q)\) has the further subgroup \[ \operatorname {Spin}^{\dagger }(p, q)=\{g \in \operatorname {Spin}(p, q): g \widetilde {g}=+1\} . \] \(\operatorname {Pin}(p, q)\), \(\operatorname {Spin}(p, q)\) and \(\operatorname {Spin}^{\dagger }(p, q)\) are respectively the two-fold covering groups of \(O(p, q), S O(p, q)\) and \(S O^{\dagger }(p, q)\) (where \(S O^{\dagger }(p, q)\) is the connected component of \(S O(p, q)\) ).

definition. Spin group [dutailly2018clifford] [spin-000H]

The Spin group \(\operatorname {Spin}(F, \rho )\) of \(C l(F, \rho )\) is the subset of \(C l(F, \rho )\) whose elements can be written as the product \(g=u_1 \cdot \ldots \cdot u_{2 p}\) of an even number of vectors of \(F\) of norm \(\left \langle u_k, u_k\right \rangle =1\).

As a consequence : \(\langle g, g\rangle =1, g^t \cdot g=1\) and \(\operatorname {Spin}(F, \rho ) \subset O(C l)\).

The scalars \(\pm 1\) belong to the Spin group. The identity is \(+1 . \operatorname {Spin}(F, \rho )\) is a connected Lie group.

definition. Spin group [hitzer2012introduction] [spin-000I]

A versor refers to a Clifford monomial (product expression) composed of invertible vectors. It is called a rotor, or spinor, if the number of vectors is even. It is called a unit versor if its magnitude is 1.

Every versor \(A=a_1 \ldots a_r, \quad a_1, \ldots , a_r \in \mathbb {R}^2, r \in \mathbb {N}\) has an inverse \[ A^{-1}=a_r^{-1} \ldots a_1^{-1}=a_r \ldots a_1 /\left (a_1^2 \ldots a_r^2\right ), \] such that \[ A A^{-1}=A^{-1} A=1 . \]

This makes the set of all versors in \(C l(2,0)\) a group, the so called Lipschitz group with symbol \(\Gamma (2,0)\), also called Clifford group or versor group. Versor transformations apply via outermorphisms to all elements of a Clifford algebra. It is the group of all reflections and rotations of \(\mathbb {R}^2\).

The normalized subgroup of versors is called pin group \[ \operatorname {Pin}(2,0)=\{A \in \Gamma (2,0) \mid A \widetilde {A}= \pm 1\} . \] In the case of \(C l(2,0)\) we have \[ \begin {aligned} & \operatorname {Pin}(2,0) \\ & =\left \{a \in \mathbb {R}^2 \mid a^2=1\right \} \cup \left \{A \mid A=\cos \varphi +e_{12} \sin \varphi , \varphi \in \mathbb {R}\right \} . \end {aligned} \] The pin group has an even subgroup, called spin group \[ \operatorname {Spin}(2,0)=\operatorname {Pin}(2,0) \cap C l^{+}(2,0) . \] The spin group has in general a spin plus subgroup \[\operatorname {Spin}_{+}(2,0)=\{A \in \operatorname {Spin}(2,0) \mid A \widetilde {A}=+1\}.\]

definition. Spin group [hahn2004clifford] [spin-000J]

We continue to let \(F\) be a field of characteristic not 2 and \(M\) a quadratic space over \(F\).

Recall that \(\gamma : M \rightarrow C(M)\) is injective and that there is a unique involution - on \(C(M)\) taking \(\gamma x\) to \(\gamma x\) for all \(x\). Consider \(M\) to be a subset of \(C(M)\) via \(\gamma \), and define the group \(\operatorname {Spin}(M)\) by \[ \operatorname {Spin}(M)=\left \{c \in C_0(M)^{\times } \mid c M c^{-1}=M, c \bar {c}=1_C\right \}, \] where \(C_0(M)^{\times }\)is the group of invertible elements of the \(\operatorname {ring} C_0(M)\). The isometries from \(M\) onto \(M\) constitute the orthogonal group \(O(M)\) and \(S O(M)\) is the subgroup of elements of determinant 1. For \(c\) in \(\operatorname {Spin}(M)\), define \[ \pi c: M \rightarrow M \] by \(\pi c(x)=c x c^{-1}\). This provides a homomorphism \[ \pi : \operatorname {Spin}(M) \rightarrow S O(M) . \] By a theorem of Cartan and Dieudonné, any element \(\sigma \) in \(O(M)\) is a product \(\sigma =\tau _{y_1} \cdots \tau _{y_k}\) of hyperplane reflections \(\tau _{y_i}\). The assignment \(\Theta (\sigma )=\) \(q\left (y_1\right ) \cdots q\left (y_k\right )\left (F^{\times }\right )^2\) defines the spinor norm homomorphism \[ \Theta : S O(M) \rightarrow F^{\times } /\left (F^{\times }\right )^2 . \]

definition. Spin group [porteous1995clifford] [spin-000K]

Let \(g\) be an invertible element of a universal Clifford algebra \(A\) such that, for each \(x \in X, g x \widehat {g}^{-1} \in X\). Then the map \[ \rho _{X, g}: x \mapsto g x \widehat {g}^{-1} \] is an orthogonal automorphism of \(X\).

The element \(g\) will be said to induce or represent the orthogonal transformation \(\rho _{X, g}\) and the set of all such elements \(g\) will be denoted by \(\Gamma (X)\) or simply by \(\Gamma \).

The subset \(\Gamma \) is a subgroup of \(A\).

The group \(\Gamma \) is called the Clifford group (or Lipschitz group) for \(X\) in the Clifford algebra \(A\). Since the universal algebra \(A\) is uniquely defined up to isomorphism, \(\Gamma \) is also uniquely defined up to isomorphism.

An element \(g\) of \(\Gamma (X)\) represents a rotation of \(X\) if and only if \(g\) is the product of an even number of elements of \(X\). The set of such elements will be denoted by \(\Gamma ^0=\Gamma ^0(X)\). An element \(g\) of \(\Gamma \) represents an anti-rotation of \(X\) if and only if \(g\) is the product of an odd number of elements of \(X\). The set of such elements will be denoted by \(\Gamma ^1=\Gamma ^1(X)\). Clearly, \(\Gamma ^0=\Gamma \cap A^0\) is a subgroup of \(\Gamma \), while \(\Gamma ^1=\Gamma \cap A^1\).

The Clifford group \(\Gamma (X)\) of a quadratic space \(X\) is larger than is necessary if our interest is in representing orthogonal transformations of \(X\). Use of a quadratic norm \(N\) on the Clifford algebra \(A\) leads to the definition of subgroups of \(\Gamma \) that are less redundant for this purpose. This quadratic norm \(N: A \rightarrow A\) is defined by the formula \[ N(a)=a^{-} a, \text { for any } a \in A, \] For \(X\) and \(\Gamma =\Gamma (X)\) as above we now define \[ \operatorname {Pin} X=\{g \in \Gamma :|N(g)|=1\} \text { and } \operatorname {Spin} X=\left \{g \in \Gamma ^0:|N(g)|=1\right \} \text {. } \]

definition. Spin group [rosen2019geometric] [spin-000L]

Let \(V\) be an inner product space. We denote by \(\Delta V\) the standard Clifford algebra \((\wedge V,+, \Delta )\) defined by the Clifford product \(\Delta \) on the space of multivectors in \(V\).

Let \(V\) be an inner product space. The Clifford cone of \(V\) is the multiplicative group \(\widehat {\triangle } V \subset \triangle V\) generated by nonsingular vectors, that is, vectors \(v\) such that \(\langle v\rangle ^2 \neq 0\). More precisely, \(q \in \widehat {\triangle } V\) if there are finitely many nonsingular vectors \(v_1, \ldots , v_k \in V\) such that \[ q=v_1 \Delta \cdots \Delta v_k . \]

Let \(w \in \triangle V\). Then \(w \in \widehat {\triangle } V\) if and only if \(w\) is invertible and \(\widehat {w} v w^{-1} \in V\) for all \(v \in V\).

In this case \(w\) can be written as a product of at most \(\operatorname {dim} V\) nonsingular vectors, and \(\bar {w} w=w \bar {w} \in \mathbf {R} \backslash \{0\}\).

Let \(V\) be an inner product space. Define the orthogonal, special orthogonal, pin, and spin groups \[ \begin {aligned} \mathrm {O}(V) & :=\{\text { isometries } T: V \rightarrow V\} \subset \mathcal {L}(V), \\ \mathrm {SO}(V) & :=\{T \in \mathrm {O}(V) ; \operatorname {det} T=+1\} \subset \mathcal {L}(V), \\ \operatorname {Pin}(V) & :=\left \{q \in \widehat {\triangle } V ;\langle q\rangle ^2= \pm 1\right \} \subset \triangle V, \\ \operatorname {Spin}(V) & :=\left \{q \in \operatorname {Pin}(V) ; q \in \triangle ^{\mathrm {ev}} V\right \} \subset \triangle ^{\mathrm {ev}} V . \end {aligned} \] We call \(T \in \operatorname {SO}(V)\) a rotation and we call \(q \in \operatorname {Spin}(V)\) a rotor.

definition. Spin group [ruhe2024clifford] [spin-000M]

Motivation E. 39 (The problem of generalizing the definition of the Spin group). For a positive definite quadratic form \(\mathfrak {q}\) on the real vector space \(V=\mathbb {R}^n\) with \(n \geq 3\) the \(\operatorname {Spin}\) group \(\operatorname {Spin}(n)\) is defined via the kernel of the Spinor norm (=extended quadratic form on \(\mathrm {Cl}(V, \mathfrak {q})\) ) restricted to the special Clifford group \(\Gamma ^{[0]}(V, \mathfrak {q})\) : \[ \operatorname {Spin}(n):=\operatorname {ker}\left (\overline {\mathfrak {q}}: \Gamma ^{[0]}(V, \mathfrak {q}) \rightarrow \mathbb {R}^{\times }\right )=\left \{w \in \Gamma ^{[0]}(V, \mathfrak {q}) \mid \overline {\mathfrak {q}}(w)=1\right \}=\left .\overline {\mathfrak {q}}\right |_{\Gamma ^{[0]}(V, \mathfrak {q})} ^{-1}(1) . \] \(\operatorname {Spin}(n)\) is thus a normal subgroup of the special Clifford group \(\Gamma ^{[0]}(V, \mathfrak {q})\), and, as it turns out, a double cover of the special orthogonal group \(\mathrm {SO}(n)\) via the twisted conjugation \(\rho \). The latter can be summarized by the short exact sequence: \[ 1 \longrightarrow \{ \pm 1\} \xrightarrow {\text { incl }} \operatorname {Spin}(n) \xrightarrow {\rho } \mathrm {SO}(n) \longrightarrow 1 . \]

We intend to generalize this in several directions: 1. from Spin to Pin group, 2. from \(\mathbb {R}^n\) to vector spaces \(V\) over general fields \(\mathbb {F}\) with \(\operatorname {ch}(\mathbb {F}) \neq 2\), 3. from non-degenerate to degenerate quadratic forms \(\mathfrak {q}\), 4. from positive (semi-)definite to non-definite quadratic forms \(\mathfrak {q}\). This comes with several challenges and ambiguities.

Definition E. 40 (The real Pin group and the real Spin group). Let \(V\) be a finite dimensional \(\mathbb {R}\)-vector space \(V, \operatorname {dim} V=n<\infty \), and \(\mathfrak {q} a\) (possibly degenerate) quadratic form on \(V\). We define the (real) Pin group and (real) Spin group, resp., of \((V, \mathfrak {q})\) as the following subquotients of the Clifford group. \(\Gamma (V, \mathfrak {q})\) and its even parity part \(\Gamma ^{[0]}(V, \mathfrak {q})\), resp.: \[ \begin {aligned} \operatorname {Pin}(V, \mathfrak {q}) & :=\{x \in \Gamma (V, \mathfrak {q}) \mid \overline {\mathfrak {q}}(x) \in \{ \pm 1\}\} / \bigwedge ^{[*]}(\mathcal {R}) \\ \operatorname {Spin}_{\infty }(V, \mathfrak {q}) & :=\left \{x \in \Gamma ^{[0]}(V, \mathfrak {q}) \mid \overline {\mathfrak {q}}(x) \in \{ \pm 1\}\right \} / \bigwedge ^{[*]}(\mathcal {R}) \end {aligned} \] Corollary E.41. Let \((V, \mathfrak {q})\) be a finite dimensional quadratic vector space over \(\mathbb {R}\). Then the twisted conjugation induces a well-defined and surjective group homomorphism onto the group of radical preserving orthogonal automorphisms of \((V, \mathfrak {q})\) : \[ \rho : \operatorname {Pin}(V, \mathfrak {q}) \rightarrow \mathrm {O}_{\mathcal {R}}(V, \mathfrak {q}), \] with kernel: \[ \operatorname {ker}\left (\rho : \underset {\sim \text { in }}{\operatorname {Pin}}(V, \mathfrak {q}) \rightarrow \mathrm {O}_{\mathcal {R}}(V, \mathfrak {q})\right )=\{ \pm 1\} . \] Correspondingly, for the \(\operatorname {Spin}(V, \mathfrak {q})\) group. In short, we have short exact sequences: \[ \begin {aligned} & 1 \longrightarrow \{ \pm 1\} \xrightarrow {\text { incl }} \operatorname {Pin}(V, \mathfrak {q}) \xrightarrow {\rho } \mathrm {O}_{\mathcal {R}}(V, \mathfrak {q}) \longrightarrow 1, \\ & 1 \longrightarrow \{ \pm 1\} \xrightarrow {\text { incl }} \operatorname {Spin}(V, \mathfrak {q}) \xrightarrow {\rho } \operatorname {SO}_{\mathcal {R}}(V, \mathfrak {q}) \longrightarrow 1 . \end {aligned} \]

definition. Spin group [gallier2014clifford] [spin-000N]

Every Clifford algebra \(\mathrm {Cl}(\Phi )\) possesses a canonical anti-automorphism \(t: \mathrm {Cl}(\Phi ) \rightarrow \mathrm {Cl}(\Phi )\) satisfying the properties \[ t(x y)=t(y) t(x), \quad t \circ t=\mathrm {id}, \quad \text { and } \quad t(i(v))=i(v), \] for all \(x, y \in \mathrm {Cl}(\Phi )\) and all \(v \in V\). Furthermore, such an anti-automorphism is unique. Every Clifford algebra \(\mathrm {Cl}(\Phi )\) has a unique canonical automorphism \(\alpha : \mathrm {Cl}(\Phi ) \rightarrow \mathrm {Cl}(\Phi )\) satisfying the properties \[ \alpha \circ \alpha =\mathrm {id}, \quad \text { and } \quad \alpha (i(v))=-i(v), \] for all \(v \in V\). First, we define conjugation on a Clifford algebra \(\mathrm {Cl}(\Phi )\) as the map \[ x \mapsto \bar {x}=t(\alpha (x)) \text { for all } x \in \mathrm {Cl}(\Phi ) . \] Given a finite dimensional vector space \(V\) and a quadratic form \(\Phi \) on \(V\), the Clifford group of \(\Phi \) is the group \[ \Gamma (\Phi )=\left \{x \in \mathrm {Cl}(\Phi )^* \mid \alpha (x) v x^{-1} \in V \quad \text { for all } v \in V\right \} . \] The map \(N: \mathrm {Cl}(Q) \rightarrow \mathrm {Cl}(Q)\) given by \[ N(x)=x \bar {x} \] is called the norm of \(\mathrm {Cl}(\Phi )\).

We also define the group \(\Gamma ^{+}(\Phi )\), called the special Clifford group, by \[ \Gamma ^{+}(\Phi )=\Gamma (\Phi ) \cap \mathrm {Cl}^0(\Phi ) . \]

We define the pinor group \(\operatorname {Pin}(p, q)\) as the group \[ \operatorname {Pin}(p, q)=\left \{x \in \Gamma _{p, q} \mid N(x)= \pm 1\right \}, \] and the spinor group \(\operatorname {Spin}(p, q)\) as \(\operatorname {Pin}(p, q) \cap \Gamma _{p, q}^{+}\).

The restriction of \(\rho : \Gamma _{p, q} \rightarrow \mathbf {G L}(n)\) to the pinor group \(\operatorname {Pin}(p, q)\) is a surjective homomorphism \(\rho : \mathbf {P i n}(p, q) \rightarrow \mathbf {O}(p, q)\) whose kernel is \(\{-1,1\}\), and the restriction of \(\rho \) to the spinor group \(\mathbf {S p i n}(p, q)\) is a surjective homomorphism \(\rho : \mathbf {S p i n}(p, q) \rightarrow \mathbf {S O}(p, q)\) whose kernel is \(\{-1,1\}\).

Remark: According to Atiyah, Bott and Shapiro, the use of the name \(\operatorname {Pin}(k)\) is a joke due to Jean-Pierre Serre (Atiyah, Bott and Shapiro Clifford modules, page 1).

definition. Spin group [fulton2013representation] [spin-000O]

Instead of defining the spin group as the set of products of certain elements of \(V\), it will be convenient to start with a more abstract definition. Set \[ \operatorname {Spin}(Q)=\left \{x \in C(Q)^{\text {even }}: x \cdot x^*=1 \text { and } x \cdot V \cdot x^* \subset V\right \} \text {. } \] We see from this definition that \(\operatorname {Spin}(Q)\) forms a closed subgroup of the group of units in the (even) Clifford algebra. Any \(x\) in \(\operatorname {Spin}(Q)\) determines an endomorphism \(\rho (x)\) of \(V\) by \[ \rho (x)(v)=x \cdot v \cdot x^*, \quad v \in V . \] Define a larger subgroup, this time of the multiplicative group of \(C(Q)\), by \[ \operatorname {Pin}(Q)=\left \{x \in C(Q): x \cdot x^*=1 \text { and } x \cdot V \cdot x^* \subset V\right \}, \] and define a homomorphism \[ \rho : \operatorname {Pin}(Q) \rightarrow \mathrm {O}(Q), \quad \rho (x)(v)=\alpha (x) \cdot v \cdot x^*, \] where \(\alpha : C(O) \rightarrow C(O)\) is the main involution.

definition. Spin group [wiki2023spin] [spin-000P]

The pin group \(\operatorname {Pin}(V)\) is a subgroup of \(\mathrm {Cl}(V)\) 's Clifford group of all elements of the form \[ v_1 v_2 \cdots v_k \] where each \(v_i \in V\) is of unit length: \(q\left (v_i\right )=1\).

The spin group is then defined as \[ \operatorname {Spin}(V)=\operatorname {Pin}(V) \cap \mathrm {Cl}^{\text {even }}, \] where \(\mathrm {Cl}^{\text {even }}=\mathrm {Cl}^0 \oplus \mathrm {Cl}^2 \oplus \mathrm {Cl}^4 \oplus \cdots \) is the subspace generated by elements that are the product of an even number of vectors. That is, \(\operatorname {Spin}(V)\) consists of all elements of \(\operatorname {Pin}(V)\), given above, with the restriction to \(k\) being an even number. The restriction to the even subspace is key to the formation of two-component (Weyl) spinors, constructed below.

definition. Spin group [nlab2023spin] [spin-000Q]

The Pin group \(\operatorname {Pin}(V ; q)\) of a quadratic vector space, is the subgroup of the group of units in the Clifford algebra \(\mathrm {Cl}(V, q)\) \[ \operatorname {Pin}(V, q) \hookrightarrow \operatorname {GL}_1(\mathrm {Cl}(V, q)) \] on those elements which are multiples \(v_1 \cdots v_n\) of elements \(v_i \in V\) with \(q\left (v_i\right )=1\).

The Spin group \(\operatorname {Spin}(V, q)\) is the further subgroup of \(\operatorname {Pin}(V ; q)\) on those elements which are even number multiples \(v_1 \cdots v_{2 k}\) of elements \(v_i \in V\) with \(q\left (v_i\right )=1\).

definition. Spin group [dereli2010degenerate] [spin-000T]

The group \(\operatorname {Spin}(p, q)\) is defined by \[ \operatorname {Spin}(p, q)=\left \{v_1 \ldots v_m \in \mathcal {C} \ell _{p, q} \mid m \in 2 \mathbb {Z}^{+}, v_i=\sum _{j=1}^{p+q} a_{i j} e_j,\left \langle v_i, v_i\right \rangle =\mp 1,1 \leq i \leq m\right \} \]

definition. Degenerate Spin group [dereli2010degenerate] [spin-000U]

The subset of \(\mathcal {C} \ell _{p, q, r}\) defined by \[ S_{p, q, r}=\left \{s \gamma _1 \ldots \gamma _{p+q}(1+G) \mid s \in \operatorname {Spin}(p, q), \gamma _i=1+e_i \sum _{l=1}^r c_{i l} f_l, G \in \Lambda (f)\right \} \] is a group under the Clifford multiplication where \(\left \langle \cdot , \cdot \right \rangle \) is a symmetric bilinear form on \(\mathbb {R}^{p+q+r}\), \(\left \{e_1, \ldots , e_p, e_{p+1}, \ldots , e_{p+q}, f_1, \ldots , f_r\right \}\) is the algebra basis for the degenerate Clifford algebra \(\mathcal {C} \ell _{p, q, r}=\mathcal {C} \ell \left (\mathbb {R}^n,\langle \rangle \right )\), \(1 \leq i \leq p+q, c_{i l} \in \mathbb {R}\), and \(\Lambda (f)\) is defined by \[ \Lambda (f)=\operatorname {Span}\left \{f_{k_1} \ldots f_{k_j} \mid 1 \leq k_1<k_2<\ldots <k_j \leq r\right \} . \]

translation of Bourbaki on Clifford algebras [uts-000C]
📄PDF

September 2, 2020
Utensil Song

disclaimer

September 2, 2020
Utensil Song

This is a manual translation of a portion of Chapitre 9 Formes sesquilinéaires et formes quadratiques of Bourbaki's Éléments de Mathématique [bourbaki2007algebra] from French to English.

Such a translation is not available eleswhere, to the best of our knowledge. We claim no rights to the original text, and all mistakes in the translation are our own. The translation is for educational purposes only.

So far, we have only translated the portion that is related to Clifford algebras, namely § 9. Clifford Algebra. The translation is still in progress.

editorial remarks

September 2, 2020
Utensil Song

We always keep the original notation, and the numbering of equations, definitions, propositions, sections etc.

Equation labels are placed on the right side of the equations unlike the original text where they are placed on the left side.

We add a proof block for each proof, to make the proof stand out.

For the Web version, the original text can revealed by clicking the "earth" button in the top-right corner.

§ 9. Clifford Algebra 🇫🇷 § 9. Algèbres de Clifford

September 2, 2020
Utensil Song

In this paragraph, we will assume the ring \(A\) to be commutative. We will designate by Q a quadratic form over the \(A\)-module \(E\), and by \(\Phi \) the associated bilinear form (§ 3, \(n^{\circ }\) 4). 🇫🇷 Dans ce paragraphe, nous supposerons l'anneau \(A\) commutatif. Nous désignerons par Q une forme quadratique sur le \(A\)-module \(E\), et par \(\Phi \) la forme bilinéaire associée (§ 3, \(n^{\circ }\) 4).

translation of Bourbaki on Clifford algebras › 1. Definition and universal property of Clifford Algebra 🇫🇷 1. Définition et propriété universelle de l'algèbre de Clifford.

September 2, 2020
Utensil Song

Definition 1. - We call Clifford algebra of \(Q\), denoted \(C(Q)\), the quotient of the tensor algebra \(T(E)\) of the module \(E\) by the two-sided ideal, denoted \(I(Q)\), generated by the elements of the form \(x \otimes x - Q(x) . 1 \quad (x \in E)\). 🇫🇷 Définition 1. - On appelle algèbre de Clifford de \(Q\) et on note \(C(Q)\) l'algèbre quotient de l'algèbre tensorielle \(T(E)\) du module \(E\) par l'idéal bilatère (noté \(I(Q)\)) engendré par les éléments de la forme \(x \otimes x - Q(x) . 1 \quad (x \in E)\).

We will denote by \(\rho _{\tiny Q}\) (or simply \(\rho \) when there is no risk of confusion) the mapping of \(E\) into \(C(Q)\) composed of the canonical mapping of \(E\) into \(T(E)\) and of the canonical mapping \(\sigma \) of \(T(E)\) onto \(C(Q)\); the mapping \(\rho _{\tiny Q}\) is said to be canonical. 🇫🇷 Nous noterons \(\rho _{e}\) (ou simplement \(\rho \) quand aucune confusion n'est à craindre) l'application de E dans \(C(Q)\) composée de l'application canonique de \(E\) dans \(T(E)\) et de l'application canonique \(\sigma \) de \(T(E)\) sur \(C(Q)\); l'application \(\rho _{\tiny Q}\) est dite canonique .

Note that \(C(Q)\) is generated by \(\rho _{\tiny Q}(E)\), and that, for \(x \in E\), we have \[\rho (x)^{2} = Q(x) . 1 ; \tag{1}\] hence, replacing \(x\) by \(x + y\) (\(x, y\) in \(E\)): \[\rho (x) \rho (y) + \rho (y) \rho (x) = \Phi (x, y) . 1 \tag{2}\] 🇫🇷 Remarquons que \(C(Q)\) est engendrée par \(\rho _{\tiny Q}(E)\), et que, pour \(x \in E\), on a \[\rho (x)^{2} = Q(x) . 1 ; \tag{1}\] d'où, en remplaçant \(x\) par \(x + y\) (\(x, y\) dans \(E\)): \[\rho (x) \rho (y) + \rho (y) \rho (x) = \Phi (x, y) . 1 \tag{2}\]

Example: If \(E\) admits a base composed of a single element \(e\), \(T(E)\) is isomorphic to the polynomial algebra \(A[X]\), and \(C(Q)\) is a quadratic extension of \(A\), based on \((1, u)\), where \(u\) is the element \(u = \rho (e)\) and satisfies \(u^{2} = Q(e)\). 🇫🇷 Exemple. Si \(E\) admet une base composée d'un seul élément \(e\), \(T(E)\) est isomorphe à l'algèbre de polynômes \(A[X]\), et \(C(Q)\) est une extension quadratique de \(A\), ayant pour base \((1, u)\), où \(u\) est l'élément \(u = \rho (e)\) et vérifie \(u^{2} = Q(e)\).

We denote \(T^{h}\) the \(h\)-th tensor power \(\bigotimes \limits ^{h} E\) in \(T(E)\), and \(T^{+}\) (resp. \(T^{-}\)) being the sum of the \(T^{h}\) for even (resp. odd) number of \(h\). 🇫🇷 Notons \(T^{h}\) la puissance tensorielle \(h\)-ème \(\bigotimes \limits ^{h} E\) dans \(T(E)\), et soit \(T^{+}\) (resp. \(T^{-}\)) la somme des \(T^{h}\) pour \(h\) pair (resp. impair).

Since \(T(E)\) is a direct sum of \(T^{+}\) and \(T^{-}\), and \(I(Q)\) is generated by elements of \(T^{+}\), \(I(Q)\) is a direct sum of \(T^{+} \cap I(Q)\) and \(T^{-} \cap I(Q)\), and \(C(Q)\) is a direct sum of the two submodules \(C^{+}(Q) = \sigma (T^{+})\) and \(C^{-}(Q) = \sigma (T^{-})\) (also denoted \(C^{+}\) and \(C^{-}\)). The elements of \(C^{+}\) are called even (resp. odd). 🇫🇷 Comme \(T(E)\) est somme directe de \(T^{+}\) et \(T^{-}\), et \(I(Q)\) est engendré par des éléments de \(T^{+}\), \(I(Q)\) est somme directe de \(T^{+} \cap I(Q)\) et \(T^{-} \cap I(Q)\), et \(C(Q)\) est somme directe des deux sous-modules \(C^{+}(Q) = \sigma (T^{+})\) et \(C^{-}(Q) = \sigma (T^{-})\) (que l'on note aussi \(C^{+}\) et \(C^{-}\)). Les éléments de \(C^{+}\) seront dits pairs (resp. impairs).

We have the relations \[C^{+} C^{+} \subset C^{+}, \quad C^{+} C^{-} \subset C^{-}, \quad C^{-} C^{+} \subset C^{-}, \quad C^{-} C^{-} \subset C^{+}. \tag{3}\] In particular \(C^{+}\) is a subalgebra of \(C(Q)\). 🇫🇷 On a les relations \[C^{+} C^{+} \subset C^{+}, \quad C^{+} C^{-} \subset C^{-}, \quad C^{-} C^{+} \subset C^{-}, \quad C^{-} C^{-} \subset C^{+}. \tag{3}\] En particulier \(C^{+}\) est une sous-algèbre de \(C(Q)\).

Proposition 1. - Let \(f\) be a linear map of \(E\) in an algebra \(D\) over \(A\) such that \(f(x)^{2} = Q(x) . 1\) for all \(x \in E\). There is one and only one homomorphism \(\bar {f}\) of \(C(Q)\) into \(D\) such that \(f = \bar {f} \circ \rho _{\tiny Q}\). 🇫🇷 Proposition 1. - Soit \(f\) une application linéaire de \(E\) dans une algèbre \(D\) sur \(A\) telle que \(f(x)^{2} = Q(x) . 1\) pour tout \(x \in E\). Il existe un homomorphisme \(\bar {f}\) et un seul de \(C(Q)\) dans \(D\) tel que \(f = \bar {f} \circ \rho _{\tiny Q}\).

proof.

September 2, 2020
Utensil Song

The uniqueness of \(\bar {f}\) results from the fact that \(C(Q)\) is generated by \(\rho _{\tiny Q}(E)\). Let \(h\) be the unique homomorphism of \(T(E)\) into \(D\) which extends \(f\) (\(h\) is defined by \(h(x_{1} \otimes \cdots \otimes x_{n}) = f(x_{1}) \ldots f(x_{n})\)). We have \[h(x \otimes x - Q(x) . 1) = (f(x)^{2} - Q(x)) . 1 = 0 , \notag\] and then \(h\) cancels on \(I(Q)\) and defines the homomorphism \(\bar {f}\) through the quotient. 🇫🇷 L'unicité de \(\bar {f}\) résulte de ce que \(C(Q)\) est engendrée par \(\rho _{\tiny Q}(E)\). Soit \(h\) l'unique homomorphisme de \(T(E)\) dans \(D\) qui prolonge \(f\) (\(h\) est défini par \(h(x_{1} \otimes \cdots \otimes x_{n}) = f(x_{1}) \ldots f(x_{n})\)). On a \[h(x \otimes x - Q(x) . 1) = (f(x)^{2} - Q(x)) . 1 = 0 , \notag\] et par suite \(h\) s'annule sur \(I(Q)\) et définit par passage au quotient l'homomorphisme \(\bar {f}\) cherché.

Prop. 1 shows that \(C(Q)\) is a solution to a problem of universal (mapping) property ( Ens., chap. IV, § 3, \(n^{\circ }\) 1 ). 🇫🇷 La prop. 1 exprime que \(C(Q)\) est solution d'un problème d'application universelle ( Ens., chap. IV, § 3, \(n^{\circ }\) 1 ).

Let us take in particular for \(D\) the opposite algebra of \(C(Q)\) and for \(f\) the mapping \(\rho \); prop. 1 implies that there is one and only one anti-automorphism \(\beta \) of \(C(Q)\) whose restriction to \(\rho (E)\) is the identity; it is called the main anti-automorphism of \(C(Q)\). It is clear that \(\beta ^{2} = 1\). 🇫🇷 Prenons en particulier pour \(D\) l'algèbre opposée de \(C(Q)\) et pour \(f\) l'application \(\rho \); la prop. 1 entraine qu'il existe un antiautomorphisme \(\beta \) et un seul de \(C(Q)\) dont la restriction à \(\rho (E)\) soit l'identité ; on l'appelle l'antiautomorphisme principal de \(C(Q)\). Il est clair que \(\beta ^{2} = 1\).

On the other hand, let \(Q'\) be a quadratic form over an \(A\)-module \(E'\), and \(f\) a linear map of \(E\) into \(E'\) such that \(Q' \circ f = Q\). We have \(\rho _{\tiny Q'}(f(x))^{2} = Q'(f(x)) . 1 = Q(x) . 1\), and consequently there is one and only one homomorphism \(C(f)\) of \(C(Q)\) into \(C(Q')\) such that \(C(f) \circ \rho _{\tiny Q} = \rho _{\tiny Q'} \circ f\). 🇫🇷 D'autre part, soient \(Q'\) une forme quadratique sur un \(A\)-module \(E'\), et \(f\) une application linéaire de \(E\) dans \(E'\) telle que \(Q' \circ f = Q\). On a \(\rho _{\tiny Q'}(f(x))^{2} = Q'(f(x)) . 1 = Q(x) . 1\), et par suite il existe un homomorphisme \(C(f)\) et un seul de \(C(Q)\) dans \(C(Q')\) tel que \(C(f) \circ \rho _{\tiny Q} = \rho _{\tiny Q'} \circ f\).

If \(f\) is the identity, \(C(f)\) is the identity ; if \(Q'\) is a quadratic form over an \(A\)-module \(E'\), and \(g\) a linear map of \(E'\) into \(E''\) such that \(Q'' \circ g = Q'\), we have \(C(g \circ f) = C(g) \circ C(f)\). When \(E'\) is a submodule of \(E\) and \(f\) is the canonical injection of \(E'\) into \(E\) (so that \(Q'\) is the restriction of \(Q\) to \(E'\)), we say that \(C(f)\) is the canonical homomorphism of \(C(Q')\) into \(C(Q)\). 🇫🇷 Si \(f\) est l'identité, \(C(f)\) est l'identité ; si \(Q'\) est une forme quadratique sur un \(A\)-module \(E'\), et \(g\) une application linéaire de \(E'\) dans \(E''\) telle que \(Q'' \circ g = Q'\), on a \(C(g \circ f) = C(g) \circ C(f)\). Lorsque \(E'\) est un sous-module de \(E\) et \(f\) l'injection canonique de \(E'\) dans \(E\) (de sorte que \(Q'\) est la restriction de \(Q\) à \(E'\)), on dit que \(C(f)\) est l'homomorphisme canonique de \(C(Q')\) dans \(C(Q)\).

Let us take in particular \(Q' = Q\) and for \(f\) the mapping \(x \to -x\); we see that there is an automorphism \(\alpha \) and only one of \(C(Q)\) such that \(\alpha \circ \rho = -\rho \); it is called the main automorphism of \(C(Q)\). It is clear that \(\alpha ^{2} = 1\), and that the restriction of \(\alpha \) to \(C^{+}\) (resp. \(C^{-}\)) is the identity (resp. the mapping \(u \to -u\)). 🇫🇷 Prenons en particulier \(Q' = Q\) et pour \(f\) l'application \(x \to -x\); on voit qu'il existe un automorphisme \(\alpha \) et un seul de \(C(Q)\) tel que \(\alpha \circ \rho = -\rho \); on l'appelle l'automorphisme principal de \(C(Q)\). Il est clair que \(\alpha ^{2} = 1\), et que la restriction de \(\alpha \) à \(C^{+}\) (resp. \(C^{-}\)) est l'identité (resp. l'application \(u \to -u\)).

Proposition 2. - Let \(A'\) be a commutative ring, \(\varphi \) a homomorphism of \(A\) into \(A'\), \(Q'\) the quadratic form over \(E' = A' \otimes _{ A } E\) induced from \(Q\) by extension of scalars (§ 3, \(n^{\circ }\) 4, prop. 3 ). There is one and only one isomorphism \(j\) of the algebra \(A' \otimes _{ A } C ( Q )\) onto \(C(Q')\) such that \(j (1 \otimes \rho _{\tiny Q}(x) )=\rho _{\tiny Q'}(1 \otimes x)\) for all \(x \in E\). 🇫🇷 Proposition 2. - Soient \(A'\) un anneau commutatif, \(\varphi \) un homomorphisme de \(A\) dans \(A'\), \(Q'\) la forme quadratique sur \(E'= A' \otimes _{ A } E\) déduite de \(Q\) par extension des scalaires (§ 3, \(n^{\circ }\) 4, prop. 3 ). Il existe un isomorphisme \(j\) et un seul de l'algèbre \(A' \otimes _{ A } C ( Q )\) sur \(C(Q')\) tel que \(j (1 \otimes \rho _{\tiny Q}(x) )=\rho _{\tiny Q'}(1 \otimes x)\) pour tout \(x \in E\).

proof.

September 2, 2020
Utensil Song

It is sufficient to show that the algebra \(C'\) = \(A' \otimes C(Q)\) and the mapping \(1 \otimes \rho _{\tiny Q}\) of \(E'\) into \(C'\) form a solution to the same problem of universal (mapping) property as \(C(Q')\) and \(\rho _{\tiny Q'}\). 🇫🇷 Il suffit de démontrer que l'algèbre \(C'\) = \(A' \otimes C(Q)\) et l'application \(1 \otimes \rho _{\tiny Q}\) de \(E'\) dans \(C'\) forment une solution du même problème d'application universelle que \(C(Q')\) et \(\rho _{\tiny Q'}\).

Now, let \(D'\) be an algebra on \(A'\), and \(f'\) be a \(A'\)-linear application of \(E'\) in \(D'\) such that \(f'(x')^{2} = Q'(x') . 1\) for all \(x' \in E'\). The mapping \(g: x \to f'(1 \otimes x)\) of \(E\) into \(D'\) (considered as \(A\)-module thanks to the homomorphism \(\varphi \)) is \(A\)-linear, and we have \(g(x)^{2} = Q'(1 \otimes x) . 1 = Q(x) . 1\) for all \(x \in E\). There is thus one and only one \(A\)-homomorphism \(\bar {g}\) of \(C(Q)\) into \(D'\) such that \(\bar {g}(\rho _{\tiny Q}(x)) = f'(1 \otimes x)\). Therefore, there is one and only one \(A'\)-homomorphism \(\bar {f'}\) of \(C'\) into \(D'\) such that \(\bar {f'}(1 \otimes \rho _{\tiny Q }(x)) = f'(1 \otimes x)\) for all \(x \in E\); by linearity, it results that \(\bar {f'}((1 \otimes \rho _{\tiny Q})(x')) = f'(x')\) for all \(x' \in E'\). Q.E.D. 🇫🇷 Or, soient \(D'\) une algèbre sur \(A'\), et \(f'\) une application \(A'\)-linéaire de \(E'\) dans \(D'\) telle que \(f'(x')^{2} = Q'(x') . 1\) pour tout \(x' \in E'\). L'application \(g: x \to f'(1 \otimes x)\) de \(E\) dans \(D'\) (considéré comme \(A\)-module grâce à l'homomorphisme \(\varphi \)) est \(A\)-linéaire, et on a \(g(x)^{2} = Q'(1 \otimes x) . 1 = Q(x) . 1\) pour tout \(x \in E\). Il existe donc un \(A\)-homomorphisme \(\bar {g}\) et un seul de \(C(Q)\) dans \(D'\) tel que \(\bar {g}(\rho _{\tiny Q}(x)) = f'(1 \otimes x)\). Par suite, il existe un \(A'\)-homomorphisme \(\bar {f'}\) et un seul de \(C'\) dans \(D'\) tel que \(\bar {f'}(1 \otimes \rho _{\tiny Q }(x)) = f'(1 \otimes x)\) pour tout \(x \in E\); par linéarité, il en résulte que \(\bar {f'}((1 \otimes \rho _{\tiny Q})(x')) = f'(x')\) pour tout \(x' \in E'\). CQFD.

translation of Bourbaki on Clifford algebras › 2. Some operations in tensor algebra 🇫🇷 2. Quelques opérations dans l'algebre tensorielle.

September 2, 2020
Utensil Song

In this section we will designate by \(e_{x}\) (\(x \in E\)) the linear mapping \(u \rightarrow x \otimes u\) of the tensor algebra \(T( E )\) into itself. 🇫🇷 Dans ce \(n^{\circ }\) nous désignerons par \(e_{x}\) (\(x \in E\)) l'application linéaire \(u \rightarrow x \otimes u\) de l'algèbre tensorielle \(T( E )\) dans elle-même.

Lemma 1. - Let \(f\) be an element of the dual \(E^{*}\) of \(E\). There exists one linear mapping \(i_{f}\) and only one from \(T ( E )\) into itself such that \[i_{f}(1)=0 \tag{4}\] \[i_{f} \circ e_{x} + e_{x} \circ i_{f} = f(x) . I \text {\quad for all \,} x \in E \tag{5}\] (where \(I\) denotes the identical mapping). The mapping \(f \rightarrow i_{f}\) from \(E^{*}\) into \(\mathscr {L}(T(E))\) is linear. We have \(i_{f}\left (T^{n}\right ) \subset T^{n-1},\left (i_{f}\right )^{2}=0,\) and \(i_{f} \circ i_{g}+i_{g} \circ i_{f}=0\) for \(f, g\) in \(E ^{*}\). The mapping \(i_{f}\) is null on the subalgebra of \(T ( E )\) generated by the kernel of \(f\). The ideal \(I(Q)\) is stable by \(i_{f}\); through quotient \(i_{f}\), we define a linear mapping (again \(i_{f}\)) of \(C(Q)\) into itself. 🇫🇷 Lemme 1. - Soit \(f\) un élément du dual \(E^{*}\) de \(E\). Il existe une application linéaire \(i_{f}\) et une seule de \(T ( E )\) dans elle-même telle que \[i_{f}(1)=0 \tag{4}\] \[i_{f} \circ e_{x} + e_{x} \circ i_{f} = f(x) . I \text {\quad pour tout \,} x \in E \tag{5}\] (où \(I\) désigne l'application identique). L'application \(f \rightarrow i_{f}\) de \(E^{*}\) dans \(\mathscr {L}(T(E))\) est linéaire. On a \(i_{f}\left (T^{n}\right ) \subset T^{n-1},\left (i_{f}\right )^{2}=0,\) et \(i_{f} \circ i_{g}+i_{g} \circ i_{f}=0\) pour \(f, g\) dans \(E ^{*}\). L'application \(i_{f}\) est nulle sur la sous-algèbre de \(T ( E )\) engendrée par le noyau de \(f\). L'idéal \(I(Q)\) est stable par \(i_{f}\); par passage au quotient \(i_{f}\), définit donc une application linéaire (notée encore \(i_{f}\) ) de \(C(Q)\) dans elle-mème.

Math notes (early drafts) [uts-000V]

Notes on language models [lm-0001]

This is a placeholder root note, for now, see opening thoughts.

drafts for Notes on language models

definition. L-GATr architecture [spinner2024lorentz, p. 5] [lm-0007]

\[ \begin {align*} \bar {x}&=\operatorname {LayerNorm}(x),\\ \operatorname {AttentionBlock}(x)&=\operatorname {Linear} \circ \operatorname {Attention}(\operatorname {Linear}(\bar {x}), \operatorname {Linear}(\bar {x}), \operatorname {Linear}(\bar {x})) + x,\\ \operatorname {MLPBlock}(x)&=\operatorname {Linear} \circ \operatorname {GatedGELU} \circ \operatorname {Linear} \circ \operatorname {GP}(\operatorname {Linear}(\bar {x}), \operatorname {Linear}(\bar {x})) + x,\\ \operatorname {Block}(x)&=\operatorname {MLPBlock} \circ \operatorname {AttentionBlock}(x),\\ \operatorname {L-GATr}(x)&=\operatorname {Linear} \circ \operatorname {Block} \circ \operatorname {Block} \circ \cdots \circ \operatorname {Block} \circ \operatorname {Linear}(x). \end {align*} \]

opening thoughts [lm-0006]

My recent learning has become too scattered, and I have long wished to write notes on LMs (language models), so I hope focusing on this will prevent me from losing focus.

I've settled on the prefix lm instead of llm as I think phrases like "small LLMs" are redundant, and I hope LM will include multimodal variants, such as VLMs (visual language models).

The interest doesn't stop there - I haven't properly studied diffusion models yet, and I wish to explore their integration with LMs, or at least, the perspective of diffusion models in relation to LMs.

As my interests in the LM field are quite diverse, I will organize the notes into several series rather than one. I've reserved lm-0002 through lm-0005 for root notes covering these topics. The areas that come to mind include:

math behind LMs since ML
various optimization methods of LMs, including GPU programming
post-training techniques
application of GA in LMs
alternative architectures of LMs, that includes RWKV, stuff related to diffusion models

Individual notes will be written as I read related papers and books, and may be referenced across these root notes.

I have scattered notes about LMs previously, here I list them for easier reference.

This is also an experiment to see if modern VLMs could better convert screenshots to Forester math formula markup.

In Transformers: from self-attention to performance optimizations, I was focused on visualizing transformer architectures using subscript-free tensor notations ("Named Tensor Notation" [chiang2021named]). However, mathematical concepts can often be expressed succinctly once you develop the ability to visualize them mentally. In this series of notes, I'll focus more on the mathematical foundations rather than visualizations or introductory explanations, having progressed beyond that stage after reading hundreds of LM papers.

notes on group algebras [spin-0010]

September 5, 2024

rationale [spin-0011]

In a sense, group algebras are the source of all you need to know about representation theory.

The primary reference is [james2001representations] for understanding FG-module, Group algebra, the presentation of groups, Clifford theory (which is the standard method of constructing representations and characters of semi-direct products, see [woit2017quantum], and "3.6 Clifford theory" in [lux2010representations]), Schur indices etc. We also need to check [lux2010representations] for its introduction to GAP, and we should pay close attention to the progress of GAP-LEAN. [sims1994computation] might also be interesting in a similar manner as [lux2010representations] but with emphasis on the presentation of groups.

See also group algebra on nlab, particularly that "A group algebra is in particular a Hopf algebra and a \(G\)-graded algebra."

The related Zulip thread is here, and I have preliminary explorations and experiments in Lean here.

This interest originates from reading Robert A. Wilson's work [wilson2024discrete]. The ultimate goal is to understand the group algebra of the binary tetrahedral group (\(Q_8 \rtimes Z_3\)), then the three-dimensional complex reflection group (\(G_{27} \rtimes Q_8 \rtimes Z_3\)), a.k.a. the triple cover of the Hessian group, which can be interpreted as a finite analogue of the complete gauge group \(U(1) \times SU(2) \times SU(3)\).

A further neccecity arises from reading [hamilton2023supergeometric] and [hamilton2023unification].

Notes on Algebraic Geometry [ag-0001]

August 13, 2024
Utensil Song

What is Algebraic Geometry? [ag-0005]

Algebraic geometry deals with solution sets of systems of polynomial equations [borisov2024adventures, sec. 1] from an geometric view, with many algebraic concepts imitating the notions of analysis and topology [kriz2021introduction, p. vii], but has the advantage of being able to deal with singularities [mehrle2017algebraic, lect. 1, sec. 1] and other pathological objects.

Algebraic geometry has developed in waves, each with its own language and point of view, see [hartshorne1977graduate, p. xiv] for some discussion. The older language is closer to the geometric intuition, while the newer language developes powerful techniques to solve problems in great generality, but the study of the latter is considered tedious, or even depressing, by many, when not accompanied with tangible applications [grothendieck1964elements, p. 12].

remark. the grand plan [ag-0002]

We will use [kriz2021introduction] as a holistic guide to the organization of materials, which has done a great job of having the minimal prerequisites, being self-contained, and covering most of the topics that concern us, including the geometric motivation.

For a similar purpose, we use the formalization papers [bordg2022simple] and [buzzard2022schemes] to guide the path, at least the part towards schemes, and their counterparts in the Mathlib of Lean 4.

For prerequisites in basic algebra and commutative algebra, we will use [knapp2006basic] and [knapp2007advanced], and notes by Andreas Gathmann [gathmann2023plane][gathmann2013commutative] and David Mehrle [mehrle2015commutative]. For the geometric motivation and intuition, we will use [cox1997ideals] and [borisov2024adventures].

For upstream treatment of algebraic geometry, we will use [grothendieck1964elements] (particularly the English translation available at ryankeleti/ega) and [fantechi2006fundamental]. For modern notes, we will use [vakil2024rising], [gathmann2022algebraic] and [mehrle2017algebraic], with an eye on the classic textbook [hartshorne1977graduate].

We also need to tap into the language of Stacks in a modern setting, as treated in [khan2023lectures], with preliminaries on \(\infty \)-categories and derived categories.

See the plan for notes on algebraic geometry for an early discussion of the plan.

convention. rings and fields [gathmann2013commutative, 0.1] [ag-0008]

A ring, usually denoted \(R\), is always assumed to be a commutative ring (i.e. \(a+b=b+a\) and \(a b=b a\) for all \( a, b \in R\)) with 1 (i.e. the multiplicative identity element, or called multiplicative unit).

\(1 \neq 0\) is not required, where \(0\) is the additive neutral element. If \(1=0\), then \(R\) must be the zero ring (or called the trivial ring), which consisting of one element, and is denoted \(\{0\}\).

Subrings must have the same unit, and ring homomorphisms are always required to map \(1\) to \(1\).

A field, usually denoted \(K\), is a commutative ring with \(1\), where every nonzero element has a multiplicative inverse (thus division can be defined).

definition. formal variable, formal expression [ag-0009]

A formal variable is an arbitrary symbol that is used to represent some mathematical object, and assumes nothing about the value or nature of the object.

A formal expression is mathematical expression with formal variables, assuming nothing but the formal variables participate in the expression have operations that are used in the expression.

A formal expression can be evaluated by replacing the formal variables with actual mathematical objects that have the operations defined in the expression.

When there is no ambiguity, we may omit the word "formal" and simply say variables or an expression.

definition. monomial [cox1997ideals, 1.1.1] [ag-0003]

A monomial in \(n\) formal variables \(x_1, \ldots , x_n\), denoted \(x^\alpha \), is a formal expression of the form \[ x_1^{\alpha _1} x_2^{\alpha _2} \cdots x_n^{\alpha _n} \] where \(n \in \mathbb N\), \(\alpha =\left (\alpha _1, \ldots , \alpha _n\right )\) is an \(n\)-tuple of nonnegative integers.

The total degree of the monomial is denoted \(|\alpha |=\alpha _1+\cdots +\alpha _n\).

definition. polynomial [cox1997ideals, 1.1.2, 1.1.3] [ag-0004]

A polynomial \(f\) over a ring \(R\) in \(n\) variables is a finite linear combination (with coefficients \(a_\alpha \) in \(R\) ) of monomials, i.e. the formal expression of the form \[ f=\sum _\alpha a_\alpha x^\alpha , \quad a_\alpha \in R, \]

The set of all polynomials in \(x_1, \ldots , x_n\) with coefficients in \(R\) is denoted \(R\left [x_1, \ldots , x_n\right ]\).

\(a_\alpha x^\alpha \) is called a term of \(f\) if \(a_\alpha \neq 0\).

The total degree of \(f \neq 0\), denoted \(\operatorname {deg}(f)\), is the maximum \(|\alpha |\) such that the coefficient \(a_\alpha \) is nonzero. The total degree of the zero polynomial is undefined.

remark. polynomial ring [cox1997ideals, 1.1.3] [ag-0007]

Under addition and multiplication, \(R\left [x_1, \ldots , x_n\right ]\) satisfies all axioms of a commutative ring, and for this reason we will refer to \(R\left [x_1, \ldots , x_n\right ]\) as a polynomial ring.

definition. affine space [gathmann2013commutative, 0.3] [ag-000A]

The (\(n\)-dimensional) affine space over a field \(K\), denoted \(\mathbb {A}_K^n\), is \[ \left \{\left (c_1, \ldots , c_n\right ): c_i \in K \text { for } i=1, \ldots , n\right \} \] which is just \(K^n\) as a set, without the its additional structures as a \(K\)-vector space and a ring.

We'll often use the term affine \(n\)-space to indicate the dimension. Particularly, an affine 1-space is called an affine line, an affine 2-space is an affine plane.

definition. affine variety [gathmann2013commutative, 0.3] [ag-000D]

Let \(S \subset K\left [x_1, \ldots , x_n\right ]\) be a set of polynomials. The zero locus (or zero set) of \(S\) is \[ V(S):=\left \{x \in \mathbb {A}_K^n: f(x)=0 \text { for all } f \in S\right \} \subset \mathbb {A}_K^n \]

An affine algebraic variety over \(K\) is a subset of \(\mathbb {A}_K^n\) of this form. It's usually simply called an affine variety over \(K\), or an affine \(K\)-variety.

If \(S=\left (f_1, \ldots , f_k\right )\) is a finite set, \(V(S)\) can be written as \(V\left (f_1, \ldots , f_k\right )\).

Obviously, it is the set of all solutions of the system of polynomial equations \(f_1\left (x_1, \ldots , x_n\right )=\cdots =f_s\left (x_1, \ldots , x_n\right )=0\) [cox1997ideals, 1.2.1], denoted \(\operatorname {Sol}(S;K)\) [dolgachev2013introduction, p. 1].

remark. affine v.s. projective [michalek2021invitation, ch. 2] [ag-000E]

The prefix "affine" of affine variety is used to distinguish it from a projective variety. Affine varieties arise from arbitrary polynomials, while projective varieties arise from systems of homogeneous polynomials, i.e. linear combinations of monomials of fixed degree.

Since affine varieties are the general case, they are sometimes simply called varieties.

Figure out its relation to affine space in Geometry, which preserves parallelism and ratio of lengths for parallel line segments, but not distances and measures of angles.

example. varieties [ag-000F]

\(\mathbf {V}\left (10 x^2-x^3-y^2\right )\) from [cox1997ideals, p. 24]:

\(\mathbf {V}\left (x^2-y^2 z^2+z^3\right )\) from [cox1997ideals, p. 7, p. 16]:

figure. [uts-000J]

#define AS_LIB 1

int get_shape() {
    return int(iTime) % 52;
}

#include "/forest/shader/implicit.glsl"

void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
    vec2 uv = 2.*(fragCoord-iResolution.xy/2.)/iResolution.y; // contains [-1,1]^2
    vec3 col = vec3(0.);
    
    // Camera rays
    vec3 camPos = vec3(4.,0.,0.);
    vec3 camDir = - normalize(camPos);
    vec3 rayPos, rayDir;
    float zoom = 1.3; // 1.8*cos(iTime);
    // if (checkKey(KEY_E)) zoom = 0.5;
    float fov = 0.4*zoom;
    float fov_ortho = 1.5*zoom;
    #if perspective
        // perspective cam
        rayPos = camPos;
        rayDir = normalize(camDir + fov*vec3(0., uv.x, uv.y));
    #else
        // orthographic cam
        rayPos = camPos + fov_ortho*vec3(0., uv.x, uv.y);
        rayDir = camDir;
    #endif
    // for perspective background in orthographic mode
    vec3 cubemapDir = normalize(camDir + fov*vec3(0., uv.x, uv.y));
    
    // Mouse-controlled rotation
    vec2 mouse = initMouse + vec2(0.015625*sin(iTime*PI), 0.0); // initMouse; // iMouse.xy == vec2(0.,0.) ? initMouse : (iMouse.xy/iResolution.xy - 0.5);
    float yaw = clamp(- mouse.x * 2.*PI * 1., -PI,PI);
    float pitch = clamp( mouse.y * PI * 1.2, -PI*0.5, PI*0.5);
    // pitch and yaw rotations (column-wise matrices)
    mat3 rot = mat3(cos(yaw), sin(yaw), 0., -sin(yaw), cos(yaw), 0., 0., 0., 1.);
    rot = rot * mat3(cos(pitch), 0., -sin(pitch), 0., 1., 0., sin(pitch), 0., cos(pitch));
    // apply
    camPos = rot*camPos;
    camDir = rot*camDir;
    rayPos = rot*rayPos;
    rayDir = rot*rayDir;
    cubemapDir = rot*cubemapDir;
    
    //cubemapDir = vec3(cubemapDir.x, cubemapDir.z, cubemapDir.y);
    
    vec3 hitPoint = raycast(rayPos, rayDir);
    
    if (hitPoint == BINGO) { fragColor = vec4(BINGO,1.0); return; }
    //if (hitPoint == NOHIT) { fragColor = vec4(NOHIT,1.0); return; }
    //if (hitPoint == NOBOUNDHIT) { fragColor = vec4(NOBOUNDHIT,1.0); return; }
    //if (hitPoint == ESCAPEDBOUNDS) { fragColor = vec4(ESCAPEDBOUNDS,1.0); return; }
    //if (hitPoint == MAXDISTREACHED) { fragColor = vec4(MAXDISTREACHED,1.0); return; }
    //if (hitPoint == MAXITERREACHED) { fragColor = vec4(MAXITERREACHED,1.0); return; }
    
    if (hitPoint == NOBOUNDHIT  || hitPoint == NOHIT || hitPoint == ESCAPEDBOUNDS || hitPoint == MAXITERREACHED) {
        //fragColor = vec4(vec3(0.2),1.0); return;

        // make background transparent
        fragColor = vec4(0.0,0.0,0.0,0.0); return;
        
        col = with_background(cubemapDir);
        
        #if showBoundingCube
        // darken bounding cube
        if (hitPoint != NOBOUNDHIT) { col *= vec3(0.7); }
        #endif

        fragColor = vec4(col,1.0); return;
    }

    vec3 grad = gradf(hitPoint+1.1*EPS*(-rayDir));
    float s = -sign(dot(grad,rayDir));
    
    col = with_color_mode(grad, s, hitPoint, camPos);

    col = clamp(col, 0., 1.);

    col = with_surface_pattern(col, hitPoint);

    col = with_shading(col, grad, s, rayDir);
    
    col = clamp(col, 0., 1.);

    fragColor = vec4(col,1.0);
}

adapted from Rogolop's Implicit surfaces

appendix [ag-000B]

For draft notes, see drafts for Notes on Algebraic Geometry.

Notes on Hopf Algebras [hopf-0001]

May 6, 2024
Utensil Song

I would like to have some notes on Hopf algebras, particularly its relation to group algebras of finite groups and Clifford algebras.

The following papers interest me (marked with extra keywords hit):

[ablamowicz2016clifford]: finite groups
[fauser2004grade]: grade, knot
[fauser2002treatise]: grade, knot, graphical calculi
[rodriguez1996clifford]: action
[trindade2019clifford]: physics
[bulacu2011clifford]: category

definition. Peano space [fauser2002treatise] [hopf-0002]

Let \(V\) be a linear space of finite dimension \(n\). Let lower case \(x_i\) denote elements of \(V\), which we will call also letters. We define a bracket as an alternating multilinear scalar valued function \[ \begin {aligned} [, \ldots , .] & : V \times \ldots \times V \rightarrow \mathbb {k} \quad (n\text {-factors}) \\ {\left [x_1, \ldots , x_n\right ]} & =\operatorname {sign}(p)\left [x_{p(1)}, \ldots , x_{p(n)}\right ] \\ {\left [x_1, \ldots , \alpha x_r+\beta y_r, \ldots , x_n\right ] } & =\alpha \left [x_1, \ldots , x_r, \ldots , x_n\right ]+\beta \left [x_1, \ldots , y_r, \ldots , x_n\right ] \end {aligned} \]

The sign is due to the permutation \(p\) on the arguments of the bracket. The pair \(\mathcal {P}=(V,[., \ldots ,])\). is called a Peano space.

remark. Peano space [fauser2002treatise] [hopf-0003]

Of course, this structure is much weaker as e.g. a normed space or an inner product space. It does not allow to introduce the concept of length, distance or angle. Therefore it is clear that a geometry based on this structure cannot be metric. However, the bracket can be addressed as a volume form. Volume measurements are used e.g. in the analysis of chaotic systems and strange attractors.

definition. standard Peano space [fauser2002treatise] [hopf-0004]

A standard Peano space is a Peano space over the linear space \(V\) of dimension \(n\) whose bracket has the additional property that for every vector \(x \in V\) there exist vectors \(x_2, \ldots , x_n\) such that \[ \left [x, x_2, \ldots , x_n\right ] \neq 0 . \]

In such a space the length of the bracket, i.e. the number of entries, equals the dimension of the space, and conversely. We will be concerned here with standard Peano spaces only.

technical notes (draft) [uts-0012]

Try some alternative browsers [uts-016N]

I decided to widen my horizon about browsers. After browsing a few subreddits, I have chosen these 3 browsers:

- Arc
    - Chromium based
    - the team has moved on to Dia, an AI browser
    - but people still miss the UX of Arc, it seems to be one of a kind, so I have to know
    - unlikely to be open-source
    - Mac/iOS/Windows, no Linux
- Orion
    - WebKit based, this engine is the best hope for degoogle
    - will be open-source
    - not mature yetA
    - Mac/iOS, Linux WIP, no Windows/Android
- Floorp
    - Firefox based
    - open-source
    - Mac/Windows/Linux, no mobile

How to run perf in a colima container [uts-016L]

Just to note down that it's not straightforward to get perf working.

Here's what worked finally:

First, the container needs to be started with proper capabilities:

```bash
docker run --rm -it --platform linux/amd64 --cap-add PERFMON ubuntu:jammy bash
```

Secondly, some flags need to be set from the VM:

```bash
# this would start a shell in the VM, outside of containers
colima ssh

# don't know why, sudo doesn't work, but it works in a root shell
# -1 or 2 both work
sudo bash -c 'echo -1 > /proc/sys/kernel/perf_event_paranoid'
```

Lastly, install perf in the container:

```bash
apt install -y linux-tools-$(uname -r) linux-cloud-tools-$(uname -r)
```

Then the following works:

```bash
perf record -g echo 1
perf report -g graph,0.5,caller
```

Note also that you can't pass `"echo 1"` because perf will failed to find a file named "echo 1".

Learned this while working on Trying Zig's self-hosted x86 backend on Apple Silicon and before submitting this issue about unable to run `poop` in `colima` containers.

Follow-up:

It's important to check whether certain perf counters actually work:

```
sleep 10 &
perf stat -p $! -e "cpu-cycles,instructions,cache-references,cache-misses,branch-misses"
```

On a machine that supports corresponding counters, it should give something like:

```
 Performance counter stats for process id '101011':

            511488      cpu-cycles                                                  
            280194      instructions              #    0.55  insn per cycle         
             16193      cache-references                                            
              6161      cache-misses              #   38.047 8730      branch-misses                                               

      10.009515156 seconds time elapsed
```

On an unsupported machine, the output would be like:

```
 Performance counter stats for process id '552':

   <not supported>      cpu-cycles
   <not supported>      instructions
   <not supported>      cache-references
   <not supported>      cache-misses
   <not supported>      branch-misses

       6.491105847 seconds time elapsed
```

Resources about Rubik's Cube [uts-016J]

- speed solving
    - Roux (a.k.a as bridging)
        - Learning the Roux method
        - Simple Roux Method of Solving Rubik's Cube
        - Introduction - Roux Tutorial
            - uses TwistySim
    - ZZ
        - Example Solves - Beginner ZZ Tutorial
- notation
    - Rubik's cube notation by Sebastiano Tronto
    - 3x3x3 notation
        - which clarifies naming of `E`, `S`, `r` etc.
- papers
    - The Generalized Rubik’s Cube: Combinatorial Insights Through Group Theory
        - it cites Nxnxn rubik’s cubes and god’s number[salkinder2021nxnxn]
        - has its own notation for nxnxn cubies
        - works out many details about general nxnxn cubes and their corresponding group theory
    - Commutators in the Rubik’s Cube Group
        - uses notations like `UR`, `FUR` to denote edges and corners, which I prefered
        - nice diagrams
        - demo with Twizzle
    - Generalising the configurations of an N×N×N Rubik’s Cube
        - aimed to find an expression for the total number of possible configurations of an N×N×N Rubik’s Cube
        - uses notations like `ur`, `fur`, `u_1` for 3x3x3 and 4x4x4 cubes
        - generalize to N×N×N with 3x3x3-style edges and 4x4x4-style corners
- solving with linear algebra
    - asked by this Reddit post
    - Mathematics of the Rubik’s cube by W. D. Joyner in 1996
    - Non-Commutative Gaussian Elimination and Rubik’s Cube
- Test TwistySim
- Test Twizzle

My setup with 4 screens and 2 Macs [uts-016B]

Recently, a MacBookPro and a Mac Mini, both are M4 Pro, have become my primary productivity tools. Along with them, I have 3 monitors, plus the one comes with MacBookPro, counts as 4 screens in total. On top of that, I have the logi MX Keys keyboard and MX Master 3S suite, and an Apple Magic Trackpad.

It turns out that I need to come up with a productive setup for them.

## TL;DR as an ASCII art

```
                                              +---------+
                                              |         |
  +-----------------+   +-----------------+   |  Tall   |
  |    Reference    |   |     Primary     |   |         |
  |    Display      |   |     Display     |   | Display |
  +-----------------+   +-----------------+   |         |
                                              |         |
        +------+           +------------+     +---------+
        | Mini |           |    MBP     |
        +------+           +------------+
                           | [Keyboard] |
                           |   [Pad]    |
                           +------------+
                            [ MX Keys   ]       [MX 3S]
                              [Pad]
```

## Screen arrangement

Above the screen of MacBookPro, I have my primary display, which serves as the main workspace for coding, browsing etc. The second display is placed on its left, on the same height, hosting stuff for reference. Both are horizontal and wide.

The third display (I would like to call it the "tall display") is placed on the right, yet vertically, with its middle part aligned with the primary display, it might host IM, PDFs, AI assistants, or any stuff that are tall, including vertically stacked windows that requires less width.

## Mac-display correspondence

Mac Mini is placed under the second display, and it's connected to the first 2 displays, and all of them if desired.

MacBookPro is connected to the tall display, and optionally also to the primary display.

## Keyboard and mouse

I could just use the keyboard and trackpad of MacBookPro, and they can travel to the nearby Mac Mini.

As I have muscle memory that's already used to MacBookPro's keyboard, MX Keys would be a good fit, as the keys are flatter, the key distance is also shorter.

I'm so used to use the trackpad to scroll, zoom, switching between spaces, and move cursor during coding, so that I would either put MX Keys on top of the keyboard of MacBookPro so I can reach its trackpad effortlessly, or I would just MX Keys and the Magic Trackpad beneath it, with the same relative location like the MacBookPro's keyboard and trackpad.

When I work with diagrams, slides, or UIs that require meticulous mouse control, I would use the MX Master 3S mouse. Its side wheel is configured to send `Ctrl+Left` and `Ctrl+Right`, and side buttons are configured to send `Ctrl+Up` and `Ctrl+Down`, both serving space switching.

The logi keyboard and mouse supports switch between 3 devices, so I can switch between 2 Macs. Unfortunately, one doesn't follow the other to switch, so I would rely on Mac's native support for keyboard and mouse sharing, and it goes like `2->1->MacBookPro->Tall`, so there is minimal extra travel between 1 and tall.

One might noticed that such a setup is highly redundant, but this provides me with maximum flexibility and productivity:

- In coding mode, I could have both hands on the keyboard, with the left hand sometimes on the trackpad
- In browing mode, my left hand is on the trackpad, and the right hand is on the mouse
- In diagramming mode, both hands are also on the keyboard, with the right hand sometimes on the mouse

## Thunderbolt bridge

I also have a Thunderbolt bridge, by connecting 2 Macs with a Thunderbolt cable, and following this guide. This would create a secure and fast connection between them for file transfer, screen sharing, distributed AI inference etc.

Such a bridge could also be used in MX Flow, but it has certain delay each time the cursor moves between the 2 Macs, so I didn't use it.

Organizing files [uts-002J]

January 3, 2025
Utensil Song

I wish to note down my exploration of solutions to organize massive amount of (mostly binary media) files.

## Wishlist

- Performant
    - it should be able to handle huge amount of files, with cached metadata, thumbnails, etc. so I can browse over their contents quickly
    - batch operations to files on NAS should be as performant as local files, e.g. using client-server architecture, using rsync to determine the diff, etc.
- Data integrity
    - it should be able to use versioning, deduplication, etc. to ensure data integrity so I can mess with the files (mostly moving around, renaming) without worrying about losing them and their metadata such as timestamp, and this should be efficient in terms of storage
- Semi-auto tagging
    - I should be able to tag files manually, in batch, and set rules to tag files automatically with metadata or based on content (using ML models)
    - preferably, tags could have its own hierarchy
    - tags should not be using filename or metadata, but rather a separate database
- Open-source, freemium or affordable one-time purchase
    - in anycase, I should not be locked in, and the data should be exportable, so I can script

## Candidate solutions

- git-annex
    - Pros
        - it maintains a versioned record for files, can even track offline media
    - Cons
        - it only have access to the file contents when they are present at the registered location
        - even with the web UI, it's not realy user-friendly
- DataLad
    - it builds on git-annex, but still a CLI
- kopia
    - Pros
        - backup, versioning, deduplication, encryption, compression, error correction etc.
        - it can mount or restore any snapshot to a directory
        - there is official support for syncing the repo, making it reliable to backup the backup
    - Cons
        - can't really check the history of a file
        - relies on other tools for checking file location changes
        - lost the time added, preserving only the modification time
- Commander One
    - dual-pane file manager, trying out
- `dua -i`
    - it can analyze disk usage recursively, interactively
    - I can mark files for deletion
- jw
    - Pros
        - it can calculate the hashes of files in deep directories really quick
        - I use it to check the integrity of files after copying files
    - Cons
        - it doesn't check file renaming/moving
- VeraCrypt
    - Pros
        - useful if you just want to encrypt files in a container
    - Cons
        - it's inconvenient to mount in docker
- Garage
    - Pros
        - it's a S3-compatible storage
        - it works with git-annex and kopia
- macFUSE
    - needed to mount various filesystems
- photo management
    - HomeGallery
    - librephotos
    - digiKam
    - PhotoPrism
    - Immich
    - Lychee
- asset management
    - Eagle 4
    - Pixcall
    - Billfish

Notes on modern C++ programming [uts-002H]

Scope

This note will cover some modern C++ features interesting to me during reading Modern C++ Programming.

Features standardized in C++23, or implemented for C++26, are in scope.

This note serves as a holistic view and a reminder.

Introduction

Curiosity and persistence matter. -- Ben Cichy

C++ achieves uncompromised performance by living with undefined behavior, accepting a safety trade-off. -- This is C++: Uncompromised Performance, Undefined Behavior, & Move Semantics - Jon Kalb C++Now 2024

The problem with using C++...is that there’s already a strong tendency in the language to require you to know everything before you can do anything -- Larry Wall

Preparation

- C++23 `std::print`
- Reducing C++ Compilation Times Through Good Design - Andrew Pearcy - ACCU 2024

Basic Concepts I


- C++14 digit separators, e.g. `1'000'000`
- C++14 `auto` for function return types, e.g. `auto f() { return 1; }`
- C++20 `auto` for function input,  equivalent to templates but less expensive at compile-time
- C++20 `<utility>` for safely comparing signed/unsigned types

Basic Concepts II


- Detecting wraparound for unsigned integral types is not trivial
- C++23 `std::float128`, `std::binary16`, `std::bfloat16`
- `inf` and `nan` for floating-point types
- Floating-point Arithmetic Properties
- Catastrophic Cancellation
    - Fixed epsilon “looks small” but it could be too large when the numbers being compared are very small
    - `areFloatNearlyEqual`
    - compensation algorithm like Kahan summation, Dekker’s FastTwoSum, Rump’s AccSum

Basic Concepts III

- C++11/C++17/C++20 enum class, e.g. `using enum Color`
- C++17 range-based loop for structure binding: `for (auto [key, value] : map)`
- C++17/C++20 initializing statement in `if`/`switch` and range-for loop, e.g. `for (int i = 0; auto x : {'A', 'B', 'C'}) {}`
- C++17 `[[maybe_unused]]` attribute, C++26 `auto _`

Basic Concepts IV

- C++11 Dynamic memory 2D allocation/deallocation: `new int[3][4]` and `delete[]`
- `new (buffer)`: `delele[] buffer`, explicit `x->∼A()`
- `new (std::nothrow)`
- C++20 Designated Initializer List: `struct A { int x, y; }; A a{.x = 1, .y = 2};`
- Pointer arithmetic rule: `address(ptr + i) = address(ptr) + (sizeof(T) * i)`
- C++11 `constexpr`: _can_ be evaluated at compile-time
    - variable: always evaluated at compile-time
    - function
        - evaluated at compile-time as long as all its arguments are evaluated at compile-time
        - always evaluated at run-time if
            - contain run-time functions
            - contain references to run-time global variables
- C++20 `consteval`: guarantees compile-time evaluation
    - A run-time value always produces a compile error
- C++20 `constinit`: guarantees initialization at compile-time
    - A run-time initialization value always produces a compile error
    - The value of a variable _can_ change during the execution
- C++17 `if constexpr`: compile-time branching
    - C++20 `std::is_constant_evaluated()`
    - C++23 `if consteval`
- C++20 `std::bit_cast`

Basic Concepts V

- `=delete` can be used to prevent calling the wrong overload
- C++20 `[[nodiscard("reason")]]`
    - This issues a warning if the return value of a function is discarded (not handled), which is good for ensuring error handling.
    - C++17 for the entire class/struct
    - C++20 for constructors
    - C++23 for lambdas
- The compiler injects `operator()` in the code of the destination function and then compile the routine. Operator inlining is the standard behavior
- C++11 lambda expression: capture clause
    - `[]` captures nothing
    - `[&]` captures all variables by reference
    - `[=]` captures all variables by value
    - `[=, x, &y]` captures `x` by value and `y` by reference, all other variables by value
    - C++17 `[this]` captures the current object by reference
    - C++14 `[x=x]`, `[&x=x]` capture current object member `x` by value/reference
    - C++20 deprecates `[=]` capturing `this` pointer by value
- C++14 `[](auto value)`, `[](int i = 6)`
- A function can return a lambda (dynamic dispatch is also possible)
- C++17 `constexpr` lambda, C++20 `consteval` lambda, `mutable`
    - syntax: post specifier
- C++20 `template` and `requires` clause for lambda

auto lambda = []<typename T>(T value)
    requires std::is_arithmetic_v<T> {
        return value * 2;
    };
auto v = lambda(3.4);   // v: 6.8 (double)
struct A{} a;
// auto v = lambda(a);  // compiler error

- When Preprocessors are Necessary
    - Conditional compiling
        - [Abseil Platform Check Macros](https://abseil.io/docs/cpp/platforms/macros)
    - Mixing different languages
    - Complex name replacing
- [Are We Macro free Yet, CppCon2019](https://github.com/CppCon/CppCon2019/blob/master/Presentations/are_we_macrofree_yet/are_we_macrofree_yet__zhihao_yuan__cppcon_2019.pdf)
- C++20 `<source location>`
    - replaces `__FILE__`, `__LINE__`, C++11 `__func__`
    - still: non-standard `__PRETTY_FUNCTION__`
- C++17 [`__has_include`](https://en.cppreference.com/w/cpp/preprocessor/include)
- stringizing macro operator `#`, e.g. `#string` expands to `"string"`
- token pasting macro operator `##`, e.g. `x##y` expands to `xy`
- `#error`, C++23 `#warning`
- non-portable `pragma`, C++11 `_Pragma`
- C++11 `__VA_ARGS__`

Object-Oriented Programming I

- C++11 In-class non-static data members initialization (NSDMI)
    - allows initializing the data members where they are declared
    - a user-defined constructor can be used to override their default values
- C++11 uniform initialization `{}`
    - in function arguments and return values
- C++11 delegate constructor
- C++11 `=default`, `=delete`
- `using` for type aliasing
    - e.g. partial specialization for templates

Object-Oriented Programming II

- C++11 `override`
    - ensures that the function is virtual and is overriding a virtual function from a base class
- C++11 `final`
    - prevents a virtual function from being overridden in a derived class
- `dynamic_cast`
    - upcast: `dynamic_cast<Base*>(derived)`
- C++23 multidimensional subscript operator `[]`
- C++23 static `[]` and `()` operators
- C++11 `explicit` for conversion operators
- `std:swap`
- operators preserve precedence and short-circuit properties
- binary operators should be implemented as friend methods
- layouts
    - aggregate
        - initializable with `{}`
    - trivial copyable
        - can be copied with `memcpy`
    - standard layout
        - just like a struct or a union
    - POD
        - trivial copyable + standard layout
- C++11/C++20 `<type_traits>` to check layout

Templates and Meta-programming I

- C++17 automatic deduction of non-type template parameters, with `auto`
- non-type template parameters
    - C++20 a class literal type
        - `constexpr` assignable
        - public, non-mutable
        - for all base classes and non-static data members
    - pointer, reference, and pointer to member
    - function, e.g. `decltype(f)`
- C++26 assertion with text formatting
    - e.g. `static_assert(sizeof(T) != 4, std::format("test1 with sizeof(T)={}", sizeof(T)));`
- `decltype((expr))` for `lvalue` reference to the type of `expr`
- C++14 `auto` for function template return types
    - replaces e.g. `decltype(T{} + R{})`
- `<type_traits>`
    - type query
    - type manipulation

Templates and Meta-programming II

- C++17 CTAD
    - automatic deduction of class template arguments in constructor calls (CTAD)
    - template deduction guide
        - map constructor parameter types to class template parameters
        - e.g. `MyString(char const*) -> MyString<std::string>;`
        - doesn’t work within the class scope
        - C++20 alias template deduction
- use `std::enable_if` to make use of SFINAE
    - helper alias: `_t` for `::type`, `_v` for `::value`
    - using `std::enable_if_t` for the return type prevents `auto` deduction
- variadic template
    - captures a parameter pack of arguments, which hold an arbitrary number of values or types
    - ellipsis `...`
        - prefix an identifier to introduce/capture
        - suffix the identifier to expand
        - `sizeof...(args)` to get the number of arguments
        - C++17 fold
            `(other op ... op pack)`
        - C++14 for lambdas
        - C++20 for concepts
    - [C++20 idioms for parameter packs](https://www.scs.stanford.edu/~dm/blog/param-pack.html)
- C++20 concepts
    - `concept [name] = [compile-time boolean expression];`
    - `requires` clause: `requires [compile-time boolean expression or Concept]`
    - `requires` expression:

#include <concept>

requires [(arguments)] {
    [SFINAE contrain]; // or
    requires [predicate];
} -> bool

template<typename T>
concept MyConcept = requires (T a, T b) { // First case: SFINAE constrains
    a + b;                         // Req. 1 - support add operator
    a[0];                          // Req. 2 - support subscript operator
    a.x;                           // Req. 3 - has "x" data member
    a.f();                         // Req. 4 - has "f" function member
    typename T::type;              // Req. 5 - has "type" field

    {*a + 1} -> std::convertible_to<float>;// Req. 6 - can be deferred
                                           // and the sum
                                           // with an integer is convertible
                                           // to float
    {a * a} -> std::same_as<int>;          // Req. 7 - "a * a" must be valid
                                           // and the result type is "int"
};

Translation Units I

- storage duration
- static initialization order fiasco
- One Definition Rule (ODR)
- C++11 `extern template class A<int>;`

Translation Units II

- common linker errors
    - multiple definitions
    - undefined reference
- C++20 modules
    - a module is a set of source code files that are compiled independently of the translation units that import them
    - `module module.name;`
    - `import module.name;`
    - `export`
        - `export { ... }`
        - `export module module.name;`
        - `expor import module.name;`
    - global module fragment: include header files in a module interface
        - `module;`
    - private module fragment: include header files in a module implementation
        - `module :private;`
    - legacy headers can be directly imported with import
        - all declarations are implicitly exported and attached to the global module (fragment)
    - a module can be organized in isolated module partitions
- namespace
    - anonymous namespace
        - visible only in the translation unit, internal linkage
    - inline namespace
        - can be used to version a library
    - C++17 attributes on namespaces
- tools
    - `c++filt`, `ldd`, `nm`, `objdump`, `readelf`

Code Conventions

- “Common” Project Organization
- [C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines)
- Naming style conventions can be also enforced by using tools like `clang-tidy`'s [readability-identifier-naming](https://clang.llvm.org/extra/clang-tidy/checks/readability/identifier-naming.html#readability-identifier-naming)
- Use braced direct-list-initialization or copy-initialization for setting default data member value. Avoid initialization in constructors if possible

struct A {
    int x = 3;      // copy-initialization
    int x { 3 };    // direct-list-initialization (best option)
};

Debugging and Testing

- Cost of Software Defects
- 60 terrible tips for a C++ developer
- Compiler Options Hardening Guide for C and C++
- Sanitizers
- Static Analyzers
- Linters

Ecosystem


- xmake vs cmake
- Compiler Explorer
- CppInsights: see what your compiler does behind the scenes
- grep.app: search git repos
- Quick C++ Benchmark

Utilities

- C++20 `std::format`
- C++23 `std::print`, `std::println`
- C++20 `std::span`: non-owning view of an underlying sequence or array
- C++20 `<numeric>`
- C++11 pseudo random number generation (PRNG)
- C++11 `<chrono>`
    - overhead ~20-40ns
- C++17 `<optional>`, `<any>`
- C++23 `<stacktrace>`

Containers, Iterators, and Algorithms


- C++20 Ranges
    - operate on elements of data structures uniformly
    - pipe operator `|`, `|=`
        - `std::views::*`, `std::actions::*`
    - range factory: produces a view that contains no elements

Advanced Topics I

- lvalue, rvalue (could be *p*ure rvalue, or e*x*piring *g*eneralized lvalue)
- `std::move` indicates that an object is no longer needed and can be moved
- C++11 reference collapsing rules
- C++11 perfect forwarding with `std::forward`: allows preserving argument value category and const/volatile modifiers
- copy elision for return value (C++17 guarantees some cases)
- C++23 decay copy with `auto`

Advanced Topics II

- C++23 `std::expected`
    - return a value or an `std::unexpected`
    - the user can choose how to handle the error: check return value, throw an exception, or monadic operations
- future
    - `std::async(f, args)`
        - launch policy: `std::launch::async`, `std::launch::deferred`
    - `.get()` or `.wait()`

Performance Optimizations I

Mostly known conceptual introductions, good reference.

Performance Optimizations II

- C++11 `std:fma`
- C++20 `<bit>`
- C++20 [[unlikely]]
- C++23 [[assume(boolean expression)]]

Performance Optimizations III

- compiler flags and optimization
- consider using optimized external libraries for critical program operations (good reference)
- profiling tools
- parallel computing
    - OpenACC
    - Intel OneAPI

Software Design I

- functions that are strongly suggested being non-member
    - binary operators
    - template functions within a class template
- `std::linalg` proposal

Software Design II

- Curiously Recurring Template Pattern (CRTP): static polymorphism
- Template Virtual Function

useful tools and key mappings for TUI [uts-002G]

October 29, 2024
Utensil Song

Top bar

October 29, 2024
Utensil Song

Setup:

- Use SketchyBar
- Use `just prep-sbar` to setup SketchyBar
- Use `just prep-skhd` to setup some extra key bindings
    - `Shift + Alt + space`: toggle sketchbar
- Modify `~/.config/sketchybar/bar.lua` as follows:

```lua
  display = "all",
  sticky = true,
```

for better multi-monitor support, and make the bar sticky.

Unfortunatly, SketchyBar actively hides itself for fullscreen spaces, see this issue for more info.

So, remember to use `Cmd + Ctrl + F` to toggle fullscreen mode.

Window tiling

October 29, 2024
Utensil Song

Setup:

- Use Amethyst
    - sensible default
    - less confusing configuration than yabai it seems
- Use `just prep-tile` to install Amethyst

Keys:

- `Option + Shift + Space` to cycle through layouts
    - I prefer only BSP and wide
- `Option + Shift + J/K` to cycle through windows
- `Option + Shift + H/L` to resize windows

Alternative setup:

- Use yabai
    - requires skhd for key bindings
- Use `just prep-yabai` to install yabai
- Intentionally choose not to disable SIP

Keys:

- Space manipulation won't work with SIP enabled
- keys to switch between windows are confusing
- place windows on another window to split with it, there is visual hint for the split layout when overlapping with different areas
- `Shift + Alt + s`: change how windows are splitted, prefer use Kitty's own layout management
- `Ctrl + Alt + g`: toggle gaps between windows, prefer some gap for normal apps but no gap for terminals

Terminal

October 29, 2024
Utensil Song

Setup:

- Use Kitty
- Use `just prep-kitty` to install Kitty

Keys:

- Font size
    - `Cmd +` to increase font size
    - `Cmd -` to decrease font size
- OS Windows
    - `Cmd + N` to open a new OS window with the same current directory
- Tabs
    - `Cmd + T` to open a new tab with the same current directory
    - `Cmd + number` to switch to tab number
        - particularly useful for tiled windows, as SketchyBar will hide the tab bar
    - `Cmd + w` to close the current tab
- Kitty Windows
    - `Cmd + Enter` to open a new window with the same current directory
    - close the current Kitty window
        - `Ctrl + Shift + w`
        - or run `exit`
    - `Ctrl + Shift + l` to switch to next layout
    - `Cmd + r` to resize window, e.g. `t` to make it taller
- Kitty
    - `Cmd + q` to quit kitty
    - `Cmd + ,` to open kitty config file
        - with `Ctrl` to reload config

Monitor

October 29, 2024
Utensil Song

- For most info, use btop
- For GPU on Mac, use macmon
- Install both with `just prep-monit`

some useful key mappings in Neovim [uts-002F]

Introduction

Here are some useful keys that I like using in NeoVim, some of them require corresponding plugins.

I use

- LunarVim
    - it has some pre-installed plugins
    - Task `prep-lvim` in my justfile is how I have installed LunarVim. Task `lvim` will syncrhonize the configurations, change the working directory and open LunarVim for a specified project.
    - my LunarVim configuration
    - LunarVim specific keys are marked with 🌕
- AstroVim
    - seems to be a better starting point
    - see astrocommunity for configuring plugin packs
    - I've ported plugins to use it except for forester
    - AstroVim specific keys are marked with 🌌
- LazyVim
    - it has some pre-installed plugins
    - similarly, see task `prep-lazyvim` and `lazyvim` in my justfile
    - LazyVim specific keys are marked with 💤
- NvChad
    - it's ~ 900 LOC lightweight configuration
    - see tasks `prep-chad` and `chad` in my justfile
    - NvChad specific keys are marked with 📦
- For all Vim flavors, the following apply:
    - shared configurations
    - shared plugins

Vital meta keys

some useful key mappings in Neovim › Esc

The `Esc` key is used to escape to normal mode, to use Vim's powerful motions. It's the most frequently used key in Vim.

-  I have configured CapsLock to mean Esc
    - on Mac, go to `System Preferences > Keyboard > Modifier Keys > Caps Lock Key`
-  I have configured `jk` in insert mode to mean Esc
    -  this is a common practice, but I don't seem to need it much

some useful key mappings in Neovim › Leader key

The leader key is used to trigger many key mappings in normal mode.

The default leader key is a single space, represented by `<leader>` below.

🌕 The default local leader key is not set. I've configure it to be two consecutive spaces. `,` is also a common choice.

💤 The default local leader key is set to `\\`. I didn't change it.

With plugin `whichkey` installed, `<leader>` and wait to see some useful mappings.

Most frequently used key mappings is `<leader>` + one character. There are also many two-character mappings, with the first character indicating a category.

First aid


> To generate a truly random string, place a new user in front of Vim and ask them to exit

Most of the time, one can use `Esc` + `:qa!` to force quit Vim, with potential loss of unsaved changes.

In most plugin UIs,

- it can be quit with a single `q`, or `Esc` + `:q`, when in despair, try also `Ctrl+c`
- it will usually enter normal mode first, even in terminals, a single `i` is required to enter insert mode, to acually type something
- it might provide visual hints on how to navigate, usually a single number or a character in brackets
- `g?` or `?` might bring up help on key mappings.

Recording is a powerful feature of Vim, but one can accidentally trigger it with an out-of-order `q`, so one simply need to remember to press `q` again to quit recording if seeing something like `recording @a` in the status bar.

Other first aid key mappings or commands available:

- `:map` to see all key mappings
- `:help key-notation`
    - in case it's not immediately clear that
        - `C` means `Ctrl`, `S` means `Shift`
        - `M` means `Alt`/`Option`
        - `D` means `Cmd` on Mac
- use `:Lazy` to check on plugins
- use `:Mason` to check on LSPs
- 🌕
    - `<leader>+;` to open the dashboard for frequently used actions
    - `<leader>+Lr` to reload the config
    - `<leader>+li` to see LSP informations
- 📦
    - `<leader>+ch` to open cheatsheet
- Theming
    - dropping into an unfamiliar theme is worth a first aid
    - `:colorscheme ` and tab to select a theme
    - 🌕 `<leader>+sp` to search/select and preview themes
    - my favorite is `railscasts`
    - 💤 defaults to `tokyonight`
    - 📦 defaults to `onedark`

Navigation

- Resume session
    - `<leader>+rs` to resume the last session
    - `<leader>+rS` to select a session to resume
- Explore and find files
    - `<leader>+e` to toggle file explorer on the left
        - `x` to cut, `c` to copy, `p` to paste
        - `a` to create a file, `d` to delete, `r` to rename
        - `H` to toggle hidden files, `I` to toggle git ignored files
        - `Y` to copy relative path, `gy` to copy absolute path
        - `s` to open in system file explorer
        - `g?` for help, `q` to quit, `R` to refresh
        - `hjkl` to move up/down, collapse/expand
        - it will automatically follow the current file
        - 💤 and 📦 use NeoTree
            - can't be toggled, need `q` to quit
            - `?` for help
    - 🌕
        - `<leader>+sr` to fuzzy find recent files
        - `<leader>+sf` to fuzzy find files
        - `<leader>+st` to grep text in files
        - `<leader>+bf` to fuzzy find buffers
    - 💤
        - `<leader>+fr` to fuzzy find recent files
        - `<leader>+ff` to fuzzy find files, or simply double space
        - `<leader>+/` to grep text in files
- Manipulate buffers and windows
    - use `L` for going to next buffer, `H` for previous buffer
    - use `<C-hjkl>` to move between windows
    - use `-`/`=` to resizing, or with `C-` for vertical resizing
- Jump between files
    - `gd`: go to definition
        - `Ctrl`+click also works, note that it's not cmd+click on mac
    - `gl`: show diagnostics under cursor
    - `<C+o>`: go back, `<C+i>`: go forward
        `:jum` to see the jump history
    - `gf`: go to file
    - `K` : hover, backspace to cancel hover
    - `Shift`+click on URL to open in browser
- In-file outline
    - `<leader>+o` to toggle the outline on the right
    - I have configure it to autojump and autoexpand
- Folding
    - the following needs to be run inside the fold
        - otherwist it will work on the upper level
    - `za` to toggle fold
    - `zA` to toggle all folds
    - `zO` or `zC` to open/close folds recursively
        - e.g. it's useful `zCV` then `<leader>+/` to toggle comment for a fold range

Editing

some useful key mappings in Neovim › Basics

- Open file
    - `:e` to open a file, with tab completion, but it's better to use the fuzzy finder above
    - `:vs` to open a file (or the current buffer) on the right
    - `:sp` to open a file (or the current buffer) on the bottom
- Save & quit
    - `:bd` (i.e. buffer delete) to close the current buffer/tab
        - won't close if there are unsaved changes, `:bd!` to force close
    - In VSCode Vim mode, should stick to `cmd+s` for saving, `cmd+w` for closing a tab
    - 💤
        - `:w` or `Ctrl+s` to save
        - `<leader>+bd` to close a buffer with confirmation on unsaved changes
    - 🌕 prefer to use
        - `<leader>+w` to save
        - `<leader>+c` to close a buffer with confirmation on unsaved changes
    - `ZZ` to save and quit

some useful key mappings in Neovim › Motions

- Repeatition
    - prefix with a number to repeat a motion multiple times, e.g. `3j`, `3w` for the impatient
    - `.` to repeat the last edit motion, but not move motions
- Move motions
    - On Mac, combine with the touchpad to be even more efficient
    - `gg` for beginning, `G` for end
    - `zz` to center the current line, `zt` to top, `zb` to bottom
        - prefixing with a number works on the corresponding line
    - `w`/`b` to move forward/backward by word
        - `W`/`B` to move by space-separated words
    - `e` works like `w` but moves to the end of the word
        - `ge` works like `b` but moves to the end of the previous word
    - left `hj` are for left, down; right `kl` are for right, up
        - pointer finger on `j`, then muscle memory for `jkl` but `h` is a little difficult to reach
    - `0` to move to the beginning of the line
        - I don't like `^` and `$` because they are difficult to reach
            - but in the case of `d$`, it's worth it to reach for `$`
    - `f` + character to move to the next such character on the same line
        - `;` and `,` to move to next or previous
        - `t` works like `f` but moves to one character before, just like `i` for `a`
        - e.g.
            - `ct)` is useful for change until the next `)`
            - `vt)p` is useful for select until the next `)` and paste and replace
        - `F` and `T` work like `f` and `t` but move backward
    - `}` to the next paragraph
- Edit motions
    - to enter INSERT mode
        - `i` for before the cursor
            - `I` to insert at non-whitespace beginning of the line
            - `0i` to insert at the beginning of the line
        - `a` for after the cursor
            - `A` to append at the end of the line
        - `c` (i.e. change) to replace by motion, e.g. `cw` to replace a word
            - `R` to replace by character
    - `u` to undo, surprisingly `Ctrl+r` to redo
- Text objects
    - `i` for inside, `a` for around
        - works after the action, before the motion
            - e.g. `ciw` to *c*hange *i*nside the *w*ord
            - note that `cw` only changes the part of the word after the cursor, but `ciw` changes the whole word
            - `a` will affect the surrounding whitespace and punctuation
    - with `mini.ai` extending the text objects, we have more general way to discribe the text objects
        - `b` for brackets
        - `q` for quotes
        - `c` for class
        - `f` for function bodies
        - `a` for arguments
        - `?` to specify left and right delimiters interactively
- Cut/copy/paste
    - `x` to immediately delete the character under the cursor
    - `d`+direction to delete a character in that direction
        - prefer to use number + `dj`/`dk` to delete multiple lines
    - `dw`/`dd` to delete a character/word/line
    - all the above will also cut the deleted text
    - `yw` to copy the current word, `yy` to copy the current line
    - `p` to paste after, `P` to paste before
        - for pasting into a pair of brackets/quotes etc., move to the opening one (e.g. use `Esc` + `b`), then `p` or `Cmd+v`
    - `yyp` to duplicate a line
    - `<leader>+s+R` to search the registers for cut/copied text
    - on Mac, `Cmd`+`c`/`v`/`x` also works despite which mode Vim is in
        - but it seems to be not working stably across difference instances of Vim, especially when inside various plugins
        - `y` usually guarantees the text is copied to the system clipboard, then use it in other applications

some useful key mappings in Neovim › Search & replace

- Search
    - `/` to search forward, `n`/`p` to go to next/previous match
    - `?` to search backward
    - `*` to search the word under the cursor
    - remember to press `Enter` after typing the search term
    - remember to treat the search as a regex
        - i.e. to escape `\`, `()`, `[]`, `{}`, `^`, `$`, `*`, `+`, `?`, `|`, `.` etc.
    - use `<leader>+h` to clear the search highlight
    - `gn` to select the next match, `gN` to select the previous match
        - repeating this would select *to* the following matches
        - prefix with an operator to operate on the selected text
- Replace
    - use `%s/before/after/g` to replace all occurrences, remember`%`, otherwise it will only look for the pattern in the current line
    - `<leader>+ss` to toggle the search and replace panel
        - `<localleader>+r` to replace all occurrences
        - manually: use `gf` to go to each file, do edits, then `Ctrl+o` to go back and continue for the next file

some useful key mappings in Neovim › Visual selection

- Select text
    - use `V` to enter line visual mode
        - this is the most frequently used visual mode, to select multiple lines
    - lowercase `v` can be used for character visual mode, but I find it less useful
    - `Ctrl+v` to enter block visual mode, i.e. select by column
- Indentation & comment
    - use `<<`/`>>` to indent/unindent the current line
        - use a single `<`/`>` to unindent/indent the selected text
    - 🌕
        - `<leader>+/` to toggle comment
        - works for both current line and selected text
    - 💤
        - `gc` to toggle comment for selected text
        - `gcc` to comment/uncomment a line
- Multiple-cursor editing (⧉)
    - use `/xyz` + enter to search for `xyz`, then `cgn` to change it to something else, `Esc`, then
        - use `.` to do the same for next occurrences
        - use `n` to skip, and `u` to undo
    - use `Ctrl+v` and direction keys (e.g. `j` or `5j`) to vertically select
        - use `c` to change
        - use `I` to prepend
        - use `A` to append
        - use `d` to delete
        - type the new text, then `Esc`, then it will be applied to all lines of selection
    - use `gv` to reselect the last selection
- Visual selection with mouse/touchpad
    - motion-compatible selections
        - dragging works like `v`
        - triple click + scroll works like `V`
        - `alt/option` + drag works like `Ctrl+v`
    - select and
        - ✅ delete by `d`/`x`
        - ✅ cut/copy/paste
        - ✅ comment/uncomment
        - ❌ typing
            - ✅ use one of `i`, `a`, `c` etc., then start typing
        - ❌ indent/unindent with `Tab`/`Shift+Tab`
            - ✅ indent/unindent with `<`/`>`
    - vertical select and
        - ✅ delete
        - ✅ paste
        - ❌ typing
            - ✅ use one of `I`, `A`, `c`, then start typing
- Recording
    - `qa` starts recording user's inputs, `q` quits recording, `@a` replays the recording, `a` could be any other letter
    - it could serve as ad hoc key mappings

Diagnostic, quickfix and code actions

- Diagnostics
    - by default, diagnostics are shown as virtual text, I've disabled it for now
    - `<leader>+l` with `j`/`k` to move between diagnostics
    - `<leader>+ld` to show diagnostics in the current buffer
    - `<leader>lw` can search for diagnostics in workspace
    - `Trouble` can be used to show diagnostics in a separate panel
        - `<leader>+xx` to open the trouble panel at the bottom
        - use `<leader>+xX` to see only the current buffer's diagnostics
        - click on diagnostics to jump to the corresponding location
- `<leader>+lq` to open the quickfix list
- `<leader>+la` to open the code action list

Terminal

It's very important to stay in the flow and Zen mode as much as possible. Split windows for terminals breaks the flow, so I prefer to use float terminals.

- Float terminals
    - 🌕 `Ctrl+\`to toggle a float terminal
    - 🌕 number + `Ctrl`+backslash to toggle the n-th float terminal
    - 💤 same, but with `Ctrl+/`
    - it's insert mode on first launch
        - but usually enter normal mode when lost focus
- Split terminals
    - `:term` to open a terminal
    - it's normal mode by default, `i` to start typing commands
    - `:sp term://zsh` to open a terminal running `zsh` at the bottom
    - 📦
        - use `<leader>+h` to open a new horizontal split terminal
        - `:q` to hide terminal
        - `<leader>+pt` to pick hidden terminal
- Working inside the terminal
    - `exit` will exit the terminal and close the window associated with it
    - `Ctrl`+`c` can be used for interrupting the terminal

Git & Github

- git
    - install and use `lazygit`
        - `<leader>+gg` to open lazygit,  `q` to quit
        - on a file/directory
            - space to stage/unstage, `i` to ignore, `d` to discard
        - `c` to commit, `P` to push, `p` to pull
        - `r` on a commit to reword
    - in editor, `<leader>+gs` to stage the current hunk, `<leader>+gr` to revert the current hunk
    - usually lazygit is good enough, one can also use `:DiffviewOpen` to inspect diff and handle merge conflicts
- Github Actions
    - `<leader>+ga` to open the Github Actions split window
        - deprecated: `<leader>+gh` because it conflicts with 💤
    - `gj` to open the job under cursor, `gr` for the run, `gw` for the workflow file
    - `<leader>+gq` to close the split window
- Octo
    - `:Octo <Tab>` to operate on issues and PRs
    - it's interesting but I've disabled it for now, in favor of using the web interface

Auto-completion

- Auto-complete
    - with Github Copilot
        - `:Copilot setup` to authenticate etc.
        - `Tab` to accept suggestions
        - `Shift+Tab` to use the normal Tab
    - with `blink.cmp`
        - I've configure it to not triggered by default, in favor of Github Copilot
            - toggle by `Ctrl+q`
            - could also be triggerd by `down` in insert mode
        - up/down to cycle through suggestions, `Enter` to accept
        - the above is set up so all this key mapping are explicit, intuitive and won't interfere with `<Tab>` for Github Copilot
    - with `nvim-cmp` (deprecated)
        - `Tab` to cycle through suggestions from `cmp`, reverse cycle with `Shift+Tab`
        - continue to type to select a suggestion, this is better then hitting `Enter`
        - `Ctrl+e` to reject suggestions, and `Tab` to take suggestions from Github Copilot
- LLM Chats
    - Github Copilot
        - `<leader>+aa` to toggle the chat, `<leader>+al` to clear chat history
            - `<leader>+a` + a character
            - for selected code block, **e**xplain, **r**eview, **o**ptimize, **R**efactor, **t**est
            - **d**ebug, **f**ix, co**m**mit log, **n**aming etc.
            - `?` to select the model
        - `C-s` to send the prompt
        - `:help copilot` to learn more
    - Aichat
        - `aichat` with a little setup at `~/Application Support/aichat/config.yaml`
        - `.` + tab to see available commands
    - Aider
        - it's great for multiple-file code generation
        - I've created a `aider` shim to run it in a virtual environment with preferred configurations
        - `/` + tab to see available commands
            - mostly it's a combination of `/add`, `/web`
        - use `/token` to check token usage, use `/clear`, `/drop` and check `/map` to save token usage
    - Code Companion
        - `<localleader>+a` to toggle the inline chat (works also for selection)
        - `<localleader>+z` to toggle the chat sidebar
    - Parrot
        - `<leader>+pr` to rewrite the selection
        - `<leader>+pp` to implement the selection
        - `<leader>+pn` to start a new chat, optionally with the selection
        - select and `:Prt` + `<Tab>` to select advanced prompts
        - `:PrtInfo` to inspect configurations
        - `:PrtModel` to select the model
        - I've configured it to use a buffer to input the prompt, escape twice to send it
        - it seems unable to stop generation sometimes, use `:PrtChatStop` to stop
    - Avante (deprecated for now)
        - `<leader>+aa` to open the chat, use `a` to accept individual suggestions, or `A` to accept all
        - select code and `<leader>+ae` to modify code
        - it's interesting but I've disabled it for now, in favor of Github Copilot
- Spellcheck
    - to use NeoVim's built-in spellcheck
        - use `:set spell`
        - `]s`/`[s` to move between misspelled words
        - `zg` marks a word as good, `zw` marks a word as wrong
            - `zug`/`zuw` to undo
        - `z=` to see suggestions

Useful plugins

- Conform
    - formart on save using LSP default formatter
    - `<leader>+lf` to format the current file
- showkeys
    - `<leader>+kk` to show keys screencaster on top-right corner
- symbols
    - `<leader>+se` to fuzzy search symbols, e.g. emoji, LaTeX
- Lean
    - set local leader to "  " (tentative)
    - `<localleader>+i` to toggle Lean Infoview
    - `<localleader>+r` to restart Lean server for the current file
- Forester
    - `<localleader>+c` to create a new tree and open it
    - `<localleader>+b` to browse and search trees by title
    - `gf` to go to the tree under the cursor

initial survey on headless render/testing for WebGL [uts-001N]

I want to be able to run headless render or visual regression tests, both locally where GPU is available, and on CI where GPU has to be emulated.

The ingredients seems to be Storybook+Playwright+Xvfb.

- use OrbStack as local docker engine
- use act (usage) to run GitHub Actions locally
- use Storybook to building pages and components in isolation
- use Playwright to verify the visual appearance on different browsers
- use Xvfb to emulate a display server

If I'm a bit more ambitious, I would like to be able to do the same for WebGPU. At least we do know that Xvfb support OpenGL via mesa (which also supports Vulkan etc.), and there are some libraries to run WebGPU via WebGL 2 (e.g. wgpu).

Some resources I found so far:

- Effortless CI Integration: Running Playwright Tests with GitHub Actions uses `xvfb-run`
- `playwright`'s own CI uses `xvfb-run`
- Headless chrome – testing WebGL using playwright demonstrates WegGL testing when GPU is available, see also its writeup
- Options to enable GPU Hardware Acceleration in headeless mode
- use xquartz with PyVirtualDisplay on Mac
- use libosmesa with Chrome from 2016
- use VirtualGL with Xvfb ⧉

For an earlier survey on headless testing, see SSR of WebGL.

SSR of WebGL [uts-0013]

August 31, 2024
Utensil Song


## Server-side rendering of WebGL

Combining `xvfb-run`, headless-gl, three.js, should make it possible, e.g. see Headless rendering with THREE.js, see also:

- sharp for rendering to various image formats
- three.js + headless.gl rendering with texturing in CoffeeScript
- How to run OpenGL based tests on Github Actions

## Plotly.js

I wish to draw a style similar to Project contour profiles onto a graph by Matplotlib, which plots a 3D surface then its contour profiles projected to surrounding planes.

Plotly.js has a similar example as Surface Plot With Contours.

Plotly requires a `div` element to render the plot, it will initialize a WebGL-capable `canvas` to do so. It has support for using virtual WebGL context but no native way to specify a `canvas` from the outside.

When running in headless mode from node, we will typically use jsdom which in turn uses node-canvas that's not WebGL-capable. There is an open issue to detect the presence of node-canvas-webgl and use it, as it integrations `node-canvas` and headless-gl. Another idea is to make NPM packages `get-canvas-context` and in turn `webgl-context` to handle this.

A commercial plot libraries `LightningChart JS` has made Headless server use support for itself, which could be used as a reference.

Should I Render Three.js to Video on the Client or Server? has some discussions on this topic.

initial survey on writing Rust in Forester [uts-0011]

August 31, 2024
Utensil Song

I wish to be able to call rust libraries that are WASM compatible to use it for both BSR and SSR.

Here is my initial survey:

- wasm-pack and wasm-bindgen are the two main tools for building and binding rust to javascript, more info see awesome-rust-and-webassembly
- I wish to write one-file Rust and specify the dependencies in the file, rust-script is similar in spirit but it's for server-side execution.
- I wish to find prior art on doing this with bun, found bun-ffi-rust and hyperimport, but again, it's for server-side execution.

some links for learning shaders [uts-000H]

readings

Ray tracing weekend series [shirley2024ray1][shirley2024ray2][shirley2024ray3] is a good starting point for learning ray tracing.
The Book of Shaders [vivo2024book] is a good starting point for learning shaders.
[glassner1989introduction] is a classic book on ray tracing, now free to download here.
[pharr2023physically] is a complete, illustrative, and physically based ray-tracing renderer book, free to read on line.
Autodesk Standard Surface contains references to many concepts and formulas for surface shaders.

libraries and tools

One could use GlslEditor (which provides widgets to adjust colors, 1-3D numeric values etc.), GlslCanvas or glslViewer to play with the shaders. They are crreated by the authors of The Book of Shaders.
ShaderMania seems to be a good tool to play with shaders. Similarly there are Shadertoy, which has a VS Code extension and a basic ThreeJS port.
stackgl provides many packages for using in GLSL. LYGIA is a similar effort, targeting a wider range of languages: GLSL/HLSL/MSL/WGSL/CUDA, it also has great integration with many tools, including ObservableJS.
For GA, there are miniPGA.glsl, slerp optimization, Spinor 2d, and G2 Multivector-Valued Functions.
MPR implements [keeter2020massively] on GPU, while Fidget implements it on CPU.
Easy steps to implementing a Shadertoy in React Three Fiber
Workflow from Shadertoy to React Three Fiber - R3F
The Study of Shaders with React Three Fiber

some links for rendering implicit surfaces [uts-000G]


# readings: raymarching on implicit surfaces

- [singh2009real] has reviews on related work, an algorithm and many examples. But it has no code.
- Ray Marching and Signed Distance Functions is a good introduction to ray-marching and signed distance functions, and the author learns from Inigo Quilez.
- Rendering Worlds with Two Triangles with raytracing on the GPU in 4096 bytes by Iñigo Quilez in 2008
- Raymarching Distance Fields in 2013
- raymarching terrains by Iñigo Quilez in 2002
- Volumetric Rendering: Raymarching in 2016
- Painting with Math: A Gentle Study of Raymarching in 2023, which has interative illustration of ray-marching algorithm, and shader examples from simple to complex
- CUDA Ray Marching by Jonathan Granskog
- Slides on GPU Ray Marching by Alex Benton in 2017–18, with more related materials here.
- Ray Tracing Deterministic 3-D Fractals in 1989
- Ray-marching Thurston geometries in 2022
- PU-Ray: Domain-Independent Point Cloud Upsampling via Ray Marching on Neural Implicit Surface in 2024
- Ray Tracing Harmonic Functions in 2024
- Lipschitz-agnostic, efficient and accurate rendering of implicit surfaces in 2024
- What is raytracing/raymarching/raycasting
- Ray Marching in JS, the source code has a link to the principle behind ray marching shadows: soft shadows in raymarched SDFs - 2010
- Raymarching Voxel Rendering in 2017
- UC Berkeley CS184/284A Lecture 9: Ray Tracing & Acceleration Structures in 2024
- Third Year and MSc Interactive Computer Graphics in 2008
- Raymarching Distance Fields in 2013
- Perfecting anti-aliasing on signed distance functions in 2025

# libraries and tools: raymarching on implicit surfaces

- SURFER renders implicit surfaces using ray-tracing, the result is stunning well. It also has a Java library with many examples.
- Raycasting Implicit Surfaces based on CindyJS is Web-based implicit surface renderer using ray-tracing, and the source code is best suited for reproducing and learning. It also has a rich set of examples.
- Raycasting a surface of degree 4 is a relatively simpler example of ray-tracing implicit surfaces with source here.
- mla has a series of shaders from simple to complicated to render the Kummer surface, see here. mla has also ported a tiny ray tracer here which could be a starting point.
- A simpler but more versatile ray-tracer for implicit surfaces is Implicit surfaces created by Rogolop.
- 3D Bifurcation Diagram
- MA1104-implicitplot has a naive implicit surface plotter.
- usegpu.live demo of ploting implicit surface
- Surface Plotter makes use of Paul Bourke's Marching Cubes algorithm with edge table, to turn implicit surfaces into polygonal meshes. See Marching Cubes and its references.

## Implicit reconstruction or polygonization

- A Heat Method for Generalized Signed Distance in 2024
- Robust Creation of Implicit Surfaces from Polygonal Meshes in 2002
- Curvature Dependent Polygonization of Implicit Surfaces in 2004
- Implicit Meshes for Modeling and Reconstruction in 2018
- Differential Geometry in Neural Implicits in 2022
- Differentiable Iterated Function Systems in 2024
- Adaptive grid generation for discretizing implicit complexes in 2024
- SDFGen: A simple commandline utility to generate grid-based signed distance field (level set) generator from triangle meshes, using code from Robert Bridson's website. Found from Merging Meshes with Signed Distance Fields.
- sdf: Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.
- SDFStudio: A Unified Framework for Surface Reconstruction

For learning shaders, see some links for learning shaders.

Convert some of the links to citations.

I would like to have a formal specification of the raymarching algorithm (in algorithm2e or algorithmicx+algpseudocode or even algpseudocodex), and find propositions worth proving.

Distributing my binaries [uts-016P]

- Repology
- brew for mac & linux
    - personal tap repo: see
        - Taps (Third-Party Repositories) — Homebrew Documentation
        - Adding Software to Homebrew — Homebrew Documentation
        - `create`, `bottle` in brew(1), also `bump*`
- Windows or all platforms
    - Aqua private registery
        - then supported by mise or maybe more
        - see Develop a Registry | aqua

Using Zig to build more [uts-016O]

- C/C++ projects
    - All Your Codebase | Loris Cro's Blog
        - All Your Codebase
            - no GiNaC yet
- help rust to cross-compile
    - rust-cross/cargo-zigbuild
        - Compile Cargo project with zig as linker for easier cross compiling
- help PyPI to store less
    - The Python Package Index Should Get Rid Of Its Training Wheels | Loris Cro's Blog
    - zig is also available as a PyPI package
        - ziglang · PyPI
- zig alternative to PyO3/maturin: Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
    - spiraldb/ziggy-pydust: A toolkit for building Python extensions in Zig.
        - the orginal template that ueses poetry
        - the community template that uses uv and hatchling
            - where hatchling seems to be a better build backend than setup tools post 3.12 for projects with native extensions and custom build hooks
            - see also Why Hatch? - Hatch

posts [uts-0168]
✍️source

My Day 1 with jj [uts-016M]

> Discuss on [/r/git](https://www.reddit.com/r/git/comments/1lvbzo8/my_day_1_with_jj_jujutsu/) and [Bluesky](https://bsky.app/profile/iutensil.bsky.social/post/3ltnz23tp2k2z)

## TL;DR

I became productive with [jj](https://github.com/jj-vcs/jj) (Jujutsu, *dauntless* version control) on day 1. This is my story, along with my mindset changes, and delicious recipes. Scroll down to the end for a short list of when to use what command/alias.

## Why try `jj`?

Well, *dauntless* is not the official slogan, but I really need it to describe a liberating feeling much stronger than "fearless".

Git is an intimidating black box for some, but I don't fear it in my daily work, as I'm skilled enough to easily recover work if I messed up.

`jj`, on the other hand, gives me the ultimate power in the river of time:

> In `jj`, I can go anywhere in history, do anything, be done with it, and history will reshape itself accordingly.

I wouldn't know this when I first read [Steve's Jujutsu Tutorial](https://steveklabnik.github.io/jujutsu-tutorial/) months ago, it's well written but still a haystack to find the needle that would click with me.

[jujutsu on tangled](https://blog.tangled.sh/blog/stacking) intrigued me, by telling me "Jujutsu is built around structuring your work into meaningful commits", and showing how it helps contributors to iterate on changes with dependencies on other changes, individually reviewed, reworked, and merged. But what if I just use it on my personal repo, and don't care much about clean history?

[A Newbie's First Contribution to (Rust for) Linux: Git gud](https://blog.buenzli.dev/rust-for-linux-first-contrib/#git-gud) shows the author's Git workflow to contribute to Linux kernel with a lot of fast moving forks. The configuration for `rebase` and `rerere` seems really solving the roadblocks of contributing to a thriving project. But the author said: "Jujutsu is much better at this kind of workflow and what I was actually using most of the time. I first discovered this workflow while using Jujutsu and only later found out you could do something similar with Git." A refreshing remark! Maybe I do need native and friendly support for more advanced features, even just to alleviate my mind.

Eventually, pksunkara's [Git experts should try Jujutsu](https://pksunkara.com/thoughts/git-experts-should-try-jujutsu/), and his side-by-side workflow comparisons with Git, and his [`jj` configuration full of aliases](https://gist.github.com/pksunkara/622bc04242d402c4e43c7328234fd01c)  that gave me the first boost, so I finally decided to spend a day with  `jj`. I definitely no Git expert, but those aliases really look attempting, what if I can pick my few favorites?

I started the journey with a minimal goal. I just wanted to recover my Git workflow in `jj`, as `jj` coexists with a local Git repo, and I need `jj` to work with my GitHub (or any code forge I might migrate to in future). I learned much more, and they are much easier than I thought.

## Protect my main branch

First thing, I went to my GitHub settings and [protected my main branch](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/about-protected-branches), to at least forbid force push, in case (my misuse of) `jj` messes up the remote. ([Here](https://ricky-lim.github.io/blog/protect-your-main-branch-in-github/) is how to do even better from CLI.)

## Install and configure `jj`

I use [just](https://github.com/casey/just) to write code snippets as tasks. Installing `jj` on Mac, and prepare the configurations is simply:

```make
prep-jj:
    which jj || brew install jj
    mkdir -p ~/.config/jj
    cp -f dotfiles/.config/jj/config.toml  ~/.config/jj/config.toml
    jj config set --user user.name "$(git config --get user.name)"
    jj config set --user user.email "$(git config --get user.email)"
    jj config list --no-pager
```

The [`dotfiles/.config/jj/config.toml`](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) is an adapted version from pksunkara's, and all changed places are annotated with `[NOTE]` comments and inspiring links. By copying to the right path, it becomes the user-wide configuration.

`jj config set --user ` will write additional and more personal information into the same file (assuming git has already configured these information), so they came later. Finally, the configuration is listed for inspection.

Every time I've updated the configuration, or I'm on a new machine, I'll run `just prep-jj` again, so I'm all set again. Your way to manage code snippets might be different, but it all boils down to similar few lines.

You might want to change the line `editor = 'hx'` back to favorite editor, as mine is now [Helix](https://github.com/helix-editor/helix).

It also assumed that you have installed [delta](https://github.com/dandavison/delta), a syntax-highlighting pager for git, diff, grep, and blame output. It's eye candy, and makes the diff much more clear.

## Initialize the Git repo

`jj` can coexist, or officially, *colocate* with a local Git repo. I place all my local Git repos under `~/projects/` so my just task looks like this:

```make
init-jj PROJ:
    #!/usr/bin/env zsh
    cd ~/projects/{{PROJ}}
    jj status || jj git init --colocate
```

`jj status` will detect if it's already a `jj` repo, otherwise we will initialize `jj` to colocate with Git. I try to make all my tasks open to reentrance, so I can easily tweak and rerun them.

Actually it's already alias to `jj init` by pksunkara, but here we go with the origin command, to avoid locked-in. But later on, I'll use more aliases, and you may consult [the configuration](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) for the full command.

## Thrown into limbo

Once `jj` is initialized, we are no longer on any branch of Git. If we go back to `git` or `lazygit`, we'll see that we are on a detached state.

I don't know about you, but my feeling was like floating in the air, with no solid ground under my feet.

I felt much better when I learned that what I commit as *change*s in `jj` will still be reflected in Git as commits, and commits can still find their parents.

We can go back and edit a *change* then commit again, it would be become a new Git commit, but its identity as the original *change* remains the same. So changes can still find their parent changes.

Actually, a branch in Git, is just a reference always point to the latest commit of the branch, so its history can be traced via parents.

`jj` has *bookmark* for pointing to a *change*. Unlike in Git, when you commit more changes on top of it, the *bookmark* doesn't move to the latest change automatically, however, *we* can move it.

With these 2 powerful concept, we are lifted off ground, start flying among changes, with bookmarks marking our path.

## The working copy

OK, we are not on any branch, so where are we?

`jj log` (or simple `jj l` or even `jj`) gave me something like these at the top:

```
@  rrptyzpt 0f23c873 (empty) (no description set)
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

That's a lot to take in!

`f800c4ed` and  `My last commit on the main branch` can be recognized as my last Git commit hash and message.

`vmzsxwzu` is the *change id* assigned to it, along with a diamond ◆ indicating that it's immutable: It's written on the stars by Git, so we can't edit it.

But what's the commit-like stuff on top of it? I didn't like it, and even tried to `jj abandon` it, but it keeps coming back with a new *change id* and a new commit hash.

Finally, I accepted it as the "working copy", indicated by `@`. Unlike in Git, where I'll be on HEAD, along with some local changes to be staged, in `jj`, I'm already on the future commit as a working copy. That's why sometimes we need `@-` that points to the parent of `@`, the last committed *change*.

`(empty)` means I haven't made any edits to this *change*. `(no description set)` means I haven't described what I'm going to do for this change (via `jj desc`).

I'll feel much better if it's just `(no description)`, as the `set` makes me feel guilty for not providing a description ahead. And it turns out that I didn't need to! Even after I'm done with it, and moved on (`jj` will keep the change in its history, like automatically doing `git stash`). If I have trouble recognizing the change without a description, I could always use `jj d` to inspect the diff.

`jj s` (status) shows it more verbosely:

```
The working copy has no changes.                                                                                                                          │
│Working copy  (@) : rrptyzpt 0f23c873 (empty) (no description set)                                                                                        │
│Parent commit (@-): vmzsxwzu f800c4ed main | My last commit on the main branch
```

The change id of my working copy begins with `rr` (bold and colored, indicating we can refer to it uniquely with `rr`).

It's a bit annoying to enter and quit pager, and unable to see the latest changes, so I'v configured `jj` to use no pager and only occupy about 2/3 screen.

There's also alias `jj lp` that shows private (i.e. not pushed) changes, and `jj lg` that shows all changes, including elided changes that you just pushed.

## Commit

I worked on a repo that I'll also add my just tasks and configurations for `jj` into it. So I edit the justfile in my editor Helix, and saved it to disk.

Alias `jj d` will show the diff, that confirms that `jj` has seen our edits.

We can now commit the change. `jj commit` will commit everything, opening the configured editor for commit message (just save and quit to submit, quit without saving to cancel).

Aliases like `jj ci` will also fire up a built-in diff editor to interactively select which files, or even which parts (called sections) to commit. Arrow keys can fold/unfold files/sections, space to select, `c` to commit, and `q` to quit. (I couldn't find how to commit until I tried `c`, there is no visual hint at all, unless you use mouse to click on the menus, but I didn't realize that I could do it on day 1, I tried every possible key combination to trigger the menu, no luck).

I often find myself repeatly use `jj ci` to review, select and commit, to split my local edits into a few independent changes.

With alias `jj cim <message>`, we can directly supply the commit message from CLI without opening an editor, then select which parts to commit.

## New

After commit, we'll again on an empty change with no description, so we can continue making new edits.

But if after working for a while, I suddenly want to discard my (uncommitted) edits, I can do `jj n rr` to start over from the last change. The uncommitted edits will be still there in `jj` as a change, but local files will be reset to the state when we commited `rr`.

Here `rr` could be any other change id, or any bookmark, or any *revset* in general. We'll learn more about *revset* later, for now, we just need to know that a change id, a bookmark, `@` and `@-` that we know so far are all *revset* s.

`jj n` (new) effectively spins off a new history line from the parent change. We can go along the line for a few changes, and spin off from anywhere again, without having to create a named branch.

## Edit

In Git, I seldom amend a commit, usually I just use `reword` in lazygit to amend the commit message, or `reset --mixed` (`gm` in lazygit) to discard history after a commit, then stage and commit them in a better way again.

In `jj`, if I don't like a change, I could go back and edit it:

```
jj e <revset>
```

will auto stash any dirty edits, and reset local files to the state of that `<reveset>` (read: change), I can happily do any edits, or commit it with alternative selections of files or sections.

We don't need to explictly commit, when we use `jj n <revset>` or `jj e <revset>` to jump to anywhere else, the local edits will be automatically committed to the original change.

## Evolution of a change

Wait, what if I messed up when editing a commit, and left it, can I go back?

At that time, it's beyond what `jj undo` can easily do, especially we might be confused by what we did and didn't do.

The first thing we can do is to inspect how a change has evolved over time by

```
jj evolog -r <revset>
```

It looks like this:

```
❯ jj evolog -r uo
○  uoxoknux 92fb8d3a (no description set)-- operation a0693515ab31 (2025-07-09 11:54:16) snapshot working copy
○  uoxoknux hidden 1f73cdfd (no description set)-- operation 79813357f450 (2025-07-09 11:52:58) snapshot working copy
○  uoxoknux hidden c2bf8f4f (empty) (no description set)-- operation a22c21ce9fc1 (2025-07-09 11:52:32) new empty commit
```

We can inspect the diffs of each hidden commits with their commit hash via `jj dr <revset>` (alias for `diff -r`) where `<revset>` should be the commit hash to disambiguate.

The we can use either `jj n` or `jj e` on the commit hash of our choosing, to continue work.

Note that the change id would become ambiguous, so we need to `jj ad` one of the commit hash so we can continue to refer to the change id.

It looks like this for me:

<details>
<summary>click to expand</summary>
<pre>
❯ jj n 1f73
Working copy  (@) now at: unyrkpru 003d7750 (empty) (no description set)
Parent commit (@-)      : uoxoknux?? 1f73cdfd (no description set)
Added 0 files, modified 1 files, removed 0 files

❯ jj e 1f73
Working copy  (@) now at: uoxoknux?? 1f73cdfd (no description set)
Parent commit (@-)      : mxowtxvr 68b37035 (no description set)

❯ jj evolog -r uo
Error: Revset `uo` resolved to more than one revision
Hint: The revset `uo` resolved to these revisions:
  uoxoknux?? 92fb8d3a (no description set)
  uoxoknux?? 1f73cdfd (no description set)

Hint: Some of these commits have the same change id. Abandon the unneeded commits with `jj abandon <commit_id>`.

❯ jj ad 92f
Abandoned 1 commits:
  uoxoknux?? 92fb8d3a (no description set)
Rebased 1 descendant commits onto parents of abandoned commits
</pre>
</details>

We may also use `jj op diff --op <operationid>` to inspect meta changes of an operation (it has `--from` `--to` variant so it also works on a range of operations).

## PR with an ad hoc branch

> see also the official doc [Working with GitHub: Using a generated bookmark name](https://jj-vcs.github.io/jj/latest/github/#using-a-generated-bookmark-name)

If I like my changes so far on the current history line, I could open an unnamed PR by first commiting changes, then using alias

```
jj prw
```

where `prw` stands for PR working copy.

Per our config, it will push to a branch named `jj/<changeid>` on origin, then open a PR via `gh` (Github's official CLI to interact with issues, PRs, etc.), it will ask you for a title (default to the description of the first change), and a body, we could just press enter all the way to submit, and change anything later.

Personally I think such a tiny ad hoc PR is a good way to record the proposed changes, even for a personal repo. But if pusing to a branch is all you need, alias `jj pscw` (push changes of working copy) will do.

After merging the PR, we need

```
jj git fetch
```

to let both Git and `jj` to learn about the changes.

Here is how `jj l` looked like after I PR a change "Make a start on trying jj" as `#1`, merged it on GitHub, then fetched the changes:

```
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
│ 
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

`gh pr view #1 -w` can open the PR in the browser.

If we have made edits to the change (but not new changes on top of it, because they would be pushed to their own `jj/<changeid>` branch), like `jj e` or `jj sqar` (introduced below), we can use

```
jj pscw
```

where `pscw` stands for pushing local changes of working copy.

The result looks like this:

```
Changes to push to origin:
  Move sideways bookmark jj/stwxmxoqkyym from 33c46b992efc to 4d3c2a60e0d2
remote: Resolving deltas: 100% (3/3), completed with 3 local objects.
```

If we have made other changes on top of the change, we can treat `jj/<changeid>` as a named branch like below (i.e. `jj bs jj/<changeid> -r <latest changeid>` and `jj psb jj/<changeid>`).

## PR with a named branch

This would best resemble a usual GitHub workflow.

First we create a local bookmark:

```
jj bs <bookmark> -r <revset>
```

`<revset>` could be just a change id, `@-` could come in handy to refer to the last committed change.

`<bookmark>` is the name of the bookmark, just like a normal branch name. And `bs` is the alias for `bookmark set`.

We can see if the bookmark is correctly placed from `jj l`. If we don't like it, we can either `jj undo` or `jj bd <bookmark>` (bookmark delete).

Then we can PR that bookmark:

```
jj prb <bookmark>
```

This would push to a branch named `<bookmark>` on Github, then use `gh` to PR it.

It looks like working on a branch in Git, but after we commit more changes, we need to call `jj bs -r <revset>` to set it to desired `<revset>`, then

```
jj psb <bookmark>
```

Note that the alias begins with `ps` instead of `pr`, meaning "push bookmark".

## Fetching

What if we committed some changes on GitHub, like accepting a review suggestion, we need to use `jj git fetch` to make `jj` know about the changes. We'll see something like

```
bookmark: our-name-of-bookmark@origin [updated] tracked
```

`origin` is the remote on GitHub. `[updated]` means `jj` noticed the changes on the remote, `tracked` means the local bookmark is also automatically updated.

If it's not tracked, we can use

```
jj bt our-name-of-bookmark@origin
```

to track it.

Here is how `jj l` looks like when I PR a few changes in this way then fetched it:

```
@  nozpulso fd589b2a (empty) (no description set)
○  kpxkrxvm a22ed20a configure-jj* Add frequently used jj command alias
○  tvvoymuu 6808370c Keybindings for builtin diff editor
○  ylnlzpvx b20ac2d1 Adapt jj config from pksunkara
○  rnkttumw f6296b0a Improve configs in `just init-jj`
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

## Rebase

But I didn't get such a clean history from the start. I'd begun with a few fixup commits, and one of the commits is mixed with different changes. And initially, I pushed it with an ad hoc branch, accepted a few review suggestions on Github.

After opening the PR, I've noticed a weird non-empty commit `xm`

```
○  ...more changes...
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
│ ○  xmnykxym 4dcb41d9 (no description set)
├─┘
◆  vmzsxwzu f800c4ed My last commit on the main branch
```

After inspecting the diff, I believe its changes is aborbed into `rz` (`#1`), so I rebased `xm` on top of `rz` to see if there is any diff left.

```
❯ jj rb -s xm -d rz 
Rebased 1 commits to destination
```

where `rb` stands for `rebase`.

I just need to remember that I'll be rebasing source (`-s`) onto destination (`-d`).

Now I have

```
○  ...more changes...
│ ○  xmnykxym 2f622327 (empty) I think this is what absorbed into #1
├─┘
◆  rzyozkll 929fe190 main Make a start on trying jj (#1)
```

Nice, it's indeed empty, so I can be assured no work is lost.

It's an extremely simple use case, but I'm happy with being able to transplant a parallel change to be on top of another change, without much hassle.

## Squash

There are some other complications:

```
@  ...more changes...
│ ○  zquplvzx 2b572336 configure-jj Update dotfiles/.config/jj/config.toml
│ ○  nmuntxon 746cb300 Fix typo found by copilot
├─┘
○  ...some changes in between...
○  ylnlzpvx b20ac2d1 Adapt jj config from pksunkara
○  rnkttumw f6296b0a Improve configs in `just init-jj`
```

I want to absorb typo fixes `nm` and `zq` into `yl`, how do I do that?

It's as simple as correctly specifying `from` and `to`:

```
❯ jj sqa --from zq --to y
Rebased 3 descendant commits
Working copy  (@) now at: srnwoqzq c831d68c (empty) (no description set)
Parent commit (@-)      : kpxkrxvm 7d911838  jj/vzzzxqvnptov* | Add frequently used jj command alias
Added 0 files, modified 1 files, removed 0 files
```

where `sqa` is short for squash. Note from the output that any changes after them will be properly rebased.

There is also a shortcut if I just want to absorb a change into its parent: `jj sqar <revset>`.

I keep using the word "absorb" but `jj absorb` does a completely different thing (it splits a change, allowing some other changes to absorb it, by "moving each change to the closest mutable ancestor where the corresponding lines were modified last", that sounds like "desolve" to me).

## Summary

I could go on and try splitting a change, but the above is enough for day 1.

This doesn't demonstrate the full power of `jj`, but imagining how the same things could be done in Git, I'll definitely rather spend the time with `jj`.

I'll wrap up with a summary of the commands/aliases that would be enough for daily work:

### Getting started

| alias | when to use |
|-------|-------------|
| `jj init` | init `jj` in a local Git repo |

### Show

| alias | when to use |
|-------|-------------|
| `l`   | show log, `jj` works better |
| `lp`  | show private log |
| `lg`  | show all log, even elided changes |
| `s`   | show status |
| `w`   | show status and diff |
| `d` | show the diff of current change |
| `dr <revset>` | show the diff of revision `<revset>` |

### Edits

| alias | when to use |
|-------|-------------|
| `n <revset>` | create a new change based on `<revset>` |
| `cm <message>` | commit everything with `<message>` |
| `ci` | commit the change, choose interactively, write commit message in `$EDITOR` | 
| `e <revset>` | edit any `<revset>` after commit |

`<revset>` could be a short change id, a bookmark, and defaults to @ (working copy).

After edit, just go anywhere else (with e or n), the changes will be commited; or an explicit commit will do.

### Working with remote branch

| alias | when to use |
|-------|-------------|
| `git fetch` |  will update local bookmark from remote branch |
| `bt <branch>@origin` | track remote branch |
 
### PR with named branch

| alias | when to use |
|-------|-------------|
| `bs <bookmark> -r <revset>` | move bookmark `<bookmark>` to `<revset>` |
| `psb <bookmark>` | push bookmark to remote branch named `<bookmark>` |
| `prb <bookmark>` |  create a PR from `<bookmark>` to the default branch |

### PR with ad hoc branch

| alias | when to use |
|-------|-------------|
| `prw` | open (committed) working copy as a PR |
| `pscw` | push changes of working copy to remote branch named `jj/<changeid>` |

### History cleanup

| alias | when to use |
|-------|-------------|
| `undo` | undo last jj operation when messed up |
| `rb -s <src> -d <dst>` | rebase `<src>` onto `<dst>` |
| `sqar <revset>` | squash `<revset>` into its parent |
| `sqa --from <src> --to <dst>` | squash `<src>` into `<dst>` |
| `ad <revset>` | abandon a change |
| `evolog <revset>` | show how a change evolve in time |
| `op log` | show log of operations on jj |
| `bd <bookmark>` | delete a bookmark |

These aliases can be found in this [config file](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) which is based on pksunkara's [gist](https://gist.github.com/pksunkara/622bc04242d402c4e43c7328234fd01c).

Trying Zig's self-hosted x86 backend on Apple Silicon [uts-016K]

> TL;DR: I used `colima` to run an x86_64 Docker container (Ubuntu) on Apple silicon, to quickly test `zig build` with LLVM backend and with self-hosted backend, because it's both exciting and concerning (for missing all the goodies from LLVM) news.

After Self-Hosted x86 Backend is Now Default in Debug Mode dropped, I immediately wanted to try it out, but I only have an Apple Silicon machine (e.g. Mac Mini M4 Pro).

## Running an x86 container on Apple Silicon

According to Intel-on-ARM and ARM-on-Intel, I'm supposed to be able to run x86_64 Zig using `lima` with Apple's native virtualization framework and Rosetta. After some fiddling and searching, I've realized that I could just use `colima` to run an x86_64 container on an ARM64 VM, which is also backed by `lima`.

OK, let's get started:

First, install `colima` and prepare it properly:

```bash
# we need `docker` CLI as the client
which docker || (brew install docker; brew link docker)

# (optional) while we are at it, get `docker-compose` and `kubectl` too
which docker-compose || brew install docker-compose
which kubectl || brew install kubectl

# install colima
which colima || brew install colima

# this is to prevent othere docker daemons from interfering
docker context use default
```

Next, let's start colima with Apple's native virtualization framework and rosetta:

```bash
colima start --vm-type=vz --vz-rosetta
```

Because I have already used colima before, but without these flags, there is a warning saying that they are ignored, so I have to delete the existing colima VM and start over.

(Warning: the following command will also DELETE all existing images! So I commented out them to prevent accidental execution.)

```bash
# colima delete
```

Now, we can pull an x86_64 Docker container with Ubuntu:

```bash
# asssuming `docker login` has been done already
docker pull --platform linux/amd64 ubuntu:jammy
```

and start it (`--rm` means to remove the container after it exits, so we'll lose the changes made inside, remove this option if you want to keep the container):

```bash
docker run --rm -it --platform linux/amd64 ubuntu:jammy bash
```

Inside the container, let's confirm that we are indeed running x86_64:

```bash
uname -m
```

Cool, I'm seeing `x86_64`!

## Running `zig build`

Now, we can install Zig and try it out:

```bash
# we need a few basic utils
apt update
apt install -y wget xz-utils software-properties-common git

# Download the corresponding Zig version with self-hosted x86 backend
wget https://ziglang.org/builds/zig-x86_64-linux-0.15.0-dev.769+4d7980645.tar.xz
tar xvf zig-x86_64-linux-0.15.0-dev.769+4d7980645.tar.xz

# Make it available in PATH
export PATH=/zig-x86_64-linux-0.15.0-dev.769+4d7980645/:$PATH

# Verify its version and that it runs
zig version
# got: 0.15.0-dev.769+4d7980645
```
Let's create a simple Zig project to test building it:

```bash
mkdir projects
cd projects
mkdir zig_x86_64
cd zig_x86_64

zig init
zig build
```

Success!

`zig build run` gives

```
All your codebase are belong to us.
Run `zig build test` to run the tests.
```

and `zig build test --summary all` gives:

```
Build Summary: 5/5 steps succeeded; 3/3 tests passed
test success
├─ run test 1 passed 7ms MaxRSS:5M
│  └─ compile test Debug native cached 68ms MaxRSS:44M
└─ run test 2 passed 7ms MaxRSS:5M
   └─ compile test Debug native cached 67ms MaxRSS:44M
```

## Comparing with and without LLVM

But wait, how do I know it's actually using the self-hosted x86 backend?

Hopefully someone has a better way, I just took the longer way to force Zig to build with and without LLVM.

After reading the doc and some searching, I figured out that I could expose an extra option to `zig build` in my `build.zig` to set the corresponding flag for the executable, with only 2 edits:

```zig
    // EDIT 1
    const use_llvm = b.option(bool, "use_llvm", "Force use llvm or not") orelse false;

    const exe = b.addExecutable(.{
        .name = "zig_x86_64",
        .root_module = b.createModule(.{
            .root_source_file = b.path("src/main.zig"),
            .target = target,
            .optimize = optimize,
            .imports = &.{
                .{ .name = "zig_x86_64", .module = mod },
            },
        }),
        // EDIT 2
        .use_llvm = use_llvm,
    });
```

(Optional) I did the edits by running the following to install a helix editor so I can edit Zig code out-of-the-box:

```bash
# https://docs.helix-editor.com/package-managers.html#ubuntu-ppa
add-apt-repository -y ppa:maveonair/helix-editor
apt install helix
# then fire up `hx build.zig` and use it mostly like in Vim
# I also installed zls by
# cd /tmp
# wget https://builds.zigtools.org/zls-linux-x86_64-0.15.0-dev.197+48fb941e.tar.xz
# tar xvf zls-linux-x86_64-0.15.0-dev.197+48fb941e.tar.xz
# cp -f zls /usr/local/bin/
# and checked that it works by
# hx --health zig
# so I can use `gd` to go to definitions!
```

Cool, now let's try building with LLVM:

```bash
rm -rf .zig-cache && time zig build -Duse_llvm=true
```

```
real    0m3.068s
user    0m3.610s
sys     0m0.363s
```

Then without (which should be the x86 self-hosted backend):

```bash
rm -rf .zig-cache && time zig build -Duse_llvm=false
```

```
real    0m2.112s
user    0m2.812s
sys     0m0.361s
```

Wow, it's indeed faster without LLVM! I've tested this a few times and getting consistent results. I'll also try this on more complex projects later, but it's so exciting that I just wanted to write a note for this.

### UPDATE 2025-06-15

I further tried using poop to get more metrics.

First, get and install `poop`:

```bash
cd /projects && git clone https://github.com/andrewrk/poop && cd poop && zig build && cp -f zig-out/bin/poop /usr/local/bin/
```

Then let's run the cold start builds again:

```bash
cd /projects/zig_x86_64

rm -rf .zig-cache* && poop "zig build -Duse_llvm=true --cache-dir .zig-cache-llvm" "zig build -Duse_llvm=false --cache-dir .zig-cache-nollvm"
```

Well, that doesn't work due to permission denied. And `--cap-add PERFMON` or even `--cap-add SYS_ADMIN` didn't work. Not even `--privileged`. See this issue.

Let's try `hyperfine` instead:

```bash
apt install -y hyperfine
```

Taking comments by `mlugg0` on reddit into account, a few factors should be ruled out for a more fair comparison (with C):

1. rule out the build time for `build.zig`;
2. rule out the overhead of panic handler
3. squeeze a bit of performance at the cost of some safety by disabling C sanitization.

1 means to build `build.zig` before the benchmark and after cleaning the cache (observing that `zig build --help`  will build `build.zig` in order to get the options defined in the build script).

2 means to add the following to `main.zig`:

```zig
/// Don't print a fancy stack trace if there's a panic
pub const panic = std.debug.no_panic;
/// Don't print a fancy stack trace if there's a segfault
pub const std_options: std.Options = .{ .enable_segfault_handler = false };
```

3 means to pass `.sanitize_c = .off` to `root_module` in `build.zig`.

With

```bash
hyperfine --prepare "rm -rf .zig-cache* && zig build --help -Duse_llvm=true && zig build --help -Duse_llvm=false" "zig build -Duse_llvm=true" "zig build -Duse_llvm=false"
```

I got

```
Benchmark 1: zig build -Duse_llvm=true
  Time (mean ± σ):      1.392 s ±  0.052 s    [User: 1.287 s, System: 0.126 s]
  Range (min … max):    1.329 s …  1.473 s    10 runs

Benchmark 2: zig build -Duse_llvm=false
  Time (mean ± σ):     546.1 ms ±  13.6 ms    [User: 570.1 ms, System: 128.7 ms]
  Range (min … max):   532.9 ms … 575.9 ms    10 runs

Summary
  'zig build -Duse_llvm=false' ran
    2.55 ± 0.11 times faster than 'zig build -Duse_llvm=true'
```

which is indeed even faster!

discussions on /r/Zig, Bluesky

What's next for LLMs? [uts-016C]

This is no way a professional prediction, just summarizing some ideas popped into my head in a recent discussion with friends. Read this with a grain of salt.

Reasoning, agent, and multimodality are top 3 research topics for LLMs, as far as I can tell. And for me, there is always the topic of efficiency or scalability. And of course, beyond that, world models, and intelligent robotics.

Verifiable reinforcement learning boosted the reasoning performance as explored by DeepSeek and others, but it requires much more output tokens, to a factor of almost 10. That would be 10x inference compute, and to solve the same problem in the same time, you would need another 10x compute to achieve that. Such a 100x compute demand, is not desirable.

I was wrong in 2023, about the inefficient inference of LLMs. I didn't like the quadratic scaling of the attention mechanism, but it turns out that there are more ways to work around the bottleneck than seeking linear attention such as RWKV and other alternatives. Inference time scaling, and MoE are both game changers.

So I would expect myself to be wrong again about the cost of reasoning. And there are preliminary researches showing that reasoning is already inside and can be further trained into the LLMs.

That is, reasoning should not be happening at the output layer, but intermediate layers. Writing down the reasoning process, should merely serve as a memory of explored paths, to keep things on track, or materialize the box to think out of.

The waste is not just on the input side, there are significant waste of input tokens by prompts that are not part of the instruction nor the context, but guidance. These guidances should not just be cached, but "compiled", in a more efficient and controllable manner, while remaining the interpretability.

Prompt engineering and reasoning by output tokens are two sides of the same coin: we are reaching better results in an interpretable manner, but much more expensive than it should be, and it would become obvious once we find the right principle. Of course, history relies on such detours to get back on track.

Agent is another layer of waste, on top of individual inferences. Agents fall into fruitless planning and exploration, failing to snap out of the infinite loop, by acquiring the extra information or triggering the extra side effects. Their interactions with tools (e.g., MCP), agents (e.g., A2A), and humans (e.g., A2H) are all inefficient due to using natural language as the intermediate representation.

Imagine an alien species with a far more efficient language, like the one depicted in *Arrival*. Such a language could express complex concepts like tool invocation with parameters using just a single token - or perhaps a single multimodal token combining multiple meanings.

There should be special dense tokens, to represent tools or even ways to think (e.g., the MCP for "Sequential Thinking"). And why should a token just be an embedding of a high dimensional vector? Why isn't it a curve, a hypersurface, or a manifold?

Take some time to think about this, while showing the visualization of the Biology of a Large Language Model on the screen.

I believe using modularity other than human perceptions, as the intermediate representation, is the key to the next breakthrough. They would also represent the world model better.

At the end of the day, it all comes down to human-level efficiency to reach human-level intelligence. And the efficiency will come from using the right principle, which will also result in better performance.

And all of reasoning, agent, multimodality, efficiency, world model, and robotics, are all covered by this one integrated solution.

4 bugs caught me off guard recently [uts-016A]

And it's non-trivial to hunt their families down.

But yeah, that's pretty much my job description, that is, to set up mechanisms to eliminate similar problems from all systems in future.

They are from different kind of systems and varying technical stacks, so it would be interesting to pin these specimens down for a display.

How to hunt their families down are left as exercises for the readers.

## Legacy configuration removal

The system in question is a proxy server that routes requests to different backends that are responsible for communicating with different third-party APIs. Two keys in the request are used to determine the backend: `party` and `service`, which indicate *who* does *what*, respectively. For the same pair of these keys, even if there is only one type of backend, there could be multiple instances in different availability zones (AZs) and regions.

There is a subsystem (let's call it the "Switch") to route requests based on the health of these instances. It's vital for the switch to always work, so the requests won't go to a bad instance that is already unresponsive.

The Switch will reuse the *static* configurations of the proxy that lists combinations of `party` and `service`, along with *dynamic* instance information. The static and dynamic information as a whole is written to a cache, whenever an arbiter detects unhealthy instances. The proxy will use the cache to change its routing dynamically.

There was a special party (let's call it "Bob") that requires 2 steps/services (namely "Pre" and "Main") to complete what other parties can do in 1 step ("Main"). These 2 steps are stateful in the sense that if an intance can't do "Main", requests should not be routed to "Pre" either.

So when the Switch reads the blob from the cache, a special check is in place to ensure "Bob" & "Pre" are in the combination list, if they are not, the switch will reject to read, and report an error as a monitoring metric which didn't set up a proper alerting threshold for sudden increase in error rate, because it's an error that never happened before, and is missed when SREs setting up the thresholds for important metrics.

10 years passed by, and "Bob" can also do it in one step, "Main". There are also many legacy combinations that are not used anymore. One day it becomes necessary to remove unused combinations, the includes the combination "Bob" & "Pre".

Bang! The Switch rejects all new writes to the cache for 6 days and no one noticed because no major outage happened, the error rate due to unhealthy instances was low enough to be ignored. The issue is only discovered when the same person is observing the Switch for a different deployment, and it took almost a day to figure out the root cause.

The interesting part? The same person who removed the legacy configuration placed those special checks 10 years ago, the reviewer of the removal also reviewed the check 10 years ago. The check was just an unthoughtful precaution, recalling the check and predicting its ramifications on config removal didn't cross the minds of both of them, even during the initial stage of debugging.

## Too many concurrent slow queries

There is a batch processing system (let's call it the "Sink") that loads different kinds of data from asynchronous nodes of production database clusters, according the configured rules about data source, conditions, how to process etc. There are many rate limits in place to prevent the system from overwhelming the database clusters, such as limiting the concurrent number of tasks, and the threshold of the number of concurrent queries on the same database node. It's also an assumption that the asynchronous nodes are not used for user-facing online queries but other batch processing systems.

The rate limits gradually converged to a balance that maximizes the throughput without overwhelming the nodes even during stress tests of high data volume.

There is also a separate monitor metric that tracks slow queries over a certain threshold, and the threshold was also converging, so SREs could focus on the top issues.

One day, many different databases nodes were reporting high load, only for a few seconds each, but continuously affecting an online service that happens to be using asynchronous nodes, as the service is considered less time-sensitive thus tolerable to the delay that asynchronous nodes might have. SREs and DBAs are confused for quite a while, until a technical lead who guessed it might be caused by the batch processing system Sink, which turned out to be true.

The developers of Sink were also confused, as the concurrent queries on the same node, and the slow queries were both within the limits.

The truth did come out in the end. There was 2 tasks configured in Sink, which will be broken down into many sub-tasks for different time spans, and subsequently, queries for different shards of the same table across different asynchronous nodes. And there is a bug in the task configuration that caused slow queries, and the bug was copy-pasted to affect both tasks. Both tasks were running slow slow, but the database nodes could handle the load along with online queries, as the concurrent slow queries are well below the CPU cores that the nodes could use.

One that day, both the tasks were requested to be rerun for 2 time spans by the same person who introduced the bug, which caused concurrent slow queries on the same node to exceed the core count, blocking the online queries. But as the queries are rate-limited so they hit different nodes in a rotating manner, each node was only affected for a few seconds, the overall impact could only be observed by SREs monitoring the online services, and DBAs considered the phenomenon as a normal load spike.

The lesson learned? Each piece of measures were working individually, but they could not account for combinations of different issues, namely too many *concurrent* (and) *slow* queries, and only the combination could cause an impactful issue. And the information about different parts of the system (the online service, database nodes, and Sink) were not shared across teams.

## OOM when retiring old worker processes

This story is about an SOA framework written in C++ (let's call it the "Framework") that will spawn many worker processes to handle requests for a service, and when a new version of the service is deployed, it will gracefully retire the old worker processes, and spawn new ones of the new version to handle requests. The framework also integrates a few plugins to extend its functionality, and the plugins have their own lifecycle management, e.g. `init`, and `fini`.

One of these plugins (let's call it "Charlie") will starts an asynchronous thread for each worker process, and signals the termination of the thread in `fini`. It recommends an explicit call to `fini`, but `fini` was considered trivial enough to be also called in the destructor of the plugin. It's a bad practice in the eyes of any C++ expert, but it worked well for a long time, there was also no issue calling `fini` twice, once explicitly and once in the destructor.

Every few ten thousand deployments, there is an OOM ("Out of memory") when retiring the old worker processes, but it will be killed per cgroup thresholds on memory consumptions, so only few requests are impacted overtime, eventually SREs concluded that the OOMs are not isolated events.

It was a nightmare to reproduce the issue without know the cause upfront, and code looked innocent enough in review. Finally enough data were collected on rare reproductions, and the clues pointed to `fini` in the destructor of Charlie, but there is only the signaling in `fini`. Further investigation revealed that the asynchronous thread was sending a heartbeat request which caused the OOM.

During sending the request, there was a call to a method using local const static `std::array` holding some `std::string` initialized by literals. That static variable had been destroyed when the asynchronous thread is still alive, which would be signaled to terminate later in `fini` in the destructor of Charlie. The destructor was doing its thing not too non-trivial, but too late, due to the undefined execution order of destructors.

Why would this cause an OOM? After destruction, the heap memory involved were still valid, as they were not released but reused by the allocator, thus there is no SEGFAULT, but the memory for `size()` of the strings contains garbage value, which happens to be a large number, and the strings were appending to a `std::stringstream` to form the heartbeat request, causing endless memory allocation until the OOM.

The issue became more reproducible after using an allocator (`tcmalloc`) that would retain the heap memory longer after destruction. It would have been caught by various sanitizers if we have integrated them in CI.

## Silently hanging tasks

It's again about the batch processing system Sink. Internally it uses an MQ to queue tasks, so it can retry failed tasks. To prevent aggressive retries, there is a backoff mechanism that will exponentially increase the delay between retries, and the maximum number of retries is also limited.

One day, an incident caused all available zones of the MQ to drop a certain percentage of network packets, causing a stable failure rate of task dispatching. If the retries are performed in a regular rate, the aggregate delay of tasks would not be too long.

Unfortunately, the backoff mechanism backfired, along with other rate-limiting mechanisms, eventually causing all task to hang indefinitely. Again, existing metrics were not able to pinpoint the issue, and the developers took some time to figure out the root cause. But as the incident causing all SREs to be in a all-hands-on-deck mode, the hung were noticed quickly from less sensitive monitoring metrics, resulted in a relatively timely fix.

A more sensitive monitoring metric was deployed to detect hanging tasks, and it caught another issue a few days later, and the issue was handled swiftly before causing actual delay.

The issue was that there was a silly bug that runs `remove()` for every element in a huge array, and overall it would take `O(n^2)` time to finish, it's almost forever for such huge arrays. The removal would only be triggered when certain conditions are met, and the bug of removal was copy-pasted from some other utility package.

The bug should only hang one processes that handled the task, unfortunately, the dispatcher would detect timeout tasks, considered the process as dead, reassign the task to other processes when the task was still occupying the old process which was burying its head deep in the `remove()` calls, missing its own checkpoint for timeout. The re-assignment happened over and over again for each timeout, eventually causing all processes to be busy with the same task, and the `remove()` calls would almost never finish. The world literally stopped for that task.

With early detection of hanging tasks, developers were able to try fixes quickly. It should not be easy to find the root cause, but the developers noticed long pauses between two log entries, and the removal loop was the only code there when they wanted to add some logging between the log entries, thus the obvious bug was immediately identified.

It's really lucky that the first issue that was not too serious helped to deploy the more sensitive monitoring metric, so the second issue, which could be much fatal, was detected and fixed promptly, on the morning of the first day of a national holiday. Task re-assignment were then limited to prevent poluting more processes with a bad task.

## LLM's takeaways

Disclaimer: The above stories are slightly modified from the actual events, and they are mannually written by me, not generated by LLM.

I asked DeepSeek to write some takeaways as I was a bit tired after telling these stories. This is the version I'm happy with:

1. **Legacy safeguards breed hidden traps**
   Undocumented checks (Bug 1) and copy-pasted code (Bugs 2, 4) linger as systemic debt. Always audit dependencies before removal or reuse.

2. **Test failure *combinations*, not thresholds**
   Rate limits + slow queries (Bug 2), retries + packet loss (Bug 4): isolated safeguards fail when risks collide. Model interactions, not silos.

3. **Own lifecycle control explicitly**
   Destructor chaos (Bug 3) and hanging tasks (Bug 4) stem from assuming cleanup. Govern threads, memory, and resources with strict ownership.

4. **Instrument for why, not what**
   Missing alerts for config errors (Bug 1), slow queries (Bug 2), or hung processes (Bug 4) delay fixes. Metrics must expose root causes, not just symptoms.

**Core**: Systems fail in layers. Anticipate overlap, govern lifecycles ruthlessly, and monitor for causality—not just correlation.

Fragmented learning, no more [uts-0167]

Well, at least I hope so, for 2025.

I started keeping a Learning diary on math and tech since September 2024.

Before that, I was working on a series of math notes, in a Zen mode, for months. I annotated PDFs in Margin Note 4, and forged definitions, diagrams into Forester trees which are bi-directionally linked notes that can be rendered into beautiful web pages and LaTeX PDFs, until I'm satisfied. During that time, I didn't read news, unrelated books, and solely focused on the books and papers needed for polishing the notes. I held my breath until one note reached a certain stage, so I might move on to a next note.

It was a mental paradise.

Then came the learning diary idea. Since September, constant context switch had dominated me at work. My attention for learning, had inevitably become fragmented. I started to check out interesting stuff from social media, still about math and tech, they are indeed potential elements of what I might want to work on. So I wished to log these readings or just findings daily, so they are not lost in time.

The habit was well-kept, to the end of 2024, even when the load at work had become unbearably distracting and exhausting. The breadcrumbs would be a gold mine for future me.

But it was a disaster to the mental state of my mind.

During that time, I managed to go back to a few open-source projects that I was working on. Most of them have almost no users, but I still feel the obligation to keep them up to date. Usually I just had little time for each of them, so after some chores, I ran out of time and energy for new features, and moved on to chores for a next project. This partially reflected my mental state caused by my fragmented learning.

The world had become an endless curiosity for me, the potential to work on a new project inspired by an element just learned about, was dazzling.

When I stood at the beginning of 2025, I realized that I'm almost fruitless in 2024. I didn't finish the notes before September, although they are nice in their partially completed form. I didn't create anything in a whole after September. I enjoyed the reward of the journey, but came to the end empty-handed.

I no longer log my learning daily in 2025 due to various reasons. My work and life still consume a lot of my time at the beginning of a year, like always. I still log them, but almost just weekly.

April is coming, I would hope to reach a similar Zen mode like the better half of last year. Hence, this post is poured out of me.

A customizable, hybrid approach to code & resource pipeline and embedding [uts-000X]

August 28, 2024
Utensil Song

The following are written in Markdown, and submitted as ~jonsterling/forester#87.

## TL;DR

This is the proposal I mentioned in #82. It's an informal RFC about what I wish to do about embedding both the code and the resource (primarily images) generated in forester output.

The code is there so that the resource generated is fully reproducible, the resource is there so that if the consumer of the forester output (e.g. browser or LaTeX) could gracefully downgrade to use the resource instead. And the pipeline handles regeneration when the source code or dependencies (e.g. imported external files, used assets) change.

## Feasibility

The proposal is already feasible with current forester implementation with some external add-ons, so it might not bring too much complexity to immediate forester development. It can be viewed as not a feature request, but a way to make use of existing features whose equivalents may or may not be included in future forester development, e.g. XML/XSLT etc. That said, it would be nice to have some degree of native support from forester to smooth things out. Actually, this proposal is inspired by how forester handles LaTeX diagrams embedding, with visions on more diagram tools and beyond.

Now that let's assume the feasibility is not a problem, and take a closer look at the approach.

## The customizable, hybrid Approach

The "customizable" part in the approach means the user can choose the kind of tools and pipelines (e.g. parameter customzation or post-processing) to generated the resource, as well as the format(s) of the resource. Currently, the only natively support tool is `latex`, with fixed parameters (e.g. no `--shell-escape`), and a pipeline to generate DVI then post-process it with `divsvgm`, and the format is SVG, with a naming convention of hashing the source code to guarantee browser-cache invalidation of the image whenever the source dode is changed. The pipeline also watch the source change and trigger the regeneration and copying related assets to the output, to an extent that it would be triggered by changes of upstream macro (as it would impact the source code) but not including some `.tex`/`.sty` files that's "imported" in the source code, but forester has no way of detecting that in the current implementation.

Imagine all these can be customized for a second, it would be a great productivity boost for the users, but would this bring a ton of complexity into forester? Not neccessarily. If we view forester as a kernel, then it could just provide a minimal infrastructure to allow all these to happen, instead of handling all tools, pipelins, and formats by itself.

The "hybrid" part in the approach, at least in the web context, means the code can be executed both in browser (browser-side render, BSR for short, which might result in a more interactive experience, e.g. WebGL animations), and in server-side generation (SSR). The technical possibility thanks to that JS libraries that can be executed both in browser and in a JS runtime on server-side (e.g. node.js, deno), and that native libraries can be executed both on server, and in browser as WASM, that includes programs written in C, Rust, Go and more.

In the LaTeX context (or equivalently Typst or other authoring tools), the hybrid approach means whether to execute the code inside the document environment to regenerate, e.g. to preserve the same font and style settings, or to just include the externally generated diagrams. The current forester-to-latex takes the former approach, and usually results in a more desired, style-consistent result, while the latter may have the advantage of stabibility and efficiency, as the diagram could have unexpected errors or deformations during regeneration, and it could be more time consuming to regenerate too many diagrams when you are just authoring some text, and expect to preview the result in real-time.

Now it become clear that this hybrid approach facilitates both the reproducibliity and the graceful downgrade, as long as there is a customizable pipeline to handle it.

## A teaser: authoring tool candidates

Let's get a bit more concrete about the tools that could be used in the customizable pipeline, and see how it opens to endless possibilities. Here are some candidates that I have experimented with:

- Rich text (including math, diagrams)
  - Markdown: a easy-to-write, source-code-readable format
    - both BSR and SSR capable
    - it plays well with forester markups, as fr XML nodes are transformed to HTML tags to be rendered as is by the markdown renderer
    - it can downgrade to plain text (but preserving rendered forester markups) when the markdown renderer is not loaded
    - experiment: test markdownit (via markdown-it )
    - it's at least very convenient for writing paragraph/link/list/footnote-heavy notes compared to forester markups
    - plus, users could write the same text for forester, or for external communication, such as this very proposal which is written in Markdown, submitted to forester ticket system, but also rendered in my forest
  - Typst
    - both BSR and SSR capable
    - it can be made to plays with forester markups, to support links/cross-references/citations etc.
    - it renders to text-selctable SVG, indistinguishable from ordinary text
    - besides text and math, it can be used as a modern diagram authoring tool with decent scripting support
    - experiment: test typst (via typst.ts )
  - Syntax-highlighted code blocks
    - both BSR and SSR capable
    - experiment: test syntax highlighting (via shiki )
- Static diagrams
  - Penrose: creating diagrams with a declarative language with separation of domain knowlege, layout constraints&styling, and concrete substances
    - both BSR and SSR capable
    - experiment: test penrose
  - Pikchr: an open-source text-to-diagram tool, the language is simple yet has great support for relative positioning and flexible path guiding
    - experiment: test pikchr
    - SSR first, BSR-capable thanks to WASM
  - D2: a modern diagram scripting language, preferred over PlantUML, Mermaid, and Graphviz, for its great layout support for complex diagrams and its declarative design.
    - no BSR experiments yet, but I have used its SSR with other static site generators
- Animated, interactive diagrams
  - WebGL
    - experiments:  , test interactions between ray-marching and 3D objects
    - still trying to work out a SSR pipeline, but combining `xvfb-run`, headless-gl, three.js, should make it possible, e.g. see this gist.
    - a less preferred alternative is to use the SVGRenderer addon from three.js, but its render result is visually worse than WebGLRenderer, per my experiment in test three.js to SVG.

For SSR for other output format, users could use XSLT to transform the XML output to make the downstream tools (e.g. LaTeX) to consume the source code, or the resource produced by the SSR pipeline. If neither is consumable, the same content could be excluded, using a macro like `webonly` to surround the forester markup.

Note that these experiments are mostly done for only BSR. SSR experiments are not done for forester, as I have learned the feasibility from my experiment with other static site generators. Without SSR gracefully downgrade, it will take a bit time to load on first visit.

## Elements of an implementation external to forester

Currently, forester can:

- consume `.tree` files, and output `.xml` files, from which the final HTML are produced by XSLT on browser side, or by LaTeX on server side
- allowing user to define macros, which could emitted HTML tags in the XML output (this could also be done as XML tags)
- forester 4.3 uses `fr:resource`, `fr:img`, `fr:resource-source` (with type and part attributes) for rendered SVG image, and the LaTeX source code, which is general enough to be used for other `type` of code that renders to images

To make it available to BSR, the users could:

- use XSLT to transform `fr:resource` tags to HTML tags with certain `class` attribute
- use CSS to render them to show the source code, or a loading animation, or a fallback image rendered by the SSR pipeline described below
- some JS can be loaded on demand to replace the the HTML tag with a rendered, potentially interactive HTML element

To make a SSR pipeline external to forester, the users could:

- define macros accepting the source code to emit `fr:resource`, `fr:img`, `fr:resource-source` following the forester practice on LaTeX diagrams, but with a filename specified by the user, instead of generated from the source code hash
- use a file change watcher (e.g. watchexec) to check the change of XML, then detect `fr:resource-source` presence, then render the source using external scripts to call the corresponding tools by the `type` attribute, and generate an image file with the specified filename
- use a browser cache bustering mechanism (e.g. use XSLT to generate a random number to be appended to the image URL) for reloading the latest image whenever the source code changes if the user refreshes the page

So far, BSR are implemented for the experiments mentioned above, and SSR could wait for the standardization of `fr:resource`, and the resolution of this proposal, or just use any other XML tags in a similar spirit.

## Concluding remarks

The text above is organized so that a reader might gradually see the full vision of the proposal, then the feasibility, and many possible path, downgraded or even upgraded.

To reiterate, this is already feasible with current forester implementation with some external add-ons, and will continue to work if the used features or their equivalents are available in future forester.

It might coincide with the direction of forester development (e.g. the on-going pipeline refactor might just provide the minimal native support needed for this approach to work more smoothly), or it might exceed what forester was originally designed for (thus should not pose a maintenance burden to forester). Hence this informal RFC.

Forester experiment [uts-000S]

April 26, 2024
Utensil Song

I wish to use it to organize many definitions, theorems and discussions about the same mathematical concepts/topics.

My math interests in 2024 [uts-001R]

March 16, 2024
Utensil Song
https://utensil.github.io/blog/posts/math-2024/

Rationale [uts-001S]

I wish this post to be a continuously updated list of my math interests in 2024 with proper citations to literatures, as I keep wandering in the math wonderland and I don't want to be lost in it without breadcrumbs.

Some interests that have older origins will gradually moved to corresponding posts for earlier years.

I also hope certain interests will be developed into research projects, and leaving only a brief summary and a link here.

Each interest should have one or few central questions, and one or few references to literatures.

Formalization [uts-001T]

This part of interests is about small-scale formalization of mathematical concepts and theorems, for learning and potential PRs to Lean's Mathlib. Each should focus on one reference which is well organized and convenient to be converted into a blueprint.

Spin groups [uts-001U]

The PR to Mathlib #9111 about Spin groups is ready to merge, but there are 2 open questions:

- what more lemmas about Spin groups are interesting to mathematians?
- what more should be formalized to formalize Versors and what’s in Section "The contents of a geometric algebra" in [chisolm2012geometric] , e.g. r-blades, r-vectors?

For the former, I should take a closer look at [figueroa2010spin] and maybe [suarez2019expository], [reynoso2023probing].

For the latter, see the Z-filteration in lean-ga and the prototype.

I also wish to include some latest results presented in [ruhe2023clifford], with supplements from [brehmer2023geometric] (lately there is a new paper applying this in HEP [spinner2024lorentz], in the same spirit, I should also read [berzins2024geometry] and possibly [raissi2019physics]), in which some of the results are proven in [roelfs2023graded].

See also discussions in the lean-ga blueprint.

I've started a Forester experiment about the definitions of Spin groups here. I also need to check citations of On some Lie groups in degenerate Clifford geometric algebras.

Matrix [uts-001V]

The Matrix Cookbook (November 15, 2012) [petersen2008matrix] covers many useful results about matrices, and Eric Wieser's project lean-matrix-cookbook aims to give one-liner proofs (with reference to the counter part in Mathlib) to all of them.

The project is far from complete and it would be great to claim a small portion of interested results and contribute to it. I also wish to figure out the GA counterpart of the same portion.

Previous interests about matrices rise from Steven De Keninck's work on GALM [de2019geometric], since the paper I have been interested in GA approaches that has practical advantages over traditional matrix-based methods. Notably the paper also discussed the link between degenerate Clifford algebras and dual numbers / automatic differentiation. A more recent inspiration might be his new work LookMaNoMatrices.

TODO: decide which results are interesting and feasible to be formalized for me.

I wish to pursue further on the topic of Matrix/Tensor, see [taylor2024introduction] and [randy2023matrix]. The former also led me to Einsums in C++. For the latter, I'm thinking of HepLean.SpaceTime.CliffordAlgebra.

Group Algebra [uts-001W]

> In a sense, group algebras are the source of all you need to know about representation theory.

The primary reference is [james2001representations] for understanding FG-module, Group algebra, the presentation of groups, Clifford theory (which is the standard method of constructing representations and characters of semi-direct products, see [woit2017quantum], and "3.6 Clifford theory" in [lux2010representations]), Schur indices etc. We also need to check [lux2010representations] for its introduction to GAP, and we should pay close attention to the progress of GAP-LEAN. [sims1994computation] might also be interesting in a similar manner as [lux2010representations] but with emphasis on the presentation of groups.

See also group algebra on nlab, particularly that "A group algebra is in particular a Hopf algebra and a \(G\)-graded algebra."

The related Zulip thread is here, and I have preliminary explorations and experiments in Lean here.

This interest originates from reading Robert A. Wilson's work [wilson2024discrete]. The ultimate goal is to understand the group algebra of the binary tetrahedral group (\(Q_8 \rtimes  Z_3\)), then the three-dimensional complex reflection group (\(G_{27} \rtimes  Q_8 \rtimes  Z_3\)), a.k.a. the triple cover of the Hessian group, which can be interpreted as a finite analogue of the complete gauge group \(U(1) \times  SU(2) \times  SU(3)\).

Type Theory [uts-001X]

Recently I have read some meta-level dependent type theory (Typst source). It might be time to re-read `leantt` paper, and start reading `lean4lean` paper/source.

The author `ice1000` has strong interest in QIIT (Quotient Inductive-Inductive Types) and QIIR (Quotient Inductive-Inductive Recursion), he has implemented `overlap` in Aya with termination check and confluence check.

Aya has a philosophy that the kernel could include pattern matching, but at the cost of no generation and translation of eliminators at present. In principle this is feasible, without `overlap`, it could be implemented by "theory of signatures", with `overlap`, it needs "Coherent and concurrent elimination for initial algebras" which I find fascinating, and have read [yeasin2011initial].

The author also has a great article on TT & Cat [zhang2021type]. His recommendation of [gundry2013tutorial] is also worth reading.

In the process of learning Topos, I wish to have a better understanding of Logic. Particularly, Curry–Howard–Lambek correspondance, Propositional truncation, and Paraconsistent logic are on the plate.

Although remotely related, I'll place Introduction to Formal Reasoning (COMP2065) in this section so I won't lose track of it.

Philosophy [uts-001Y]

Type theories, logic have their origins in philosophy. Lacan had drawn inspiration from the work of the mathematician and philosopher of science Georges Canguilhem, and the philosopher of mathematics Alain Badiou. If he had lived to see the rise of proof assistants, he would be interested in the formalization of his theories.

There is a project on Github called Lacan-Mathemes which visualized some core concepts of Lacan in TikZ. It would be interesting to start there and look for proper objects to formalize in Lean. Another interesting thing to do is to visualize them in Typst.

Here I would like to quote a generated response from Claude 1.2 Instant, which is quite optimistic:

> Here are a few thoughts on formalizing aspects of Lacanian psychoanalytic theory using theorem provers:
>
> - Lacan's structural theory of the psyche could potentially be formalized using logic. The tripartite structure of the Real, Symbolic, and Imaginary could be modeled as formal domains or ontologies with defined relationships between them.
> - Key concepts like the mirror stage, the Name-of-the-Father, the objet petit a, etc. could be defined as logical predicates or functions operating within this structural framework. For example, one could define predicates like "inMirrorStage(subject)" or "desiresObjetPetitA(subject, object)".
> - Relations like the split between desire and drive, the tension between the Symbolic order and the Real, the misrecognition of the Imaginary could be expressed through logical rules and inferences between concepts.
> - The dynamics of psychoanalytic concepts like repression, sublimation, transference could be modeled as state transitions or logical transformations within the structured system.
> - Specific psychoanalytic theories like the stages of psychosexual development or the topology of the fantasy could be axiomatized and theorems deduced from the axioms.
>
> However, capturing the open-ended, non-deterministic nature of unconscious processes and the ambiguity/contingency of signification would be challenging and may require non-classical logics or probabilistic approaches.
>
> So in summary, while difficult, certain aspects of Lacanian theory seem amenable to formalization using tools from logic, ontology, and knowledge representation. Careful design would be needed to address theory's complexity.

Lately there is a paper formalizing Kant [achourioti2011formalization] which might be interesting.

Generalized Clifford Algebra and related mathematical branches [uts-001Z]

A Clifford algebra is a \(Z_2\)-graded algebra, and a Filtered algebra, the associated graded algebra is the exterior algebra.

It may be thought of as quantizations (cf. quantum group) of the exterior algebra, in the same way that the Weyl algebra is a quantization of the symmetric algebra. Specifically, for \(V\) an inner product space, the symbol map constitutes an isomorphism of the underlying super vector spaces of the Clifford algebra with the exterior algebra on  \(V\), and one may understand the Clifford algebra as the quantization Grassmann algebra induced from the inner product regarded as an odd symplectic form.

Weyl algebras and Clifford algebras admit a further structure of a *-algebra, and can be unified as even and odd terms of a superalgebra, as discussed in CCR and CAR algebras.

A Clifford module is a representation of a Clifford algebra.

A Generalized Clifford algebra (GCA) can also refer to associative algebras that are constructed using forms of higher degree instead of quadratic forms, e.g.

> For \(q_1, q_2, \ldots , q_m \in  \mathbb {k}^*\), the generalized Clifford algebra \(C^{(n)}\left (q_1, q_2, \ldots , q_m\right )\) is a unital associative algebra generated by \(e_1, e_2, \ldots , e_m\) subject to the relations
> \[ e_i^n=q_i \mathbf {1}, \quad  e_i e_j=\omega  e_j e_i, \quad  \forall  j \lt  i . \]
>
> It is easy to see that \(C^{(n)}\left (q_1, q_2, \ldots , q_m\right )\) is \(\mathbb {Z}_n\)-graded where the degree of \(e_i\) is \(\overline {1}\), the generator of \(\mathbb {Z}_n\). [cheng2019new]

In [cheng2019new], note also that "Clifford algebras are weak Hopf algebras in some symmetric tensor categories." while "generalized Clifford algebras are weak Hopf algebras in some suitable braided linear categories of graded vector spaces." as well as that "the Clifford process is a powerful technique to construct larger dimensional Clifford algebras from known ones."

TODO: add papers linking Hopf algebra and Clifford algebra together learned from the adjoint discord here.

Clifford algebras can be obtained by twisting of group algebras [albuquerque2002clifford], where twisted group algebras are studied in [conlon1964twisted], [edwards1969twisted], [edwards1969twisted2].

There exists isomorphisms between certain Clifford algebras and NDAs (Normed Division Algebras) over \(\mathbb {R}\).

Variants of Clifford algebras whose generators are idempotent or nilpotent can be considered. Zeon algebras ("nil-Clifford algebras") have proven to be useful in enumeration problems on graphs where certain configurations are forbidden, such as in the enumeration of matchings and self-avoiding walks. The idempotent property of the generators of idem-Clifford algebras can be used to avoid redundant information when enumerating certain graph and hypergraph structures. See [ewing2022zeon].

It's also closely related to universal enveloping algebra (see [figueroa2010spin] and "The universal enveloping algebra of a Lie algebra is the analogue of the usual group algebra of a group." from group algebra on nlab).

Great discussions about the limitations and generalizations of Clifford algebras can be found in John C. Baez's [baez2002octonions]. Particularly, note Cayley-Dickson construction, Bott periodicity, matrix algebra, triality, and \(\mathbb {R}\) as a real commutative associative nicely normed ∗-algebra. Also see Anthony Lasenby's work on the embedding of octonions in the Clifford geometric algebra for space-time STA ( \(\mathop {\mathcal {C}\ell }(1, 3)\) ) [lasenby2024some].

Note also Kingdon algebras: alternative Clifford-like algebras over vector spaces equipped with a symmetric bilinear form [depies2024octonions].

Categorified Clifford Algebra [uts-0020]

A categorical view of Clifford Algebra is discussed in [figueroa2010spin].

A Clifford Algebra can be categorified: "An Clifford algebra over a vector space is defined to be the Koszul dual to an abelian fully weak Lie-algebra" where "Fully weak Lie-algebras are Koszul dual to differential graded Clifford algebras." See also Higher Clifford Algebras.

Sheaves of Clifford Algebras [uts-0021]

Sheaves of Clifford Algebras are studied in [yizengaw2015clifford] and its references. See also [schapira2023introduction], an elementary introduction to [kashiwara2006categories], which presents categories, homological algebra and sheaves in a systematic and exhaustive manner.

DKP Algebra [uts-0022]

Duffin-Kemmer-Petiau Algebra is

\[\frac {T(V)}{I(v\otimes  w\otimes  v - g(v,w)v)}\]

in the same way that Clifford Algebra is

\[\frac {T(V)}{I(v\otimes  v - g(v,v))}.\]

See [fernandes2022clifford] which embeds DKP Algebra in Clifford Algebra with projectors.

Misc [uts-0023]

TODO:

- Recovering Composition Algebras from 3D Geometric Algebras
- Eliminating topological errors in neural network rotation estimation using self-selecting ensembles - Sitao Xiang - SIGGRAPH 2021
- OSMU24: Quantum Foundations, Particle Physics, and Unification of Forces
- Bill Baylis' research on Clifford's Geometric Algebra of Physical Space (APS)

Applied mathematics [uts-0024]

Topology [uts-0025]

Knots [uts-0026]

See [manturov2018knot] and the tracking issue.

For interpreting knots in a sheaf-theoretic perspective, see [chrisman2023geometric].

See also:

- A Table of Virtual Knots
- The homological arrow polynomial for virtual links (code)

Origami [uts-0027]

See [hull2020origametry] and the tracking issue.

Dynamical Systems: Bifurcation Theory [uts-0028]

Sheaves [uts-0029]


My first impression of sheaves is that they are useful to local-to-global applications "which ask for global solutions to problems whose hypotheses are local".

Roughly speaking, a sheaf requires some gluing conditions (axioms "Locality" and "Gluing") so that local data can be collated compatibly into a global algebraic structure that varies continuously over local covering domains ("sections" of sheaves).

To do so, a sheaf in general, as defined in the category-theoretical language, needs

- a topological space (or a site in general), denoted \(X\) (or \(\mathcal {C}\) for a site)
- a category, sometimes denoted \(\mathcal {D}\), meaning "data category", whose objects are algebraic structures and morphisms are structure-preserving maps

and builds (gluing conditions) on a \(\mathcal {D}\)-valued presheaf over \(X\) (or \(\mathcal {C}\)), denoted \(\mathcal {F}\) (as its French name is "faisceau"), which is essentially a  contravariant functor \(\mathcal {F}: \mathcal {C}^{op} \to  \mathcal {D}\) but a concept with an attitude, and its morphisms are restriction maps between open sets in \(X\) (or between objects that satisfy the pretopology \(\mathcal {J}\) in \(C\), where \(\mathcal {J}\) is the pretopology on \(\operatorname {Open}(X)\), which specifies when a covering family of open sets exists).

Its latest application to deep learning, Thomas Gebhart's thesis [gebhart2023sheaf] sees a sheaf over a topological space as a data structure "which defines rules for associating data to a space so that local agreements in data assignment align with a coherent global representation", thus a generalization of both:

- relational learning, which aims to "combine symbolic, potentially hand-engineered prior knowledge with the tabula rasa representational flexibility of deep learning to achieve a synthetic model family which can be defined with respect to symbolic knowledge priors about the data domain"
- geometric deep learning, which "provides a group-theoretic approach to reasoning about and encoding domain-symmetry invariance or equivariance within machine learning models",

"providing a mathematical framework for characterizing the interplay between the topological information embedded within a domain and the representations of data learned by machine learning models".

My prior interest in geometric deep learning, particularly group-equivariant neural networks, and my believe in symbolism, are the background of my interest in sheaf representation learning.

Notably, this thesis treats the discrete case of sheaves, a cellular sheaf, whose

- topological space is a cell complex, which is "a topological generalization of a graph, with set inclusion and intersection given by the incidence relationships among cells in the complex", thus "admitting a computable, linear-algebraic representation".
- data category is \(\mathtt {FVect}\), the category of finite-dimensional vector spaces over a field \(\mathbb {F}\), which is a common choice for the data category in machine learning applications, a model-free approach with massive parameter space, flexible representational capacity, but inherits fundamental limitations, e.g. data inefficiency, generalization failure, and interpretability issues.

For more details, see also Thomas Gebhart's talk Sheaves for AI: Graph Representation Learning through Sheaf Theory.

<!-- has the potential to learn sheaf representations of data, which is essentially assigning observed data to a space in a way that's

globally consistent with the constraints imposed by the topology of the application domain .

a richer but still tractable representation that is an algebraic structure richer than a vector space commonly used in ML, to sampled data, which is in the form of components of a category (e.g. a category of a data structure like a Graph).  -->

Its application to physics has the potential to formulate differential geometry in a more general setting, without assuming the existence of a locally Euclidean space as manifold did. It's believe that this approach can overcome some difficulties in Quantum field theory even Quantum gravity, because locally there might be no concept of a metric space at all [mallios2015differential].

Note that there are CAS systems that can do sheaf cohomology etc., e.g. Macaulay2, OSCAR.

Synthetic Differential Geometry [uts-002A]

For SDG, [kock2006synthetic] is a classic. [schellingerhout2023circles] is a concise bachelor thesis on the topic, and has interesting discussions on circles.

Ryszard Paweł Kostecki has very approachable notes on Topos [kostecki2011introduction] and SDG [kostecki2009differential].

We should also read [rosiak2022sheaf]. It also has many diagrams in the way I imagined, for examples of sheaf.

ML [uts-002B]

I don't want efforts in Transformers: from self-attention to performance optimizations to be discontinued, lately there is [ferrando2024primer] on this topic.

I might need to follow on the latest development on the linear attention mechanism [peng2024eagle].

I have almost no understanding of diffusion models, so I should read [bao2023all] and related papers.

I should also read [mikula2023magnushammer] and related papers.

Consciousness [uts-002C]

Physics [uts-002D]

Physics-Based Simulation is out.

So is [sen2024string] on string theory.

TODOs [uts-002E]

Incorporating the following interests into this post:

- Threads of thought in recent years (in Chinese)
- Interests logged here

For earlier posts, see Utensil's Blog powered by Hugo.

For even ealier posts but not in English, see Utensil's Tech Blog, powered by an old version of Middleman.