A level set of a real-valued function \(f\) (a.k.a. an iso-contour of a scalar field in 3D)
\[L_c(f)=\left \{\boldsymbol {x} \mid \forall \boldsymbol {x} \in X, f(\boldsymbol {x})=c \right \} \]
where \(c\) is a constant.
\(\partial \Omega \) as an implicit surface can be defined by a level set \(L_0(f)\)
where
\[ f(\boldsymbol {p})\begin {cases} \le 0 & \text { if } \boldsymbol {p} \in \Omega \\ > 0 & \text { if } \boldsymbol {p} \notin \Omega \end {cases} \]
Let \(\Omega \) be a subset of a metric space \(X\) with metric \(d\), and \(\partial \Omega \) be its boundary. The distance between a point \(\boldsymbol {p}\) of \(X\) and the subset \(\partial \Omega \) of \(X\) is defined as usual as
\[ d(\boldsymbol {p}, \partial \Omega )=\inf _{\boldsymbol {q} \in \partial \Omega } d(\boldsymbol {p}, \boldsymbol {q}) \]
where \(\inf \) denotes the infimum, i.e. the greatest lower bound.
The signed distance function or signed distance field (SDF) from a point \(\boldsymbol {p}\) of \(X\) to \(\Omega \) is defined by
\[ f(\boldsymbol {p})= \begin {cases} -d(\boldsymbol {p}, \partial \Omega ) & \text { if } \boldsymbol {p} \in \Omega \\ d(\boldsymbol {p}, \partial \Omega ) & \text { if } \boldsymbol {p} \notin \Omega \end {cases} \]
Simply put, SDFs are the minimum possible distance from a point to an implicit surface defined by \(f(\boldsymbol {p})=0\).
The convention adopted in this note is that the \(+\) and \(-\) signs indicate whether the point is outside or inside the surface, respectively, so that when a ray marches towards the surface from the outside, the distance is positive, becomes smaller when approaching the surface.
If \(X\) is also a normed space, i.e. for a point \(\boldsymbol {p} \in X\), its norm can be defined, e.g. in a Euclidean space \(\mathbb R^n\) where \(\boldsymbol {p}\) can be expressed by basis vectors as \((p_1, \ldots , p_n)\), then its norm \(\lVert \boldsymbol {p}\rVert \) can be defined as the Euclidean norm
\[ \lVert \boldsymbol {p}\rVert _2:=\sqrt {p_1^2+\cdots +p_n^2} \]
which is essentially the distance from the origin \(\boldsymbol {o}=(0, \ldots , 0)\) to the point \(\boldsymbol {p}\)
\[ \lVert \boldsymbol {p}\rVert =d(\boldsymbol {p}, \boldsymbol {o})=\lVert \boldsymbol {p} - \boldsymbol {o}\rVert \]
where \(d\) is the Euclidean distance function.
In the following, we will be working with Euclidean space \(\mathbb R^n\), which is both a metric space and a normed space. Most of the time, we will focus on \(\mathbb R^3\) for visualization purposes.
The SDF of a sphere with radius \(r\) centered at the origin \(\boldsymbol {o}\) is
\[ f(\boldsymbol {p})=\lVert \boldsymbol {p}\rVert -r \]
as depicted below:
SDFs of some geometric primitives in \(\mathbb R^3\), can be found in Appendix A: Distance to Natural Quadrics and Torus of [hart1996sphere].
Many more can be found in Inigo Quilez's article Distance functions (written in GLSL), which also includes an example that renders them by ray-marching.
A ray is a half-line that starts at a point and extends indefinitely in one direction.
In the context of ray marching etc., a ray could be a view ray, i.e. a ray that starts at the camera, extending through the screen, possibly intersecting with objects in the scene, or a secondary ray, i.e. a ray generated from the intersection of other rays, e.g. shadow rays, reflection rays, or refraction rays.
A ray anchored at origin \(\boldsymbol {r}_o\) in the direction of the unit vector \(\boldsymbol {r}_d\) can be parametrically defined as a ray equation
\[ \boldsymbol {r}(t)=\boldsymbol {r}_o+t \boldsymbol {r}_d \]
Plugging the ray equation \(r: \mathbb {R} \rightarrow \mathbb {R}^n\) into the function \(f: \mathbb {R}^n \rightarrow \mathbb {R}\) that defines the implicit surface produces the composite real function \(F: \mathbb {R} \rightarrow \mathbb {R}\) where \(F=f \circ \boldsymbol {r}\) such that the solutions to
\[ F(t)=0 \]
correspond to ray intersections with the implicit surface. Theses solutions are also called the roots of the function \(F\).
The ray-casting/ray-tracing/ray-marching algorithms simply apply one of the multitude of root finding methods to solve the ray intersection equation.
In general, ray-tracing typically uses a bunch of geometry (e.g. closed-form formulas) to calculate the intersection point, while ray-marching marches along the ray, and uses numerical methods to find the intersection point.
The term ray-casting was introduced concurrently with ray-tracing in the 1980s, nowadays it could be used to mean the general ray intersection process, or specifically the ray-marching process with fixed step size.
To render a scene, the Ray marching algorithm marches a ray from an origin towards a direction. At each step, the algorithm marches a short (possibly changing) distance and checks if a ray intersection occurs. This process continues until a stopping condition is met. The information obtained from the ray intersection is then used to determine the color of a pixel on the screen.
Example. Ray-marching for different types of rays [ag-001D]
A view ray has the origin at the camera, the direction is determined by the pixel on the screen that it passes through. It may intersect with a surface in the scene, and the information obtained from the intersection is used to determine the color of the corresponding pixel on the screen.
A secondary ray typically originates from a point on a surface and its direction is determined by the type of ray. For instance, a shadow ray is cast towards a light source to determine if the point is in shadow, while a reflection ray and a refraction ray follow directions based on optics. The information obtained from secondary rays helps determine the color of the corresponding pixel that the view ray passes through.
The ray marching (naïve) algorithm is to march a ray by a fixed step size, check if a ray intersection occurs at each step, until the ray reaches a maximum distance or step count.
Formally, the root found by ray marching (naïve) is the limit point of the sequence defined by the recurrence equation
\[t_{i+1} = t_i + \Delta t\]
where \(t_0 = 0\) and \(\Delta t\) is the fixed step size.
At each step, the ray marches to \(\boldsymbol {r}(t_i)\), which is usually denoted \(\boldsymbol {r}_i\).
The following renders a scene with a unit sphere at the origin, the camera at \((0,0,8)\) and looking at the origin, through a screen of height 1.0, centered at 5.0 from the camera.
The color of pixels are adjusted so that they are brighter the closer they are to the camera, as if there is a light source at the camera. As it takes a fixed step size of 0.1 (not small enough), the sphere is not rendered smoothly, and clearly shows the stepping artifacts.
/* The SDF for a sphere with radius r */
float sdSphere(vec3 p, float r)
{
return length(p) - r;
}
/* The SDF for a scene with a unit sphere at the origin */
float sdScene(in vec3 pos) {
return sdSphere(pos, 1.0);
}
#define EPSILON 0.01
#define T_MAX 10.0
#define DELTA_T 0.1
#define SDF_FUNCTION sdScene
/* Ray marching (naïve) */
float rayMarch(in vec3 ro, in vec3 rd) {
float t = 0.0;
while(t < T_MAX) {
vec3 rt = ro + t * rd;
float ft = SDF_FUNCTION(rt);
if(ft < EPSILON) return t;
t += DELTA_T;
}
return T_MAX;
}
/* The main function to render the scene per pixel */
void mainImage(out vec4 pixelColor, in vec2 pixelCoordinate) {
// set up uv coordinates on the screen
float aspect = iResolution.x/iResolution.y;
vec2 uv = pixelCoordinate/iResolution.xy;
uv -= vec2(0.5, 0.5);
uv *= 2.0 * vec2(aspect, 1.0);
// set up camera and view ray
float observerDistance = 8.0;
vec3 ro = vec3(0.0, 0.0, observerDistance);
vec3 ta = ro - vec3(uv, 5.0);
vec3 rd = normalize(ta - ro);
// ray march
float t = rayMarch(ro, rd);
float hit = t != T_MAX ? 1.0 : 0.0;
// adjust the color to look just bright enough
float baseRatio = 0.65;
// the closer, the brighter
float distRatio = baseRatio + smoothstep(0.0, 1.0 - baseRatio, 1.0 - t / observerDistance);
vec3 col = distRatio * vec3(1.0);
// transparent if not hit
pixelColor = vec4(col, hit);
}
simplified from zmcbeth's Raymarch 3.0Code/* The SDF for a sphere with radius r */
float sdSphere(vec3 p, float r)
{
return length(p) - r;
}
/* The SDF for a scene with a unit sphere at the origin */
float sdScene(in vec3 pos) {
return sdSphere(pos, 1.0);
}
#define EPSILON 0.01
#define T_MAX 10.0
#define DELTA_T 0.1
#define SDF_FUNCTION sdScene
/* Ray marching (naïve) */
float rayMarch(in vec3 ro, in vec3 rd) {
float t = 0.0;
while(t < T_MAX) {
vec3 rt = ro + t * rd;
float ft = SDF_FUNCTION(rt);
if(ft < EPSILON) return t;
t += DELTA_T;
}
return T_MAX;
}
/* The main function to render the scene per pixel */
void mainImage(out vec4 pixelColor, in vec2 pixelCoordinate) {
// set up uv coordinates on the screen
float aspect = iResolution.x/iResolution.y;
vec2 uv = pixelCoordinate/iResolution.xy;
uv -= vec2(0.5, 0.5);
uv *= 2.0 * vec2(aspect, 1.0);
// set up camera and view ray
float observerDistance = 8.0;
vec3 ro = vec3(0.0, 0.0, observerDistance);
vec3 ta = ro - vec3(uv, 5.0);
vec3 rd = normalize(ta - ro);
// ray march
float t = rayMarch(ro, rd);
float hit = t != T_MAX ? 1.0 : 0.0;
// adjust the color to look just bright enough
float baseRatio = 0.65;
// the closer, the brighter
float distRatio = baseRatio + smoothstep(0.0, 1.0 - baseRatio, 1.0 - t / observerDistance);
vec3 col = distRatio * vec3(1.0);
// transparent if not hit
pixelColor = vec4(col, hit);
}
The point-to-set distance defines the distance from a point \(\boldsymbol {x} \in \mathbb {R}^3\) to a set \(A \subset \mathbb {R}^3\) as the distance from \(x\) to the closest point in \(A\),
\[ d(\boldsymbol {x}, A)=\min _{y \in A}\lVert \boldsymbol {x}-\boldsymbol {y}\rVert \]
A function \(f: \mathbb {R}^3 \rightarrow \mathbb {R}\) is a signed distance bound of its implicit surface \(f^{-1}(0)\) if and only if
\[ |f(\boldsymbol {x})| \leq d\left (\boldsymbol {x}, f^{-1}(0)\right ) \]
If equality holds, then \(f\) is a signed distance function.
A function \(f: \mathbb {R}^3 \rightarrow \mathbb {R}\) is Lipschitz over a domain \(D\) if and only if for all \(\boldsymbol {x}, \boldsymbol {y} \in D\), there exists a positive finite constant \(\lambda \), called Lipschitz bound, such that
\[ |f(\boldsymbol {x})-f(\boldsymbol {y})| \leq \lambda \lVert \boldsymbol {x}-\boldsymbol {y}\rVert . \]
The Lipschitz constant, denoted \(\operatorname {Lip} f\), is the minimum \(\lambda \) satisfying the condition above.
Let \(f\) be Lipschitz with Lipschitz bound \(\lambda \geq \) Lip \(f\). Then the function \(f / \lambda \) is a signed distance bound of its implicit surface.
The sphere tracing algorithm is to march a ray by an adaptive safe step size, which is the absolute value of the signed distance bound calculated by \(F/\lambda \) per Theorem [ag-000T], the rest is the same as ray marching (naïve).
Formally, the root found by sphere tracing is the limit point of the sequence defined by the recurrence equation
\[t_{i+1} = t_i + \frac {|F(t_i)|}{\lambda } = t_i + \frac {f(\boldsymbol {r}(t_i))}{\lambda } \]
where \(F\) is the ray intersection, and \(f\) is the SDF.
Usually \(\lambda = 1\), and \(F(t_i)\) stays positive until the stopping condition \(F(t_i) < \epsilon \) is met, so the above can be simplified to
\[t_{i+1} = t_i + F(t_i) = t_i + f(\boldsymbol {r}(t_i)) \]
Given a function \(F: \mathbb {R} \rightarrow \mathbb {R}\) with Lipschitz bound \(\lambda \geq \) Lip \(F\), and an initial point \(t_0\), sphere tracing converges linearly to the smallest root greater than \(t_0\).
Theorem [ag-000U] means that sphere tracing requires the function to be Lipschitz, i.e. has a known Lipschitz bound.
Not all functions are Lipschitz, e.g. Harmonic Functions are not globally Lipschitz, many may have no local Lipschitz bound even within small neighborhoods, thus requires a new technique named Harnack tracing developed in [gillespie2024ray].
objects \(\operatorname {Ob}({\cal C})\): \(O, X, Y, \dots \)
arrows \(\operatorname {Arr}({\cal C})\): \(f, g, h, \dots \), where for each arrow \(f\),
a pair of operations \(\operatorname {dom}\) and \(\operatorname {cod}\) assign a domain object \(X=\operatorname {dom}(f)\) and a codomain object \(Y=\operatorname {cod}(f)\) to \(f\),
thus f can be denoted by \[f : X \to Y\] or \[X \xrightarrow {f} Y\]
compositions: a composite arrow of any pair of arrows \(f\) and \(g\), denoted \(g \circ f\) or \( f \mathbin {\bullet } g \), makes the diagram
commute (we say that \(f \mathbin {\bullet } g\) factors through \(Y\)),
a identity arrow for each object \(O\), denoted \(\mathit {1}_O : O \to O\)
satisfying:
associativity of composition: the diagram
commutes,
In literatures, composition of arrows in a category is most frequently denoted \(\circ \) or just by juxtaposition, i.e. for the diagram
\(g \circ f\) or just \(gf\) mean \(g\) applies to the result of \(f\).
We use \(f \mathbin {\bullet } g\) throughout this note, so it can always be understood as \(\xrightarrow {f}\xrightarrow {g}\). This way saves mental energy when reading commuting diagrams, pasting diagrams and string diagrams which we employ heavily.
A category C is called locally small iff for any of its objects \(X, Y\) , the collection of arrows from \(X\) to \(Y\) is a set, called a hom-set, denoted
\[\operatorname {Hom}_{{\cal C}}(X, Y)\]
A category is called small iff the collection of all its arrows is a set.
\(\operatorname {Hom}\) in \(\operatorname {Hom}_{{\cal C}}(X, Y)\) is short for homomorphism, since an arrow in category theory is a morphism (i.e. an arrow), a generalization of homomorphism between algebraic structures.
This notation could be unnecessarily verbose, so when there is no confusion, we follow [leinster2016basic] and [zhang2021type] to simply write \(X,Y \in \operatorname {Ob}({\cal C})\) as
\[X,Y \in {\cal C}\]
and \(f \in \operatorname {Hom}_{{\cal C}}(X, Y)\) as
\[f \in {\cal C}(X, Y)\]
In some other scenarios, when the category in question is clear (and it might be to too verbose to write out, e.g. a functor category), we omit the subscript of the category and write just
\[\operatorname {Hom}(X, Y)\]
In general, collection \(\operatorname {Ob}({\cal C})\) and \(\operatorname {Arr}({\cal C})\) are not neccessarily sets, but classes. In that case, \(\operatorname {Hom}_{{\cal C}}(X, Y)\) is called a hom-class.
Later, we will also learn that \(\operatorname {Ob}\) and \(\operatorname {Arr}\) are representable functors.
A diagram in a category \({\cal C}\) is defined as a collection of objects and arrows that belong to the category \({\cal C}\) with specified operations \(\operatorname {dom}\) and \(\operatorname {cod}\).
A commuting diagram is defined as such diagram that any two arrows of this diagram which have the same domain and codomain are equal.
For example, that the diagram
commutes means \(f \mathbin {\bullet } g = h \mathbin {\bullet } k\).
A string diagram represents categories as surfaces (2-dimensional), functors as wires (1-dimensional), natural transformations as blocks or just dots (0-dimensional).
String diagrams has the advantage of being able to represent objects, arrows, functors, and natural transformations from multiple categories, and their vertical and horizontal composition, and has various topologically plausible calculational rules for proofs.
A category \({\cal C}\) is represented as a colored region:
Functors of type \(\mathbf {1} \to {\cal C}\) can be identified with objects of the category \({\cal C}\), where \(\mathbf {1}\) is the the terminal category, so an object \(X \in \operatorname {Ob}({\cal C})\) can be represented as:
An arrow \(f : X \to Y\) is then a natural transformation between two of these functors, represented as:
as a special case, functors with the terminal category (i.e. constant object functors) as the domain are identified with the objects in the codomain category, thus are denoted like an object: \(X : \mathbf {1} \to {\cal C}, \mathrm {*} \mapsto X\)
\(\mathscr {I}\) is only used in the inclusion functor (note that this is letter "I")
we do not use \(\mathscr {J}\) and \(\mathscr {S}\) because they are visually ambiguous
lowercase Greek letters \(\alpha , \beta , \eta , \epsilon , \sigma \) denote natural transformations, their components are denoted by them with subscripts.
An arrow \(f : X \to Y\) is monic if the diagram
commutes, i.e. \(g_1 \mathbin {\bullet } f = g_2 \mathbin {\bullet } f \implies g_1 = g_2\), denoted \(f : X \rightarrowtail Y\).
"Monic" is short for "monomorphism", which is a generalization of the concept of injective (one-to-one) functions between sets.
An arrow \(f : X \to Y\) is epic if the diagram
commutes, i.e. \(f \mathbin {\bullet } g_1 = f \mathbin {\bullet } g_2 \implies g_1 = g_2\), denoted \(f : X \twoheadrightarrow Y\).
"Epic" is short for "epimorphism", which is a generalization of the concept of surjective (onto) functions between sets.
An arrow \(f : X \to Y\) is iso, or \(X\) and \(Y\) are isomorphic, denoted \(X \cong Y\), or \(X \xrightarrow {\sim } Y\), if the diagram
commutes, where \(!g\) means there exists a unique arrow \(g\), and \(g\) is called the inverse of \(f\), denoted \(f^{-1}\).
"Iso" is short for "isomorphism", which is a generalization of the concept of bijective (one-to-one and onto) functions.
An initial object in a category \({\cal C}\) is an object \(\mathrm {0} \in \operatorname {Ob}({\cal C})\) such that for any object \(X\) in \({\cal C}\), there exists a unique arrow \(\mathrm {0} \to X\). It's also called a universal object, or a free object.
A terminal object in a category C is an object \(\mathrm {1} \in \operatorname {Ob}({\cal C})\) such that for any object \(X\) in \({\cal C}\), there exists a unique arrow \(X \to \mathrm {1}\). It's also called a final object, or a bound object.
Diagramatically,
A null object is an object which is both terminal and initial, confusingly, it's also called a zero object.
An element or a generalized element of \(X\) at stage \(S\) (or, of shape \(S\)) is an arrow \(x : S \to X\) in \({\cal C}\), also denoted \(x \in _{S} X\).
An arrow \(1 \to X\) is called a global element of \(X\), a.k.a. a point of \(X\).
An arrow \(S \to X\), if \(S\) is not isomorphic to \(1\), is called the local element of \(X\) at stage \(S\).
An arrow \(\mathit {1}_X : X \to X\) is called the generic element of \(X\).
In an element \(x : S \to X\), the object \(S\) is called a stage in order to express the intuition that it is a "place of view" on \(X\). In the same sense, \(S\) is also called a domain of variation, and \(X\) a variable element.
when the object is a set, a generalized element of \(X\) of shape \(\mathbb N\) is a sequence in the set \(X\)
when the object is a topological space, a generalized element of \(X\) of shape \(S^1\) is a loop
In the context of studying solutions to polynomial equations, we may also call it a \(S\)-valued point in \(X\), where \(S\) is the number set where the solution is taken, e.g. the real, complex, and \(\operatorname {Spec} \mathbb F_p\)-valued solutions.
A subobject of any object is defined as an equivalence class of monic arrows into it.
The class of subobjects of an object \(X\) is denoted as
\[ \operatorname {Sub}(X):=\{[f] \mid \operatorname {cod}(f)=X \wedge f \text { is monic }\}. \]
\(\mathbf {Set}\), the category of sets, consists of objects which are sets, and arrows which are functions between them. The axioms of composition, associativity and identity hold due to standard properties of sets and functions.
\(\mathbf {Set}\) has the initial object \(\varnothing \), the empty set, and the terminal object, \(\{*\}\), the singleton set.
Let \({\cal C}\) be a category. A subcategory \({\cal S}\) of \({\cal C}\) consists of a subclass \(\operatorname {Ob}({\cal S})\) of \(\operatorname {Ob}({\cal C})\) together with, for each \(S, S' \in \operatorname {Ob}({\cal S})\), a subclass \({\cal S}\left (S, S'\right )\) of \({\cal C}\left (S, S'\right )\), such that \({\cal S}\) is closed under composition and identities.
It is a full subcategory if \({\cal S}\left (S, S'\right )={\cal C}\left (S, S'\right )\) for all \(S, S' \in \operatorname {Ob}({\cal S})\).
A (covariant) functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is given by the diagram
i.e. a map of objects and arrows between categories \({\cal C}\) and \({\cal D}\) that preserves the structure of the compositions and identities.
A functor \(\mathscr {F}\) is called a contravariant functor from \({\cal C}\) to \({\cal D}\), and denoted \(\mathscr {F} : {\cal C}^{op} \to {\cal D}\), if it obeys the definition given by the (covariant) functor for \({\cal C}\) replaced by \({\cal C}^{op}\), i.e. it's given by the diagram
i.e. a map of objects and arrows between categories \({\cal C}\) and \({\cal D}\) that reverses the structure of the arrows, compositions and identities.
For some expression \(E(X)\) containing \(X\), when we say \(E(X)\) is (covariant) functorial in \(X\), we mean that there exists a functor \(\mathscr {F}\) such that
for every \(f : X \to X'\).
Dually, we use the term contravariantly functorial in.
For simplicity, when there is no confusion, we use \(\bullet \) to represent corresponding objects, and omit the arrow names in the codomain of a functor, e.g.
A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is full iff for any pair of objects \(X, Y\) in \({\cal C}\) the induced map \(F_{X, Y}: {\cal C}(X, Y) \to {\cal D}(\mathscr {F}(X), \mathscr {F}(Y))\) is surjective (onto). \(\mathscr {F}\) is faithful if this map is injective (one-to-one).
A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is called to preserve a property \(\wp \) of an arrow iff for every \(f \in \operatorname {Arr}({\cal C})\) that has a property \(\wp \) it follows that \(\mathscr {F}(f) \in \operatorname {Arr}({\cal D})\) has this property. A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) is called to reflect a property \(\wp \) of an arrow iff for every \(\mathscr {F}(f) \in \operatorname {Arr}({\cal D})\) that has a property \(\wp \) it follows that \(f \in \operatorname {Arr}({\cal C})\) has this property.
The identity functor \(\mathit {1}_{{\cal C}}: {\cal C} \to {\cal C}\) (denoted also by \( 1_{{\cal C}}: {\cal C} \to {\cal C}\)), defined by \(\mathit {1}_{{\cal C}}(X)=X\) and \(\mathit {1}_{{\cal C}}(f)=f\) for every \(X \in \operatorname {Ob}({\cal C})\) and every \(f \in \operatorname {Arr}({\cal C})\).
The constant functor \(\Delta _O: {\cal C} \to {\cal D}\) which assigns a fixed \(O \in \mathrm {Ob}({\cal D})\) to any object of \({\cal C}\) and \(\mathit {1}_O\), the identity arrow on \(O\), to any arrows from \({\cal C}\) :
with compositions and identities preserved in a trivial way.
A functor from the terminal category \(\mathbf {1}\) to a category \({\cal C}\) simply picks out an object of \({\cal C}\), called a constant object functor (which is a constant functor), denoted \(\Delta _X : \mathbf {1} \to {\cal C}\) for some \(X \in \operatorname {Ob}({\cal C})\), or simly denoted by the object, e.g. \(X\).
As special cases, constant object functor for initial and terminal objects are denoted by \(\mathrm {0}\) and \(\mathrm {1}\), respectively.
A unique functor, is a functor from a category \({\cal C}\) to the terminal category \(\mathbf {1}\), uniquely determined by mapping all arrows in \({\cal C}\) to the identity arrow \(\mathit {1}_{\mathrm {*}}\) of the unique object \(\mathrm {*}\) in \(\mathbf {1}\).
This functor is often denoted by \(! : {\cal C} \to \mathbf {1}\).
Intuitively, the functor \(!\) acts to erase all information about the input.
Given a small category \({\cal J}\) and a category \({\cal C}\), the diagonal functor
\[\Delta _{{\cal J}}: {\cal C} \to [{\cal J}, {\cal C}]\]
maps each object \(X \in {\cal C}\) to the constant functor \(\Delta _{{\cal J}}(X): {\cal J} \to {\cal C}\), which in turn maps each object in \({\cal J}\) to \(X\), and all arrows in \({\cal J}\) to \(\mathit {1}_X\).
When \({\cal J}\) is clear in the context, we may write \(\Delta _{{\cal J}}(X)\) as \(\Delta _X\).
Particularly [kostecki2011introduction, 3.1, example 6], when \({\cal J}\) is a discrete category of two objects,
\(\Delta : {\cal C} \to {\cal C} \times {\cal C}, \Delta (X)=(X, X)\) and \(\Delta (f)=(f, f)\) for \(f: X \to X^{\prime }\)
\(\Delta _{{\cal J}}(X)\) is the same as \(X \mathbin {\bullet } !\), thus [nakahira2023diagrammatic, eq. 2.12]
\[\Delta _{{\cal J}} = - \mathbin {\bullet } !\]
Whenever \({\cal S}\) is a subcategory of a category \({\cal C}\) , there is an inclusion functor \(\mathscr {I} : {\cal S} \hookrightarrow {\cal C}\) defined by \(\mathscr {I}(S) = S\) and \(\mathscr {I}( f ) = f\) , i.e. it sends objects and arrows of \({\cal S}\) into themselves in category \({\cal C}\). It is automatically faithful, and it is full iff S is a full subcategory.
A universal arrow from \(X \in {\cal D}\) to \(\mathscr {F} : {\cal C} \to {\cal D}\) is a unique pair \((O , u)\) that makes the diagram
commute for any \(f \in {\cal D}\).
Conversely, a (co)universal arrow from \(\mathscr {F} : {\cal C} \to {\cal D}\) to \(X \in {\cal D}\) is a unique pair \((O , u)\) that makes the diagram
commute for any \(f \in {\cal D}\).
The term universal property is used to describe some property (usually some equality of some compositions, or a commuting diagram in general) that is satisfied for all (hence universal) relevant objects/arrows in the "world" (i.e. in the relevant categories), by a corresponding unique object/arrow.
It's usually spell out like this:
In a context where (usually a given diagram), for all (some objects/arrows), there exists a unique (object/arrow) such that (some property).
For the diagram above, we also say that \(f\) uniquelyfactors through \(u\) along \(g\) [riehl2017category, 2.3.7].
A universal property is a property of some construction which boils down to (is manifestly equivalent to) the property that an associated object is an initial object of some (auxiliary) category [nlab2020univ].
Given categories and functors
the comma category \(\mathscr {F} \downarrow \mathscr {G}\) (or \((\mathscr {F} \Rightarrow \mathscr {G})\)) is the category given by objects \((X, h, Y)\) and arrows \((f, g)\) that makes the diagram
commute.
A universal arrow from \(X \in {\cal D}\) to \(\mathscr {F} : {\cal C} \to {\cal D}\) is an initial object in the comma category \(X \downarrow \mathscr {F}\).
Conversely, a (co)universal arrow from \(\mathscr {F} : {\cal C} \to {\cal D}\) to \(X \in {\cal D}\) is a terminal object in the comma category \(\mathscr {F} \downarrow X\).
Natural transformation and functor category [tt-001M]
Given categories and functors
the natural transformation \(\alpha : \mathscr {F} \to \mathscr {G}\), denoted
is a collection of arrows \(\left \{\alpha _X : \mathscr {F}(X) \to \mathscr {G}(X) \right \}_{X \in \operatorname {Ob}({\cal C})}\) in \({\cal D}\) which satisfies naturality, i.e. makes the diagram
commute for every arrow \(f : X \to X'\) in \({\cal C}\). The arrows \(\left \{\alpha _X\right \}_{X \in \operatorname {Ob}({\cal C})}\) are called the components of the natural transformation.
The collection of natural transformations from functors \(\mathscr {F}\) to \(\mathscr {G}\) is denoted \(\operatorname {Nat}(\mathscr {F}, \mathscr {G})\).
A functor \(\mathscr {F} : {\cal C} \to {\cal D}\) can be represented as an edge, commonly referred to as a wire:
Functors compose from left to right:
A natural transformation \(\alpha : \mathscr {F} \to \mathscr {F}'\) can be represented as a dot on the wire from top to bottem (the opposite direction of [marsden2014category], but the same as [sterling2023models]), connecting the two functors :
Natural transformations (for the same pair of categories) compose vertically from top to bottem:
Natural transformations (from different pairs of categories) compose horizontally from left to right:
The two ways of composing natural transformations are related by the interchange law:
i.e. \[(\alpha \mathbin {\bullet } \alpha ') \mathbin {\bullet } (\beta \mathbin {\bullet } \beta ') = (\alpha \mathbin {\bullet } \beta ) \mathbin {\bullet } (\alpha ' \mathbin {\bullet } \beta ')\]
The naturality in natural transformations is equivalent to the following equality:
where \(X\) and \(X'\) are objects in \({\cal C}\), understood as functors from the terminal category \(\mathit {1}\) to \({\cal C}\).
Since a string diagrams is composed from top to bottem, left to right, we can read
as
\[(X \mathbin {\bullet } \mathscr {F}) \mathbin {\bullet } (\alpha _X) \mathbin {\bullet } (f \mathbin {\bullet } \mathscr {G})=(X' \mathbin {\bullet } \mathscr {G}) \quad {\large =} \quad (X \mathbin {\bullet } \mathscr {F}) \mathbin {\bullet } (f \mathbin {\bullet } \mathscr {F}) \mathbin {\bullet } (\alpha '_X)=(X' \mathbin {\bullet } \mathscr {G})\]
where each pair of parentheses corresponds to an overlay in the string diagram,
or with the notation in the opposite direction that is more familiar to most:
\[\mathscr {G}(f) \circ \alpha _X \circ \mathscr {F}(X) = \mathscr {G}(X') \quad {\large =} \quad \alpha '_X \circ \mathscr {F}(f) \circ \mathscr {F}(X) = \mathscr {G}(X')\]
Note that we read the wire from \(\mathscr {F}\) to \(\mathscr {G}\) as \(\mathscr {F}\) before the natural transformation, but as \(\mathscr {G}\) after the transformation.
Effectively naturality says that the natural transformation and function \(f\) “slide past each other”, and so we can draw them as two parallel wires to illustrate this.
The functor category from \({\cal C}\) to \({\cal D}\), denoted \([{\cal C}, {\cal D}]\) or \({\cal D}^{{\cal C}}\), is a category whose objects are functors from \({\cal C}\) to \({\cal D}\) and whose arrows are natural transformations between them, where composition is given by
and the identity is given by
One can think of a functor category \([{\cal C}, {\cal D}]\) or \({\cal D}^{{\cal C}}\) as a category of diagrams in \(D\) indexed (or labelled) by the objects from \({\cal C}\).
A natural isomorphism between functors from categories \({\cal C}\) and \({\cal D}\) is an isomorphism in the functor category \([{\cal C}, {\cal D}]\).
A category is a world of objects, all looking at one another. Each sees the world from a different viewpoint.
We may ask: what objects see? Fix an object, this can be described by the arrows from it, this corresponds to the covariantly representable functor.
We can also ask the dual question: how objects are seen? Fix an object, this can be described by the arrows into it, this corresponds to the contravariantly representable functor.
For every locally small category \({\cal C}\), the covariant hom-functor, denoted
\[{\cal C}(X,-): {\cal C} \to \mathbf {Set} \]
is given by
Conversely, the contravariant hom-functor, denoted
\[{\cal C}(-,X): {\cal C}^{op} \to \mathbf {Set} \]
is given by
Further, the hom-bifunctor, denoted
\[{\cal C}(-,=): {\cal C}^{op} \times {\cal C} \to \mathbf {Set} \]
defined as a contravariant hom-functor at first argument and as a covariant hom-functor at second argument.
We see \(-\) and \(=\) as placeholders for any object and its "associated arrow" (whose domain/codomain is the object, respectively) in the corresponding category. And we use boxes to mark the placeholder objects in diagrams.
A set-valued functor \(\mathscr {F}: {\cal C} \to \mathbf {Set}\) is called covariantly representable if for some \(X \in {\cal C}\),
\[\tau : \mathscr {F} \cong {\cal C}(X,-)\]
where \(\cong \) denotes a natural isomorphism.
Conversely, a set-valued functor \(\mathscr {G} : {\cal C}^{op} \to \mathbf {Set}\) is called contravariantly representable if for some \(X \in {\cal C}\),
\[\tau : \mathscr {G} \cong {\cal C}(-, X)\]
Such an object \(X\) is called a representing object for the functor \(\mathscr {F}\) or \(\mathscr {G}\), respectively.
The pair \((\tau , X)\) is called a representation of the functor \(\mathscr {F}\) (respectively, \(\mathscr {G}\) ).
A representation \((\tau , X)\) of a representable functor \(\mathscr {F}\) is a choice of an object \(X \in {\cal C}\) and an isomorphism \(\tau \) between the corresponding type of hom-functor and \(\mathscr {F}\).
Representable functors are sometimes just called representables. Only set-valued functors can be representable.
Let \({\cal C}\) be a locally small category. The covariant Yoneda embedding functor of \({\cal C}\) is the functor
\[ \mathscr {H}^{\bullet }: {\cal C}^{op} \to [{\cal C}, \mathbf {Set}] \]
defined on objects \(X\) by the covariant hom-functor on \(X\).
This functor embeds what every object in \({\cal C}\) sees the "world" of the category \({\cal C}\), i.e. arrows from each object.
Conversely, the (contravariant) Yoneda embedding functor of \({\cal C}\) is the functor
\[ \mathscr {H}_{\bullet }: {\cal C} \to [{\cal C}^{op}, \mathbf {Set}] \]
defined on objects \(X\) by the contravariant hom-functor on \(X\).
This functor embeds how every object in \({\cal C}\) is "seen", i.e. arrows to each object.
\(\bullet \) is a placeholder for an object. \(\mathscr {H}^X\) and \(\mathscr {H}_X\) denote the corresponding Yoneda embedding functors applied to \(X\), and are called covariant/contravariant Yoneda functors, respectively.
When one speaks of the Yoneda (embedding) functor without specifying covariant or contravariant, it means the contravariant one, because it's the one used in the Yoneda lemma.
Let \({\cal C}\) be a locally small category. Then
\[ \operatorname {Nat}(\mathscr {H}_X, \mathscr {F}) \cong \mathscr {F}(X) \]
naturally in \(X \in {\cal C}\) and \(\mathscr {F} \in \left [{\cal C}^{\mathrm {op}}, \mathbf {Set} \right ]\), where \(\mathscr {H}_X\) is the (contravariant) Yoneda embedding functor on \(X\), and Nat denotes all the natural transformations between the two functors.
Diagramatically, \(\operatorname {Nat}(\mathscr {H}_X, \mathscr {F})\) is
and it's also denoted by \(\left [{\cal C}^{op}, \mathbf {Set}\right ]\left (\mathscr {H}_X, \mathscr {F}\right )\) in the sense of \(\operatorname {Hom}_{\left [{\cal C}^{op}, \mathbf {Set}\right ]}\left (\mathscr {H}_X, \mathscr {F}\right )\) where \(\left [{\cal C}^{op}, \mathbf {Set}\right ]\) is a functor category.
To understand an object it suffices to understand all its relationships with other things.
This is similar to the seventeenth-century philosopher Spinoza's idea that what a body is (its “essence”) is inseparable from all the ways that the body can affect (causally influence) and be affected (causally influenced) by other bodies.
The idea of Yoneda is that we can be assured that if a robot wants to learn whether some object \(X\) is the same thing as object \(Y\), it will suffice for it learn whether
\[{\cal C}(-, X) \cong {\cal C}(-, Y)\]
or, dually,
\[{\cal C}(X, -) \cong {\cal C}(Y, -)\]
i.e. whether all the ways of probing \(X\) with objects of its environment amount to the same as all the ways of probing \(Y\) with objects of its environment.
Given a small category \({\cal C}\), there is a functor \(\operatorname {Ob} : \mathbf {Cat} \to \mathbf {Set}\) that sends \({\cal C}\) to its set of objects where \(\mathbf {Cat}\) is the category of small categories. Thus,
\[ \mathscr {H}^{\mathrm {1}}({\cal C}) \cong \operatorname {Ob}({\cal C}) \]
where \(\mathscr {H}\) is a Yoneda embedding functor.
This isomorphism is natural in \({\cal C}\); hence \(\operatorname {Ob} \cong \mathbf {Cat}(\mathrm {1}, -)\)
where \(\mathbf {Cat}(\mathrm {1}, -)\) is a covariant hom-functor.
Functor \(\operatorname {Ob}\) is representable. Similarly, the functor \(\operatorname {Arr} : \mathbf {Cat} \to \mathbf {Set}\) sending a small category to its set of arrows is representable.
The Yoneda lemma says that for any \(X \in {\cal C}\) and presheaf \(\mathscr {F}\) on \({\cal C}\),
a natural transformation \(\mathscr {H}_X \to \mathscr {F}\) is an element of \(\mathscr {F}(X)\) of shape \(\mathscr {H}_X\).
What kind of presheaves are already "built in" to the category \({\cal C}\)?
The answer by the Yoneda lemma is, the Yoneda embedding \(\mathscr {H}_{\bullet }: {\cal C} \to [{\cal C}^{op}, \mathbf {Set}]\) embeds \({\cal C}\) into its own presheaf category.
In mathematics at large, the word "embedding" is used (sometimes informally) to mean a map \(i: X \to Y\) that makes \(X\) isomorphic to its image in \(Y\), i.e. \(X \cong i(X)\). [leinster2016basic, 1.3.19] tells us that in category theory, a full and faithful functor \(\mathscr {I}: X \to Y\) can reasonably be called an embedding, as it makes \(X\) equivalent to a full subcategory of \(Y\).
So, \({\cal C}\) is equivalent to the full subcategory of the presheaf category \([{\cal C}^{op}, \mathbf {Set}]\) whose objects are the representables.
The functor category of contravariant set-valued functors \([{\cal C}^{op}, \mathbf {Set}]\), called the category of presheaves or varying sets, the objects of which are contravariant functors \({\cal C}^{op} \to \) Set. It may be regarded as a category of diagrams in \(\mathbf {Set}\) indexed contravariantly by the objects of \({\cal C}\).
By definition, objects of \({\cal C}\) play the role of stages, marking the "positions" (in passive view) or "movements" (in active view) of the varying set \(\mathscr {F}: {\cal C}^{op} \to \mathbf {Set}\). For every \(X\) in \({\cal C}^{op}\), the set \(\mathscr {F}(X)\) is a set of elements of \(\mathscr {F}\) at stage \(X\).
An arrow \(f: Y \to X\) between two objects in \({\cal C}^{op}\) induces a transition arrow \(\mathscr {F}(f): \mathscr {F}(X) \to \mathscr {F}(Y)\) between the varying set \(\mathscr {F}\) at stage \(A\) and the varying set \(\mathscr {F}\) at stage \(B\).
Let \({\cal C}\) be a category and \({\cal J}\) a small category. A functor \(\mathscr {D} : {\cal J} \to {\cal C}\) is called a diagram in \({\cal C}\) of shape \({\cal J}\).
\({\cal J}\) is also called the indexing category of the diagram, and we say that \(\mathscr {D}\) is a diagram indexed by \({\cal J}\) [rosiak2022sheaf, example 35]. \(J\) is also called the template.
An equalizer of a shape \(E(f, g)\) is a fork \((E, \iota )\) over it, such that, for any \((\mathrm {-}, d)\) over the fork, the diagram
commutes (i.e. any arrow \(d : \mathrm {-} \to X\) must uniquelyfactor through \(E\)).
For simplicity, we refer to the equalizer of a shape \(E(f, g)\) as \(\operatorname {Eq}(f, g)\), and \(\iota \) is the canonical inclusion.
We say that a category \({\cal C}\) has equalizers iff everyshape E in \({\cal C}\) has an equalizer.
Equalizer in a category is a generalisation of a subset which consists of elements of a given set such that two given functions are equal on them, formally:
For any two arrows \(f, g: X \to Y\), their equalizing set \(E \subseteq X\) is defined as
\[ E:=\{e \mid e \in X \wedge f(e)=g(e)\} \]
A pullback of a shape \(P(f, g)\) is an object \(X \times _O Y\) in \({\cal C}\) together with arrows \(p_X\) and \(p_Y\), called projections, such that, for any object \(\mathrm {-}\) and arrows \(h\) and \(k\), the diagram
commutes.
We say that a category \({\cal C}\) has pullbacks iff every shape \(P(f, g)\) in \({\cal C}\) has a pullback in \({\cal C}\).
A pullback is also called a fiber product.
The square
is called the pullback square of \(f\) and \(g\). The object \(X \times _O Y\) in \({\cal C}\) is called the fiber product object.
Pullbacks can be pasted together, i.e. for diagram
given that the right-hand square is a pullback, the left-hand square is a pullback if and only if the outer rectangle is a pullback.
A binary product of objects \(X\) and \(Y\) is an object \(X \times Y\) in \({\cal C}\) together with arrows \(p_X\) and \(p_Y\), called projections, such that, for any object \(\mathrm {-}\) and arrows \(h\) and \(k\), the diagram
commutes.
We say that a category \({\cal C}\) has binary products iff every pair \(X, Y\) in \({\cal C}\) has a binary product \(X \times Y\) in \({\cal C}\).
When there is no confusion, we simply call bianry products products.
The pushout equivalent of the fiber product object in pullback is the fiber coproduct object, denoted \(X +_{O} Y\), and the pushout equivalent of projections in pullback are injections, denoted \(i_X\) and \(i_Y\), respectively. The unique arrow of a pushout is denoted \([f, g]\).
The binary coproduct equivalent of the binary product object in binary product is the binary coproduct object, denoted \(X + Y\), and the binary coproduct equivalent of projections in binary product are injections, denoted \(i_X\) and \(i_Y\), respectively. The unique arrow of a binary coproduct is denoted \([f, g]\).
In any category with binary products the objects \(X \times (Y \times Z)\) and \((X \times Y) \times Z\) are isomorphic. In any category with binary coproducts the objects \(X+(Y+Z)\) and \((X+Y)+Z\) are isomorphic.
This allows to consider \(n\)-fold products \(X_1 \times \dots \times X_n\) and \(n\)-fold coproducts \(X_1 + \dots + X_n\) of objects of a given category.
Adjointness is about the relationships between categories. Representability is a property of set-valued functors. Limits are about what goes on inside a category.
Whenever you meet a method for taking some objects and arrows in a category and constructing a new object out of them, there is a good chance that you are looking at either a limit or a colimit.
Let \({\cal C}\) be a category, \({\cal J}\) a small category, and \(\mathscr {D} : {\cal J} \to {\cal C}\) a diagram in \({\cal C}\) of shape \({\cal J}\).
A cone on \(\mathscr {D}\) is an object \(V \in {\cal C}\) (the vertex of the cone) together with a family
\[ \left (V \xrightarrow {\pi _J} \mathscr {D}(J)\right )_{J \in {\cal J}} \]
of arrows in \({\cal C}\) such that for all arrows \(J \to J'\) in \({\cal J}\), the diagram
commutes.
The family of arrows are components of a natural transformation \(\pi : \Delta _V \to \mathscr {D}\), i.e. from the constant functor ( which assigns the same object \(V\) to any object \(J_i\) in \({\cal J}\)) to diagram functor \(\mathscr {D}\).
For simplicity, we refer to a cone by "a cone \((V, \pi )\) on \(\mathscr {D}\)".
A cone \((V, \pi )\) on \(\mathscr {D} : {\cal C} \to {\cal J}\) is called a limit of \(\mathscr {D}\), denoted
\[\lim \mathscr {D}\]
if the diagram
commutes for every cone \((V, \pi )\) on \(\mathscr {D}\).
The arrows \(\pi _J\) are called the projections of the limit.
Other possible terms of limit are limiting cone, universal cone.
the limit of \(\mathscr {D} : \mathbf {\emptyset } \to {\cal C} \), where \(\mathbf {\emptyset }\) is an empty category:
i.e. the terminal object \(\mathrm {1}\) in \({\cal C}\). In particular, for \({\cal C} = \mathbf {Set}\) we have
\[\mathrm {-} \xrightarrow {!} V = \lim \mathscr {D} = \{∗\} \]
In another word, given \(\mathscr {D}^{op} : {\cal J}^{op} \to {\cal C}^{op}\), a cocone on \(\mathscr {D}\) is a cone on \(\mathscr {D}^{op}\), a colimit of \(\mathscr {D}\) is a limit of \(\mathscr {D}^{op}\) [leinster2016basic, 5.2.1].
The arrows \(\pi _J\) are called the coprojections of the colimit.
In the same say, one can and show that coequaliser, coproduct, pushout and initial object are examples of colimits.
We say that a category \({\cal C}\) has (finite) limits or is (finitely) complete if every diagram \(\mathscr {D} : {\cal J} \to {\cal C}\), where \({\cal J}\) is a (finite) category, has a limit.
A category \({\cal C}\) is called left exact iff it is finitely complete.
Dually to Definition [tt-002F], when every diagram \(\mathscr {D} : {\cal J} \to {\cal C}\), where \({\cal J}\) is a (finite) category, has a colimit, it is said that the category \({\cal C}\) has (finite) colimits or is (finitely) cocomplete.
A category is called right exact iff it is finitely cocomplete.
A category \({\cal C}\) is (finitely) complete if it has a terminal object, equalizers and (finite) products, or if it has a terminal object and (finite) pullbacks.
Dually, a category \({\cal C}\) is (finitely) cocomplete if it has an initial object, coequalizers and (finite) coproducts, or if it has an initial object and (finite) pushouts.
The category \(\mathbf {Preord}\) consists of objects which are preorders and of arrows which are orderpreserving functions.
The category \(\mathbf {Poset}\) consists of objects which are posets and of arrows which are order-preserving functions between posets, that is, the maps \(T: P \to P'\) such that
\[ p \leq q \Rightarrow T(p) \leq T(q) \]
Any any preorder \((P, \leq )\) and poset \((P, \leq )\) can be considered as a category consisting of objects which are elements of a set \(P\) and arrows defined by \(p \to q \Longleftrightarrow p \leq q\).
An example that is a directed poset category but not a total order is:
where each pair of nodes has a common upper bound (thus satisfying directedness), but there is no path between the two nodes on the center, thus violating comparability.
One can see immediately that this is a preorder because length=0 paths give reflexivity and concatenation of paths gives transitivity. To see that it is a partial order we only note that there are no loops.
To see that it is a poset, we note that every pair of nodes from one side or both sides has the central node as an upper bound, thus satisfying directedness.
But this partial order is not a total order because there is no path (in either direction) between some nodes, thus violating comparability.
Let \({\cal J}\) be a directed poset and \(\mathscr {F} : {\cal J} \to {\cal C}\) be a contravariant functor. The limit of \(\mathscr {F}\) is called an inverse limit or projective limit, and is denoted \(\lim \limits _{\leftarrow {\cal J}} \mathscr {F}\) or simply \(\lim \limits _{\longleftarrow } \mathscr {F}\).
Let \({\cal J}\) be a directed poset and \(\mathscr {F}: {\cal J} \to {\cal C}\) be a contravariant functor. The colimit of \(\mathscr {F}\) is called a direct limit (some called directed limit) or inductive limit, and is denoted \(\lim \limits _{\to {\cal J}} \mathscr {F}\), or simply \(\lim \limits _{\longrightarrow } \mathscr {F}\).
A functor \(\mathscr {F}: {\cal C} \to {\cal D}\) preserves (all) limits and is called left exact iff it sends all limits in \({\cal C}\) into limits in \({\cal D}\).
Dually, a functor \(\mathscr {F}: {\cal C} \to {\cal D}\) preserves (all) colimits and is called right exact iff it sends all colimits in \({\cal C}\) into colimits in \({\cal D}\).
One can see from the table that, in general, limits have the direction "back" "into" (where "back", "left", "inverse" are directional consistent), and colimits have the opposite: "forward" "out of".
This might help to memorize the directions in these concepts without disorientation.
Given functors
we say \(\mathscr {L}\) and \(\mathscr {R}\) are a pair of adjoint functors, or together called an adjunction between them, \(\mathscr {L}\) is called left adjoint to \(\mathscr {R}\), and \(\mathscr {R}\) is called right adjoint to \(\mathscr {L}\), denoted
\[\mathscr {L} \dashv \mathscr {R} : {\cal C} \rightleftarrows {\cal D}\]
or
iff there exists a natural isomorphism \(\sigma \) between the following two hom-bifunctors:
\[ {\cal D}(\mathscr {L}(-), =) \cong {\cal C}(-, \mathscr {R}(=)) \]
diagramatically,
The components of the natural isomorphism \(\sigma \) are isomorphisms
\[ \sigma _{XY} : {\cal D}(\mathscr {L}(X), Y) \cong {\cal C}(X, \mathscr {R}(Y)) \]
An adjunction \(\mathscr {L} \dashv \mathscr {R}\) means arrows \(\mathscr {L}(X) \to Y\) are essentially the same thing as arrows \(X \to \mathscr {R}(Y)\) for any \(X \in {\cal C}\) and \(Y \in {\cal D}\).
This means the diagram
commutes for any arrows \(f: \mathscr {L}(X) \to Y\) in \({\cal D}\).
The above can also be diagramatically denoted by transposition diagram
\[ \begin {array}{ccccccc} X' & \xrightarrow {x} & X & \xrightarrow {\sigma _{X Y}(f)} & \mathscr {R}(Y) & \xrightarrow {\mathscr {R}(y)} & \mathscr {R}\left (Y'\right ) \\ \hline \mathscr {L}\left (X'\right ) & \xrightarrow {\mathscr {L}(x)} & \mathscr {L}(X) & \xrightarrow {f} & Y & \xrightarrow {y} & Y' \end {array} \]
or simply,
\[ \frac {X \to \mathscr {R}(Y) \quad ({\cal C})}{\mathscr {L}(X) \to Y \quad ({\cal D})} \]
An adjunction is a concept that describes the relationship between two functors that are weakly inverse to each other [nakahira2023diagrammatic, sec. 4].
By "weakly inverse", we don't mean that applying one after the other gives the identity functor, but in a sense similar to eroding (i.e. enhancing holes) and dilating (i.e. filling holes) an image, applying them in different order yeilds upper/lower "bounds" of the original image [rosiak2022sheaf, sec. 7.1].
The covariant hom-functor \({\cal C}(X, -)\), denoted \(-^X\), can be represented in string diagrams as
where the dotted circle denotes any arrows with the domain \(X\) and codomain \(-\).
The contravariant hom-functor \({\cal C}(-, X)\), denoted \(X^-\), can be represented in a similar manner.
The hom-bifunctor \({\cal C}(-,=)\), also denoted \(=^-\), can be represented as
where the dotted circle denotes any arrows with the domain \(-\) and codomain \(=\).
The natural isomorphism
\[{\cal D}(\mathscr {L}(-), =) \cong {\cal C}(-, \mathscr {R}(=))\]
in adjunction can be represented as
Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), there exists \(f^{\sharp }\) and \(g^{\flat }\) such that the diagrams
commute for any arrow \(f: X \to \mathscr {R}(Y)\) in \({\cal C}\), \(g: \mathscr {L}(X) \to Y\) in \({\cal D}\).
\(f^{\sharp }\) is called the left transpose of \(f\). \(g^{\flat }\) is called the right transpose of \(g\).
Other possible terms are left/right adjunct of each other, and mates [nlab2023adjunct].
Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), we may obtain two endofunctors \( \mathscr {L} \mathbin {\bullet } \mathscr {R} : {\cal C} \to {\cal C}\) and \( \mathscr {R} \mathbin {\bullet } \mathscr {L} : {\cal D} \to {\cal D}\) that commute the diagram
that means they are both idempotent, i.e. applying \(\mathscr {L} \mathbin {\bullet } \mathscr {R}\) any times yields the same result as applying it once, and similarly for \(\mathscr {R} \mathbin {\bullet } \mathscr {L}\).
Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), the natural transformation
\[\eta : \mathit {1}_{{\cal C}} \to \mathscr {L} \mathbin {\bullet } \mathscr {R} \]
is called the unit of the adjunction, and
\[\epsilon : \mathscr {R} \mathbin {\bullet } \mathscr {L} \to \mathit {1}_{{\cal D}}\]
is called the counit.
We call an arrow
\[\eta _X: X \to (\mathscr {L} \mathbin {\bullet } \mathscr {R})(X)\]
a unit over \(X\), and
\[\epsilon _Y: (\mathscr {R} \mathbin {\bullet } \mathscr {L})(Y) \to Y\]
a counit over \(Y\). They are components of the natural transformations \(\eta \) and \(\epsilon \), respectively.
The unit \(\eta \) and counit \(\epsilon \) of an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\) are universal, i.e. the diagram
commutes for any \(f \in {\cal C}\), and the diagram
commutes for any \(g \in {\cal D}\).
Given an adjunction \(\mathscr {L} \dashv \mathscr {R}: {\cal C} \rightleftarrows {\cal D}\), the diagrams
commute.
Note that \(\mathscr {L}\) in \(\epsilon _\mathscr {L}\) is a subscript, meaning \(\epsilon _\mathscr {L}: {\cal D} \to {\cal D}, \mathscr {L}(X) \mapsto \epsilon _{\mathscr {L}(X)}\) for \(X \in {\cal C}\). Similar for \(\eta _\mathscr {R}\).
Continuing from Notation [tt-001U], the triangle identities can be represented in string diagrams as follows, and called the snake identities (or zig-zag identities):
where
are the unit and counit of the adjunction, respectively.
Following Notation [tt-001G], recall that a string diagram is composed from top to bottem, left to right, we can read the left snake
in snake identities as
\[ \mathscr {L} \xmapsto {(\eta \mathbin {\bullet } \mathscr {L}) \mathbin {\bullet } (\mathscr {L} \mathbin {\bullet } \epsilon )} \mathscr {L}\]
where each pair of parentheses corresponds to an overlay in the string diagram, and this is equivalent to
in triangle identities.
Given an adjunction
with unit \(\eta \) and counit \(\epsilon \), the diagrams
\[h=\eta _X \mathbin {\bullet } \mathscr {R}(h^{\sharp })\]
and
\[f^{\sharp }=\mathscr {L}(f) \mathbin {\bullet } \epsilon _Y\]
commute.
The natural transformation \(\sigma _{XY}\) and \(\tau _{XY}\) that are the components of the natural isomorphism in the adjunction \(\mathscr {L} \dashv \mathscr {R} : {\cal C} \rightleftarrows {\cal D}\) are related to the unit and counit of the adjunction:
\[\begin {aligned} \sigma _{XY}(f) & = \eta _X \mathbin {\bullet } \mathscr {R}(f^{\sharp })\\ \tau _{XY}(g) & = \mathscr {L}(g^{\flat }) \mathbin {\bullet } \epsilon _Y \end {aligned} \]
and they are reverse of each other
\[\sigma _{XY} = \tau _{YX}^{-1}\]
For the left adjoint, from the university of \(\eta \) it follows that there exists a unique, up to isomorphism, isomorphism between different left adjoints. It remains to show naturality of this isomorphim, which is left as an exercise. The proof for right adjoint follows by duality.
Given categories and functors
there is a one-to-one correspondence between the adjunction \(\mathscr {L} \dashv \mathscr {R}\) and the pairs of natural transformations \(\eta \) and \(\epsilon \) satisfying the the triangle identities.
From Lemma [tt-001Z], it follows that every adjunction between \(\mathscr {L}\) and \(\mathscr {R}\) gives rise to a pair of transformations \(\eta \) and \(\epsilon \) satisfying the triangle identities.
To show that there exists a unique adjunction for \(\eta \) and \(\epsilon \), the uniqueness follows from Lemma [tt-001Y], the existence can use the construction in the spirit of Definition [tt-001X].
Given categories and functors
there is a one-to-one correspondence between:
the adjunction \(\mathscr {L} \dashv \mathscr {R}\)
natural transformations \(\eta : \mathit {1}_{{\cal C}} \to \mathscr {L} \mathbin {\bullet } \mathscr {R}\) such that \(\eta _X\) is initial in the comma category \(X \Rightarrow \mathscr {R}\) for every \(X \in {\cal C}\)
Let \({\cal C}\) be a locally small category and \(\mathscr {F} : {\cal C}^{op} \to \mathbf {Set}\). Then a representation of \(\mathscr {F}\) consists of a pair \((X, u)\) such that the diagram
commutes.
Pairs \((Y, y)\) with \(Y \in {\cal C}\) and \(y \in \mathscr {F}(Y)\) in Corollary [tt-0035] are sometimes called elements of the presheaf \(\mathscr {F}\).
An element \(u\) satisfying condition in Corollary [tt-0035] is sometimes called a universal element of \(\mathscr {F}\). So, Corollary [tt-0035] says that a representation of a presheaf \(\mathscr {F}\) amounts to a universal element of \(\mathscr {F}\).
Now, given a diagram \(\mathscr {D}: {\cal J} \to {\cal C}\) and an object \(V \in {\cal C}\), a cone on \(\mathscr {D}\) with vertex \(V\) is simply a natural transformation from the diagonal functor \(\Delta _V\) to the diagram \(\mathscr {D}\).
Writing \(\operatorname {Cone}(V, \mathscr {D})\) for the set of cones on \(\mathscr {D}\) with vertex \(V\), we therefore have
\[ \operatorname {Cone}(V, \mathscr {D})=[{\cal J}, {\cal C}] (\Delta _V, \mathscr {D}) . \]
Thus, \(\operatorname {Cone}(V, \mathscr {D})\) is functorial in \(V\) (contravariantly) and \(\mathscr {D}\) (covariantly).
Let \({\cal J}\) be a small category, \({\cal C}\) a category, and \(\mathscr {D}: {\cal J} \to {\cal C}\) a diagram. Then there is a one-to-one correspondence between
limit cones on \(\mathscr {D}\)
representations of the natural transformation Cone
with the representing objects being the limit objects (i.e. the vertices) of \(\mathscr {D}\).
Briefly put: a limit \((V, \pi )\) of \(\mathscr {D}\) is a representation of \([{\cal J}, {\cal C}] (\Delta _{-}, \mathscr {D})\).
Diagramatically,
It implies that
\[\operatorname {Cone}(\mathrm {-}, \mathscr {D}) \cong {\cal C}\left (\mathrm {-}, \lim \limits _{\leftarrow {\cal J}} \mathscr {D} \right )\]
for any \(\mathrm {-} \in {\cal C}\).
It follows from Lemma [tt-003Y] and that [leinster2016basic, 6.2.1]
\[\operatorname {Cone}(X, \mathscr {D}) \cong \lim \limits _{\leftarrow {\cal J}} {\cal C}(X, \mathscr {D})\]
naturally in \(X\) and \(\mathscr {D}\).
Let \({\cal I}\) and \({\cal J}\) be small categories. Let \({\cal C}\) be a locally small category with limits of shape \({\cal I}\) and of shape
\({\cal J}\).
Initial and terminal objects can be described as adjoints. Let \({\cal C}\) be a category. There exist the unique functor \(! : {\cal C} \to \mathbf {1}\), and a constant object functor \(X : 1 \to {\cal C}\) for each object \(X\).
A left adjoint to \(!\) is exactly an initial object of \({\cal C}\):
\[ \mathrm {0} \dashv \ ! : \mathbf {1} \rightleftarrows {\cal C} \]
Similarly, a right adjoint to \(!\) is exactly a terminal object of \({\cal C}\):
\[ ! \dashv \mathrm {1} : {\cal C} \rightleftarrows \mathbf {1} \]
In both cases, being an adjunction gives an isomorphism for each object \(X\), one side of the isomorphism is \(\mathbf {1}(\mathrm {*}, \mathrm {*})\) which is just \(\mathit {1}_{\mathrm {*}}\), and the other side are \({\cal C}(\mathrm {0}, X)\) or \({\cal C}(X, \mathrm {1})\), and the isomorphism establishes the uniqueness of the arrows (from \(\mathrm {0}\) or to \(\mathrm {1}\)) for each object. The initial or terminal object exists if the corresponding adjunction exists.
Let \({\cal S}\) be a small category, \({\cal C}\) a locally small category. For each \(X \in {\cal S}\), there is a functor
\[ \begin {array}{lccc} \operatorname {ev}_X: & [{\cal S}, {\cal C}] & \to & {\cal C} \\ & \mathscr {F} & \mapsto & \mathscr {F}(X) \end {array} \]
called evaluation at \(X\).
Given a diagram \(\mathscr {D}: {\cal J} \to [{\cal S}, {\cal C}]\), we have for each \(X \in {\cal S}\) a functor
\[ \begin {array}{lccc} \mathscr {D} \mathbin {\bullet } \operatorname {ev}_X : & {\cal J} & \to & {\cal C} \\ & J & \mapsto & \mathscr {D}(J)(X) \end {array} \]
We write \(\mathscr {D} \mathbin {\bullet } \operatorname {ev}_X\) as \(\mathscr {D}(-)(X)\).
If \({\cal C}\) is a category with binary products, we can define for every \(X\) the cartesian product functor \(X \times (-)\) : \({\cal C} \rightarrow {\cal C}\), with the following action:
\[ \begin {aligned} & (X \times (-))(Y)=X \times Y \\ & (X \times (-))(f)=\operatorname {id}_X \times f \end {aligned} \]
If for \(X \in {\cal C}\), a cartesian product functor \(X \times (-)\) has a right adjoint, it is called an exponential or the exponential object, denoted \((-)^X\).
Explicitly, \(X \times (-) \dashv (-)^X\) means there is a natural isomorphism of bifunctors \(\operatorname {Hom}(X \times (-),-) \cong \operatorname {Hom}(-, (-)^X)\), i.e. for any arrow \(f: X \times Y \to Z\) there is a unique arrow \(f^b: Y \to Z^X\), which is the transpose of the adjunction, called exponential transpose.
The arrow \(\operatorname {ev} : X \times (-)^X \to (-)\) is called the evaluation arrow of the exponential, and is the counit of the adjunction.
Diagramatically, the diagram
commutes.
We say that category \({\cal C}\) has exponentials if for any \(X \in {\cal C}\), there exists an exponential \((-)^X\).
The power object \(P(X)\) of an object \(X\) in a cartesian closed category \({\cal C}\) with subobject classifier \(\Omega \) is defined as the exponential object \(\Omega ^X\).
If \(\Omega ^X\) exists for any \(X\) in \({\cal C}\), we say that \({\cal C}\) has power objects.
Let \({\cal C}\) be a category and \(\mathscr {X} : {\cal C}^{op} \to \mathbf {Set}\) a presheaf on \({\cal C}\). The category of elements \({\cal E}(\mathscr {X})\) of \(\mathscr {X}\) is the category in which:
There is a projection functor \(\mathscr {P}: \mathbf {E}(\mathscr {X}) \rightarrow {\cal C}\) defined by \(\mathscr {P}(X, x) = X\) and \(\mathscr {P}(f) = f\).
The Yoneda lemma implies that for a presheaf \(\mathscr {X}\), the generalized elements of \(\mathscr {X}\) of representable shape correspond to objects of the category of elements.
Let \({\cal C}\) be a small category and \(\mathscr {X}\) a presheaf on
\({\cal C}\). Then \(\mathscr {X}\) is the colimit of the diagram
\[ {\cal E}(\mathscr {X}) \xrightarrow {\mathscr {P}} \mathbf {{\cal C}} \xrightarrow {H_{\bullet }}\left [\mathbf {{\cal C}}^{op}, \text { Set }\right ] \]
in \(\left [\mathbf {{\cal C}}^{op}, \mathbf {Set} \right ]\), i.e.
\[\mathscr {X} \cong \lim \limits _{\rightarrow {\cal E}(\mathscr {X})}\left (\mathscr {P} \mathbin {\bullet } H_{\bullet } \right )\]
where \({\cal E}(\mathscr {X})\) is the category of elements, \(\mathscr {P}\) the projection functor in it, and \(H_{\bullet }\) the (contravariant) Yoneda embedding.
This theorem states that every presheaf is a colimit of representables in a canonical way. It is secretly dual to the Yoneda lemma. This becomes apparent if one expresses both in suitably lofty categorical language (that of ends, or that of bimodules).
A subobject classifier is an object \(\Omega \) in \({\cal C}\), together with an arrow \(\top : \mathbf {1} \to \Omega \), called the true arrow, such that for each monic arrow \(m: Y \mapsto X\) there is a unique arrow \(\chi : X \to \Omega \), called the characteristic arrow of \(m\) (or of \(Y\)), such that the diagram
is a pullback, where \(!\) is the unique functor.
\(\Omega \) is also called a generalized truth-value object.
The arrow \((! \mathbin {\bullet } T) : Y \stackrel {!}{\to } \mathbf {1} \rightarrowtail \Omega \) is often denoted as \(\top _Y: Y \to \Omega \).
In any category \({\cal C}\) with a subobject classifier \(\Omega \),
\[ \operatorname {Sub}(X) \cong {\cal C}(X, \Omega ) \]
i.e. the class of subobjects of an object \(X\) in \({\cal C}\) is isomorphic to the class of arrows from \(X\) to \(\Omega \).
The contravariant power object functor \(\mathscr {P}: {\cal C}^{o p} \to {\cal C}\), given by
\[\mathscr {P}: X \mapsto \Omega ^X\] for \(X \in \mathrm {Ob}({\cal C})\)
and such that \(\mathscr {P}(f): \Omega ^Y \mapsto \Omega ^X\) for \(f: X \to Y\) in \({\cal C}\) is given by
\[\mathscr {P}(f)(Y)=\{x \in X \mid f(x) \in Y\}\]
When power object is defined for cartesian closed categories, we have
\[ \frac {X \times Y \to \Omega }{Y \to \Omega ^X} \]
thus for every category with power objects
\[ \operatorname {Hom}(X \times Y, \Omega ) \cong \operatorname {Hom}\left (Y, \Omega ^X\right ) \]
This equation, together with Lemma [tt-004K] written in the form \(\operatorname {Sub}(X \times Y) \cong \operatorname {Hom}(X \times Y, \Omega )\), gives the isomorphism
\[ \operatorname {Sub}(X \times Y) \cong \operatorname {Hom}(Y, \mathscr {P}(X)) \]
The name "topos" originates from the Greek word "τoπoς", meaning a place, as topos could mean a place of geometry, and at the same time as a place of logic.
Following the ancient Greek naming convention, the plural of topos is topoi, but people also use toposes. We use them interchangeably.
every space gives rise to a topos (namely, the category of sheaves on it).
Topological properties of the space can be reinterpreted in a useful way as categorical properties of its associated topos.
Elementary topos was introduced by Lawvere and Tierney, to generalize Set. A topos can be regarded as a 'universe of sets' [leinster2016basic, 6.3.20]. \(\mathbf {Set}\) is the most basic example of a topos, and every topos shares enough features with Set that [⧉]
anything you really really needed to do in the category of sets can be done in any topos.
A topos can allow the interpretation of a higher-order logic. In particular, objects can be seen as collections of elements of a given type, subobjects are viewed as propositions. Products and coproducts are interpreted as conjunction and disjunction respectively. For an introduction, see [pitts2001categorical].
If \({\cal E}_1\) and \({\cal E}_2\) are toposes, then a geometric morphism \(\mathscr {G}: {\cal E}_1 \to {\cal E}_2\) is defined as a pair of adjoint functors \(\mathscr {G}^* \dashv \mathscr {G}_*\) between \({\cal E}_1\) and \({\cal E}_2\), such that \(\mathscr {G}^*\) preserves finite limits (i.e. is left exact), which implies \(\mathscr {G}_*\) preserves colimits (i.e. is right exact).
The category of toposes and their geometric morphisms is denoted \(\mathbf {Topoi}\).
This section introduces the algebraic environment of Clifford Algebra,
covering vector spaces, groups, algebras, representations, modules, multilinear algebras,
quadratic forms, filtrations and graded algebras.
The material in this section should be familiar to the readers, but it is worth reading
through it to become familiar with the notation and terminology that is used,
as well as their counterparts in Lean, which usually require some additional treatment, both
mathematically and technically (probably applicable to other formal proof verification systems).
Details can be found in the references in corresponding section, or you may use the L∃∀N button to check the corresponding \(\textsf {Mathlib}\) document and Lean 4 source code.
In this document, we unify the informal mathematical language for a definition to:
Let \(\texttt {X}\) be a \(\texttt {concept X}\).
A \(\texttt {concept Z}\) is a set/pair/triple/tuple \((Z, \mathtt {op}, ...)\), satisfying:
\(\texttt {Z}\) is a \(\texttt {concept Y}\) over \(\texttt {X}\) under \(\texttt {op}\) .
\(\texttt {formula}\) for all \(\texttt {elements in Z}\) ( \(\texttt {property}\) ).
for each \(\texttt {element}\) in \(\texttt {concept X}\),
there exists \(\texttt {element}\) such that
\(\texttt {formula}\) for all \(\texttt {elements in concept Z}\).
\(\texttt {op}\) is called \(\texttt {op name}\), for all \(\texttt {elements in Z}\), we have
\(\texttt {formula}\)
\(\texttt {formula}\)
( \(\texttt {property}\) ).
By default, \(\texttt {X}\) is a set, \(\texttt {op}\) is a binary operation on \(\texttt {X}\).
It then follows that \(1\), the identity element, is unique, and that for each \(g \in G\) the inverse \(g^{-1}\) is unique.
A group G is abelian, or commutative, if \(g * h = h * g\) for all \(g, h \in G\).
In literatures, the binary operation of a group is usually called a product. It's denoted by juxtaposition \(g h\), and is understood to be a mapping
\((g, h) \mapsto g * h\) from \(G \times G\) to \(G\).
Here we use an explicit symbol for the operation. It can be denoted multiplicatively as \(*\) in Group
or additively as \(+\) in AddGroup, where the identity element will be denoted by \(1\) or \(0\), respectively.
Sometimes we use notations with subscripts (e.g. \(*_G\), \(1_G\)) to indicate where they are.
\(\textsf {Mathlib}\) encodes the mapping \(G \times G \to G\) as \(G \to G \to G\), it is understood to be \(G \to (G \to G)\),
that sends \(g \in G\) to a mapping that sends \(h \in G\) to \(g * h \in G\).
The notation of scalar multiplication is generalized as heterogeneous scalar multiplication HMul in \(\textsf {Mathlib}\):
\[ \bullet : \alpha \to \beta \to \gamma \]
where \(\alpha \), \(\beta \), \(\gamma \) are different types.
For generality, \(\textsf {Mathlib}\) uses Module throughout for vector spaces,
particularly, for a vector space \(V\), it's usually declared as
/--
Let $K$ be a division ring, a module $V$ over $K$ is a vector space
where being a module requires $V$ to be a commutative group over $+$.
-/
variable [DivisionRing K] [AddCommGroup V] [Module K V]
For definitions/theorems about it, and most of them can be found under Mathlib.LinearAlgebra e.g. LinearIndependent.
A submodule \(N\) of \(M\) is a module \(N\) such that every element of \(N\) is also an element of \(M\).
If \(M\) is a vector space then \(N\) is called a subspace.
Let \((\alpha , +_\alpha , *_\alpha )\) and \((\beta , +_\beta , *_\beta )\) be rings.
A ring homomorphism from \(\alpha \) to \(\beta \) is a map \(\mathit {1} : \alpha \to _{+*} \beta \) such that
\(\mathit {1}(x +_{\alpha } y) = \mathit {1}(x) +_{\beta } \mathit {1}(y)\) for each \(x,y \in \alpha \).
\(\mathit {1}(x *_{\alpha } y) = \mathit {1}(x) *_{\beta } \mathit {1}(y)\) for each \(x,y \in \alpha \).
Following literatures, for \(r \in R\),
usually we write \(\mathit {1}_A(r) : R \to _{+*} A\) as a product \(r \mathit {1}_A\) if not omitted,
while they are written as a call to
algebraMap _ _ r in \(\textsf {Mathlib}\),
which is defined to be Algebra.toRingHom r.
What's simply called algebra is actually associative algebra with identity, a.k.a. associative unital algebra. See
the red herring principle
for more about such phenomenon for naming, particularly the example of (possibly) nonassociative algebra.
Let \(A\) and \(B\) be \(R\)-algebras. \(\mathit {1}_A\) and \(\mathit {1}_B\) are ring homomorphisms from \(R\) to \(A\) and \(B\), respectively.
A algebra homomorphism from \(A\) to \(B\) is a map \(f : \alpha \to _{a} \beta \) such that
\(f\) is a ring homomorphism
\(f(\mathit {1}_{A}(r)) = \mathit {1}_{B}(r)\) for each \(r \in R\)
Let \(R\) be a non-commutative ring, \(r\) an arbitrary equivalence relation on \(R\).
The ring quotient of \(R\) by \(r\)
is the quotient of \(R\) by the strengthen equivalence relation of \(r\)
such that for all \(a, b, c\) in \(R\):
\(a + c \sim b + c\) if \(a \sim b\)
\(a * c \sim b * c\) if \(a \sim b\)
\(a * b \sim a * c\) if \(b \sim c\)
i.e. the equivalence relation is compatible with the ring operations \(+\) and \(*\).
As ideals haven't been formalized for the non-commutative case, \(\textsf {Mathlib}\) uses RingQuot in places
where the quotient of non-commutative rings by ideal is needed.
The universal properties of the quotient are proven, and should be used instead of the definition that is subject to change.
Let \(X\) be an arbitrary set.
An free \(R\)-algebra on \(X\) (or "generated by \(X\)"), named \(A\), is the ring quotient of the following inductively constructed set \(A_{\mathtt {pre}}\)
for all \(x\) in \(X\), there exists a map \(X \to A_{\mathtt {pre}}\).
for all \(r\) in \(R\), there exists a map \(R \to A_{\mathtt {pre}}\).
for all \(a, b\) in \(A_{\mathtt {pre}}\), \(a + b\) is in \(A_{\mathtt {pre}}\).
for all \(a, b\) in \(A_{\mathtt {pre}}\), \(a * b\) is in \(A_{\mathtt {pre}}\).
by that equivalence relation that makes \(A\) an \(R\)-algebra, namely:
there exists a ring homomorphism from \(R\) to \(A_{\mathtt {pre}}\), denoted \(R \to _{+*} A_{\mathtt {pre}}\).
\(A\) is a commutative group under \(+\).
\(A\) is a monoid under \(*\).
left and right distributivity under \(*\) over \(+\).
\(0 * a \sim a * 0 \sim 0\).
for all \(a, b, c\) in \(A\), if \(a \sim b\), we have
\(a + c \sim b + c\)
\(c + a \sim c + b\)
\(a * c \sim b * c\)
\(c * a \sim c * b\)
(compatibility with the ring operations \(+\) and \(*\))
Similar to Remark [ca-0019], what we defined here is the free (associative, unital) \(R\)-algebra on \(X\), it can be denoted \(R\langle X \rangle \),
expressing that it's freely generated by \(R\) and \(X\), where \(X\) is the set of generators.
Let \(R, S\) be rings, \(M\) an \(R\)-module, \(N\) an \(S\)-module.
A linear map from \(M\) to \(N\) is a function \(f : M \to _{l} N\) over a ring homomorphism \(\sigma : R \to _{+*} S\), satisfying:
\(f(x + y) = f(x) + f(y)\) for all \(x, y \in M\).
\(f(r \bullet x) = \sigma (r) \bullet f(x)\) for all \(r \in R\), \(x \in M\).
The set of all linear maps from \(M\) to \(M'\) is denoted \(\operatorname {Lin}(M, M')\),
and \(\operatorname {Lin}(M)\) for mapping from \(M\) to itself.
\(\operatorname {Lin}(M)\) is an endomorphism.
Let \(A\) be a free \(R\)-algebra generated by module \(M\), let \(\iota : M \to A\) denote the map from \(M\) to \(A\).
An tensor algebra over \(M\) (or "of \(M\)") \(T\) is the ring quotient of the free \(R\)-algebra generated by \(M\),
by the equivalence relation satisfying:
for all \(a, b\) in \(M\), \(\iota (a + b) \sim \iota (a) + \iota (b)\).
for all \(r\) in \(R\), \(a\) in \(M\), \(\iota (r \bullet a) \sim r * \iota (a)\).
i.e. making the inclusion of \(M\) into an \(R\)-linear map.
The definition above is equivalent to the following definition in literature (e.g. [jadczyk2019notes, 1.7]):
Let \(M\) be a module over \(R\). An algebra \(T\) is called a tensor algebra over \(M\) (or "of \(M\)")
if it satisfies the following universal properties:
\(T\) is an algebra containing \(M\) as a submodule, and it is generated by \(M\),
Every linear mapping \(\lambda \) of \(M\) into an algebra \(A\) over \(R\), can be extended to
Let \(R\) be a commutative ring, \(M\) a \(R\)-module.
An quadratic form \(Q\) over \(M\) is a map \(Q : M \to R\), satisfying:
\( Q(a \bullet x) = a * a * Q(x)\) for all \(a \in R, x \in M\).
there exists a companion bilinear form \(B : M \to M \to R\), such that \(Q(x + y) = Q(x) + Q(y) + B(x, y)\)
In [jadczyk2019notes], the bilinear form is denoted \(\Phi \), and called the polar form associated with the quadratic form \(Q\), or simply the polar form of \(Q\).
Let \(M\) be a module over a commutative ring \(R\), equipped with a quadratic form \(Q: M \to R\).
Let \(\iota : M \to _{l[R]} T(M)\) be the canonical \(R\)-linear map for the tensor algebra \(T(M)\).
Let \(\mathit {1} : R \to _{+*} T(M)\) be the canonical map from \(R\) to \(T(M)\), as a ring homomorphism.
A Clifford algebra over \(Q\), denoted \(\mathcal {C}\kern -2pt\ell (Q)\), is
the ring quotient of the tensor algebra \(T(M)\)
by the equivalence relation satisfying \(\iota (m)^2 \sim \mathit {1}(Q(m))\) for all \(m \in M\).
The natural quotient map is denoted \(\pi : T(M) \to \mathcal {C}\kern -2pt\ell (Q)\) in some literatures,
or \(\pi _\Phi \)/\(\pi _Q\) to emphasize the bilinear form \(\Phi \) or the quadratic form \(Q\), respectively.
In literatures, \(M\) is often written \(V\), and the quotient is taken by the two-sided ideal \(I_Q\) generated from the set
\(\{ v \otimes v - Q(v) \vert v \in V \}\). See also Clifford algebra [pr-spin].
As of writing, \(\textsf {Mathlib}\) does not have direct support for two-sided ideals,
but it does support the equivalent operation of taking the ring quotient by a suitable closure of
a relation like \(v \otimes v \sim Q(v)\).
This definition and what follows in \(\textsf {Mathlib}\) is initially presented in [wieser2022formalizing],
some further developments are based on [grinberg2016clifford], and in turn based on [bourbaki2007algebra]
which is in French and never translated to English.
We denote the canonical \(R\)-linear map to the Clifford algebra \(\mathcal {C}\kern -2pt\ell (M)\) by \(\iota : M \to _{l[R]} \mathcal {C}\kern -2pt\ell (M)\).
It's denoted \(i_\Phi \) or just \(i\) in some literatures.
Given a linear map \(f : M \to _{l[R]} A\) into an \(R\)-algebra \(A\),
satisfying \(f(m)^2 = Q(m)\) for all \(m \in M\),
called is Clifford,
the canonical lift of \(f\) is defined to be a algebra homomorphism from \(\mathcal {C}\kern -2pt\ell (Q)\) to \(A\),
denoted \(\operatorname {lift} f : \mathcal {C}\kern -2pt\ell (Q) \to _{a} A\).
Given \(f : M \to _{l[R]} A\), which is Clifford, \(F = \operatorname {lift} f\) (denoted \(\bar {f}\) in some literatures), we have:
\(F \circ \iota = f\), i.e. the following diagram commutes:
\(\operatorname {lift}\) is unique, i.e. given \(G : \mathcal {C}\kern -2pt\ell (Q) \to _{a} A\), we have:
\[ G \circ \iota = f \iff G = \operatorname {lift} f\]
Same as the previous section, let \(M\) be a module over a commutative ring \(R\), equipped with a quadratic form \(Q: M \to R\).
We also use \(m\) or \(m_1, m_2, \dots \) to denote elements of \(M\), i.e. vectors, and \(x, y, z\) to denote elements of \(\mathcal {C}\kern -2pt\ell (Q)\).
Grade reversion, intuitively, is reversing the multiplication order of basis vectors.
Formally, it's an algebra homomorphism \(\tau : \mathcal {C}\kern -2pt\ell (Q) \to _{a} \mathcal {C}\kern -2pt\ell (Q)^{\mathtt {op}}\), satisfying:
\(\tau (m_1 m_2) = \tau (m_2) \tau (m_1)\)
\(\tau \circ \tau = \operatorname {id}\)
\(\tau (\iota (m)) = \iota (m)\)
That is, the following diagram commutes:
It's called anti-involution \(\tau \) in [jadczyk2019notes], the canonical anti-automorphism in [gallier2008clifford],
also called transpose/transposition in some literature, following tensor algebra or matrix.
Clifford conjugate is an algebra homomorphism \({*} : \mathcal {C}\kern -2pt\ell (Q) \to _{a} \mathcal {C}\kern -2pt\ell (Q)\),
denoted \(x^{*}\) (or even \(x^\dagger \), \(x^v\) in some literatures),
defined to be:
\[ x^{*} = \operatorname {reverse}(\operatorname {involute}(x)) = \tau (\alpha (x)) \]
for all \(x \in \mathcal {C}\kern -2pt\ell (Q)\), satisfying
(as a \(*\)-ring):
\((x + y)^{*} = (x)^{*} + (y)^{*}\)
\((x y)^{*} = (y)^{*} (x)^{*}\)
\({*} \circ {*} = \operatorname {id}\)
\(1^{*} = 1\)
and (as a \(*\)-algebra):
\[ (r x)^{*} = r' x^{*} \]
for all \(r \in R\), \(x, y \in \mathcal {C}\kern -2pt\ell (Q)\) where \('\) is the involution of the commutative \(*\)-ring \(R\).
Note: In our current formalization in \(\textsf {Mathlib}\), the application of the involution on \(r\) is ignored,
as there appears to be nothing in the literature that advocates doing this.
This survey is built on my notes in the process of figuring out Eric Wieser's MathOverflow question Definition of a spin group for our PR to Lean 4's Mathlib about Spin groups.
Let \(M\) be a module over a commutative ring \(R\), equipped with a quadratic form \(Q: M \to R\).
A Clifford algebra over \(Q\) is
\[ \mathcal {C}\kern -2pt\ell (Q) \equiv T(M)/I_Q \]
where \(T(M)\) is the tensor algebra of \(M\), \(I_Q\) is the two-sided ideal generated from the set
\[ \{ m \otimes m - Q(m) \mid m \in M \}. \]
We denote the canonical linear map \(M \to \mathcal {C}\kern -2pt\ell (Q)\) as \(\iota _Q\).
The Lipschitz-Clifford group is defined as the subgroup closure of all the invertible elements in the form of \(\iota _Q(m)\),
\[ \Gamma \equiv \left \{ x_1 \ldots x_k \in \mathcal {C}\kern -2pt\ell ^{\times }(Q) \mid x_i \in V \right \} \]
where
\[ \mathcal {C}\kern -2pt\ell ^{\times }(Q) \equiv \left \{ x \in \mathcal {C}\kern -2pt\ell (Q) \mid \exists x^{-1} \in \mathcal {C}\kern -2pt\ell (Q), x^{-1} x = x x^{-1}=1\right \} \]
is the group of units (i.e. invertible elements) of \(\mathcal {C}\kern -2pt\ell (Q)\),
\[ V \equiv \left \{ \iota _Q(m) \in \mathcal {C}\kern -2pt\ell (Q) \mid m \in M \right \} \]
is the subset \(V\) of \(\mathcal {C}\kern -2pt\ell (Q)\) in the form of \(\iota _Q(m)\).
The Pin group is defined as
\[ \operatorname {Pin}(Q) \equiv \Gamma \sqcap \operatorname {U}(\mathcal {C}\kern -2pt\ell (Q)) \]
where \(\sqcap \) is the infimum (or greatest lower bound, or meet), and the infimum of two submonoids is just their intersection \(\cap \),
\[ \operatorname {U}(S) \equiv \left \{ x \in S \mid x^{*} * x = x * x^{*} = 1 \right \} \]
are the unitary elements of the Clifford algebra as a \(*\)-monid, and we have defined the star operation of Clifford algebra as Clifford conjugate [wieser2022formalizing], denoted \(\bar {x}\).
This definition is equivalent to the following:
\[ \operatorname {Pin}(Q) \equiv \left \{ x \in \Gamma \mid \operatorname {N}(x) = 1 \right \} \]
where
\[ \operatorname {N}(x) \equiv x \bar {x}. \]
The Spin group is defined as
\[ \operatorname {Spin}(Q) \equiv \operatorname {Pin}(Q) \sqcap \mathcal {C}\kern -2pt\ell ^{+}(Q) \]
where \(\mathcal {C}\kern -2pt\ell ^{+}(Q)\) is the even subalgebra of the Clifford algebra.
Definitions coming from different sources are simply quoted here with minimal modifications, to include immediate prerequisites, and omit some discussions or theorems.
They are not classified, ordered, or pruned by similarity.
Let \(V\) be a vector space over the commutative field \(k\) and suppose \(q\) is a quadratic form on \(V\).
We now consider the multiplicative group of units in the Clifford algebra \(C \ell (V, q)\) associated to \(V\), which is defined to be the subset
\[ C \ell ^{\times }(V, q) \equiv \left \{\varphi \in C \ell (V, q): \exists \varphi ^{-1} \text { with } \varphi ^{-1} \varphi =\varphi \varphi ^{-1}=1\right \} \]
This group contains all elements \(v \in V\) with \(q(v) \neq 0\).
The group of units always acts naturally as automorphisms of the algebra. That is, there is a homomorphism
\[ \mathrm {Ad}: \mathrm {C} \ell ^{\times }(V, q) \longrightarrow \operatorname {Aut}(\mathrm {C} \ell (V, q)) \]
called the adjoint representation, which is given by
\[ \operatorname {Ad}_{\varphi }(x) \equiv \varphi \times \varphi ^{-1} \]
The Pin group of \((V, q)\) is the subgroup \(\operatorname {Pin}(V, q)\) of \(\mathrm {P}(V, q)\) generated by the elements \(v \in V\) with \(q(v)= \pm 1\).
The associated Spin group of \((V, q)\) is defined by
\[ \operatorname {Spin}(V, q)=\operatorname {Pin}(V, q) \cap C \ell ^0(V, q) . \]
The pin group \(\operatorname {Pin}_V(K)\) is the subgroup of the Lipschitz group \(\Gamma \) of elements of spinor norm 1, and similarly the spin group \(\operatorname {Spin}_V(K)\) is the subgroup of elements of Dickson invariant 0 in \(\operatorname {Pin}_V(K)\).
A versor refers to a Clifford monomial composed of invertible vectors. It is called a rotor, or spinor, if the number of vectors is even. It is called a unit versor if its magnitude is 1.
All versors in \(\mathcal {C L}\left (\mathcal {V}^n\right )\) form a group under the geometric product, called the versor group, also known as the Clifford group, or Lipschitz group. All rotors form a subgroup, called the rotor group. All unit versors form a subgroup, called the pin group, and all unit rotors form a subgroup, called the spin group, denoted by \(\operatorname {Spin}\left (\mathcal {V}^n\right )\).
The Clifford group \(\Gamma _{\mathrm {p}, \mathrm {q}}\) of a Clifford algebra \(\mathcal {C}_{p, q}\) is defined as
\[ \Gamma _{\mathrm {p}, \mathrm {q}}:=\left \{s \in \mathcal {C}_{p, q} \mid \forall x \in \mathbb {R}_{p, q}: s x \hat {s}^{-1} \in \mathbb {R}_{p, q}\right \} . \]
From that definition we get immediately
\[ \Gamma _{\mathrm {p}, \mathrm {q}} \times \mathbb {R}_{p, q} \rightarrow \mathbb {R}_{p, q} ; \quad (s, x) \mapsto s x \hat {s}^{-1} \]
as the operation of the Clifford group \(\Gamma _{\mathrm {p}, \mathrm {q}}\) on \(\mathbb {R}_{p, q}\).
\(\Gamma _{\mathrm {p}, \mathrm {q}}\) is a multiple cover of the orthogonal group \(O(p, q)\). However, it is still unnecessarily large. Therefore, we first reduce \(\Gamma _{\mathrm {p}, \mathrm {q}}\) to a two-fold cover of \(O(p, q)\) by defining the so-called Pin group
\[ \operatorname {Pin}(\mathrm {p}, \mathrm {q}):=\left \{s \in \Gamma _{\mathrm {p}, \mathrm {q}} \mid s \tilde {s}= \pm 1\right \} . \]
The even elements of \(\operatorname {Pin}(p, q)\) form the spin group
\[ \operatorname {Spin}(\mathrm {p}, \mathrm {q}):=\operatorname {Pin}(\mathrm {p}, \mathrm {q}) \cap \mathcal {C}_{p, q}^{+} \]
which is a double cover of the special orthogonal group \(S O(p, q)\). Finally, those elements of \(\operatorname {Spin}(\mathrm {p}, \mathrm {q})\) with Clifford norm equal 1 form a further subgroup
\[ \operatorname {Spin}_{+}(\mathrm {p}, \mathrm {q}):=\{s \in \operatorname {Spin}(\mathrm {p}, \mathrm {q}) \mid s \tilde {s}=1\} \]
that covers \(\mathrm {SO}_{+}(p, q)\) twice. Thereby, \(\mathrm {SO}_{+}(p, q)\) is the connected component of the identity of \(O(p, q)\).
A versor is a multivector that can be expressed as the geometric product of a number of non-null 1-vectors. That is, a versor \(\boldsymbol {V}\) can be written as \(\boldsymbol {V}=\prod _{i=1}^k \boldsymbol {n}_i\), where \(\left \{\boldsymbol {n}_1, \ldots , \boldsymbol {n}_k\right \} \subset \mathbb {G}_{p, q}^{\varnothing 1}\) with \(k \in \mathbb {N}^{+}\), is a set of not necessarily linearly independent vectors.
The subset of versors of \(\mathbb {G}_{p, q}\) together with the geometric product, forms a group, the Clifford group, denoted by \(\mathfrak {G}_{p, q}\).
A versor \(\boldsymbol {V} \in \mathfrak {G}_{p, q}\) is called unitary if \(\boldsymbol {V}^{-1}=\tilde {\boldsymbol {V}}\), i.e. \(\boldsymbol {V} \widetilde {\boldsymbol {V}}=+1\).
The set of unitary versors of \(\mathfrak {G}_{p, q}\) forms a subgroup \(\mathfrak {P}_{p, q}\) of the Clifford group \(\mathfrak {G}_{p, q}\), called the pin group.
A versor \(\boldsymbol {V} \in \mathfrak {G}_{p, q}\) is called a spinor if it is unitary \((\boldsymbol {V} \tilde {\boldsymbol {V}}=1)\) and can be expressed as the geometric product of an even number of 1-vectors. This implies that a spinor is a linear combination of blades of even grade.
The set of spinors of \(\mathfrak {G}_{p, q}\) forms a subgroup of the pin group \(\mathfrak {P}_{p, q}\), called the spin group, which is denoted by \(\mathfrak {S}_{p, q}\).
We define the Clifford group \(\Gamma =\Gamma (q)\) to be the group of all invertible elements \(u \in \mathrm {Cl}(q)\) which have the property that \(uyu^{-1}\) is in \(M\) whenever \(y\) is in \(M\). We define \(\Gamma (q)^{ \pm }\)as the intersection of \(\Gamma (q)\) and \(\mathrm {Cl}(q)_{ \pm }\).
For every element \(u \in \Gamma (q)\) we define the spinor norm \(N(u)\) by the formula
\[ N(u)=\tau (u) u, \]
where \(\tau \) is the main involution of the Clifford algebra \(\mathrm {Cl}(q)\).
The following groups are called spin groups:
\[ \begin {aligned} & \operatorname {Pin}(q):=\left \{s \in \Gamma (q)^{+} \cup \Gamma (q)^{-}: N(s)= \pm 1\right \} \\ & \operatorname {Spin}(q):=\left \{s \in \Gamma (q)^{+}: N(s)= \pm 1\right \} \\ & \operatorname {Spin}^{+}(q):=\left \{s \in \Gamma (q)^{+}: N(s)=+1\right \} . \end {aligned} \]
Suppose that \((E, q)\) is a regular quadratic space. We consider the action of \(\mathcal {G}(E, q)\) on \(\mathcal {A}(E, q)\) by adjoint conjugation. We set
\[ A d_g^{\prime }(a)=g a g^{-1}, \]
for \(g \in \mathcal {G}(E, q)\) and \(a \in \mathcal {A}(E, q)\).
We restrict attention to those elements of \(\mathcal {G}(E, q)\) which stabilize \(E\). The Clifford group \(\Gamma =\Gamma (E, q)\) is defined as
\[ \left \{g \in \mathcal {G}(E, q): A d_g^{\prime }(x) \in E \text { for } x \in E\right \} . \]
If \(g \in \Gamma (E, q)\), we set \(\alpha (g)(x)=A d_g^{\prime }(x)\). Then \(\alpha (g) \in G L(E)\), and \(\alpha \) is a homomorphism of \(\Gamma \) into \(G L(E) . \alpha \) is called the graded vector representation of \(\Gamma \).
It is customary to scale the elements of \(\Gamma (E, q)\); we set
\[ \begin {aligned} \operatorname {Pin}(E, q) & =\{g \in \Gamma (E, q): \Delta (g)= \pm 1\}, \\ \Gamma _1(E, q) & =\{g \in \Gamma (E, q): \Delta (g)=1\} . \end {aligned} \]
If \((E, q)\) is a Euclidean space, then \(\operatorname {Pin}(E, q)=\Gamma _1(E, q)\); otherwise, \(\Gamma _1(E, q)\) is a subgroup of \(\operatorname {Pin}(E, q)\) of index 2.
We have a short exact sequence
\[ 1 \longrightarrow D_2 \xrightarrow {\subseteq } \operatorname {Pin}(E, q) \xrightarrow {\alpha } O(E, q) \longrightarrow 1 ; \]
\(\operatorname {Pin}(E, q)\) is a double cover of \(O(E, q)\).
In fact there is more interest in the subgroup \(\operatorname {Spin}(E, q)\) of \(\operatorname {Pin}(E, q)\) consisting of products of an even number of unit vectors in \(E\). Thus \(\operatorname {Spin}(E, q)=\operatorname {Pin}(E, q) \cap \mathcal {A}^{+}(E, q)\) and
\[ \operatorname {Spin}(E, q)=\left \{g \in \mathcal {A}^{+}(E, q): g E=E g \text { and } \Delta (g)= \pm 1\right \} . \]
If \(x, y\) are unit vectors in \(E\) then \(\alpha (x y)=\alpha (x) \alpha (y) \in S O(E, q)\), so that \(\alpha (\operatorname {Spin}(E, q)) \subseteq S O(E, q)\). Conversely, every element of \(S O(E, q)\) is the product of an even number of simple reflections, and so \(S O(E, q) \subseteq \alpha \left (\operatorname {Spin}(E, q)\right )\). Thus \(\alpha \left (\operatorname {Spin}(E, q)\right )=S O(E, q)\), and we have a short exact sequence.
\[ 1 \longrightarrow D_2 \xrightarrow {\subseteq } \operatorname {Spin}(E, q) \xrightarrow {\alpha } S O(E, q) \longrightarrow 1 ; \]
\(\operatorname {Spin}(E, q)\) is a double cover of \(S O(E, q)\).
Note also that if \(a \in \operatorname {Spin}(E, q)\) and \(x \in E\) then \(\alpha (a)(x)=a x a^{-1}\); conjugation and adjoint conjugation by elements of \(\operatorname {Spin}(E, q)\) are the same.
Recall that \(\Pi : \mathrm {Cl}(V) \rightarrow \) \(\mathrm {Cl}(V), x \mapsto (-1)^{|x|} x\) denotes the parity automorphism of the Clifford algebra. Let \(\mathrm {Cl}(V)^{\times }\)be the group of invertible elements in \(\mathrm {Cl}(V)\).
The Clifford group \(\Gamma (V)\) is the subgroup of \(\mathrm {Cl}(V)^{\times }\), consisting of all \(x \in \mathrm {Cl}(V)^{\times }\) such that \(A_x(v):=\Pi (x) v x^{-1} \in V\) for all \(v \in V \subset \mathrm {Cl}(V)\).
Hence, by definition the Clifford group comes with a natural representation, \(\Gamma (V) \rightarrow \mathrm {GL}(V), x \mapsto A_x\). Let \(S \Gamma (V)=\Gamma (V) \cap \mathrm {Cl}^{\overline {0}}(V)^{\times }\)denote the special Clifford group.
The canonical representation of the Clifford group takes values in \(\mathrm {O}(V)\), and defines an exact sequence,
\[ 1 \longrightarrow \mathbb {K}^{\times } \longrightarrow \Gamma (V) \longrightarrow \mathrm {O}(V) \longrightarrow 1 . \]
It restricts to a similar exact sequence for the special Clifford group,
\[ 1 \longrightarrow \mathbb {K}^{\times } \longrightarrow S \Gamma (V) \longrightarrow \mathrm {SO}(V) \longrightarrow 1 . \]
The elements of \(\Gamma (V)\) are all products \(x=v_1 \cdots v_k\) where \(v_1, \ldots , v_k \in V\) are non-isotropic. \(S \Gamma (V)\) consists of similar products, with \(k\) even. The corresponding element \(A_x\) is a product of reflections:
\[ A_{v_1 \cdots v_k}=R_{v_1} \cdots R_{v_k} . \]
Suppose \(\mathbb {K}=\mathbb {R}\). The Pin group \(\operatorname {Pin}(V)\) is the preimage of \(\{1,-1\}\) under the norm homomorphism \(N: \Gamma (V) \rightarrow \mathbb {K}^{\times }\). Its intersection with \(S \Gamma (V)\) is called the Spin group, and is denoted \(\operatorname {Spin}(V)\).
Since \(N(\lambda )=\lambda ^2\) for \(\lambda \in \mathbb {K}^{\times }\), the only scalars in \(\operatorname {Pin}(V)\) are \(\pm 1\). Hence, the exact sequence for the Clifford group restricts to an exact sequence,
\[ 1 \longrightarrow \mathbb {Z}_2 \longrightarrow \operatorname {Pin}(V) \longrightarrow \mathrm {O}(V) \longrightarrow 1, \]
so that \(\operatorname {Pin}(V)\) is a double cover of \(\mathrm {O}(V)\). Similarly,
\[ 1 \longrightarrow \mathbb {Z}_2 \longrightarrow \operatorname {Spin}(V) \longrightarrow \mathrm {SO}(V) \longrightarrow 1, \]
defines a double cover of \(\mathrm {SO}(V)\). Elements in \(\operatorname {Pin}(V)\) are products \(x=\) \(v_1 \cdots v_k\) with \(B\left (v_i, v_i\right )= \pm 1\). The group \(\operatorname {Spin}(V)\) consists of similar products, with \(k\) even.
The "group of units" in \(C l_{p, q}\), denoted \(C l_{p, q}^{\times }\), is the group of all invertible elements.
An important subgroup of \(C l^{\times }(V, q)\) is the group \(P(V, q)\) generated by elements \(v \in V\) with \(q(v) \neq 0\). Quotienting out by constants, we obtain the Pin group. Specifically, \(\operatorname {Pin}(V, q)\) (or \(\operatorname {Pin}_{p, q}\) ) is the group generated by elements \(v \in V\) with \(q(v)= \pm 1\). Further define the spin groups to be
\[ \operatorname {Spin}(V, q)=\operatorname {Pin}(V, q) \cap C l^0(V, q) . \]
There are several equivalent possible ways to go about defining the \(\operatorname {Spin}(n)\) groups as groups of invertible elements in the Clifford algebra.
1. One can define \(\operatorname {Spin}(n)\) in terms of invertible elements \(\tilde {g}\) of \(C_{\text {even }}(n)\) that leave the space \(V=\mathbf {R}^n\) invariant under conjugation
\[ \tilde {g} V \tilde {g}^{-1} \subset V \]
2. One can show that, for \(v, w \in V\),
\[ v \rightarrow v-2 \frac {Q(v, w)}{Q(w, w)} w=-w v w / Q(w, w)=w v w^{-1} \]
is reflection in the hyperplane perpendicular to \(w\). Then \(\operatorname {Pin}(n)\) is defined as the group generated by such reflections with \(||w||^2=1\) . \(\operatorname {Spin}(n)\) is the subgroup of \(\operatorname {Pin}(n)\) of even elements. Any rotation can be implemented as an even number of reflections (Cartan-Dieudonné) theorem.
3. One can define the Lie algebra of \(\operatorname {Spin}(n)\) in terms of quadratic elements of the Clifford algebra.
The space of quadratic vectors in \(\mathrm {Cl}\) is the Lie algebra of \(\mathrm {SO}(n)\). The corresponding Lie group, called the Spin group \(\operatorname {Spin}(Q)\), is the set of invertible elements \(x \in \mathrm {Cl}\) that preserve \(V\) under \(v \mapsto x v x^{-1}\). Clearly this map is in \(\mathrm {SO}(V, Q)\) since it preserves the quadratic form \(Q\), and is a two-fold cover with kernel \(\pm 1\).
The Pin group Pin \((V)\) of \((V, Q)\) is the subgroup of (the group of units of) \(C \ell (V)\) generated by \(v \in \mathrm {V}\) with \(\mathrm {Q}(v)= \pm 1\). In other words, every element of \(\operatorname {Pin}(\mathrm {V})\) is of the form \(u_1 \cdots u_r\) where \(u_i \in \mathrm {V}\) and \(\mathrm {Q}\left (u_i\right )= \pm 1\). We will write \(\operatorname {Pin}(s, t)\) for \(\operatorname {Pin}\left (\mathbb {R}^{s, t}\right )\) and \(\operatorname {Pin}(n)\) for \(\operatorname {Pin}(n, 0)\).
The spin group of \((V, Q)\) is the intersection
\[ \operatorname {Spin}(V)=\operatorname {Pin}(V) \cap C \ell (V)_0 . \]
Equivalently, it consists of elements \(u_1 \cdots u_{2 p}\), where \(u_i \in \mathrm {V}\) and \(\mathrm {Q}\left (u_i\right )= \pm 1\). We will write \(\operatorname {Spin}(s, t)\) for \(\operatorname {Spin}\left (\mathbb {R}^{s, t}\right )\) and \(\operatorname {Spin}(n)\) for \(\operatorname {Spin}(n, 0)\).
We identify the following groups embedded in \(\mathcal {G}\) :
\[ \begin {aligned} \mathcal {G}^{\times } &:=\{x \in \mathcal {G}: \exists y \in \mathcal {G}: x y=y x=1\} & \text { the group of all invertible elements } \\ \tilde {\Gamma } &:=\left \{x \in \mathcal {G}^{\times }: x^{\star } V x^{-1} \subseteq V\right \} & \text { the Lipschitz group } \\ \Gamma &:=\left \{v_1 v_2 \ldots v_k \in \mathcal {G}: v_i \in V^{\times }\right \} & \text { the versor group } \\ \text { Pin } &:=\left \{x \in \Gamma : x x^{\dagger } \in \{-1,1\}\right \} & \text { the group of unit versors } \\ \text { Spin } &:=\operatorname {Pin} \cap \mathcal {G}^{+} & \text { the group of even unit versors } \\ \operatorname {Spin}^{+} &:=\left \{x \in \operatorname {Spin}: x x^{\dagger }=1\right \} & \text { the rotor group } \\ \end {aligned} \]
In the above, \(V^{\times }:=\left \{v \in V: v^2 \neq 0\right \}\) denotes the set of all invertible vectors.
The Clifford group \(\Gamma (p, q)\) is the (multiplicative) group generated by invertible 1-vectors in \(R^{p, q}\).
The Pin group \(\operatorname {Pin}(p, q)\).
\[ \operatorname {Pin}(p, q)=\{g \in \Gamma (p, q): g \widetilde {g}= \pm 1\} . \]
So if \(g \in \operatorname {Pin}(p, q), \widetilde {g}\) is a scalar multiple of \(g^{-1}\). This is not true for arbitrary elements of \(\Gamma (p, q)\).
The Spin group \(\operatorname {Spin}(p, q)\). This is the subgroup of \(\operatorname {Pin}(p, q)\) consisting of even elements only, i.e.
\[ \operatorname {Spin}(p, q)=\operatorname {Pin}(p, q) \cap C l^{+}(p, q) . \]
The \(\operatorname {Spin}\) group \(\operatorname {Spin}(p, q)\) has the further subgroup
\[ \operatorname {Spin}^{\dagger }(p, q)=\{g \in \operatorname {Spin}(p, q): g \widetilde {g}=+1\} . \]
\(\operatorname {Pin}(p, q)\), \(\operatorname {Spin}(p, q)\) and \(\operatorname {Spin}^{\dagger }(p, q)\) are respectively the two-fold covering groups of \(O(p, q), S O(p, q)\) and \(S O^{\dagger }(p, q)\) (where \(S O^{\dagger }(p, q)\) is the connected component of \(S O(p, q)\) ).
The Spin group \(\operatorname {Spin}(F, \rho )\) of \(C l(F, \rho )\) is the subset of \(C l(F, \rho )\) whose elements can be written as the product \(g=u_1 \cdot \ldots \cdot u_{2 p}\) of an even number of vectors of \(F\) of norm \(\left \langle u_k, u_k\right \rangle =1\).
As a consequence : \(\langle g, g\rangle =1, g^t \cdot g=1\) and \(\operatorname {Spin}(F, \rho ) \subset O(C l)\).
The scalars \(\pm 1\) belong to the Spin group. The identity is \(+1 . \operatorname {Spin}(F, \rho )\) is a connected Lie group.
A versor refers to a Clifford monomial (product expression) composed of invertible vectors. It is called a rotor, or spinor, if the number of vectors is even. It is called a unit versor if its magnitude is 1.
Every versor \(A=a_1 \ldots a_r, \quad a_1, \ldots , a_r \in \mathbb {R}^2, r \in \mathbb {N}\) has an inverse
\[ A^{-1}=a_r^{-1} \ldots a_1^{-1}=a_r \ldots a_1 /\left (a_1^2 \ldots a_r^2\right ), \]
such that
\[ A A^{-1}=A^{-1} A=1 . \]
This makes the set of all versors in \(C l(2,0)\) a group, the so called Lipschitz group with symbol \(\Gamma (2,0)\), also called Clifford group or versor group. Versor transformations apply via outermorphisms to all elements of a Clifford algebra. It is the group of all reflections and rotations of \(\mathbb {R}^2\).
The normalized subgroup of versors is called pin group
\[ \operatorname {Pin}(2,0)=\{A \in \Gamma (2,0) \mid A \widetilde {A}= \pm 1\} . \]
In the case of \(C l(2,0)\) we have
\[ \begin {aligned} & \operatorname {Pin}(2,0) \\ & =\left \{a \in \mathbb {R}^2 \mid a^2=1\right \} \cup \left \{A \mid A=\cos \varphi +e_{12} \sin \varphi , \varphi \in \mathbb {R}\right \} . \end {aligned} \]
The pin group has an even subgroup, called spin group
\[ \operatorname {Spin}(2,0)=\operatorname {Pin}(2,0) \cap C l^{+}(2,0) . \]
The spin group has in general a spin plus subgroup
\[\operatorname {Spin}_{+}(2,0)=\{A \in \operatorname {Spin}(2,0) \mid A \widetilde {A}=+1\}.\]
We continue to let \(F\) be a field of characteristic not 2 and \(M\) a quadratic space over \(F\).
Recall that \(\gamma : M \rightarrow C(M)\) is injective and that there is a unique involution - on \(C(M)\) taking \(\gamma x\) to \(\gamma x\) for all \(x\). Consider \(M\) to be a subset of \(C(M)\) via \(\gamma \), and define the group \(\operatorname {Spin}(M)\) by
\[ \operatorname {Spin}(M)=\left \{c \in C_0(M)^{\times } \mid c M c^{-1}=M, c \bar {c}=1_C\right \}, \]
where \(C_0(M)^{\times }\)is the group of invertible elements of the \(\operatorname {ring} C_0(M)\). The isometries from \(M\) onto \(M\) constitute the orthogonal group \(O(M)\) and \(S O(M)\) is the subgroup of elements of determinant 1. For \(c\) in \(\operatorname {Spin}(M)\), define
\[ \pi c: M \rightarrow M \]
by \(\pi c(x)=c x c^{-1}\). This provides a homomorphism
\[ \pi : \operatorname {Spin}(M) \rightarrow S O(M) . \]
By a theorem of Cartan and Dieudonné, any element \(\sigma \) in \(O(M)\) is a product \(\sigma =\tau _{y_1} \cdots \tau _{y_k}\) of hyperplane reflections \(\tau _{y_i}\). The assignment \(\Theta (\sigma )=\) \(q\left (y_1\right ) \cdots q\left (y_k\right )\left (F^{\times }\right )^2\) defines the spinor norm homomorphism
\[ \Theta : S O(M) \rightarrow F^{\times } /\left (F^{\times }\right )^2 . \]
Let \(g\) be an invertible element of a universal Clifford algebra \(A\) such that, for each \(x \in X, g x \widehat {g}^{-1} \in X\). Then the map
\[ \rho _{X, g}: x \mapsto g x \widehat {g}^{-1} \]
is an orthogonal automorphism of \(X\).
The element \(g\) will be said to induce or represent the orthogonal transformation \(\rho _{X, g}\) and the set of all such elements \(g\) will be denoted by \(\Gamma (X)\) or simply by \(\Gamma \).
The subset \(\Gamma \) is a subgroup of \(A\).
The group \(\Gamma \) is called the Clifford group (or Lipschitz group) for \(X\) in the Clifford algebra \(A\). Since the universal algebra \(A\) is uniquely defined up to isomorphism, \(\Gamma \) is also uniquely defined up to isomorphism.
An element \(g\) of \(\Gamma (X)\) represents a rotation of \(X\) if and only if \(g\) is the product of an even number of elements of \(X\). The set of such elements will be denoted by \(\Gamma ^0=\Gamma ^0(X)\). An element \(g\) of \(\Gamma \) represents an anti-rotation of \(X\) if and only if \(g\) is the product of an odd number of elements of \(X\). The set of such elements will be denoted by \(\Gamma ^1=\Gamma ^1(X)\). Clearly, \(\Gamma ^0=\Gamma \cap A^0\) is a subgroup of \(\Gamma \), while \(\Gamma ^1=\Gamma \cap A^1\).
The Clifford group \(\Gamma (X)\) of a quadratic space \(X\) is larger than is necessary if our interest is in representing orthogonal transformations of \(X\). Use of a quadratic norm \(N\) on the Clifford algebra \(A\) leads to the definition of subgroups of \(\Gamma \) that are less redundant for this purpose. This quadratic norm \(N: A \rightarrow A\) is defined by the formula
\[ N(a)=a^{-} a, \text { for any } a \in A, \]
For \(X\) and \(\Gamma =\Gamma (X)\) as above we now define
\[ \operatorname {Pin} X=\{g \in \Gamma :|N(g)|=1\} \text { and } \operatorname {Spin} X=\left \{g \in \Gamma ^0:|N(g)|=1\right \} \text {. } \]
Let \(V\) be an inner product space. We denote by \(\Delta V\) the standard Clifford algebra \((\wedge V,+, \Delta )\) defined by the Clifford product \(\Delta \) on the space of multivectors in \(V\).
Let \(V\) be an inner product space. The Clifford cone of \(V\) is the multiplicative group \(\widehat {\triangle } V \subset \triangle V\) generated by nonsingular vectors, that is, vectors \(v\) such that \(\langle v\rangle ^2 \neq 0\). More precisely, \(q \in \widehat {\triangle } V\) if there are finitely many nonsingular vectors \(v_1, \ldots , v_k \in V\) such that
\[ q=v_1 \Delta \cdots \Delta v_k . \]
Let \(w \in \triangle V\). Then \(w \in \widehat {\triangle } V\) if and only if \(w\) is invertible and
\(\widehat {w} v w^{-1} \in V\) for all \(v \in V\).
In this case \(w\) can be written as a product of at most \(\operatorname {dim} V\) nonsingular vectors, and \(\bar {w} w=w \bar {w} \in \mathbf {R} \backslash \{0\}\).
Let \(V\) be an inner product space. Define the orthogonal, special orthogonal, pin, and spin groups
\[ \begin {aligned} \mathrm {O}(V) & :=\{\text { isometries } T: V \rightarrow V\} \subset \mathcal {L}(V), \\ \mathrm {SO}(V) & :=\{T \in \mathrm {O}(V) ; \operatorname {det} T=+1\} \subset \mathcal {L}(V), \\ \operatorname {Pin}(V) & :=\left \{q \in \widehat {\triangle } V ;\langle q\rangle ^2= \pm 1\right \} \subset \triangle V, \\ \operatorname {Spin}(V) & :=\left \{q \in \operatorname {Pin}(V) ; q \in \triangle ^{\mathrm {ev}} V\right \} \subset \triangle ^{\mathrm {ev}} V . \end {aligned} \]
We call \(T \in \operatorname {SO}(V)\) a rotation and we call \(q \in \operatorname {Spin}(V)\) a rotor.
Motivation E. 39 (The problem of generalizing the definition of the Spin group). For a positive definite quadratic form \(\mathfrak {q}\) on the real vector space \(V=\mathbb {R}^n\) with \(n \geq 3\) the \(\operatorname {Spin}\) group \(\operatorname {Spin}(n)\) is defined via the kernel of the Spinor norm (=extended quadratic form on \(\mathrm {Cl}(V, \mathfrak {q})\) ) restricted to the special Clifford group \(\Gamma ^{[0]}(V, \mathfrak {q})\) :
\[ \operatorname {Spin}(n):=\operatorname {ker}\left (\overline {\mathfrak {q}}: \Gamma ^{[0]}(V, \mathfrak {q}) \rightarrow \mathbb {R}^{\times }\right )=\left \{w \in \Gamma ^{[0]}(V, \mathfrak {q}) \mid \overline {\mathfrak {q}}(w)=1\right \}=\left .\overline {\mathfrak {q}}\right |_{\Gamma ^{[0]}(V, \mathfrak {q})} ^{-1}(1) . \]
\(\operatorname {Spin}(n)\) is thus a normal subgroup of the special Clifford group \(\Gamma ^{[0]}(V, \mathfrak {q})\), and, as it turns out, a double cover of the special orthogonal group \(\mathrm {SO}(n)\) via the twisted conjugation \(\rho \). The latter can be summarized by the short exact sequence:
\[ 1 \longrightarrow \{ \pm 1\} \xrightarrow {\text { incl }} \operatorname {Spin}(n) \xrightarrow {\rho } \mathrm {SO}(n) \longrightarrow 1 . \]
We intend to generalize this in several directions: 1. from Spin to Pin group, 2. from \(\mathbb {R}^n\) to vector spaces \(V\) over general fields \(\mathbb {F}\) with \(\operatorname {ch}(\mathbb {F}) \neq 2\), 3. from non-degenerate to degenerate quadratic forms \(\mathfrak {q}\), 4. from positive (semi-)definite to non-definite quadratic forms \(\mathfrak {q}\). This comes with several challenges and ambiguities.
Definition E. 40 (The real Pin group and the real Spin group). Let \(V\) be a finite dimensional \(\mathbb {R}\)-vector space \(V, \operatorname {dim} V=n<\infty \), and \(\mathfrak {q} a\) (possibly degenerate) quadratic form on \(V\). We define the (real) Pin group and (real) Spin group, resp., of \((V, \mathfrak {q})\) as the following subquotients of the Clifford group. \(\Gamma (V, \mathfrak {q})\) and its even parity part \(\Gamma ^{[0]}(V, \mathfrak {q})\), resp.:
\[ \begin {aligned} \operatorname {Pin}(V, \mathfrak {q}) & :=\{x \in \Gamma (V, \mathfrak {q}) \mid \overline {\mathfrak {q}}(x) \in \{ \pm 1\}\} / \bigwedge ^{[*]}(\mathcal {R}) \\ \operatorname {Spin}_{\infty }(V, \mathfrak {q}) & :=\left \{x \in \Gamma ^{[0]}(V, \mathfrak {q}) \mid \overline {\mathfrak {q}}(x) \in \{ \pm 1\}\right \} / \bigwedge ^{[*]}(\mathcal {R}) \end {aligned} \]
Corollary E.41. Let \((V, \mathfrak {q})\) be a finite dimensional quadratic vector space over \(\mathbb {R}\). Then the twisted conjugation induces a well-defined and surjective group homomorphism onto the group of radical preserving orthogonal automorphisms of \((V, \mathfrak {q})\) :
\[ \rho : \operatorname {Pin}(V, \mathfrak {q}) \rightarrow \mathrm {O}_{\mathcal {R}}(V, \mathfrak {q}), \]
with kernel:
\[ \operatorname {ker}\left (\rho : \underset {\sim \text { in }}{\operatorname {Pin}}(V, \mathfrak {q}) \rightarrow \mathrm {O}_{\mathcal {R}}(V, \mathfrak {q})\right )=\{ \pm 1\} . \]
Correspondingly, for the \(\operatorname {Spin}(V, \mathfrak {q})\) group. In short, we have short exact sequences:
\[ \begin {aligned} & 1 \longrightarrow \{ \pm 1\} \xrightarrow {\text { incl }} \operatorname {Pin}(V, \mathfrak {q}) \xrightarrow {\rho } \mathrm {O}_{\mathcal {R}}(V, \mathfrak {q}) \longrightarrow 1, \\ & 1 \longrightarrow \{ \pm 1\} \xrightarrow {\text { incl }} \operatorname {Spin}(V, \mathfrak {q}) \xrightarrow {\rho } \operatorname {SO}_{\mathcal {R}}(V, \mathfrak {q}) \longrightarrow 1 . \end {aligned} \]
Every Clifford algebra \(\mathrm {Cl}(\Phi )\) possesses a canonical anti-automorphism \(t: \mathrm {Cl}(\Phi ) \rightarrow \mathrm {Cl}(\Phi )\) satisfying the properties
\[ t(x y)=t(y) t(x), \quad t \circ t=\mathrm {id}, \quad \text { and } \quad t(i(v))=i(v), \]
for all \(x, y \in \mathrm {Cl}(\Phi )\) and all \(v \in V\). Furthermore, such an anti-automorphism is unique.
Every Clifford algebra \(\mathrm {Cl}(\Phi )\) has a unique canonical automorphism \(\alpha : \mathrm {Cl}(\Phi ) \rightarrow \mathrm {Cl}(\Phi )\) satisfying the properties
\[ \alpha \circ \alpha =\mathrm {id}, \quad \text { and } \quad \alpha (i(v))=-i(v), \]
for all \(v \in V\).
First, we define conjugation on a Clifford algebra \(\mathrm {Cl}(\Phi )\) as the map
\[ x \mapsto \bar {x}=t(\alpha (x)) \text { for all } x \in \mathrm {Cl}(\Phi ) . \]
Given a finite dimensional vector space \(V\) and a quadratic form \(\Phi \) on \(V\), the Clifford group of \(\Phi \) is the group
\[ \Gamma (\Phi )=\left \{x \in \mathrm {Cl}(\Phi )^* \mid \alpha (x) v x^{-1} \in V \quad \text { for all } v \in V\right \} . \]
The map \(N: \mathrm {Cl}(Q) \rightarrow \mathrm {Cl}(Q)\) given by
\[ N(x)=x \bar {x} \]
is called the norm of \(\mathrm {Cl}(\Phi )\).
We also define the group \(\Gamma ^{+}(\Phi )\), called the special Clifford group, by
\[ \Gamma ^{+}(\Phi )=\Gamma (\Phi ) \cap \mathrm {Cl}^0(\Phi ) . \]
We define the pinor group \(\operatorname {Pin}(p, q)\) as the group
\[ \operatorname {Pin}(p, q)=\left \{x \in \Gamma _{p, q} \mid N(x)= \pm 1\right \}, \]
and the spinor group \(\operatorname {Spin}(p, q)\) as \(\operatorname {Pin}(p, q) \cap \Gamma _{p, q}^{+}\).
The restriction of \(\rho : \Gamma _{p, q} \rightarrow \mathbf {G L}(n)\) to the pinor group \(\operatorname {Pin}(p, q)\) is a surjective homomorphism \(\rho : \mathbf {P i n}(p, q) \rightarrow \mathbf {O}(p, q)\) whose kernel is \(\{-1,1\}\), and the restriction of \(\rho \) to the spinor group \(\mathbf {S p i n}(p, q)\) is a surjective homomorphism \(\rho : \mathbf {S p i n}(p, q) \rightarrow \mathbf {S O}(p, q)\) whose kernel is \(\{-1,1\}\).
Remark: According to Atiyah, Bott and Shapiro, the use of the name \(\operatorname {Pin}(k)\) is a joke due to Jean-Pierre Serre (Atiyah, Bott and Shapiro Clifford modules, page 1).
Instead of defining the spin group as the set of products of certain elements of \(V\), it will be convenient to start with a more abstract definition. Set
\[ \operatorname {Spin}(Q)=\left \{x \in C(Q)^{\text {even }}: x \cdot x^*=1 \text { and } x \cdot V \cdot x^* \subset V\right \} \text {. } \]
We see from this definition that \(\operatorname {Spin}(Q)\) forms a closed subgroup of the group of units in the (even) Clifford algebra. Any \(x\) in \(\operatorname {Spin}(Q)\) determines an endomorphism \(\rho (x)\) of \(V\) by
\[ \rho (x)(v)=x \cdot v \cdot x^*, \quad v \in V . \]
Define a larger subgroup, this time of the multiplicative group of \(C(Q)\), by
\[ \operatorname {Pin}(Q)=\left \{x \in C(Q): x \cdot x^*=1 \text { and } x \cdot V \cdot x^* \subset V\right \}, \]
and define a homomorphism
\[ \rho : \operatorname {Pin}(Q) \rightarrow \mathrm {O}(Q), \quad \rho (x)(v)=\alpha (x) \cdot v \cdot x^*, \]
where \(\alpha : C(O) \rightarrow C(O)\) is the main involution.
The pin group \(\operatorname {Pin}(V)\) is a subgroup of \(\mathrm {Cl}(V)\) 's Clifford group of all elements of the form
\[ v_1 v_2 \cdots v_k \]
where each \(v_i \in V\) is of unit length: \(q\left (v_i\right )=1\).
The spin group is then defined as
\[ \operatorname {Spin}(V)=\operatorname {Pin}(V) \cap \mathrm {Cl}^{\text {even }}, \]
where \(\mathrm {Cl}^{\text {even }}=\mathrm {Cl}^0 \oplus \mathrm {Cl}^2 \oplus \mathrm {Cl}^4 \oplus \cdots \) is the subspace generated by elements that are the product of an even number of vectors. That is, \(\operatorname {Spin}(V)\) consists of all elements of \(\operatorname {Pin}(V)\), given above, with the restriction to \(k\) being an even number. The restriction to the even subspace is key to the formation of two-component (Weyl) spinors, constructed below.
The Pin group \(\operatorname {Pin}(V ; q)\) of a quadratic vector space, is the subgroup of the group of units in the Clifford algebra \(\mathrm {Cl}(V, q)\)
\[ \operatorname {Pin}(V, q) \hookrightarrow \operatorname {GL}_1(\mathrm {Cl}(V, q)) \]
on those elements which are multiples \(v_1 \cdots v_n\) of elements \(v_i \in V\) with \(q\left (v_i\right )=1\).
The Spin group \(\operatorname {Spin}(V, q)\) is the further subgroup of \(\operatorname {Pin}(V ; q)\) on those elements which are even number multiples \(v_1 \cdots v_{2 k}\) of elements \(v_i \in V\) with \(q\left (v_i\right )=1\).
The subset of \(\mathcal {C} \ell _{p, q, r}\) defined by
\[ S_{p, q, r}=\left \{s \gamma _1 \ldots \gamma _{p+q}(1+G) \mid s \in \operatorname {Spin}(p, q), \gamma _i=1+e_i \sum _{l=1}^r c_{i l} f_l, G \in \Lambda (f)\right \} \]
is a group under the Clifford multiplication where \(\left \langle \cdot , \cdot \right \rangle \) is a symmetric bilinear form on \(\mathbb {R}^{p+q+r}\), \(\left \{e_1, \ldots , e_p, e_{p+1}, \ldots , e_{p+q}, f_1, \ldots , f_r\right \}\) is the algebra basis for the degenerate Clifford algebra \(\mathcal {C} \ell _{p, q, r}=\mathcal {C} \ell \left (\mathbb {R}^n,\langle \rangle \right )\), \(1 \leq i \leq p+q, c_{i l} \in \mathbb {R}\), and \(\Lambda (f)\) is defined by
\[ \Lambda (f)=\operatorname {Span}\left \{f_{k_1} \ldots f_{k_j} \mid 1 \leq k_1<k_2<\ldots <k_j \leq r\right \} . \]
Translation of Bourbaki on Clifford algebras [uts-000C]
This is a manual translation of a portion of Chapitre 9 Formes sesquilinéaires et formes quadratiques of Bourbaki's Éléments de Mathématique[bourbaki2007algebra] from French to English.
Such a translation is not available eleswhere, to the best of our knowledge. We claim no rights to the original text, and all mistakes in the translation are our own. The translation is for educational purposes only.
So far, we have only translated the portion that is related to Clifford algebras, namely § 9. Clifford Algebra. The translation is still in progress.
In this paragraph, we will assume the ring \(A\) to be commutative. We will designate by Q a quadratic form over the \(A\)-module \(E\), and by \(\Phi \) the associated bilinear form (§ 3, \(n^{\circ }\) 4).
🇫🇷
Dans ce paragraphe, nous supposerons l'anneau \(A\) commutatif. Nous désignerons par Q une forme quadratique sur le \(A\)-module \(E\), et par \(\Phi \) la forme bilinéaire associée (§ 3, \(n^{\circ }\) 4).
1. Definition and universal property of Clifford Algebra 🇫🇷1. Définition et propriété universelle de l'algèbre de Clifford.[#314]
Definition 1. - We call Clifford algebra of \(Q\), denoted \(C(Q)\), the quotient of the tensor algebra \(T(E)\) of the module \(E\) by the two-sided ideal, denoted \(I(Q)\), generated by the elements of the form \(x \otimes x - Q(x) . 1 \quad (x \in E)\).
🇫🇷
Définition 1. - On appelle algèbre de Clifford de \(Q\) et on note \(C(Q)\) l'algèbre quotient de l'algèbre tensorielle \(T(E)\) du module \(E\) par l'idéal bilatère (noté \(I(Q)\)) engendré par les éléments de la forme \(x \otimes x - Q(x) . 1 \quad (x \in E)\).
We will denote by \(\rho _{\tiny Q}\) (or simply \(\rho \) when there is no risk of confusion) the mapping of \(E\) into \(C(Q)\) composed of the canonical mapping of \(E\) into \(T(E)\) and of the canonical mapping \(\sigma \) of \(T(E)\) onto \(C(Q)\); the mapping \(\rho _{\tiny Q}\) is said to be canonical.
🇫🇷
Nous noterons \(\rho _{e}\) (ou simplement \(\rho \) quand aucune confusion n'est à craindre) l'application de E dans \(C(Q)\) composée de l'application canonique de \(E\) dans \(T(E)\) et de l'application canonique \(\sigma \) de \(T(E)\) sur \(C(Q)\); l'application \(\rho _{\tiny Q}\) est dite canonique .
Note that \(C(Q)\) is generated by \(\rho _{\tiny Q}(E)\), and that, for \(x \in E\), we have
\[\rho (x)^{2} = Q(x) . 1 ; \tag{1}\]
hence, replacing \(x\) by \(x + y\) (\(x, y\) in \(E\)):
\[\rho (x) \rho (y) + \rho (y) \rho (x) = \Phi (x, y) . 1 \tag{2}\]
🇫🇷
Remarquons que \(C(Q)\) est engendrée par \(\rho _{\tiny Q}(E)\), et que, pour \(x \in E\), on a
\[\rho (x)^{2} = Q(x) . 1 ; \tag{1}\]
d'où, en remplaçant \(x\) par \(x + y\) (\(x, y\) dans \(E\)):
\[\rho (x) \rho (y) + \rho (y) \rho (x) = \Phi (x, y) . 1 \tag{2}\]
Example: If \(E\) admits a base composed of a single element \(e\), \(T(E)\) is isomorphic to the polynomial algebra \(A[X]\), and \(C(Q)\) is a quadratic extension of \(A\), based on \((1, u)\), where \(u\) is the element \(u = \rho (e)\) and satisfies \(u^{2} = Q(e)\).
🇫🇷
Exemple. Si \(E\) admet une base composée d'un seul élément \(e\), \(T(E)\) est isomorphe à l'algèbre de polynômes \(A[X]\), et \(C(Q)\) est une extension quadratique de \(A\), ayant pour base \((1, u)\), où \(u\) est l'élément \(u = \rho (e)\) et vérifie \(u^{2} = Q(e)\).
We denote \(T^{h}\) the \(h\)-th tensor power \(\bigotimes \limits ^{h} E\) in \(T(E)\), and \(T^{+}\) (resp. \(T^{-}\)) being the sum of the \(T^{h}\) for even (resp. odd) number of \(h\).
🇫🇷
Notons \(T^{h}\) la puissance tensorielle \(h\)-ème \(\bigotimes \limits ^{h} E\) dans \(T(E)\), et soit \(T^{+}\) (resp. \(T^{-}\)) la somme des \(T^{h}\) pour \(h\) pair (resp. impair).
Since \(T(E)\) is a direct sum of \(T^{+}\) and \(T^{-}\), and \(I(Q)\) is generated by elements of \(T^{+}\), \(I(Q)\) is a direct sum of \(T^{+} \cap I(Q)\) and \(T^{-} \cap I(Q)\), and \(C(Q)\) is a direct sum of the two submodules \(C^{+}(Q) = \sigma (T^{+})\) and \(C^{-}(Q) = \sigma (T^{-})\) (also denoted \(C^{+}\) and \(C^{-}\)). The elements of \(C^{+}\) are called even (resp. odd).
🇫🇷
Comme \(T(E)\) est somme directe de \(T^{+}\) et \(T^{-}\), et \(I(Q)\) est engendré par des éléments de \(T^{+}\), \(I(Q)\) est somme directe de \(T^{+} \cap I(Q)\) et \(T^{-} \cap I(Q)\), et \(C(Q)\) est somme directe des deux sous-modules \(C^{+}(Q) = \sigma (T^{+})\) et \(C^{-}(Q) = \sigma (T^{-})\) (que l'on note aussi \(C^{+}\) et \(C^{-}\)). Les éléments de \(C^{+}\) seront dits pairs (resp. impairs).
We have the relations
\[C^{+} C^{+} \subset C^{+}, \quad C^{+} C^{-} \subset C^{-}, \quad C^{-} C^{+} \subset C^{-}, \quad C^{-} C^{-} \subset C^{+}. \tag{3}\]
In particular \(C^{+}\) is a subalgebra of \(C(Q)\).
🇫🇷
On a les relations
\[C^{+} C^{+} \subset C^{+}, \quad C^{+} C^{-} \subset C^{-}, \quad C^{-} C^{+} \subset C^{-}, \quad C^{-} C^{-} \subset C^{+}. \tag{3}\]
En particulier \(C^{+}\) est une sous-algèbre de \(C(Q)\).
Proposition 1. - Let \(f\) be a linear map of \(E\) in an algebra \(D\) over \(A\) such that \(f(x)^{2} = Q(x) . 1\) for all \(x \in E\). There is one and only one homomorphism \(\bar {f}\) of \(C(Q)\) into \(D\) such that \(f = \bar {f} \circ \rho _{\tiny Q}\).
🇫🇷
Proposition 1. - Soit \(f\) une application linéaire de \(E\) dans une algèbre \(D\) sur \(A\) telle que \(f(x)^{2} = Q(x) . 1\) pour tout \(x \in E\). Il existe un homomorphisme \(\bar {f}\) et un seul de \(C(Q)\) dans \(D\) tel que \(f = \bar {f} \circ \rho _{\tiny Q}\).
The uniqueness of \(\bar {f}\) results from the fact that \(C(Q)\) is generated by \(\rho _{\tiny Q}(E)\). Let \(h\) be the unique homomorphism of \(T(E)\) into \(D\) which extends \(f\) (\(h\) is defined by \(h(x_{1} \otimes \cdots \otimes x_{n}) = f(x_{1}) \ldots f(x_{n})\)). We have
\[h(x \otimes x - Q(x) . 1) = (f(x)^{2} - Q(x)) . 1 = 0 , \notag\]
and then \(h\) cancels on \(I(Q)\) and defines the homomorphism \(\bar {f}\) through the quotient.
🇫🇷
L'unicité de \(\bar {f}\) résulte de ce que \(C(Q)\) est engendrée par \(\rho _{\tiny Q}(E)\). Soit \(h\) l'unique homomorphisme de \(T(E)\) dans \(D\) qui prolonge \(f\) (\(h\) est défini par \(h(x_{1} \otimes \cdots \otimes x_{n}) = f(x_{1}) \ldots f(x_{n})\)). On a
\[h(x \otimes x - Q(x) . 1) = (f(x)^{2} - Q(x)) . 1 = 0 , \notag\]
et par suite \(h\) s'annule sur \(I(Q)\) et définit par passage au quotient l'homomorphisme \(\bar {f}\) cherché.
Prop. 1 shows that \(C(Q)\) is a solution to a problem of universal (mapping) property ( Ens., chap. IV, § 3, \(n^{\circ }\) 1 ).
🇫🇷
La prop. 1 exprime que \(C(Q)\) est solution d'un problème d'application universelle ( Ens., chap. IV, § 3, \(n^{\circ }\) 1 ).
Let us take in particular for \(D\) the opposite algebra of \(C(Q)\) and for \(f\) the mapping \(\rho \); prop. 1 implies that there is one and only one anti-automorphism \(\beta \) of \(C(Q)\) whose restriction to \(\rho (E)\) is the identity; it is called the main anti-automorphism of \(C(Q)\). It is clear that \(\beta ^{2} = 1\).
🇫🇷
Prenons en particulier pour \(D\) l'algèbre opposée de \(C(Q)\) et pour \(f\) l'application \(\rho \); la prop. 1 entraine qu'il existe un antiautomorphisme \(\beta \) et un seul de \(C(Q)\) dont la restriction à \(\rho (E)\) soit l'identité ; on l'appelle l'antiautomorphisme principal de \(C(Q)\). Il est clair que \(\beta ^{2} = 1\).
On the other hand, let \(Q'\) be a quadratic form over an \(A\)-module \(E'\), and \(f\) a linear map of \(E\) into \(E'\) such that \(Q' \circ f = Q\). We have \(\rho _{\tiny Q'}(f(x))^{2} = Q'(f(x)) . 1 = Q(x) . 1\), and consequently there is one and only one homomorphism \(C(f)\) of \(C(Q)\) into \(C(Q')\) such that \(C(f) \circ \rho _{\tiny Q} = \rho _{\tiny Q'} \circ f\).
🇫🇷
D'autre part, soient \(Q'\) une forme quadratique sur un \(A\)-module \(E'\), et \(f\) une application linéaire de \(E\) dans \(E'\) telle que \(Q' \circ f = Q\). On a \(\rho _{\tiny Q'}(f(x))^{2} = Q'(f(x)) . 1 = Q(x) . 1\), et par suite il existe un homomorphisme \(C(f)\) et un seul de \(C(Q)\) dans \(C(Q')\) tel que \(C(f) \circ \rho _{\tiny Q} = \rho _{\tiny Q'} \circ f\).
If \(f\) is the identity, \(C(f)\) is the identity ; if \(Q'\) is a quadratic form over an \(A\)-module \(E'\), and \(g\) a linear map of \(E'\) into \(E''\) such that \(Q'' \circ g = Q'\), we have \(C(g \circ f) = C(g) \circ C(f)\). When \(E'\) is a submodule of \(E\) and \(f\) is the canonical injection of \(E'\) into \(E\) (so that \(Q'\) is the restriction of \(Q\) to \(E'\)), we say that \(C(f)\) is the canonical homomorphism of \(C(Q')\) into \(C(Q)\).
🇫🇷
Si \(f\) est l'identité, \(C(f)\) est l'identité ; si \(Q'\) est une forme quadratique sur un \(A\)-module \(E'\), et \(g\) une application linéaire de \(E'\) dans \(E''\) telle que \(Q'' \circ g = Q'\), on a \(C(g \circ f) = C(g) \circ C(f)\). Lorsque \(E'\) est un sous-module de \(E\) et \(f\) l'injection canonique de \(E'\) dans \(E\) (de sorte que \(Q'\) est la restriction de \(Q\) à \(E'\)), on dit que \(C(f)\) est l'homomorphisme canonique de \(C(Q')\) dans \(C(Q)\).
Let us take in particular \(Q' = Q\) and for \(f\) the mapping \(x \to -x\); we see that there is an automorphism \(\alpha \) and only one of \(C(Q)\) such that \(\alpha \circ \rho = -\rho \); it is called the main automorphism of \(C(Q)\). It is clear that \(\alpha ^{2} = 1\), and that the restriction of \(\alpha \) to \(C^{+}\) (resp. \(C^{-}\)) is the identity (resp. the mapping \(u \to -u\)).
🇫🇷
Prenons en particulier \(Q' = Q\) et pour \(f\) l'application \(x \to -x\); on voit qu'il existe un automorphisme \(\alpha \) et un seul de \(C(Q)\) tel que \(\alpha \circ \rho = -\rho \); on l'appelle l'automorphisme principal de \(C(Q)\). Il est clair que \(\alpha ^{2} = 1\), et que la restriction de \(\alpha \) à \(C^{+}\) (resp. \(C^{-}\)) est l'identité (resp. l'application \(u \to -u\)).
Proposition 2. - Let \(A'\) be a commutative ring, \(\varphi \) a homomorphism of \(A\) into \(A'\), \(Q'\) the quadratic form over \(E' = A' \otimes _{ A } E\) induced from \(Q\) by extension of scalars (§ 3, \(n^{\circ }\) 4, prop. 3 ). There is one and only one isomorphism \(j\) of the algebra \(A' \otimes _{ A } C ( Q )\) onto \(C(Q')\) such that \(j (1 \otimes \rho _{\tiny Q}(x) )=\rho _{\tiny Q'}(1 \otimes x)\) for all \(x \in E\).
🇫🇷
Proposition 2. - Soient \(A'\) un anneau commutatif, \(\varphi \) un homomorphisme de \(A\) dans \(A'\), \(Q'\) la forme quadratique sur \(E'= A' \otimes _{ A } E\) déduite de \(Q\) par extension des scalaires (§ 3, \(n^{\circ }\) 4, prop. 3 ). Il existe un isomorphisme \(j\) et un seul de l'algèbre \(A' \otimes _{ A } C ( Q )\) sur \(C(Q')\) tel que \(j (1 \otimes \rho _{\tiny Q}(x) )=\rho _{\tiny Q'}(1 \otimes x)\) pour tout \(x \in E\).
It is sufficient to show that the algebra \(C'\) = \(A' \otimes C(Q)\) and the mapping \(1 \otimes \rho _{\tiny Q}\) of \(E'\) into \(C'\) form a solution to the same problem of universal (mapping) property as \(C(Q')\) and \(\rho _{\tiny Q'}\).
🇫🇷
Il suffit de démontrer que l'algèbre \(C'\) = \(A' \otimes C(Q)\) et l'application \(1 \otimes \rho _{\tiny Q}\) de \(E'\) dans \(C'\) forment une solution du même problème d'application universelle que \(C(Q')\) et \(\rho _{\tiny Q'}\).
Now, let \(D'\) be an algebra on \(A'\), and \(f'\) be a \(A'\)-linear application of \(E'\) in \(D'\) such that \(f'(x')^{2} = Q'(x') . 1\) for all \(x' \in E'\). The mapping \(g: x \to f'(1 \otimes x)\) of \(E\) into \(D'\) (considered as \(A\)-module thanks to the homomorphism \(\varphi \)) is \(A\)-linear, and we have \(g(x)^{2} = Q'(1 \otimes x) . 1 = Q(x) . 1\) for all \(x \in E\). There is thus one and only one \(A\)-homomorphism \(\bar {g}\) of \(C(Q)\) into \(D'\) such that \(\bar {g}(\rho _{\tiny Q}(x)) = f'(1 \otimes x)\). Therefore, there is one and only one \(A'\)-homomorphism \(\bar {f'}\) of \(C'\) into \(D'\) such that \(\bar {f'}(1 \otimes \rho _{\tiny Q }(x)) = f'(1 \otimes x)\) for all \(x \in E\); by linearity, it results that \(\bar {f'}((1 \otimes \rho _{\tiny Q})(x')) = f'(x')\) for all \(x' \in E'\). Q.E.D.
🇫🇷
Or, soient \(D'\) une algèbre sur \(A'\), et \(f'\) une application \(A'\)-linéaire de \(E'\) dans \(D'\) telle que \(f'(x')^{2} = Q'(x') . 1\) pour tout \(x' \in E'\). L'application \(g: x \to f'(1 \otimes x)\) de \(E\) dans \(D'\) (considéré comme \(A\)-module grâce à l'homomorphisme \(\varphi \)) est \(A\)-linéaire, et on a \(g(x)^{2} = Q'(1 \otimes x) . 1 = Q(x) . 1\) pour tout \(x \in E\). Il existe donc un \(A\)-homomorphisme \(\bar {g}\) et un seul de \(C(Q)\) dans \(D'\) tel que \(\bar {g}(\rho _{\tiny Q}(x)) = f'(1 \otimes x)\). Par suite, il existe un \(A'\)-homomorphisme \(\bar {f'}\) et un seul de \(C'\) dans \(D'\) tel que \(\bar {f'}(1 \otimes \rho _{\tiny Q }(x)) = f'(1 \otimes x)\) pour tout \(x \in E\); par linéarité, il en résulte que \(\bar {f'}((1 \otimes \rho _{\tiny Q})(x')) = f'(x')\) pour tout \(x' \in E'\). CQFD.
2. Some operations in tensor algebra 🇫🇷2. Quelques opérations dans l'algebre tensorielle.[#315]
In this section we will designate by \(e_{x}\) (\(x \in E\)) the linear mapping \(u \rightarrow x \otimes u\) of the tensor algebra \(T( E )\) into itself.
🇫🇷
Dans ce \(n^{\circ }\) nous désignerons par \(e_{x}\) (\(x \in E\)) l'application linéaire \(u \rightarrow x \otimes u\) de l'algèbre tensorielle \(T( E )\) dans elle-même.
Lemma 1. - Let \(f\) be an element of the dual \(E^{*}\) of \(E\). There exists one
linear mapping \(i_{f}\) and only one from \(T ( E )\) into itself such that
\[i_{f}(1)=0 \tag{4}\]
\[i_{f} \circ e_{x} + e_{x} \circ i_{f} = f(x) . I \text {\quad for all \,} x \in E \tag{5}\]
(where \(I\) denotes the identical mapping). The mapping \(f \rightarrow i_{f}\) from \(E^{*}\) into \(\mathscr {L}(T(E))\) is linear. We have \(i_{f}\left (T^{n}\right ) \subset T^{n-1},\left (i_{f}\right )^{2}=0,\) and \(i_{f} \circ i_{g}+i_{g} \circ i_{f}=0\)
for \(f, g\) in \(E ^{*}\). The mapping \(i_{f}\) is null on the subalgebra of
\(T ( E )\) generated by the kernel of \(f\). The ideal \(I(Q)\) is stable by \(i_{f}\); through quotient \(i_{f}\), we define a linear mapping (again \(i_{f}\)) of \(C(Q)\) into itself.
🇫🇷
Lemme 1. - Soit \(f\) un élément du dual \(E^{*}\) de \(E\). Il existe une
application linéaire \(i_{f}\) et une seule de \(T ( E )\) dans elle-même telle que
\[i_{f}(1)=0 \tag{4}\]
\[i_{f} \circ e_{x} + e_{x} \circ i_{f} = f(x) . I \text {\quad pour tout \,} x \in E \tag{5}\]
(où \(I\) désigne l'application identique). L'application \(f \rightarrow i_{f}\) de \(E^{*}\) dans \(\mathscr {L}(T(E))\) est linéaire. On a \(i_{f}\left (T^{n}\right ) \subset T^{n-1},\left (i_{f}\right )^{2}=0,\) et \(i_{f} \circ i_{g}+i_{g} \circ i_{f}=0\)
pour \(f, g\) dans \(E ^{*}\). L'application \(i_{f}\) est nulle sur la sous-algèbre de
\(T ( E )\) engendrée par le noyau de \(f\). L'idéal \(I(Q)\) est stable par \(i_{f}\); par passage au quotient \(i_{f}\), définit donc une application linéaire (notée encore \(i_{f}\) ) de \(C(Q)\) dans elle-mème.
My recent learning has become too scattered, and I have long wished to write notes on LMs (language models), so I hope focusing on this will prevent me from losing focus.
I've settled on the prefix lm instead of llm as I think phrases like "small LLMs" are redundant, and I hope LM will include multimodal variants, such as VLMs (visual language models).
The interest doesn't stop there - I haven't properly studied diffusion models yet, and I wish to explore their integration with LMs, or at least, the perspective of diffusion models in relation to LMs.
As my interests in the LM field are quite diverse, I will organize the notes into several series rather than one. I've reserved lm-0002 through lm-0005 for root notes covering these topics. The areas that come to mind include:
math behind LMs since ML
various optimization methods of LMs, including GPU programming
post-training techniques
application of GA in LMs
alternative architectures of LMs, that includes RWKV, stuff related to diffusion models
Individual notes will be written as I read related papers and books, and may be referenced across these root notes.
I have scattered notes about LMs previously, here I list them for easier reference.
This is also an experiment to see if modern VLMs could better convert screenshots to Forester math formula markup.
In Transformers: from self-attention to performance optimizations, I was focused on visualizing transformer architectures using subscript-free tensor notations ("Named Tensor Notation" [chiang2021named]). However, mathematical concepts can often be expressed succinctly once you develop the ability to visualize them mentally. In this series of notes, I'll focus more on the mathematical foundations rather than visualizations or introductory explanations, having progressed beyond that stage after reading hundreds of LM papers.
In a sense, group algebras are the source of all you need to know about representation theory.
The primary reference is [james2001representations] for understanding FG-module, Group algebra, the presentation of groups, Clifford theory (which is the standard method of constructing representations and characters of semi-direct products, see [woit2017quantum], and "3.6 Clifford theory" in [lux2010representations]), Schur indices etc. We also need to check [lux2010representations] for its introduction to GAP, and we should pay close attention to the progress of GAP-LEAN. [sims1994computation] might also be interesting in a similar manner as [lux2010representations] but with emphasis on the presentation of groups.
See also group algebra on nlab, particularly that "A group algebra is in particular a Hopf algebra and a \(G\)-graded algebra."
The related Zulip thread is here, and I have preliminary explorations and experiments in Lean here.
This interest originates from reading Robert A. Wilson's work [wilson2024discrete]. The ultimate goal is to understand the group algebra of the binary tetrahedral group (\(Q_8 \rtimes Z_3\)), then the three-dimensional complex reflection group (\(G_{27} \rtimes Q_8 \rtimes Z_3\)), a.k.a. the triple cover of the Hessian group, which can be interpreted as a finite analogue of the complete gauge group \(U(1) \times SU(2) \times SU(3)\).
Algebraic geometry has developed in waves, each with its own language and point of view, see [hartshorne1977graduate, p. xiv] for some discussion. The older language is closer to the geometric intuition, while the newer language developes powerful techniques to solve problems in great generality, but the study of the latter is considered tedious, or even depressing, by many, when not accompanied with tangible applications [grothendieck1964elements, p. 12].
We will use [kriz2021introduction] as a holistic guide to the organization of materials, which has done a great job of having the minimal prerequisites, being self-contained, and covering most of the topics that concern us, including the geometric motivation.
For a similar purpose, we use the formalization papers [bordg2022simple] and [buzzard2022schemes] to guide the path, at least the part towards schemes, and their counterparts in the Mathlib of Lean 4.
We also need to tap into the language of Stacks in a modern setting, as treated in [khan2023lectures], with preliminaries on \(\infty \)-categories and derived categories.
A ring, usually denoted \(R\), is always assumed to be a commutative ring (i.e. \(a+b=b+a\) and \(a b=b a\) for all \( a, b \in R\)) with 1 (i.e. the multiplicative identity element, or called multiplicative unit).
\(1 \neq 0\) is not required, where \(0\) is the additive neutral element. If \(1=0\), then \(R\) must be the zero ring (or called the trivial ring), which consisting of one element, and is denoted \(\{0\}\).
Subrings must have the same unit, and ring homomorphisms are always required to map \(1\) to \(1\).
A field, usually denoted \(K\), is a commutative ring with \(1\), where every nonzero element has a multiplicative inverse (thus division can be defined).
A formal variable is an arbitrary symbol that is used to represent some mathematical object, and assumes nothing about the value or nature of the object.
A formal expression is mathematical expression with formal variables, assuming nothing but the formal variables participate in the expression have operations that are used in the expression.
A formal expression can be evaluated by replacing the formal variables with actual mathematical objects that have the operations defined in the expression.
When there is no ambiguity, we may omit the word "formal" and simply say variables or an expression.
A monomial in \(n\) formal variables \(x_1, \ldots , x_n\), denoted \(x^\alpha \), is a formal expression of the form
\[ x_1^{\alpha _1} x_2^{\alpha _2} \cdots x_n^{\alpha _n} \]
where \(n \in \mathbb N\), \(\alpha =\left (\alpha _1, \ldots , \alpha _n\right )\) is an \(n\)-tuple of nonnegative integers.
The total degree of the monomial is denoted \(|\alpha |=\alpha _1+\cdots +\alpha _n\).
A polynomial \(f\) over a ring \(R\) in \(n\) variables is a finite linear combination (with coefficients \(a_\alpha \) in \(R\) ) of monomials, i.e. the formal expression of the form
\[ f=\sum _\alpha a_\alpha x^\alpha , \quad a_\alpha \in R, \]
The set of all polynomials in \(x_1, \ldots , x_n\) with coefficients in \(R\) is denoted \(R\left [x_1, \ldots , x_n\right ]\).
\(a_\alpha x^\alpha \) is called a term of \(f\) if \(a_\alpha \neq 0\).
The total degree of \(f \neq 0\), denoted \(\operatorname {deg}(f)\), is the maximum \(|\alpha |\) such that the coefficient \(a_\alpha \) is nonzero. The total degree of the zero polynomial is undefined.
Under addition and multiplication, \(R\left [x_1, \ldots , x_n\right ]\) satisfies all axioms of a commutative ring, and for this reason we will refer to \(R\left [x_1, \ldots , x_n\right ]\) as a polynomial ring.
The (\(n\)-dimensional) affine space over a field \(K\), denoted \(\mathbb {A}_K^n\), is
\[ \left \{\left (c_1, \ldots , c_n\right ): c_i \in K \text { for } i=1, \ldots , n\right \} \]
which is just \(K^n\) as a set, without the its additional structures as a \(K\)-vector space and a ring.
We'll often use the term affine \(n\)-space to indicate the dimension. Particularly, an affine 1-space is called an affine line, an affine 2-space is an affine plane.
Let \(S \subset K\left [x_1, \ldots , x_n\right ]\) be a set of polynomials. The zero locus (or zero set) of \(S\) is
\[ V(S):=\left \{x \in \mathbb {A}_K^n: f(x)=0 \text { for all } f \in S\right \} \subset \mathbb {A}_K^n \]
An affine algebraic variety over \(K\) is a subset of \(\mathbb {A}_K^n\) of this form. It's usually simply called an affine variety over \(K\), or an affine \(K\)-variety.
If \(S=\left (f_1, \ldots , f_k\right )\) is a finite set, \(V(S)\) can be written as \(V\left (f_1, \ldots , f_k\right )\).
Obviously, it is the set of all solutions of the system of polynomial equations \(f_1\left (x_1, \ldots , x_n\right )=\cdots =f_s\left (x_1, \ldots , x_n\right )=0\) [cox1997ideals, 1.2.1], denoted \(\operatorname {Sol}(S;K)\) [dolgachev2013introduction, p. 1].
The prefix "affine" of affine variety is used to distinguish it from a projective variety. Affine varieties arise from arbitrary polynomials, while projective varieties arise from systems of homogeneous polynomials, i.e. linear combinations of monomials of fixed degree.
Since affine varieties are the general case, they are sometimes simply called varieties.
Figure out its relation to affine space in Geometry, which preserves parallelism and ratio of lengths for parallel line segments, but not distances and measures of angles.
Of course, this structure is much weaker as e.g. a normed space or an inner product space. It does not allow to introduce the concept of length, distance or angle. Therefore it is clear that a geometry based on this structure cannot be metric. However, the bracket can be addressed as a volume form. Volume measurements are used e.g. in the analysis of chaotic systems and strange attractors.
A standard Peano space is a Peano space over the linear space \(V\) of dimension \(n\) whose bracket has the additional property that for every vector \(x \in V\) there exist vectors \(x_2, \ldots , x_n\) such that
\[ \left [x, x_2, \ldots , x_n\right ] \neq 0 . \]
In such a space the length of the bracket, i.e. the number of entries, equals the dimension of the space, and conversely. We will be concerned here with standard Peano spaces only.
I decided to widen my horizon about browsers. After browsing a few subreddits, I have chosen these 3 browsers:
- Arc
- Chromium based
- the team has moved on to Dia, an AI browser
- but people still miss the UX of Arc, it seems to be one of a kind, so I have to know
- unlikely to be open-source
- Mac/iOS/Windows, no Linux
- Orion
- WebKit based, this engine is the best hope for degoogle
- will be open-source
- not mature yetA
- Mac/iOS, Linux WIP, no Windows/Android
- Floorp
- Firefox based
- open-source
- Mac/Windows/Linux, no mobile
Just to note down that it's not straightforward to get perf working.
Here's what worked finally:
First, the container needs to be started with proper capabilities:
```bash
docker run --rm -it --platform linux/amd64 --cap-add PERFMON ubuntu:jammy bash
```
Secondly, some flags need to be set from the VM:
```bash
# this would start a shell in the VM, outside of containers
colima ssh
# don't know why, sudo doesn't work, but it works in a root shell
# -1 or 2 both work
sudo bash -c 'echo -1 > /proc/sys/kernel/perf_event_paranoid'
```
Lastly, install perf in the container:
```bash
apt install -y linux-tools-$(uname -r) linux-cloud-tools-$(uname -r)
```
Then the following works:
```bash
perf record -g echo 1
perf report -g graph,0.5,caller
```
Note also that you can't pass `"echo 1"` because perf will failed to find a file named "echo 1".
Learned this while working on Trying Zig's self-hosted x86 backend on Apple Silicon and before submitting this issue about unable to run `poop` in `colima` containers.
Follow-up:
It's important to check whether certain perf counters actually work:
```
sleep 10 &
perf stat -p $! -e "cpu-cycles,instructions,cache-references,cache-misses,branch-misses"
```
On a machine that supports corresponding counters, it should give something like:
```
Performance counter stats for process id '101011':
511488 cpu-cycles
280194 instructions # 0.55 insn per cycle
16193 cache-references
6161 cache-misses # 38.047 8730 branch-misses
10.009515156 seconds time elapsed
```
On an unsupported machine, the output would be like:
```
Performance counter stats for process id '552':
<not supported> cpu-cycles
<not supported> instructions
<not supported> cache-references
<not supported> cache-misses
<not supported> branch-misses
6.491105847 seconds time elapsed
```
Recently, a MacBookPro and a Mac Mini, both are M4 Pro, have become my primary productivity tools. Along with them, I have 3 monitors, plus the one comes with MacBookPro, counts as 4 screens in total. On top of that, I have the logi MX Keys keyboard and MX Master 3S suite, and an Apple Magic Trackpad.
It turns out that I need to come up with a productive setup for them.
## TL;DR as an ASCII art
```
+---------+
| |
+-----------------+ +-----------------+ | Tall |
| Reference | | Primary | | |
| Display | | Display | | Display |
+-----------------+ +-----------------+ | |
| |
+------+ +------------+ +---------+
| Mini | | MBP |
+------+ +------------+
| [Keyboard] |
| [Pad] |
+------------+
[ MX Keys ] [MX 3S]
[Pad]
```
## Screen arrangement
Above the screen of MacBookPro, I have my primary display, which serves as the main workspace for coding, browsing etc. The second display is placed on its left, on the same height, hosting stuff for reference. Both are horizontal and wide.
The third display (I would like to call it the "tall display") is placed on the right, yet vertically, with its middle part aligned with the primary display, it might host IM, PDFs, AI assistants, or any stuff that are tall, including vertically stacked windows that requires less width.
## Mac-display correspondence
Mac Mini is placed under the second display, and it's connected to the first 2 displays, and all of them if desired.
MacBookPro is connected to the tall display, and optionally also to the primary display.
## Keyboard and mouse
I could just use the keyboard and trackpad of MacBookPro, and they can travel to the nearby Mac Mini.
As I have muscle memory that's already used to MacBookPro's keyboard, MX Keys would be a good fit, as the keys are flatter, the key distance is also shorter.
I'm so used to use the trackpad to scroll, zoom, switching between spaces, and move cursor during coding, so that I would either put MX Keys on top of the keyboard of MacBookPro so I can reach its trackpad effortlessly, or I would just MX Keys and the Magic Trackpad beneath it, with the same relative location like the MacBookPro's keyboard and trackpad.
When I work with diagrams, slides, or UIs that require meticulous mouse control, I would use the MX Master 3S mouse. Its side wheel is configured to send `Ctrl+Left` and `Ctrl+Right`, and side buttons are configured to send `Ctrl+Up` and `Ctrl+Down`, both serving space switching.
The logi keyboard and mouse supports switch between 3 devices, so I can switch between 2 Macs. Unfortunately, one doesn't follow the other to switch, so I would rely on Mac's native support for keyboard and mouse sharing, and it goes like `2->1->MacBookPro->Tall`, so there is minimal extra travel between 1 and tall.
One might noticed that such a setup is highly redundant, but this provides me with maximum flexibility and productivity:
- In coding mode, I could have both hands on the keyboard, with the left hand sometimes on the trackpad
- In browing mode, my left hand is on the trackpad, and the right hand is on the mouse
- In diagramming mode, both hands are also on the keyboard, with the right hand sometimes on the mouse
## Thunderbolt bridge
I also have a Thunderbolt bridge, by connecting 2 Macs with a Thunderbolt cable, and following this guide. This would create a secure and fast connection between them for file transfer, screen sharing, distributed AI inference etc.
Such a bridge could also be used in MX Flow, but it has certain delay each time the cursor moves between the 2 Macs, so I didn't use it.
I wish to note down my exploration of solutions to organize massive amount of (mostly binary media) files.
## Wishlist
- Performant
- it should be able to handle huge amount of files, with cached metadata, thumbnails, etc. so I can browse over their contents quickly
- batch operations to files on NAS should be as performant as local files, e.g. using client-server architecture, using rsync to determine the diff, etc.
- Data integrity
- it should be able to use versioning, deduplication, etc. to ensure data integrity so I can mess with the files (mostly moving around, renaming) without worrying about losing them and their metadata such as timestamp, and this should be efficient in terms of storage
- Semi-auto tagging
- I should be able to tag files manually, in batch, and set rules to tag files automatically with metadata or based on content (using ML models)
- preferably, tags could have its own hierarchy
- tags should not be using filename or metadata, but rather a separate database
- Open-source, freemium or affordable one-time purchase
- in anycase, I should not be locked in, and the data should be exportable, so I can script
## Candidate solutions
- git-annex
- Pros
- it maintains a versioned record for files, can even track offline media
- Cons
- it only have access to the file contents when they are present at the registered location
- even with the web UI, it's not realy user-friendly
- DataLad
- it builds on git-annex, but still a CLI
- kopia
- Pros
- backup, versioning, deduplication, encryption, compression, error correction etc.
- it can mount or restore any snapshot to a directory
- there is official support for syncing the repo, making it reliable to backup the backup
- Cons
- can't really check the history of a file
- relies on other tools for checking file location changes
- lost the time added, preserving only the modification time
- Commander One
- dual-pane file manager, trying out
- `dua -i`
- it can analyze disk usage recursively, interactively
- I can mark files for deletion
- jw
- Pros
- it can calculate the hashes of files in deep directories really quick
- I use it to check the integrity of files after copying files
- Cons
- it doesn't check file renaming/moving
- VeraCrypt
- Pros
- useful if you just want to encrypt files in a container
- Cons
- it's inconvenient to mount in docker
- Garage
- Pros
- it's a S3-compatible storage
- it works with git-annex and kopia
- macFUSE
- needed to mount various filesystems
- photo management
- HomeGallery
- librephotos
- digiKam
- PhotoPrism
- Immich
- Lychee
- asset management
- Eagle 4
- Pixcall
- Billfish
This note will cover some modern C++ features interesting to me during reading Modern C++ Programming.
Features standardized in C++23, or implemented for C++26, are in scope.
This note serves as a holistic view and a reminder.
The problem with using C++...is that there’s already a strong tendency in the language to require you to know everything before you can do anything -- Larry Wall
- C++14 digit separators, e.g. `1'000'000`
- C++14 `auto` for function return types, e.g. `auto f() { return 1; }`
- C++20 `auto` for function input, equivalent to templates but less expensive at compile-time
- C++20 `<utility>` for safely comparing signed/unsigned types
- Detecting wraparound for unsigned integral types is not trivial
- C++23 `std::float128`, `std::binary16`, `std::bfloat16`
- `inf` and `nan` for floating-point types
- Floating-point Arithmetic Properties
- Catastrophic Cancellation
- Fixed epsilon “looks small” but it could be too large when the numbers being compared are very small
- `areFloatNearlyEqual`
- compensation algorithm like Kahan summation, Dekker’s FastTwoSum, Rump’s AccSum
- C++11 Dynamic memory 2D allocation/deallocation: `new int[3][4]` and `delete[]`
- `new (buffer)`: `delele[] buffer`, explicit `x->∼A()`
- `new (std::nothrow)`
- C++20 Designated Initializer List: `struct A { int x, y; }; A a{.x = 1, .y = 2};`
- Pointer arithmetic rule: `address(ptr + i) = address(ptr) + (sizeof(T) * i)`
- C++11 `constexpr`: _can_ be evaluated at compile-time
- variable: always evaluated at compile-time
- function
- evaluated at compile-time as long as all its arguments are evaluated at compile-time
- always evaluated at run-time if
- contain run-time functions
- contain references to run-time global variables
- C++20 `consteval`: guarantees compile-time evaluation
- A run-time value always produces a compile error
- C++20 `constinit`: guarantees initialization at compile-time
- A run-time initialization value always produces a compile error
- The value of a variable _can_ change during the execution
- C++17 `if constexpr`: compile-time branching
- C++20 `std::is_constant_evaluated()`
- C++23 `if consteval`
- C++20 `std::bit_cast`
- `=delete` can be used to prevent calling the wrong overload
- C++20 `[[nodiscard("reason")]]`
- This issues a warning if the return value of a function is discarded (not handled), which is good for ensuring error handling.
- C++17 for the entire class/struct
- C++20 for constructors
- C++23 for lambdas
- The compiler injects `operator()` in the code of the destination function and then compile the routine. Operator inlining is the standard behavior
- C++11 lambda expression: capture clause
- `[]` captures nothing
- `[&]` captures all variables by reference
- `[=]` captures all variables by value
- `[=, x, &y]` captures `x` by value and `y` by reference, all other variables by value
- C++17 `[this]` captures the current object by reference
- C++14 `[x=x]`, `[&x=x]` capture current object member `x` by value/reference
- C++20 deprecates `[=]` capturing `this` pointer by value
- C++14 `[](auto value)`, `[](int i = 6)`
- A function can return a lambda (dynamic dispatch is also possible)
- C++17 `constexpr` lambda, C++20 `consteval` lambda, `mutable`
- syntax: post specifier
- C++20 `template` and `requires` clause for lambda
auto lambda = []<typename T>(T value)
requires std::is_arithmetic_v<T> {
return value * 2;
};
auto v = lambda(3.4); // v: 6.8 (double)
struct A{} a;
// auto v = lambda(a); // compiler error
- When Preprocessors are Necessary
- Conditional compiling
- [Abseil Platform Check Macros](https://abseil.io/docs/cpp/platforms/macros)
- Mixing different languages
- Complex name replacing
- [Are We Macro free Yet, CppCon2019](https://github.com/CppCon/CppCon2019/blob/master/Presentations/are_we_macrofree_yet/are_we_macrofree_yet__zhihao_yuan__cppcon_2019.pdf)
- C++20 `<source location>`
- replaces `__FILE__`, `__LINE__`, C++11 `__func__`
- still: non-standard `__PRETTY_FUNCTION__`
- C++17 [`__has_include`](https://en.cppreference.com/w/cpp/preprocessor/include)
- stringizing macro operator `#`, e.g. `#string` expands to `"string"`
- token pasting macro operator `##`, e.g. `x##y` expands to `xy`
- `#error`, C++23 `#warning`
- non-portable `pragma`, C++11 `_Pragma`
- C++11 `__VA_ARGS__`
- C++11 In-class non-static data members initialization (NSDMI)
- allows initializing the data members where they are declared
- a user-defined constructor can be used to override their default values
- C++11 uniform initialization `{}`
- in function arguments and return values
- C++11 delegate constructor
- C++11 `=default`, `=delete`
- `using` for type aliasing
- e.g. partial specialization for templates
- C++11 `override`
- ensures that the function is virtual and is overriding a virtual function from a base class
- C++11 `final`
- prevents a virtual function from being overridden in a derived class
- `dynamic_cast`
- upcast: `dynamic_cast<Base*>(derived)`
- C++23 multidimensional subscript operator `[]`
- C++23 static `[]` and `()` operators
- C++11 `explicit` for conversion operators
- `std:swap`
- operators preserve precedence and short-circuit properties
- binary operators should be implemented as friend methods
- layouts
- aggregate
- initializable with `{}`
- trivial copyable
- can be copied with `memcpy`
- standard layout
- just like a struct or a union
- POD
- trivial copyable + standard layout
- C++11/C++20 `<type_traits>` to check layout
- C++17 automatic deduction of non-type template parameters, with `auto`
- non-type template parameters
- C++20 a class literal type
- `constexpr` assignable
- public, non-mutable
- for all base classes and non-static data members
- pointer, reference, and pointer to member
- function, e.g. `decltype(f)`
- C++26 assertion with text formatting
- e.g. `static_assert(sizeof(T) != 4, std::format("test1 with sizeof(T)={}", sizeof(T)));`
- `decltype((expr))` for `lvalue` reference to the type of `expr`
- C++14 `auto` for function template return types
- replaces e.g. `decltype(T{} + R{})`
- `<type_traits>`
- type query
- type manipulation
- C++17 CTAD
- automatic deduction of class template arguments in constructor calls (CTAD)
- template deduction guide
- map constructor parameter types to class template parameters
- e.g. `MyString(char const*) -> MyString<std::string>;`
- doesn’t work within the class scope
- C++20 alias template deduction
- use `std::enable_if` to make use of SFINAE
- helper alias: `_t` for `::type`, `_v` for `::value`
- using `std::enable_if_t` for the return type prevents `auto` deduction
- variadic template
- captures a parameter pack of arguments, which hold an arbitrary number of values or types
- ellipsis `...`
- prefix an identifier to introduce/capture
- suffix the identifier to expand
- `sizeof...(args)` to get the number of arguments
- C++17 fold
`(other op ... op pack)`
- C++14 for lambdas
- C++20 for concepts
- [C++20 idioms for parameter packs](https://www.scs.stanford.edu/~dm/blog/param-pack.html)
- C++20 concepts
- `concept [name] = [compile-time boolean expression];`
- `requires` clause: `requires [compile-time boolean expression or Concept]`
- `requires` expression:
#include <concept>
requires [(arguments)] {
[SFINAE contrain]; // or
requires [predicate];
} -> bool
template<typename T>
concept MyConcept = requires (T a, T b) { // First case: SFINAE constrains
a + b; // Req. 1 - support add operator
a[0]; // Req. 2 - support subscript operator
a.x; // Req. 3 - has "x" data member
a.f(); // Req. 4 - has "f" function member
typename T::type; // Req. 5 - has "type" field
{*a + 1} -> std::convertible_to<float>;// Req. 6 - can be deferred
// and the sum
// with an integer is convertible
// to float
{a * a} -> std::same_as<int>; // Req. 7 - "a * a" must be valid
// and the result type is "int"
};
- common linker errors
- multiple definitions
- undefined reference
- C++20 modules
- a module is a set of source code files that are compiled independently of the translation units that import them
- `module module.name;`
- `import module.name;`
- `export`
- `export { ... }`
- `export module module.name;`
- `expor import module.name;`
- global module fragment: include header files in a module interface
- `module;`
- private module fragment: include header files in a module implementation
- `module :private;`
- legacy headers can be directly imported with import
- all declarations are implicitly exported and attached to the global module (fragment)
- a module can be organized in isolated module partitions
- namespace
- anonymous namespace
- visible only in the translation unit, internal linkage
- inline namespace
- can be used to version a library
- C++17 attributes on namespaces
- tools
- `c++filt`, `ldd`, `nm`, `objdump`, `readelf`
- “Common” Project Organization
- [C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines)
- Naming style conventions can be also enforced by using tools like `clang-tidy`'s [readability-identifier-naming](https://clang.llvm.org/extra/clang-tidy/checks/readability/identifier-naming.html#readability-identifier-naming)
- Use braced direct-list-initialization or copy-initialization for setting default data member value. Avoid initialization in constructors if possible
struct A {
int x = 3; // copy-initialization
int x { 3 }; // direct-list-initialization (best option)
};
- C++20 Ranges
- operate on elements of data structures uniformly
- pipe operator `|`, `|=`
- `std::views::*`, `std::actions::*`
- range factory: produces a view that contains no elements
- lvalue, rvalue (could be *p*ure rvalue, or e*x*piring *g*eneralized lvalue)
- `std::move` indicates that an object is no longer needed and can be moved
- C++11 reference collapsing rules
- C++11 perfect forwarding with `std::forward`: allows preserving argument value category and const/volatile modifiers
- copy elision for return value (C++17 guarantees some cases)
- C++23 decay copy with `auto`
- C++23 `std::expected`
- return a value or an `std::unexpected`
- the user can choose how to handle the error: check return value, throw an exception, or monadic operations
- future
- `std::async(f, args)`
- launch policy: `std::launch::async`, `std::launch::deferred`
- `.get()` or `.wait()`
Setup:
- Use SketchyBar
- Use `just prep-sbar` to setup SketchyBar
- Use `just prep-skhd` to setup some extra key bindings
- `Shift + Alt + space`: toggle sketchbar
- Modify `~/.config/sketchybar/bar.lua` as follows:
```lua
display = "all",
sticky = true,
```
for better multi-monitor support, and make the bar sticky.
Unfortunatly, SketchyBar actively hides itself for fullscreen spaces, see this issue for more info.
So, remember to use `Cmd + Ctrl + F` to toggle fullscreen mode.
Setup:
- Use Amethyst
- sensible default
- less confusing configuration than yabai it seems
- Use `just prep-tile` to install Amethyst
Keys:
- `Option + Shift + Space` to cycle through layouts
- I prefer only BSP and wide
- `Option + Shift + J/K` to cycle through windows
- `Option + Shift + H/L` to resize windows
Alternative setup:
- Use yabai
- requires skhd for key bindings
- Use `just prep-yabai` to install yabai
- Intentionally choose not to disable SIP
Keys:
- Space manipulation won't work with SIP enabled
- keys to switch between windows are confusing
- place windows on another window to split with it, there is visual hint for the split layout when overlapping with different areas
- `Shift + Alt + s`: change how windows are splitted, prefer use Kitty's own layout management
- `Ctrl + Alt + g`: toggle gaps between windows, prefer some gap for normal apps but no gap for terminals
Setup:
- Use Kitty
- Use `just prep-kitty` to install Kitty
Keys:
- Font size
- `Cmd +` to increase font size
- `Cmd -` to decrease font size
- OS Windows
- `Cmd + N` to open a new OS window with the same current directory
- Tabs
- `Cmd + T` to open a new tab with the same current directory
- `Cmd + number` to switch to tab number
- particularly useful for tiled windows, as SketchyBar will hide the tab bar
- `Cmd + w` to close the current tab
- Kitty Windows
- `Cmd + Enter` to open a new window with the same current directory
- close the current Kitty window
- `Ctrl + Shift + w`
- or run `exit`
- `Ctrl + Shift + l` to switch to next layout
- `Cmd + r` to resize window, e.g. `t` to make it taller
- Kitty
- `Cmd + q` to quit kitty
- `Cmd + ,` to open kitty config file
- with `Ctrl` to reload config
Here are some useful keys that I like using in NeoVim, some of them require corresponding plugins.
I use
- LunarVim
- it has some pre-installed plugins
- Task `prep-lvim` in my justfile is how I have installed LunarVim. Task `lvim` will syncrhonize the configurations, change the working directory and open LunarVim for a specified project.
- my LunarVim configuration
- LunarVim specific keys are marked with 🌕
- AstroVim
- seems to be a better starting point
- see astrocommunity for configuring plugin packs
- I've ported plugins to use it except for forester
- AstroVim specific keys are marked with 🌌
- LazyVim
- it has some pre-installed plugins
- similarly, see task `prep-lazyvim` and `lazyvim` in my justfile
- LazyVim specific keys are marked with 💤
- NvChad
- it's ~ 900 LOC lightweight configuration
- see tasks `prep-chad` and `chad` in my justfile
- NvChad specific keys are marked with 📦
- For all Vim flavors, the following apply:
- shared configurations
- shared plugins
The `Esc` key is used to escape to normal mode, to use Vim's powerful motions. It's the most frequently used key in Vim.
- I have configured CapsLock to mean Esc
- on Mac, go to `System Preferences > Keyboard > Modifier Keys > Caps Lock Key`
- I have configured `jk` in insert mode to mean Esc
- this is a common practice, but I don't seem to need it much
The leader key is used to trigger many key mappings in normal mode.
The default leader key is a single space, represented by `<leader>` below.
🌕 The default local leader key is not set. I've configure it to be two consecutive spaces. `,` is also a common choice.
💤 The default local leader key is set to `\\`. I didn't change it.
With plugin `whichkey` installed, `<leader>` and wait to see some useful mappings.
Most frequently used key mappings is `<leader>` + one character. There are also many two-character mappings, with the first character indicating a category.
> To generate a truly random string, place a new user in front of Vim and ask them to exit
Most of the time, one can use `Esc` + `:qa!` to force quit Vim, with potential loss of unsaved changes.
In most plugin UIs,
- it can be quit with a single `q`, or `Esc` + `:q`, when in despair, try also `Ctrl+c`
- it will usually enter normal mode first, even in terminals, a single `i` is required to enter insert mode, to acually type something
- it might provide visual hints on how to navigate, usually a single number or a character in brackets
- `g?` or `?` might bring up help on key mappings.
Recording is a powerful feature of Vim, but one can accidentally trigger it with an out-of-order `q`, so one simply need to remember to press `q` again to quit recording if seeing something like `recording @a` in the status bar.
Other first aid key mappings or commands available:
- `:map` to see all key mappings
- `:help key-notation`
- in case it's not immediately clear that
- `C` means `Ctrl`, `S` means `Shift`
- `M` means `Alt`/`Option`
- `D` means `Cmd` on Mac
- use `:Lazy` to check on plugins
- use `:Mason` to check on LSPs
- 🌕
- `<leader>+;` to open the dashboard for frequently used actions
- `<leader>+Lr` to reload the config
- `<leader>+li` to see LSP informations
- 📦
- `<leader>+ch` to open cheatsheet
- Theming
- dropping into an unfamiliar theme is worth a first aid
- `:colorscheme ` and tab to select a theme
- 🌕 `<leader>+sp` to search/select and preview themes
- my favorite is `railscasts`
- 💤 defaults to `tokyonight`
- 📦 defaults to `onedark`
- Resume session
- `<leader>+rs` to resume the last session
- `<leader>+rS` to select a session to resume
- Explore and find files
- `<leader>+e` to toggle file explorer on the left
- `x` to cut, `c` to copy, `p` to paste
- `a` to create a file, `d` to delete, `r` to rename
- `H` to toggle hidden files, `I` to toggle git ignored files
- `Y` to copy relative path, `gy` to copy absolute path
- `s` to open in system file explorer
- `g?` for help, `q` to quit, `R` to refresh
- `hjkl` to move up/down, collapse/expand
- it will automatically follow the current file
- 💤 and 📦 use NeoTree
- can't be toggled, need `q` to quit
- `?` for help
- 🌕
- `<leader>+sr` to fuzzy find recent files
- `<leader>+sf` to fuzzy find files
- `<leader>+st` to grep text in files
- `<leader>+bf` to fuzzy find buffers
- 💤
- `<leader>+fr` to fuzzy find recent files
- `<leader>+ff` to fuzzy find files, or simply double space
- `<leader>+/` to grep text in files
- Manipulate buffers and windows
- use `L` for going to next buffer, `H` for previous buffer
- use `<C-hjkl>` to move between windows
- use `-`/`=` to resizing, or with `C-` for vertical resizing
- Jump between files
- `gd`: go to definition
- `Ctrl`+click also works, note that it's not cmd+click on mac
- `gl`: show diagnostics under cursor
- `<C+o>`: go back, `<C+i>`: go forward
`:jum` to see the jump history
- `gf`: go to file
- `K` : hover, backspace to cancel hover
- `Shift`+click on URL to open in browser
- In-file outline
- `<leader>+o` to toggle the outline on the right
- I have configure it to autojump and autoexpand
- Folding
- the following needs to be run inside the fold
- otherwist it will work on the upper level
- `za` to toggle fold
- `zA` to toggle all folds
- `zO` or `zC` to open/close folds recursively
- e.g. it's useful `zCV` then `<leader>+/` to toggle comment for a fold range
- Open file
- `:e` to open a file, with tab completion, but it's better to use the fuzzy finder above
- `:vs` to open a file (or the current buffer) on the right
- `:sp` to open a file (or the current buffer) on the bottom
- Save & quit
- `:bd` (i.e. buffer delete) to close the current buffer/tab
- won't close if there are unsaved changes, `:bd!` to force close
- In VSCode Vim mode, should stick to `cmd+s` for saving, `cmd+w` for closing a tab
- 💤
- `:w` or `Ctrl+s` to save
- `<leader>+bd` to close a buffer with confirmation on unsaved changes
- 🌕 prefer to use
- `<leader>+w` to save
- `<leader>+c` to close a buffer with confirmation on unsaved changes
- `ZZ` to save and quit
- Repeatition
- prefix with a number to repeat a motion multiple times, e.g. `3j`, `3w` for the impatient
- `.` to repeat the last edit motion, but not move motions
- Move motions
- On Mac, combine with the touchpad to be even more efficient
- `gg` for beginning, `G` for end
- `zz` to center the current line, `zt` to top, `zb` to bottom
- prefixing with a number works on the corresponding line
- `w`/`b` to move forward/backward by word
- `W`/`B` to move by space-separated words
- `e` works like `w` but moves to the end of the word
- `ge` works like `b` but moves to the end of the previous word
- left `hj` are for left, down; right `kl` are for right, up
- pointer finger on `j`, then muscle memory for `jkl` but `h` is a little difficult to reach
- `0` to move to the beginning of the line
- I don't like `^` and `$` because they are difficult to reach
- but in the case of `d$`, it's worth it to reach for `$`
- `f` + character to move to the next such character on the same line
- `;` and `,` to move to next or previous
- `t` works like `f` but moves to one character before, just like `i` for `a`
- e.g.
- `ct)` is useful for change until the next `)`
- `vt)p` is useful for select until the next `)` and paste and replace
- `F` and `T` work like `f` and `t` but move backward
- `}` to the next paragraph
- Edit motions
- to enter INSERT mode
- `i` for before the cursor
- `I` to insert at non-whitespace beginning of the line
- `0i` to insert at the beginning of the line
- `a` for after the cursor
- `A` to append at the end of the line
- `c` (i.e. change) to replace by motion, e.g. `cw` to replace a word
- `R` to replace by character
- `u` to undo, surprisingly `Ctrl+r` to redo
- Text objects
- `i` for inside, `a` for around
- works after the action, before the motion
- e.g. `ciw` to *c*hange *i*nside the *w*ord
- note that `cw` only changes the part of the word after the cursor, but `ciw` changes the whole word
- `a` will affect the surrounding whitespace and punctuation
- with `mini.ai` extending the text objects, we have more general way to discribe the text objects
- `b` for brackets
- `q` for quotes
- `c` for class
- `f` for function bodies
- `a` for arguments
- `?` to specify left and right delimiters interactively
- Cut/copy/paste
- `x` to immediately delete the character under the cursor
- `d`+direction to delete a character in that direction
- prefer to use number + `dj`/`dk` to delete multiple lines
- `dw`/`dd` to delete a character/word/line
- all the above will also cut the deleted text
- `yw` to copy the current word, `yy` to copy the current line
- `p` to paste after, `P` to paste before
- for pasting into a pair of brackets/quotes etc., move to the opening one (e.g. use `Esc` + `b`), then `p` or `Cmd+v`
- `yyp` to duplicate a line
- `<leader>+s+R` to search the registers for cut/copied text
- on Mac, `Cmd`+`c`/`v`/`x` also works despite which mode Vim is in
- but it seems to be not working stably across difference instances of Vim, especially when inside various plugins
- `y` usually guarantees the text is copied to the system clipboard, then use it in other applications
- Search
- `/` to search forward, `n`/`p` to go to next/previous match
- `?` to search backward
- `*` to search the word under the cursor
- remember to press `Enter` after typing the search term
- remember to treat the search as a regex
- i.e. to escape `\`, `()`, `[]`, `{}`, `^`, `$`, `*`, `+`, `?`, `|`, `.` etc.
- use `<leader>+h` to clear the search highlight
- `gn` to select the next match, `gN` to select the previous match
- repeating this would select *to* the following matches
- prefix with an operator to operate on the selected text
- Replace
- use `%s/before/after/g` to replace all occurrences, remember`%`, otherwise it will only look for the pattern in the current line
- `<leader>+ss` to toggle the search and replace panel
- `<localleader>+r` to replace all occurrences
- manually: use `gf` to go to each file, do edits, then `Ctrl+o` to go back and continue for the next file
- Select text
- use `V` to enter line visual mode
- this is the most frequently used visual mode, to select multiple lines
- lowercase `v` can be used for character visual mode, but I find it less useful
- `Ctrl+v` to enter block visual mode, i.e. select by column
- Indentation & comment
- use `<<`/`>>` to indent/unindent the current line
- use a single `<`/`>` to unindent/indent the selected text
- 🌕
- `<leader>+/` to toggle comment
- works for both current line and selected text
- 💤
- `gc` to toggle comment for selected text
- `gcc` to comment/uncomment a line
- Multiple-cursor editing (⧉)
- use `/xyz` + enter to search for `xyz`, then `cgn` to change it to something else, `Esc`, then
- use `.` to do the same for next occurrences
- use `n` to skip, and `u` to undo
- use `Ctrl+v` and direction keys (e.g. `j` or `5j`) to vertically select
- use `c` to change
- use `I` to prepend
- use `A` to append
- use `d` to delete
- type the new text, then `Esc`, then it will be applied to all lines of selection
- use `gv` to reselect the last selection
- Visual selection with mouse/touchpad
- motion-compatible selections
- dragging works like `v`
- triple click + scroll works like `V`
- `alt/option` + drag works like `Ctrl+v`
- select and
- ✅ delete by `d`/`x`
- ✅ cut/copy/paste
- ✅ comment/uncomment
- ❌ typing
- ✅ use one of `i`, `a`, `c` etc., then start typing
- ❌ indent/unindent with `Tab`/`Shift+Tab`
- ✅ indent/unindent with `<`/`>`
- vertical select and
- ✅ delete
- ✅ paste
- ❌ typing
- ✅ use one of `I`, `A`, `c`, then start typing
- Recording
- `qa` starts recording user's inputs, `q` quits recording, `@a` replays the recording, `a` could be any other letter
- it could serve as ad hoc key mappings
- Diagnostics
- by default, diagnostics are shown as virtual text, I've disabled it for now
- `<leader>+l` with `j`/`k` to move between diagnostics
- `<leader>+ld` to show diagnostics in the current buffer
- `<leader>lw` can search for diagnostics in workspace
- `Trouble` can be used to show diagnostics in a separate panel
- `<leader>+xx` to open the trouble panel at the bottom
- use `<leader>+xX` to see only the current buffer's diagnostics
- click on diagnostics to jump to the corresponding location
- `<leader>+lq` to open the quickfix list
- `<leader>+la` to open the code action list
It's very important to stay in the flow and Zen mode as much as possible. Split windows for terminals breaks the flow, so I prefer to use float terminals.
- Float terminals
- 🌕 `Ctrl+\`to toggle a float terminal
- 🌕 number + `Ctrl`+backslash to toggle the n-th float terminal
- 💤 same, but with `Ctrl+/`
- it's insert mode on first launch
- but usually enter normal mode when lost focus
- Split terminals
- `:term` to open a terminal
- it's normal mode by default, `i` to start typing commands
- `:sp term://zsh` to open a terminal running `zsh` at the bottom
- 📦
- use `<leader>+h` to open a new horizontal split terminal
- `:q` to hide terminal
- `<leader>+pt` to pick hidden terminal
- Working inside the terminal
- `exit` will exit the terminal and close the window associated with it
- `Ctrl`+`c` can be used for interrupting the terminal
- git
- install and use `lazygit`
- `<leader>+gg` to open lazygit, `q` to quit
- on a file/directory
- space to stage/unstage, `i` to ignore, `d` to discard
- `c` to commit, `P` to push, `p` to pull
- `r` on a commit to reword
- in editor, `<leader>+gs` to stage the current hunk, `<leader>+gr` to revert the current hunk
- usually lazygit is good enough, one can also use `:DiffviewOpen` to inspect diff and handle merge conflicts
- Github Actions
- `<leader>+ga` to open the Github Actions split window
- deprecated: `<leader>+gh` because it conflicts with 💤
- `gj` to open the job under cursor, `gr` for the run, `gw` for the workflow file
- `<leader>+gq` to close the split window
- Octo
- `:Octo <Tab>` to operate on issues and PRs
- it's interesting but I've disabled it for now, in favor of using the web interface
- Auto-complete
- with Github Copilot
- `:Copilot setup` to authenticate etc.
- `Tab` to accept suggestions
- `Shift+Tab` to use the normal Tab
- with `blink.cmp`
- I've configure it to not triggered by default, in favor of Github Copilot
- toggle by `Ctrl+q`
- could also be triggerd by `down` in insert mode
- up/down to cycle through suggestions, `Enter` to accept
- the above is set up so all this key mapping are explicit, intuitive and won't interfere with `<Tab>` for Github Copilot
- with `nvim-cmp` (deprecated)
- `Tab` to cycle through suggestions from `cmp`, reverse cycle with `Shift+Tab`
- continue to type to select a suggestion, this is better then hitting `Enter`
- `Ctrl+e` to reject suggestions, and `Tab` to take suggestions from Github Copilot
- LLM Chats
- Github Copilot
- `<leader>+aa` to toggle the chat, `<leader>+al` to clear chat history
- `<leader>+a` + a character
- for selected code block, **e**xplain, **r**eview, **o**ptimize, **R**efactor, **t**est
- **d**ebug, **f**ix, co**m**mit log, **n**aming etc.
- `?` to select the model
- `C-s` to send the prompt
- `:help copilot` to learn more
- Aichat
- `aichat` with a little setup at `~/Application Support/aichat/config.yaml`
- `.` + tab to see available commands
- Aider
- it's great for multiple-file code generation
- I've created a `aider` shim to run it in a virtual environment with preferred configurations
- `/` + tab to see available commands
- mostly it's a combination of `/add`, `/web`
- use `/token` to check token usage, use `/clear`, `/drop` and check `/map` to save token usage
- Code Companion
- `<localleader>+a` to toggle the inline chat (works also for selection)
- `<localleader>+z` to toggle the chat sidebar
- Parrot
- `<leader>+pr` to rewrite the selection
- `<leader>+pp` to implement the selection
- `<leader>+pn` to start a new chat, optionally with the selection
- select and `:Prt` + `<Tab>` to select advanced prompts
- `:PrtInfo` to inspect configurations
- `:PrtModel` to select the model
- I've configured it to use a buffer to input the prompt, escape twice to send it
- it seems unable to stop generation sometimes, use `:PrtChatStop` to stop
- Avante (deprecated for now)
- `<leader>+aa` to open the chat, use `a` to accept individual suggestions, or `A` to accept all
- select code and `<leader>+ae` to modify code
- it's interesting but I've disabled it for now, in favor of Github Copilot
- Spellcheck
- to use NeoVim's built-in spellcheck
- use `:set spell`
- `]s`/`[s` to move between misspelled words
- `zg` marks a word as good, `zw` marks a word as wrong
- `zug`/`zuw` to undo
- `z=` to see suggestions
- Conform
- formart on save using LSP default formatter
- `<leader>+lf` to format the current file
- showkeys
- `<leader>+kk` to show keys screencaster on top-right corner
- symbols
- `<leader>+se` to fuzzy search symbols, e.g. emoji, LaTeX
- Lean
- set local leader to " " (tentative)
- `<localleader>+i` to toggle Lean Infoview
- `<localleader>+r` to restart Lean server for the current file
- Forester
- `<localleader>+c` to create a new tree and open it
- `<localleader>+b` to browse and search trees by title
- `gf` to go to the tree under the cursor
I wish to be able to call rust libraries that are WASM compatible to use it for both BSR and SSR.
Here is my initial survey:
- wasm-pack and wasm-bindgen are the two main tools for building and binding rust to javascript, more info see awesome-rust-and-webassembly
- I wish to write one-file Rust and specify the dependencies in the file, rust-script is similar in spirit but it's for server-side execution.
- I wish to find prior art on doing this with bun, found bun-ffi-rust and hyperimport, but again, it's for server-side execution.
Some links for rendering implicit surfaces [uts-000G]
One could use GlslEditor (which provides widgets to adjust colors, 1-3D numeric values etc.), GlslCanvas or glslViewer to play with the shaders. They are crreated by the authors of The Book of Shaders.
stackgl provides many packages for using in GLSL. LYGIA is a similar effort, targeting a wider range of languages: GLSL/HLSL/MSL/WGSL/CUDA, it also has great integration with many tools, including ObservableJS.
> Discuss on [/r/git](https://www.reddit.com/r/git/comments/1lvbzo8/my_day_1_with_jj_jujutsu/) and [Bluesky](https://bsky.app/profile/iutensil.bsky.social/post/3ltnz23tp2k2z)
## TL;DR
I became productive with [jj](https://github.com/jj-vcs/jj) (Jujutsu, *dauntless* version control) on day 1. This is my story, along with my mindset changes, and delicious recipes. Scroll down to the end for a short list of when to use what command/alias.
## Why try `jj`?
Well, *dauntless* is not the official slogan, but I really need it to describe a liberating feeling much stronger than "fearless".
Git is an intimidating black box for some, but I don't fear it in my daily work, as I'm skilled enough to easily recover work if I messed up.
`jj`, on the other hand, gives me the ultimate power in the river of time:
> In `jj`, I can go anywhere in history, do anything, be done with it, and history will reshape itself accordingly.
I wouldn't know this when I first read [Steve's Jujutsu Tutorial](https://steveklabnik.github.io/jujutsu-tutorial/) months ago, it's well written but still a haystack to find the needle that would click with me.
[jujutsu on tangled](https://blog.tangled.sh/blog/stacking) intrigued me, by telling me "Jujutsu is built around structuring your work into meaningful commits", and showing how it helps contributors to iterate on changes with dependencies on other changes, individually reviewed, reworked, and merged. But what if I just use it on my personal repo, and don't care much about clean history?
[A Newbie's First Contribution to (Rust for) Linux: Git gud](https://blog.buenzli.dev/rust-for-linux-first-contrib/#git-gud) shows the author's Git workflow to contribute to Linux kernel with a lot of fast moving forks. The configuration for `rebase` and `rerere` seems really solving the roadblocks of contributing to a thriving project. But the author said: "Jujutsu is much better at this kind of workflow and what I was actually using most of the time. I first discovered this workflow while using Jujutsu and only later found out you could do something similar with Git." A refreshing remark! Maybe I do need native and friendly support for more advanced features, even just to alleviate my mind.
Eventually, pksunkara's [Git experts should try Jujutsu](https://pksunkara.com/thoughts/git-experts-should-try-jujutsu/), and his side-by-side workflow comparisons with Git, and his [`jj` configuration full of aliases](https://gist.github.com/pksunkara/622bc04242d402c4e43c7328234fd01c) that gave me the first boost, so I finally decided to spend a day with `jj`. I definitely no Git expert, but those aliases really look attempting, what if I can pick my few favorites?
I started the journey with a minimal goal. I just wanted to recover my Git workflow in `jj`, as `jj` coexists with a local Git repo, and I need `jj` to work with my GitHub (or any code forge I might migrate to in future). I learned much more, and they are much easier than I thought.
## Protect my main branch
First thing, I went to my GitHub settings and [protected my main branch](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/about-protected-branches), to at least forbid force push, in case (my misuse of) `jj` messes up the remote. ([Here](https://ricky-lim.github.io/blog/protect-your-main-branch-in-github/) is how to do even better from CLI.)
## Install and configure `jj`
I use [just](https://github.com/casey/just) to write code snippets as tasks. Installing `jj` on Mac, and prepare the configurations is simply:
```make
prep-jj:
which jj || brew install jj
mkdir -p ~/.config/jj
cp -f dotfiles/.config/jj/config.toml ~/.config/jj/config.toml
jj config set --user user.name "$(git config --get user.name)"
jj config set --user user.email "$(git config --get user.email)"
jj config list --no-pager
```
The [`dotfiles/.config/jj/config.toml`](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) is an adapted version from pksunkara's, and all changed places are annotated with `[NOTE]` comments and inspiring links. By copying to the right path, it becomes the user-wide configuration.
`jj config set --user ` will write additional and more personal information into the same file (assuming git has already configured these information), so they came later. Finally, the configuration is listed for inspection.
Every time I've updated the configuration, or I'm on a new machine, I'll run `just prep-jj` again, so I'm all set again. Your way to manage code snippets might be different, but it all boils down to similar few lines.
You might want to change the line `editor = 'hx'` back to favorite editor, as mine is now [Helix](https://github.com/helix-editor/helix).
It also assumed that you have installed [delta](https://github.com/dandavison/delta), a syntax-highlighting pager for git, diff, grep, and blame output. It's eye candy, and makes the diff much more clear.
## Initialize the Git repo
`jj` can coexist, or officially, *colocate* with a local Git repo. I place all my local Git repos under `~/projects/` so my just task looks like this:
```make
init-jj PROJ:
#!/usr/bin/env zsh
cd ~/projects/{{PROJ}}
jj status || jj git init --colocate
```
`jj status` will detect if it's already a `jj` repo, otherwise we will initialize `jj` to colocate with Git. I try to make all my tasks open to reentrance, so I can easily tweak and rerun them.
Actually it's already alias to `jj init` by pksunkara, but here we go with the origin command, to avoid locked-in. But later on, I'll use more aliases, and you may consult [the configuration](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) for the full command.
## Thrown into limbo
Once `jj` is initialized, we are no longer on any branch of Git. If we go back to `git` or `lazygit`, we'll see that we are on a detached state.
I don't know about you, but my feeling was like floating in the air, with no solid ground under my feet.
I felt much better when I learned that what I commit as *change*s in `jj` will still be reflected in Git as commits, and commits can still find their parents.
We can go back and edit a *change* then commit again, it would be become a new Git commit, but its identity as the original *change* remains the same. So changes can still find their parent changes.
Actually, a branch in Git, is just a reference always point to the latest commit of the branch, so its history can be traced via parents.
`jj` has *bookmark* for pointing to a *change*. Unlike in Git, when you commit more changes on top of it, the *bookmark* doesn't move to the latest change automatically, however, *we* can move it.
With these 2 powerful concept, we are lifted off ground, start flying among changes, with bookmarks marking our path.
## The working copy
OK, we are not on any branch, so where are we?
`jj log` (or simple `jj l` or even `jj`) gave me something like these at the top:
```
@ rrptyzpt 0f23c873 (empty) (no description set)
◆ vmzsxwzu f800c4ed My last commit on the main branch
```
That's a lot to take in!
`f800c4ed` and `My last commit on the main branch` can be recognized as my last Git commit hash and message.
`vmzsxwzu` is the *change id* assigned to it, along with a diamond ◆ indicating that it's immutable: It's written on the stars by Git, so we can't edit it.
But what's the commit-like stuff on top of it? I didn't like it, and even tried to `jj abandon` it, but it keeps coming back with a new *change id* and a new commit hash.
Finally, I accepted it as the "working copy", indicated by `@`. Unlike in Git, where I'll be on HEAD, along with some local changes to be staged, in `jj`, I'm already on the future commit as a working copy. That's why sometimes we need `@-` that points to the parent of `@`, the last committed *change*.
`(empty)` means I haven't made any edits to this *change*. `(no description set)` means I haven't described what I'm going to do for this change (via `jj desc`).
I'll feel much better if it's just `(no description)`, as the `set` makes me feel guilty for not providing a description ahead. And it turns out that I didn't need to! Even after I'm done with it, and moved on (`jj` will keep the change in its history, like automatically doing `git stash`). If I have trouble recognizing the change without a description, I could always use `jj d` to inspect the diff.
`jj s` (status) shows it more verbosely:
```
The working copy has no changes. │
│Working copy (@) : rrptyzpt 0f23c873 (empty) (no description set) │
│Parent commit (@-): vmzsxwzu f800c4ed main | My last commit on the main branch
```
The change id of my working copy begins with `rr` (bold and colored, indicating we can refer to it uniquely with `rr`).
It's a bit annoying to enter and quit pager, and unable to see the latest changes, so I'v configured `jj` to use no pager and only occupy about 2/3 screen.
There's also alias `jj lp` that shows private (i.e. not pushed) changes, and `jj lg` that shows all changes, including elided changes that you just pushed.
## Commit
I worked on a repo that I'll also add my just tasks and configurations for `jj` into it. So I edit the justfile in my editor Helix, and saved it to disk.
Alias `jj d` will show the diff, that confirms that `jj` has seen our edits.
We can now commit the change. `jj commit` will commit everything, opening the configured editor for commit message (just save and quit to submit, quit without saving to cancel).
Aliases like `jj ci` will also fire up a built-in diff editor to interactively select which files, or even which parts (called sections) to commit. Arrow keys can fold/unfold files/sections, space to select, `c` to commit, and `q` to quit. (I couldn't find how to commit until I tried `c`, there is no visual hint at all, unless you use mouse to click on the menus, but I didn't realize that I could do it on day 1, I tried every possible key combination to trigger the menu, no luck).
I often find myself repeatly use `jj ci` to review, select and commit, to split my local edits into a few independent changes.
With alias `jj cim <message>`, we can directly supply the commit message from CLI without opening an editor, then select which parts to commit.
## New
After commit, we'll again on an empty change with no description, so we can continue making new edits.
But if after working for a while, I suddenly want to discard my (uncommitted) edits, I can do `jj n rr` to start over from the last change. The uncommitted edits will be still there in `jj` as a change, but local files will be reset to the state when we commited `rr`.
Here `rr` could be any other change id, or any bookmark, or any *revset* in general. We'll learn more about *revset* later, for now, we just need to know that a change id, a bookmark, `@` and `@-` that we know so far are all *revset* s.
`jj n` (new) effectively spins off a new history line from the parent change. We can go along the line for a few changes, and spin off from anywhere again, without having to create a named branch.
## Edit
In Git, I seldom amend a commit, usually I just use `reword` in lazygit to amend the commit message, or `reset --mixed` (`gm` in lazygit) to discard history after a commit, then stage and commit them in a better way again.
In `jj`, if I don't like a change, I could go back and edit it:
```
jj e <revset>
```
will auto stash any dirty edits, and reset local files to the state of that `<reveset>` (read: change), I can happily do any edits, or commit it with alternative selections of files or sections.
We don't need to explictly commit, when we use `jj n <revset>` or `jj e <revset>` to jump to anywhere else, the local edits will be automatically committed to the original change.
## Evolution of a change
Wait, what if I messed up when editing a commit, and left it, can I go back?
At that time, it's beyond what `jj undo` can easily do, especially we might be confused by what we did and didn't do.
The first thing we can do is to inspect how a change has evolved over time by
```
jj evolog -r <revset>
```
It looks like this:
```
❯ jj evolog -r uo
○ uoxoknux 92fb8d3a (no description set)-- operation a0693515ab31 (2025-07-09 11:54:16) snapshot working copy
○ uoxoknux hidden 1f73cdfd (no description set)-- operation 79813357f450 (2025-07-09 11:52:58) snapshot working copy
○ uoxoknux hidden c2bf8f4f (empty) (no description set)-- operation a22c21ce9fc1 (2025-07-09 11:52:32) new empty commit
```
We can inspect the diffs of each hidden commits with their commit hash via `jj dr <revset>` (alias for `diff -r`) where `<revset>` should be the commit hash to disambiguate.
The we can use either `jj n` or `jj e` on the commit hash of our choosing, to continue work.
Note that the change id would become ambiguous, so we need to `jj ad` one of the commit hash so we can continue to refer to the change id.
It looks like this for me:
<details>
<summary>click to expand</summary>
<pre>
❯ jj n 1f73
Working copy (@) now at: unyrkpru 003d7750 (empty) (no description set)
Parent commit (@-) : uoxoknux?? 1f73cdfd (no description set)
Added 0 files, modified 1 files, removed 0 files
❯ jj e 1f73
Working copy (@) now at: uoxoknux?? 1f73cdfd (no description set)
Parent commit (@-) : mxowtxvr 68b37035 (no description set)
❯ jj evolog -r uo
Error: Revset `uo` resolved to more than one revision
Hint: The revset `uo` resolved to these revisions:
uoxoknux?? 92fb8d3a (no description set)
uoxoknux?? 1f73cdfd (no description set)
Hint: Some of these commits have the same change id. Abandon the unneeded commits with `jj abandon <commit_id>`.
❯ jj ad 92f
Abandoned 1 commits:
uoxoknux?? 92fb8d3a (no description set)
Rebased 1 descendant commits onto parents of abandoned commits
</pre>
</details>
We may also use `jj op diff --op <operationid>` to inspect meta changes of an operation (it has `--from` `--to` variant so it also works on a range of operations).
## PR with an ad hoc branch
> see also the official doc [Working with GitHub: Using a generated bookmark name](https://jj-vcs.github.io/jj/latest/github/#using-a-generated-bookmark-name)
If I like my changes so far on the current history line, I could open an unnamed PR by first commiting changes, then using alias
```
jj prw
```
where `prw` stands for PR working copy.
Per our config, it will push to a branch named `jj/<changeid>` on origin, then open a PR via `gh` (Github's official CLI to interact with issues, PRs, etc.), it will ask you for a title (default to the description of the first change), and a body, we could just press enter all the way to submit, and change anything later.
Personally I think such a tiny ad hoc PR is a good way to record the proposed changes, even for a personal repo. But if pusing to a branch is all you need, alias `jj pscw` (push changes of working copy) will do.
After merging the PR, we need
```
jj git fetch
```
to let both Git and `jj` to learn about the changes.
Here is how `jj l` looked like after I PR a change "Make a start on trying jj" as `#1`, merged it on GitHub, then fetched the changes:
```
◆ rzyozkll 929fe190 main Make a start on trying jj (#1)
│
◆ vmzsxwzu f800c4ed My last commit on the main branch
```
`gh pr view #1 -w` can open the PR in the browser.
If we have made edits to the change (but not new changes on top of it, because they would be pushed to their own `jj/<changeid>` branch), like `jj e` or `jj sqar` (introduced below), we can use
```
jj pscw
```
where `pscw` stands for pushing local changes of working copy.
The result looks like this:
```
Changes to push to origin:
Move sideways bookmark jj/stwxmxoqkyym from 33c46b992efc to 4d3c2a60e0d2
remote: Resolving deltas: 100% (3/3), completed with 3 local objects.
```
If we have made other changes on top of the change, we can treat `jj/<changeid>` as a named branch like below (i.e. `jj bs jj/<changeid> -r <latest changeid>` and `jj psb jj/<changeid>`).
## PR with a named branch
This would best resemble a usual GitHub workflow.
First we create a local bookmark:
```
jj bs <bookmark> -r <revset>
```
`<revset>` could be just a change id, `@-` could come in handy to refer to the last committed change.
`<bookmark>` is the name of the bookmark, just like a normal branch name. And `bs` is the alias for `bookmark set`.
We can see if the bookmark is correctly placed from `jj l`. If we don't like it, we can either `jj undo` or `jj bd <bookmark>` (bookmark delete).
Then we can PR that bookmark:
```
jj prb <bookmark>
```
This would push to a branch named `<bookmark>` on Github, then use `gh` to PR it.
It looks like working on a branch in Git, but after we commit more changes, we need to call `jj bs -r <revset>` to set it to desired `<revset>`, then
```
jj psb <bookmark>
```
Note that the alias begins with `ps` instead of `pr`, meaning "push bookmark".
## Fetching
What if we committed some changes on GitHub, like accepting a review suggestion, we need to use `jj git fetch` to make `jj` know about the changes. We'll see something like
```
bookmark: our-name-of-bookmark@origin [updated] tracked
```
`origin` is the remote on GitHub. `[updated]` means `jj` noticed the changes on the remote, `tracked` means the local bookmark is also automatically updated.
If it's not tracked, we can use
```
jj bt our-name-of-bookmark@origin
```
to track it.
Here is how `jj l` looks like when I PR a few changes in this way then fetched it:
```
@ nozpulso fd589b2a (empty) (no description set)
○ kpxkrxvm a22ed20a configure-jj* Add frequently used jj command alias
○ tvvoymuu 6808370c Keybindings for builtin diff editor
○ ylnlzpvx b20ac2d1 Adapt jj config from pksunkara
○ rnkttumw f6296b0a Improve configs in `just init-jj`
◆ rzyozkll 929fe190 main Make a start on trying jj (#1)
◆ vmzsxwzu f800c4ed My last commit on the main branch
```
## Rebase
But I didn't get such a clean history from the start. I'd begun with a few fixup commits, and one of the commits is mixed with different changes. And initially, I pushed it with an ad hoc branch, accepted a few review suggestions on Github.
After opening the PR, I've noticed a weird non-empty commit `xm`
```
○ ...more changes...
◆ rzyozkll 929fe190 main Make a start on trying jj (#1)
│ ○ xmnykxym 4dcb41d9 (no description set)
├─┘
◆ vmzsxwzu f800c4ed My last commit on the main branch
```
After inspecting the diff, I believe its changes is aborbed into `rz` (`#1`), so I rebased `xm` on top of `rz` to see if there is any diff left.
```
❯ jj rb -s xm -d rz
Rebased 1 commits to destination
```
where `rb` stands for `rebase`.
I just need to remember that I'll be rebasing source (`-s`) onto destination (`-d`).
Now I have
```
○ ...more changes...
│ ○ xmnykxym 2f622327 (empty) I think this is what absorbed into #1
├─┘
◆ rzyozkll 929fe190 main Make a start on trying jj (#1)
```
Nice, it's indeed empty, so I can be assured no work is lost.
It's an extremely simple use case, but I'm happy with being able to transplant a parallel change to be on top of another change, without much hassle.
## Squash
There are some other complications:
```
@ ...more changes...
│ ○ zquplvzx 2b572336 configure-jj Update dotfiles/.config/jj/config.toml
│ ○ nmuntxon 746cb300 Fix typo found by copilot
├─┘
○ ...some changes in between...
○ ylnlzpvx b20ac2d1 Adapt jj config from pksunkara
○ rnkttumw f6296b0a Improve configs in `just init-jj`
```
I want to absorb typo fixes `nm` and `zq` into `yl`, how do I do that?
It's as simple as correctly specifying `from` and `to`:
```
❯ jj sqa --from zq --to y
Rebased 3 descendant commits
Working copy (@) now at: srnwoqzq c831d68c (empty) (no description set)
Parent commit (@-) : kpxkrxvm 7d911838 jj/vzzzxqvnptov* | Add frequently used jj command alias
Added 0 files, modified 1 files, removed 0 files
```
where `sqa` is short for squash. Note from the output that any changes after them will be properly rebased.
There is also a shortcut if I just want to absorb a change into its parent: `jj sqar <revset>`.
I keep using the word "absorb" but `jj absorb` does a completely different thing (it splits a change, allowing some other changes to absorb it, by "moving each change to the closest mutable ancestor where the corresponding lines were modified last", that sounds like "desolve" to me).
## Summary
I could go on and try splitting a change, but the above is enough for day 1.
This doesn't demonstrate the full power of `jj`, but imagining how the same things could be done in Git, I'll definitely rather spend the time with `jj`.
I'll wrap up with a summary of the commands/aliases that would be enough for daily work:
### Getting started
| alias | when to use |
|-------|-------------|
| `jj init` | init `jj` in a local Git repo |
### Show
| alias | when to use |
|-------|-------------|
| `l` | show log, `jj` works better |
| `lp` | show private log |
| `lg` | show all log, even elided changes |
| `s` | show status |
| `w` | show status and diff |
| `d` | show the diff of current change |
| `dr <revset>` | show the diff of revision `<revset>` |
### Edits
| alias | when to use |
|-------|-------------|
| `n <revset>` | create a new change based on `<revset>` |
| `cm <message>` | commit everything with `<message>` |
| `ci` | commit the change, choose interactively, write commit message in `$EDITOR` |
| `e <revset>` | edit any `<revset>` after commit |
`<revset>` could be a short change id, a bookmark, and defaults to @ (working copy).
After edit, just go anywhere else (with e or n), the changes will be commited; or an explicit commit will do.
### Working with remote branch
| alias | when to use |
|-------|-------------|
| `git fetch` | will update local bookmark from remote branch |
| `bt <branch>@origin` | track remote branch |
### PR with named branch
| alias | when to use |
|-------|-------------|
| `bs <bookmark> -r <revset>` | move bookmark `<bookmark>` to `<revset>` |
| `psb <bookmark>` | push bookmark to remote branch named `<bookmark>` |
| `prb <bookmark>` | create a PR from `<bookmark>` to the default branch |
### PR with ad hoc branch
| alias | when to use |
|-------|-------------|
| `prw` | open (committed) working copy as a PR |
| `pscw` | push changes of working copy to remote branch named `jj/<changeid>` |
### History cleanup
| alias | when to use |
|-------|-------------|
| `undo` | undo last jj operation when messed up |
| `rb -s <src> -d <dst>` | rebase `<src>` onto `<dst>` |
| `sqar <revset>` | squash `<revset>` into its parent |
| `sqa --from <src> --to <dst>` | squash `<src>` into `<dst>` |
| `ad <revset>` | abandon a change |
| `evolog <revset>` | show how a change evolve in time |
| `op log` | show log of operations on jj |
| `bd <bookmark>` | delete a bookmark |
These aliases can be found in this [config file](https://github.com/utensil/forest/blob/main/dotfiles/.config/jj/config.toml) which is based on pksunkara's [gist](https://gist.github.com/pksunkara/622bc04242d402c4e43c7328234fd01c).
Trying Zig's self-hosted x86 backend on Apple Silicon [uts-016K]
> TL;DR: I used `colima` to run an x86_64 Docker container (Ubuntu) on Apple silicon, to quickly test `zig build` with LLVM backend and with self-hosted backend, because it's both exciting and concerning (for missing all the goodies from LLVM) news.
After Self-Hosted x86 Backend is Now Default in Debug Mode dropped, I immediately wanted to try it out, but I only have an Apple Silicon machine (e.g. Mac Mini M4 Pro).
## Running an x86 container on Apple Silicon
According to Intel-on-ARM and ARM-on-Intel, I'm supposed to be able to run x86_64 Zig using `lima` with Apple's native virtualization framework and Rosetta. After some fiddling and searching, I've realized that I could just use `colima` to run an x86_64 container on an ARM64 VM, which is also backed by `lima`.
OK, let's get started:
First, install `colima` and prepare it properly:
```bash
# we need `docker` CLI as the client
which docker || (brew install docker; brew link docker)
# (optional) while we are at it, get `docker-compose` and `kubectl` too
which docker-compose || brew install docker-compose
which kubectl || brew install kubectl
# install colima
which colima || brew install colima
# this is to prevent othere docker daemons from interfering
docker context use default
```
Next, let's start colima with Apple's native virtualization framework and rosetta:
```bash
colima start --vm-type=vz --vz-rosetta
```
Because I have already used colima before, but without these flags, there is a warning saying that they are ignored, so I have to delete the existing colima VM and start over.
(Warning: the following command will also DELETE all existing images! So I commented out them to prevent accidental execution.)
```bash
# colima delete
```
Now, we can pull an x86_64 Docker container with Ubuntu:
```bash
# asssuming `docker login` has been done already
docker pull --platform linux/amd64 ubuntu:jammy
```
and start it (`--rm` means to remove the container after it exits, so we'll lose the changes made inside, remove this option if you want to keep the container):
```bash
docker run --rm -it --platform linux/amd64 ubuntu:jammy bash
```
Inside the container, let's confirm that we are indeed running x86_64:
```bash
uname -m
```
Cool, I'm seeing `x86_64`!
## Running `zig build`
Now, we can install Zig and try it out:
```bash
# we need a few basic utils
apt update
apt install -y wget xz-utils software-properties-common git
# Download the corresponding Zig version with self-hosted x86 backend
wget https://ziglang.org/builds/zig-x86_64-linux-0.15.0-dev.769+4d7980645.tar.xz
tar xvf zig-x86_64-linux-0.15.0-dev.769+4d7980645.tar.xz
# Make it available in PATH
export PATH=/zig-x86_64-linux-0.15.0-dev.769+4d7980645/:$PATH
# Verify its version and that it runs
zig version
# got: 0.15.0-dev.769+4d7980645
```
Let's create a simple Zig project to test building it:
```bash
mkdir projects
cd projects
mkdir zig_x86_64
cd zig_x86_64
zig init
zig build
```
Success!
`zig build run` gives
```
All your codebase are belong to us.
Run `zig build test` to run the tests.
```
and `zig build test --summary all` gives:
```
Build Summary: 5/5 steps succeeded; 3/3 tests passed
test success
├─ run test 1 passed 7ms MaxRSS:5M
│ └─ compile test Debug native cached 68ms MaxRSS:44M
└─ run test 2 passed 7ms MaxRSS:5M
└─ compile test Debug native cached 67ms MaxRSS:44M
```
## Comparing with and without LLVM
But wait, how do I know it's actually using the self-hosted x86 backend?
Hopefully someone has a better way, I just took the longer way to force Zig to build with and without LLVM.
After reading the doc and some searching, I figured out that I could expose an extra option to `zig build` in my `build.zig` to set the corresponding flag for the executable, with only 2 edits:
```zig
// EDIT 1
const use_llvm = b.option(bool, "use_llvm", "Force use llvm or not") orelse false;
const exe = b.addExecutable(.{
.name = "zig_x86_64",
.root_module = b.createModule(.{
.root_source_file = b.path("src/main.zig"),
.target = target,
.optimize = optimize,
.imports = &.{
.{ .name = "zig_x86_64", .module = mod },
},
}),
// EDIT 2
.use_llvm = use_llvm,
});
```
(Optional) I did the edits by running the following to install a helix editor so I can edit Zig code out-of-the-box:
```bash
# https://docs.helix-editor.com/package-managers.html#ubuntu-ppa
add-apt-repository -y ppa:maveonair/helix-editor
apt install helix
# then fire up `hx build.zig` and use it mostly like in Vim
# I also installed zls by
# cd /tmp
# wget https://builds.zigtools.org/zls-linux-x86_64-0.15.0-dev.197+48fb941e.tar.xz
# tar xvf zls-linux-x86_64-0.15.0-dev.197+48fb941e.tar.xz
# cp -f zls /usr/local/bin/
# and checked that it works by
# hx --health zig
# so I can use `gd` to go to definitions!
```
Cool, now let's try building with LLVM:
```bash
rm -rf .zig-cache && time zig build -Duse_llvm=true
```
```
real 0m3.068s
user 0m3.610s
sys 0m0.363s
```
Then without (which should be the x86 self-hosted backend):
```bash
rm -rf .zig-cache && time zig build -Duse_llvm=false
```
```
real 0m2.112s
user 0m2.812s
sys 0m0.361s
```
Wow, it's indeed faster without LLVM! I've tested this a few times and getting consistent results. I'll also try this on more complex projects later, but it's so exciting that I just wanted to write a note for this.
### UPDATE 2025-06-15
I further tried using poop to get more metrics.
First, get and install `poop`:
```bash
cd /projects && git clone https://github.com/andrewrk/poop && cd poop && zig build && cp -f zig-out/bin/poop /usr/local/bin/
```
Then let's run the cold start builds again:
```bash
cd /projects/zig_x86_64
rm -rf .zig-cache* && poop "zig build -Duse_llvm=true --cache-dir .zig-cache-llvm" "zig build -Duse_llvm=false --cache-dir .zig-cache-nollvm"
```
Well, that doesn't work due to permission denied. And `--cap-add PERFMON` or even `--cap-add SYS_ADMIN` didn't work. Not even `--privileged`. See this issue.
Let's try `hyperfine` instead:
```bash
apt install -y hyperfine
```
Taking comments by `mlugg0` on reddit into account, a few factors should be ruled out for a more fair comparison (with C):
1. rule out the build time for `build.zig`;
2. rule out the overhead of panic handler
3. squeeze a bit of performance at the cost of some safety by disabling C sanitization.
1 means to build `build.zig` before the benchmark and after cleaning the cache (observing that `zig build --help` will build `build.zig` in order to get the options defined in the build script).
2 means to add the following to `main.zig`:
```zig
/// Don't print a fancy stack trace if there's a panic
pub const panic = std.debug.no_panic;
/// Don't print a fancy stack trace if there's a segfault
pub const std_options: std.Options = .{ .enable_segfault_handler = false };
```
3 means to pass `.sanitize_c = .off` to `root_module` in `build.zig`.
With
```bash
hyperfine --prepare "rm -rf .zig-cache* && zig build --help -Duse_llvm=true && zig build --help -Duse_llvm=false" "zig build -Duse_llvm=true" "zig build -Duse_llvm=false"
```
I got
```
Benchmark 1: zig build -Duse_llvm=true
Time (mean ± σ): 1.392 s ± 0.052 s [User: 1.287 s, System: 0.126 s]
Range (min … max): 1.329 s … 1.473 s 10 runs
Benchmark 2: zig build -Duse_llvm=false
Time (mean ± σ): 546.1 ms ± 13.6 ms [User: 570.1 ms, System: 128.7 ms]
Range (min … max): 532.9 ms … 575.9 ms 10 runs
Summary
'zig build -Duse_llvm=false' ran
2.55 ± 0.11 times faster than 'zig build -Duse_llvm=true'
```
which is indeed even faster!
discussions on /r/Zig, Bluesky
This is no way a professional prediction, just summarizing some ideas popped into my head in a recent discussion with friends. Read this with a grain of salt.
Reasoning, agent, and multimodality are top 3 research topics for LLMs, as far as I can tell. And for me, there is always the topic of efficiency or scalability. And of course, beyond that, world models, and intelligent robotics.
Verifiable reinforcement learning boosted the reasoning performance as explored by DeepSeek and others, but it requires much more output tokens, to a factor of almost 10. That would be 10x inference compute, and to solve the same problem in the same time, you would need another 10x compute to achieve that. Such a 100x compute demand, is not desirable.
I was wrong in 2023, about the inefficient inference of LLMs. I didn't like the quadratic scaling of the attention mechanism, but it turns out that there are more ways to work around the bottleneck than seeking linear attention such as RWKV and other alternatives. Inference time scaling, and MoE are both game changers.
So I would expect myself to be wrong again about the cost of reasoning. And there are preliminary researches showing that reasoning is already inside and can be further trained into the LLMs.
That is, reasoning should not be happening at the output layer, but intermediate layers. Writing down the reasoning process, should merely serve as a memory of explored paths, to keep things on track, or materialize the box to think out of.
The waste is not just on the input side, there are significant waste of input tokens by prompts that are not part of the instruction nor the context, but guidance. These guidances should not just be cached, but "compiled", in a more efficient and controllable manner, while remaining the interpretability.
Prompt engineering and reasoning by output tokens are two sides of the same coin: we are reaching better results in an interpretable manner, but much more expensive than it should be, and it would become obvious once we find the right principle. Of course, history relies on such detours to get back on track.
Agent is another layer of waste, on top of individual inferences. Agents fall into fruitless planning and exploration, failing to snap out of the infinite loop, by acquiring the extra information or triggering the extra side effects. Their interactions with tools (e.g., MCP), agents (e.g., A2A), and humans (e.g., A2H) are all inefficient due to using natural language as the intermediate representation.
Imagine an alien species with a far more efficient language, like the one depicted in *Arrival*. Such a language could express complex concepts like tool invocation with parameters using just a single token - or perhaps a single multimodal token combining multiple meanings.
There should be special dense tokens, to represent tools or even ways to think (e.g., the MCP for "Sequential Thinking"). And why should a token just be an embedding of a high dimensional vector? Why isn't it a curve, a hypersurface, or a manifold?
Take some time to think about this, while showing the visualization of the Biology of a Large Language Model on the screen.
I believe using modularity other than human perceptions, as the intermediate representation, is the key to the next breakthrough. They would also represent the world model better.
At the end of the day, it all comes down to human-level efficiency to reach human-level intelligence. And the efficiency will come from using the right principle, which will also result in better performance.
And all of reasoning, agent, multimodality, efficiency, world model, and robotics, are all covered by this one integrated solution.
And it's non-trivial to hunt their families down.
But yeah, that's pretty much my job description, that is, to set up mechanisms to eliminate similar problems from all systems in future.
They are from different kind of systems and varying technical stacks, so it would be interesting to pin these specimens down for a display.
How to hunt their families down are left as exercises for the readers.
## Legacy configuration removal
The system in question is a proxy server that routes requests to different backends that are responsible for communicating with different third-party APIs. Two keys in the request are used to determine the backend: `party` and `service`, which indicate *who* does *what*, respectively. For the same pair of these keys, even if there is only one type of backend, there could be multiple instances in different availability zones (AZs) and regions.
There is a subsystem (let's call it the "Switch") to route requests based on the health of these instances. It's vital for the switch to always work, so the requests won't go to a bad instance that is already unresponsive.
The Switch will reuse the *static* configurations of the proxy that lists combinations of `party` and `service`, along with *dynamic* instance information. The static and dynamic information as a whole is written to a cache, whenever an arbiter detects unhealthy instances. The proxy will use the cache to change its routing dynamically.
There was a special party (let's call it "Bob") that requires 2 steps/services (namely "Pre" and "Main") to complete what other parties can do in 1 step ("Main"). These 2 steps are stateful in the sense that if an intance can't do "Main", requests should not be routed to "Pre" either.
So when the Switch reads the blob from the cache, a special check is in place to ensure "Bob" & "Pre" are in the combination list, if they are not, the switch will reject to read, and report an error as a monitoring metric which didn't set up a proper alerting threshold for sudden increase in error rate, because it's an error that never happened before, and is missed when SREs setting up the thresholds for important metrics.
10 years passed by, and "Bob" can also do it in one step, "Main". There are also many legacy combinations that are not used anymore. One day it becomes necessary to remove unused combinations, the includes the combination "Bob" & "Pre".
Bang! The Switch rejects all new writes to the cache for 6 days and no one noticed because no major outage happened, the error rate due to unhealthy instances was low enough to be ignored. The issue is only discovered when the same person is observing the Switch for a different deployment, and it took almost a day to figure out the root cause.
The interesting part? The same person who removed the legacy configuration placed those special checks 10 years ago, the reviewer of the removal also reviewed the check 10 years ago. The check was just an unthoughtful precaution, recalling the check and predicting its ramifications on config removal didn't cross the minds of both of them, even during the initial stage of debugging.
## Too many concurrent slow queries
There is a batch processing system (let's call it the "Sink") that loads different kinds of data from asynchronous nodes of production database clusters, according the configured rules about data source, conditions, how to process etc. There are many rate limits in place to prevent the system from overwhelming the database clusters, such as limiting the concurrent number of tasks, and the threshold of the number of concurrent queries on the same database node. It's also an assumption that the asynchronous nodes are not used for user-facing online queries but other batch processing systems.
The rate limits gradually converged to a balance that maximizes the throughput without overwhelming the nodes even during stress tests of high data volume.
There is also a separate monitor metric that tracks slow queries over a certain threshold, and the threshold was also converging, so SREs could focus on the top issues.
One day, many different databases nodes were reporting high load, only for a few seconds each, but continuously affecting an online service that happens to be using asynchronous nodes, as the service is considered less time-sensitive thus tolerable to the delay that asynchronous nodes might have. SREs and DBAs are confused for quite a while, until a technical lead who guessed it might be caused by the batch processing system Sink, which turned out to be true.
The developers of Sink were also confused, as the concurrent queries on the same node, and the slow queries were both within the limits.
The truth did come out in the end. There was 2 tasks configured in Sink, which will be broken down into many sub-tasks for different time spans, and subsequently, queries for different shards of the same table across different asynchronous nodes. And there is a bug in the task configuration that caused slow queries, and the bug was copy-pasted to affect both tasks. Both tasks were running slow slow, but the database nodes could handle the load along with online queries, as the concurrent slow queries are well below the CPU cores that the nodes could use.
One that day, both the tasks were requested to be rerun for 2 time spans by the same person who introduced the bug, which caused concurrent slow queries on the same node to exceed the core count, blocking the online queries. But as the queries are rate-limited so they hit different nodes in a rotating manner, each node was only affected for a few seconds, the overall impact could only be observed by SREs monitoring the online services, and DBAs considered the phenomenon as a normal load spike.
The lesson learned? Each piece of measures were working individually, but they could not account for combinations of different issues, namely too many *concurrent* (and) *slow* queries, and only the combination could cause an impactful issue. And the information about different parts of the system (the online service, database nodes, and Sink) were not shared across teams.
## OOM when retiring old worker processes
This story is about an SOA framework written in C++ (let's call it the "Framework") that will spawn many worker processes to handle requests for a service, and when a new version of the service is deployed, it will gracefully retire the old worker processes, and spawn new ones of the new version to handle requests. The framework also integrates a few plugins to extend its functionality, and the plugins have their own lifecycle management, e.g. `init`, and `fini`.
One of these plugins (let's call it "Charlie") will starts an asynchronous thread for each worker process, and signals the termination of the thread in `fini`. It recommends an explicit call to `fini`, but `fini` was considered trivial enough to be also called in the destructor of the plugin. It's a bad practice in the eyes of any C++ expert, but it worked well for a long time, there was also no issue calling `fini` twice, once explicitly and once in the destructor.
Every few ten thousand deployments, there is an OOM ("Out of memory") when retiring the old worker processes, but it will be killed per cgroup thresholds on memory consumptions, so only few requests are impacted overtime, eventually SREs concluded that the OOMs are not isolated events.
It was a nightmare to reproduce the issue without know the cause upfront, and code looked innocent enough in review. Finally enough data were collected on rare reproductions, and the clues pointed to `fini` in the destructor of Charlie, but there is only the signaling in `fini`. Further investigation revealed that the asynchronous thread was sending a heartbeat request which caused the OOM.
During sending the request, there was a call to a method using local const static `std::array` holding some `std::string` initialized by literals. That static variable had been destroyed when the asynchronous thread is still alive, which would be signaled to terminate later in `fini` in the destructor of Charlie. The destructor was doing its thing not too non-trivial, but too late, due to the undefined execution order of destructors.
Why would this cause an OOM? After destruction, the heap memory involved were still valid, as they were not released but reused by the allocator, thus there is no SEGFAULT, but the memory for `size()` of the strings contains garbage value, which happens to be a large number, and the strings were appending to a `std::stringstream` to form the heartbeat request, causing endless memory allocation until the OOM.
The issue became more reproducible after using an allocator (`tcmalloc`) that would retain the heap memory longer after destruction. It would have been caught by various sanitizers if we have integrated them in CI.
## Silently hanging tasks
It's again about the batch processing system Sink. Internally it uses an MQ to queue tasks, so it can retry failed tasks. To prevent aggressive retries, there is a backoff mechanism that will exponentially increase the delay between retries, and the maximum number of retries is also limited.
One day, an incident caused all available zones of the MQ to drop a certain percentage of network packets, causing a stable failure rate of task dispatching. If the retries are performed in a regular rate, the aggregate delay of tasks would not be too long.
Unfortunately, the backoff mechanism backfired, along with other rate-limiting mechanisms, eventually causing all task to hang indefinitely. Again, existing metrics were not able to pinpoint the issue, and the developers took some time to figure out the root cause. But as the incident causing all SREs to be in a all-hands-on-deck mode, the hung were noticed quickly from less sensitive monitoring metrics, resulted in a relatively timely fix.
A more sensitive monitoring metric was deployed to detect hanging tasks, and it caught another issue a few days later, and the issue was handled swiftly before causing actual delay.
The issue was that there was a silly bug that runs `remove()` for every element in a huge array, and overall it would take `O(n^2)` time to finish, it's almost forever for such huge arrays. The removal would only be triggered when certain conditions are met, and the bug of removal was copy-pasted from some other utility package.
The bug should only hang one processes that handled the task, unfortunately, the dispatcher would detect timeout tasks, considered the process as dead, reassign the task to other processes when the task was still occupying the old process which was burying its head deep in the `remove()` calls, missing its own checkpoint for timeout. The re-assignment happened over and over again for each timeout, eventually causing all processes to be busy with the same task, and the `remove()` calls would almost never finish. The world literally stopped for that task.
With early detection of hanging tasks, developers were able to try fixes quickly. It should not be easy to find the root cause, but the developers noticed long pauses between two log entries, and the removal loop was the only code there when they wanted to add some logging between the log entries, thus the obvious bug was immediately identified.
It's really lucky that the first issue that was not too serious helped to deploy the more sensitive monitoring metric, so the second issue, which could be much fatal, was detected and fixed promptly, on the morning of the first day of a national holiday. Task re-assignment were then limited to prevent poluting more processes with a bad task.
## LLM's takeaways
Disclaimer: The above stories are slightly modified from the actual events, and they are mannually written by me, not generated by LLM.
I asked DeepSeek to write some takeaways as I was a bit tired after telling these stories. This is the version I'm happy with:
1. **Legacy safeguards breed hidden traps**
Undocumented checks (Bug 1) and copy-pasted code (Bugs 2, 4) linger as systemic debt. Always audit dependencies before removal or reuse.
2. **Test failure *combinations*, not thresholds**
Rate limits + slow queries (Bug 2), retries + packet loss (Bug 4): isolated safeguards fail when risks collide. Model interactions, not silos.
3. **Own lifecycle control explicitly**
Destructor chaos (Bug 3) and hanging tasks (Bug 4) stem from assuming cleanup. Govern threads, memory, and resources with strict ownership.
4. **Instrument for why, not what**
Missing alerts for config errors (Bug 1), slow queries (Bug 2), or hung processes (Bug 4) delay fixes. Metrics must expose root causes, not just symptoms.
**Core**: Systems fail in layers. Anticipate overlap, govern lifecycles ruthlessly, and monitor for causality—not just correlation.
Well, at least I hope so, for 2025.
I started keeping a learning diary on math and tech since September 2024.
Before that, I was working on a series of math notes, in a Zen mode, for months. I annotated PDFs in Margin Note 4, and forged definitions, diagrams into Forester trees which are bi-directionally linked notes that can be rendered into beautiful web pages and LaTeX PDFs, until I'm satisfied. During that time, I didn't read news, unrelated books, and solely focused on the books and papers needed for polishing the notes. I held my breath until one note reached a certain stage, so I might move on to a next note.
It was a mental paradise.
Then came the learning diary idea. Since September, constant context switch had dominated me at work. My attention for learning, had inevitably become fragmented. I started to check out interesting stuff from social media, still about math and tech, they are indeed potential elements of what I might want to work on. So I wished to log these readings or just findings daily, so they are not lost in time.
The habit was well-kept, to the end of 2024, even when the load at work had become unbearably distracting and exhausting. The breadcrumbs would be a gold mine for future me.
But it was a disaster to the mental state of my mind.
During that time, I managed to go back to a few open-source projects that I was working on. Most of them have almost no users, but I still feel the obligation to keep them up to date. Usually I just had little time for each of them, so after some chores, I ran out of time and energy for new features, and moved on to chores for a next project. This partially reflected my mental state caused by my fragmented learning.
The world had become an endless curiosity for me, the potential to work on a new project inspired by an element just learned about, was dazzling.
When I stood at the beginning of 2025, I realized that I'm almost fruitless in 2024. I didn't finish the notes before September, although they are nice in their partially completed form. I didn't create anything in a whole after September. I enjoyed the reward of the journey, but came to the end empty-handed.
I no longer log my learning daily in 2025 due to various reasons. My work and life still consume a lot of my time at the beginning of a year, like always. I still log them, but almost just weekly.
April is coming, I would hope to reach a similar Zen mode like the better half of last year. Hence, this post is poured out of me.
A customizable, hybrid approach to code & resource pipeline and embedding [uts-000X]
## TL;DR
This is the proposal I mentioned in #82. It's an informal RFC about what I wish to do about embedding both the code and the resource (primarily images) generated in forester output.
The code is there so that the resource generated is fully reproducible, the resource is there so that if the consumer of the forester output (e.g. browser or LaTeX) could gracefully downgrade to use the resource instead. And the pipeline handles regeneration when the source code or dependencies (e.g. imported external files, used assets) change.
## Feasibility
The proposal is already feasible with current forester implementation with some external add-ons, so it might not bring too much complexity to immediate forester development. It can be viewed as not a feature request, but a way to make use of existing features whose equivalents may or may not be included in future forester development, e.g. XML/XSLT etc. That said, it would be nice to have some degree of native support from forester to smooth things out. Actually, this proposal is inspired by how forester handles LaTeX diagrams embedding, with visions on more diagram tools and beyond.
Now that let's assume the feasibility is not a problem, and take a closer look at the approach.
## The customizable, hybrid Approach
The "customizable" part in the approach means the user can choose the kind of tools and pipelines (e.g. parameter customzation or post-processing) to generated the resource, as well as the format(s) of the resource. Currently, the only natively support tool is `latex`, with fixed parameters (e.g. no `--shell-escape`), and a pipeline to generate DVI then post-process it with `divsvgm`, and the format is SVG, with a naming convention of hashing the source code to guarantee browser-cache invalidation of the image whenever the source dode is changed. The pipeline also watch the source change and trigger the regeneration and copying related assets to the output, to an extent that it would be triggered by changes of upstream macro (as it would impact the source code) but not including some `.tex`/`.sty` files that's "imported" in the source code, but forester has no way of detecting that in the current implementation.
Imagine all these can be customized for a second, it would be a great productivity boost for the users, but would this bring a ton of complexity into forester? Not neccessarily. If we view forester as a kernel, then it could just provide a minimal infrastructure to allow all these to happen, instead of handling all tools, pipelins, and formats by itself.
The "hybrid" part in the approach, at least in the web context, means the code can be executed both in browser (browser-side render, BSR for short, which might result in a more interactive experience, e.g. WebGL animations), and in server-side generation (SSR). The technical possibility thanks to that JS libraries that can be executed both in browser and in a JS runtime on server-side (e.g. node.js, deno), and that native libraries can be executed both on server, and in browser as WASM, that includes programs written in C, Rust, Go and more.
In the LaTeX context (or equivalently Typst or other authoring tools), the hybrid approach means whether to execute the code inside the document environment to regenerate, e.g. to preserve the same font and style settings, or to just include the externally generated diagrams. The current forester-to-latex takes the former approach, and usually results in a more desired, style-consistent result, while the latter may have the advantage of stabibility and efficiency, as the diagram could have unexpected errors or deformations during regeneration, and it could be more time consuming to regenerate too many diagrams when you are just authoring some text, and expect to preview the result in real-time.
Now it become clear that this hybrid approach facilitates both the reproducibliity and the graceful downgrade, as long as there is a customizable pipeline to handle it.
## A teaser: authoring tool candidates
Let's get a bit more concrete about the tools that could be used in the customizable pipeline, and see how it opens to endless possibilities. Here are some candidates that I have experimented with:
- Rich text (including math, diagrams)
- Markdown: a easy-to-write, source-code-readable format
- both BSR and SSR capable
- it plays well with forester markups, as fr XML nodes are transformed to HTML tags to be rendered as is by the markdown renderer
- it can downgrade to plain text (but preserving rendered forester markups) when the markdown renderer is not loaded
- experiment: test markdownit (via markdown-it )
- it's at least very convenient for writing paragraph/link/list/footnote-heavy notes compared to forester markups
- plus, users could write the same text for forester, or for external communication, such as this very proposal which is written in Markdown, submitted to forester ticket system, but also rendered in my forest
- Typst
- both BSR and SSR capable
- it can be made to plays with forester markups, to support links/cross-references/citations etc.
- it renders to text-selctable SVG, indistinguishable from ordinary text
- besides text and math, it can be used as a modern diagram authoring tool with decent scripting support
- experiment: test typst (via typst.ts )
- Syntax-highlighted code blocks
- both BSR and SSR capable
- experiment: test syntax highlighting (via shiki )
- Static diagrams
- Penrose: creating diagrams with a declarative language with separation of domain knowlege, layout constraints&styling, and concrete substances
- both BSR and SSR capable
- experiment: test penrose
- Pikchr: an open-source text-to-diagram tool, the language is simple yet has great support for relative positioning and flexible path guiding
- experiment: test pikchr
- SSR first, BSR-capable thanks to WASM
- D2: a modern diagram scripting language, preferred over PlantUML, Mermaid, and Graphviz, for its great layout support for complex diagrams and its declarative design.
- no BSR experiments yet, but I have used its SSR with other static site generators
- Animated, interactive diagrams
- WebGL
- experiments: test implicit surface shader 1, test interactions between ray-marching and 3D objects
- still trying to work out a SSR pipeline, but combining `xvfb-run`, headless-gl, three.js, should make it possible, e.g. see this gist.
- a less preferred alternative is to use the SVGRenderer addon from three.js, but its render result is visually worse than WebGLRenderer, per my experiment in test three.js to SVG.
For SSR for other output format, users could use XSLT to transform the XML output to make the downstream tools (e.g. LaTeX) to consume the source code, or the resource produced by the SSR pipeline. If neither is consumable, the same content could be excluded, using a macro like `webonly` to surround the forester markup.
Note that these experiments are mostly done for only BSR. SSR experiments are not done for forester, as I have learned the feasibility from my experiment with other static site generators. Without SSR gracefully downgrade, it will take a bit time to load on first visit.
## Elements of an implementation external to forester
Currently, forester can:
- consume `.tree` files, and output `.xml` files, from which the final HTML are produced by XSLT on browser side, or by LaTeX on server side
- allowing user to define macros, which could emitted HTML tags in the XML output (this could also be done as XML tags)
- forester 4.3 uses `fr:resource`, `fr:img`, `fr:resource-source` (with type and part attributes) for rendered SVG image, and the LaTeX source code, which is general enough to be used for other `type` of code that renders to images
To make it available to BSR, the users could:
- use XSLT to transform `fr:resource` tags to HTML tags with certain `class` attribute
- use CSS to render them to show the source code, or a loading animation, or a fallback image rendered by the SSR pipeline described below
- some JS can be loaded on demand to replace the the HTML tag with a rendered, potentially interactive HTML element
To make a SSR pipeline external to forester, the users could:
- define macros accepting the source code to emit `fr:resource`, `fr:img`, `fr:resource-source` following the forester practice on LaTeX diagrams, but with a filename specified by the user, instead of generated from the source code hash
- use a file change watcher (e.g. watchexec) to check the change of XML, then detect `fr:resource-source` presence, then render the source using external scripts to call the corresponding tools by the `type` attribute, and generate an image file with the specified filename
- use a browser cache bustering mechanism (e.g. use XSLT to generate a random number to be appended to the image URL) for reloading the latest image whenever the source code changes if the user refreshes the page
So far, BSR are implemented for the experiments mentioned above, and SSR could wait for the standardization of `fr:resource`, and the resolution of this proposal, or just use any other XML tags in a similar spirit.
## Concluding remarks
The text above is organized so that a reader might gradually see the full vision of the proposal, then the feasibility, and many possible path, downgraded or even upgraded.
To reiterate, this is already feasible with current forester implementation with some external add-ons, and will continue to work if the used features or their equivalents are available in future forester.
It might coincide with the direction of forester development (e.g. the on-going pipeline refactor might just provide the minimal native support needed for this approach to work more smoothly), or it might exceed what forester was originally designed for (thus should not pose a maintenance burden to forester). Hence this informal RFC.
I wish to use it to organize many definitions, theorems and discussions about the same mathematical concepts/topics.
My hope is to find these notes easier to reuse and refer to in blogs, papers, blueprints etc. Preferably, I would like to be able to author them in Forester markup, LaTeX, Markdown, Typst, and various diagram tools, freely and in a reproducible manner (see my post on this, and relevant experiments).
I wish this post to be a continuously updated list of my math interests in 2024 with proper citations to literatures, as I keep wandering in the math wonderland and I don't want to be lost in it without breadcrumbs.
Some interests that have older origins will gradually moved to corresponding posts for earlier years.
I also hope certain interests will be developed into research projects, and leaving only a brief summary and a link here.
Each interest should have one or few central questions, and one or few references to literatures.
This part of interests is about small-scale formalization of mathematical concepts and theorems, for learning and potential PRs to Lean's Mathlib. Each should focus on one reference which is well organized and convenient to be converted into a blueprint.
The Matrix Cookbook (November 15, 2012) [petersen2008matrix] covers many useful results about matrices, and Eric Wieser's project lean-matrix-cookbook aims to give one-liner proofs (with reference to the counter part in Mathlib) to all of them.
The project is far from complete and it would be great to claim a small portion of interested results and contribute to it. I also wish to figure out the GA counterpart of the same portion.
Previous interests about matrices rise from Steven De Keninck's work on GALM [de2019geometric], since the paper I have been interested in GA approaches that has practical advantages over traditional matrix-based methods. Notably the paper also discussed the link between degenerate Clifford algebras and dual numbers / automatic differentiation. A more recent inspiration might be his new work LookMaNoMatrices.
TODO: decide which results are interesting and feasible to be formalized for me.
I wish to pursue further on the topic of Matrix/Tensor, see [taylor2024introduction] and [randy2023matrix]. The former also led me to Einsums in C++. For the latter, I'm thinking of HepLean.SpaceTime.CliffordAlgebra.
> In a sense, group algebras are the source of all you need to know about representation theory.
The primary reference is [james2001representations] for understanding FG-module, Group algebra, the presentation of groups, Clifford theory (which is the standard method of constructing representations and characters of semi-direct products, see [woit2017quantum], and "3.6 Clifford theory" in [lux2010representations]), Schur indices etc. We also need to check [lux2010representations] for its introduction to GAP, and we should pay close attention to the progress of GAP-LEAN. [sims1994computation] might also be interesting in a similar manner as [lux2010representations] but with emphasis on the presentation of groups.
See also group algebra on nlab, particularly that "A group algebra is in particular a Hopf algebra and a \(G\)-graded algebra."
The related Zulip thread is here, and I have preliminary explorations and experiments in Lean here.
This interest originates from reading Robert A. Wilson's work [wilson2024discrete]. The ultimate goal is to understand the group algebra of the binary tetrahedral group (\(Q_8 \rtimes Z_3\)), then the three-dimensional complex reflection group (\(G_{27} \rtimes Q_8 \rtimes Z_3\)), a.k.a. the triple cover of the Hessian group, which can be interpreted as a finite analogue of the complete gauge group \(U(1) \times SU(2) \times SU(3)\).
Recently I have read some meta-level dependent type theory (Typst source). It might be time to re-read `leantt` paper, and start reading `lean4lean` paper/source.
The author `ice1000` has strong interest in QIIT (Quotient Inductive-Inductive Types) and QIIR (Quotient Inductive-Inductive Recursion), he has implemented `overlap` in Aya with termination check and confluence check.
Aya has a philosophy that the kernel could include pattern matching, but at the cost of no generation and translation of eliminators at present. In principle this is feasible, without `overlap`, it could be implemented by "theory of signatures", with `overlap`, it needs "Coherent and concurrent elimination for initial algebras" which I find fascinating, and have read [yeasin2011initial].
The author also has a great article on TT & Cat [zhang2021type]. His recommendation of [gundry2013tutorial] is also worth reading.
In the process of learning Topos, I wish to have a better understanding of Logic. Particularly, Curry–Howard–Lambek correspondance, Propositional truncation, and Paraconsistent logic are on the plate.
Although remotely related, I'll place Introduction to Formal Reasoning (COMP2065) in this section so I won't lose track of it.
Type theories, logic have their origins in philosophy. Lacan had drawn inspiration from the work of the mathematician and philosopher of science Georges Canguilhem, and the philosopher of mathematics Alain Badiou. If he had lived to see the rise of proof assistants, he would be interested in the formalization of his theories.
There is a project on Github called Lacan-Mathemes which visualized some core concepts of Lacan in TikZ. It would be interesting to start there and look for proper objects to formalize in Lean. Another interesting thing to do is to visualize them in Typst.
Here I would like to quote a generated response from Claude 1.2 Instant, which is quite optimistic:
> Here are a few thoughts on formalizing aspects of Lacanian psychoanalytic theory using theorem provers:
>
> - Lacan's structural theory of the psyche could potentially be formalized using logic. The tripartite structure of the Real, Symbolic, and Imaginary could be modeled as formal domains or ontologies with defined relationships between them.
> - Key concepts like the mirror stage, the Name-of-the-Father, the objet petit a, etc. could be defined as logical predicates or functions operating within this structural framework. For example, one could define predicates like "inMirrorStage(subject)" or "desiresObjetPetitA(subject, object)".
> - Relations like the split between desire and drive, the tension between the Symbolic order and the Real, the misrecognition of the Imaginary could be expressed through logical rules and inferences between concepts.
> - The dynamics of psychoanalytic concepts like repression, sublimation, transference could be modeled as state transitions or logical transformations within the structured system.
> - Specific psychoanalytic theories like the stages of psychosexual development or the topology of the fantasy could be axiomatized and theorems deduced from the axioms.
>
> However, capturing the open-ended, non-deterministic nature of unconscious processes and the ambiguity/contingency of signification would be challenging and may require non-classical logics or probabilistic approaches.
>
> So in summary, while difficult, certain aspects of Lacanian theory seem amenable to formalization using tools from logic, ontology, and knowledge representation. Careful design would be needed to address theory's complexity.
Lately there is a paper formalizing Kant [achourioti2011formalization] which might be interesting.
Generalized Clifford Algebra and related mathematical branches [uts-001Z]
A Clifford algebra is a \(Z_2\)-graded algebra, and a Filtered algebra, the associated graded algebra is the exterior algebra.
It may be thought of as quantizations (cf. quantum group) of the exterior algebra, in the same way that the Weyl algebra is a quantization of the symmetric algebra. Specifically, for \(V\) an inner product space, the symbol map constitutes an isomorphism of the underlying super vector spaces of the Clifford algebra with the exterior algebra on \(V\), and one may understand the Clifford algebra as the quantization Grassmann algebra induced from the inner product regarded as an odd symplectic form.
Weyl algebras and Clifford algebras admit a further structure of a *-algebra, and can be unified as even and odd terms of a superalgebra, as discussed in CCR and CAR algebras.
A Clifford module is a representation of a Clifford algebra.
A Generalized Clifford algebra (GCA) can also refer to associative algebras that are constructed using forms of higher degree instead of quadratic forms, e.g.
> For \(q_1, q_2, \ldots , q_m \in \mathbb {k}^*\), the generalized Clifford algebra \(C^{(n)}\left (q_1, q_2, \ldots , q_m\right )\) is a unital associative algebra generated by \(e_1, e_2, \ldots , e_m\) subject to the relations
> \[ e_i^n=q_i \mathbf {1}, \quad e_i e_j=\omega e_j e_i, \quad \forall j \lt i . \]
>
> It is easy to see that \(C^{(n)}\left (q_1, q_2, \ldots , q_m\right )\) is \(\mathbb {Z}_n\)-graded where the degree of \(e_i\) is \(\overline {1}\), the generator of \(\mathbb {Z}_n\). [cheng2019new]
In [cheng2019new], note also that "Clifford algebras are weak Hopf algebras in some symmetric tensor categories." while "generalized Clifford algebras are weak Hopf algebras in some suitable braided linear categories of graded vector spaces." as well as that "the Clifford process is a powerful technique to construct larger dimensional Clifford algebras from known ones."
TODO: add papers linking Hopf algebra and Clifford algebra together learned from the adjoint discord here.
Clifford algebras can be obtained by twisting of group algebras [albuquerque2002clifford], where twisted group algebras are studied in [conlon1964twisted], [edwards1969twisted], [edwards1969twisted2].
There exists isomorphisms between certain Clifford algebras and NDAs (Normed Division Algebras) over \(\mathbb {R}\).
Variants of Clifford algebras whose generators are idempotent or nilpotent can be considered. Zeon algebras ("nil-Clifford algebras") have proven to be useful in enumeration problems on graphs where certain configurations are forbidden, such as in the enumeration of matchings and self-avoiding walks. The idempotent property of the generators of idem-Clifford algebras can be used to avoid redundant information when enumerating certain graph and hypergraph structures. See [ewing2022zeon].
It's also closely related to universal enveloping algebra (see [figueroa2010spin] and "The universal enveloping algebra of a Lie algebra is the analogue of the usual group algebra of a group." from group algebra on nlab).
Great discussions about the limitations and generalizations of Clifford algebras can be found in John C. Baez's [baez2002octonions]. Particularly, note Cayley-Dickson construction, Bott periodicity, matrix algebra, triality, and \(\mathbb {R}\) as a real commutative associative nicely normed ∗-algebra. Also see Anthony Lasenby's work on the embedding of octonions in the Clifford geometric algebra for space-time STA ( \(\mathop {\mathcal {C}\ell }(1, 3)\) ) [lasenby2024some].
Note also Kingdon algebras: alternative Clifford-like algebras over vector spaces equipped with a symmetric bilinear form [depies2024octonions].
A categorical view of Clifford Algebra is discussed in [figueroa2010spin].
A Clifford Algebra can be categorified: "An Clifford algebra over a vector space is defined to be the Koszul dual to an abelian fully weak Lie-algebra" where "Fully weak Lie-algebras are Koszul dual to differential graded Clifford algebras." See also Higher Clifford Algebras.
Duffin-Kemmer-Petiau Algebra is
\[\frac {T(V)}{I(v\otimes w\otimes v - g(v,w)v)}\]
in the same way that Clifford Algebra is
\[\frac {T(V)}{I(v\otimes v - g(v,v))}.\]
See [fernandes2022clifford] which embeds DKP Algebra in Clifford Algebra with projectors.
The preferred reference for me is [izhikevich2007dynamical] for its applications in neuroscience and various excellent diagrams. But it's not a mathematically rigorous treatment of bifurcation theory.
My first impression of sheaves is that they are useful to local-to-global applications "which ask for global solutions to problems whose hypotheses are local".
Roughly speaking, a sheaf requires some gluing conditions (axioms "Locality" and "Gluing") so that local data can be collated compatibly into a global algebraic structure that varies continuously over local covering domains ("sections" of sheaves).
To do so, a sheaf in general, as defined in the category-theoretical language, needs
- a topological space (or a site in general), denoted \(X\) (or \(\mathcal {C}\) for a site)
- a category, sometimes denoted \(\mathcal {D}\), meaning "data category", whose objects are algebraic structures and morphisms are structure-preserving maps
and builds (gluing conditions) on a \(\mathcal {D}\)-valued presheaf over \(X\) (or \(\mathcal {C}\)), denoted \(\mathcal {F}\) (as its French name is "faisceau"), which is essentially a contravariant functor \(\mathcal {F}: \mathcal {C}^{op} \to \mathcal {D}\) but a concept with an attitude, and its morphisms are restriction maps between open sets in \(X\) (or between objects that satisfy the pretopology \(\mathcal {J}\) in \(C\), where \(\mathcal {J}\) is the pretopology on \(\operatorname {Open}(X)\), which specifies when a covering family of open sets exists).
Its latest application to deep learning, Thomas Gebhart's thesis [gebhart2023sheaf] sees a sheaf over a topological space as a data structure "which defines rules for associating data to a space so that local agreements in data assignment align with a coherent global representation", thus a generalization of both:
- relational learning, which aims to "combine symbolic, potentially hand-engineered prior knowledge with the tabula rasa representational flexibility of deep learning to achieve a synthetic model family which can be defined with respect to symbolic knowledge priors about the data domain"
- geometric deep learning, which "provides a group-theoretic approach to reasoning about and encoding domain-symmetry invariance or equivariance within machine learning models",
"providing a mathematical framework for characterizing the interplay between the topological information embedded within a domain and the representations of data learned by machine learning models".
My prior interest in geometric deep learning, particularly group-equivariant neural networks, and my believe in symbolism, are the background of my interest in sheaf representation learning.
Notably, this thesis treats the discrete case of sheaves, a cellular sheaf, whose
- topological space is a cell complex, which is "a topological generalization of a graph, with set inclusion and intersection given by the incidence relationships among cells in the complex", thus "admitting a computable, linear-algebraic representation".
- data category is \(\mathtt {FVect}\), the category of finite-dimensional vector spaces over a field \(\mathbb {F}\), which is a common choice for the data category in machine learning applications, a model-free approach with massive parameter space, flexible representational capacity, but inherits fundamental limitations, e.g. data inefficiency, generalization failure, and interpretability issues.
For more details, see also Thomas Gebhart's talk Sheaves for AI: Graph Representation Learning through Sheaf Theory.
<!-- has the potential to learn sheaf representations of data, which is essentially assigning observed data to a space in a way that's
globally consistent with the constraints imposed by the topology of the application domain .
a richer but still tractable representation that is an algebraic structure richer than a vector space commonly used in ML, to sampled data, which is in the form of components of a category (e.g. a category of a data structure like a Graph). -->
Its application to physics has the potential to formulate differential geometry in a more general setting, without assuming the existence of a locally Euclidean space as manifold did. It's believe that this approach can overcome some difficulties in Quantum field theory even Quantum gravity, because locally there might be no concept of a metric space at all [mallios2015differential].
Note that there are CAS systems that can do sheaf cohomology etc., e.g. Macaulay2, OSCAR.
Lately I became aware of the work on mathematical models of consciousness, namely Integrated information theory [albantakis2023integrated] and The information theory of individuality [krakauer2020information].