NOTE: This site has just upgraded to Forester 5.x and is still having some style and functionality issues, we will fix them ASAP.

Learning diary › Year 2025 › June, 2025 › 2025-06-28 [2025-06-28]

#context #datafusion #duckdb #game #git #gpu #news #os #physics #proof
- A Newbie's First Contribution to (Rust for) Linux
    - Raspberry Pi has a fork of kernel
    - it's possible to write out-of-tree kernel module in Rust
    - kconfig has great dependency complexity
    - git gud
        - fast pace and numerous active forks make patches quickly outdated and dependent branches hard to manage
        - pick a primary base branch and use `--squash` merges for other branches
        - keep commits on parallel branches logically independent
        - with some rebase and rerere config, use `git rebase -i <common base>` to modify the entire history tree
    - also found aha to turn ANSI colored output into HTML
        from Beautiful Terminal Output in Markdown
    - Rust in the Linux kernel: part 2
- #agent
    - Context engineering
    - definitions of "agents" collected by Simon Willison
        - the most precise one I think (from An Introduction to Google’s Approach to AI Agent Security)
            - AI systems designed to perceive their environment, make decisions, and take autonomous actions to achieve user-defined goals.
        - most liked by Simon Willison
            - an LLM wrecking its environment in a loop
        - mine combining a few
            - models autonomously acting in its environment, by using tools in a loop
    - MCP: An (Accidentally) Universal Plugin System
- Programming as Theory Building: Why Senior Developers Are More Valuable (on HN)
    - theory building without a mentor
    - Architectural Decision Records
        - ADR template
            - from An ADR for Use Markdown Architectural Decision Records
        - anti-patterns of ADRs
            - objectivity
                - Fairy Tale (aka Wishful Thinking): A shallow justification is given, for instance only pros but no cons
                - Sales Pitch: exaggerations and bragging
                - Free Lunch Coupon (aka Candy Bar): consequences are ignored accidentally or hidden deliberately
                - Dummy Alternative: made up options to make the preferred option shine and give the impression that multiple alternatives have been evaluated
            - time
                - Sprint (aka Rush): only one option is considered; only short-term effects are discussed
                - Tunnel Vision: only a local, isolated context is considered, e.g. developmental qualities are covered, but the consequences for operations and maintenance
                - Maze: the discussion derails and centers on details that are not relevant in the given context
            - record size and content nature
                - Blueprint or Policy in Disguise: the amount of details provided and/or a rather commanding, authoritative voice
                - Mega-ADR: too many details are stuffed into ADR
                - Novel and epic: like blueprint or mega-ADR, but with a writing tone of casual and jovial
            - magic tricks
                - non-existing or misleading context
                - problem-solution mismatch
                - pseudo-accuracy
        - Writing a good design document
- #ocaml
    - First thoughts on Rust vs OCaml
        - pattern matching can't penetrate `Box`, `Arc` etc.
        - a red flag: `Rc`/`Arc`, sync/async end up requiring different libary, e.g. `im` and `im-rc`
        - im:  immutable data structures for Rust
        - PureScript
    - Writing a Game Boy Emulator in OCaml
        - key design of a emulator
        - a middle-scale project in OCaml using some of its advanced features
        - now runs on WASM via `js_of_ocaml`’s WASM support
    - Learn OCaml
        - practical ocaml
        - the background story of OxCaml: Jane Street’s sneaky retention tactic
    - Identity and behaviour
        - functional programming naturally separates them
    - Why I chose OCaml as my primary language #ocaml
        - Why Lean 4 replaced OCaml as my Primary Language
    - Type-level programming for safer resource management (in Haskell)
    - A parser and interpreter for a very small language (in Haskell)
    - OCaml Programming: Correct and Efficient and Beautiful (on HN)
    - Words about Arrays and Tables (on HN) (in Haskell)
- #formal
    - Multi-Stage Programming with Splice Variables
        - an interactive demonstration in TS, the original implemtation is in Agda (examples)
        - typed meta-programming, to make code generation predictable and safe
        - provides precise control over the generation process and seamlessly scales to advanced features like code pattern matching and rewriting
        - the type system automatically tracks variable dependencies, ensuring that generated code is always well-formed, properly scoped, and type-checks correctly
- I deleted my second brain
    - Niklas Luhmann's Zettelkasten Archive
    - Johnny.Decimal: A system to organise your life
    - Howm: Write fragmentarily and read collectively
        - for emacs, but it might be interesting to me for a TUI for forester in Helix?
    - rule of three
- Notes on Epistemic Collapse
    - Epistemic collapse in science means losing shared reality due to unreliable info sources.
    - In physics, it's seen in tribalism: string theory tribes dominated despite failures, with research driven by trends over evidence.
    - Particle theory lacks experimental input, relying on "hot topics" under oligarchs, now facing funding collapse.
    - This shows how loss of truth-seeking harms scientific progress.
- #perf
    - Linux Performance Analysis in 60 seconds (2015) (on HN)
    - Announcing Sniffnet v1.4: it’s 2X faster than Wireshark at processing PCAP files
        - by focusing on extracting only the most relevant fields from the packets’ headers
    - Avoiding PostgreSQL Pitfalls: The Hidden Cost of Failing Inserts
        - `ON CONFLICT DO NOTHING` came to rescue
    - Toys/Lag: Jerk Monitor (on HN)
    - The Surprising gRPC Client Bottleneck in Low-Latency Networks (on HN)
- Run Coverage on Tests
    - a few subtle case where the test is not run, caught by coverage
    - on lobste.rs, there is another subtle case where assertions are not run due to `with pytest.raises(ValueError):`
- Sirius: A GPU-native SQL engine
    - plugs into existing databases via the standard Substrait query format, requiring no query rewrites or major system changes
    - based on
        - cuDF, a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating, filtering, and otherwise manipulating data
        - substrait, a cross platform way to express data transformation, relational algebra, standardized record expression and plans
    - supports DuckDB, will support Doris and DataFusion
- Structuring Arrays with Algebraic Shapes
    - Untyped programming invites primitives with excessive flexibility
    - resorting to dependent types (e.g. Futhark), makes types difficult to check – let alone infer, the programmer is given the burden of proof
    - a novel calculus, Star, with a type system that provides useful and expressive types, while also admitting type inference
    - use structural record and variant types for array indices and shapes (algebraic shapes), to reduce the annotation burden on the programmer
    - in the process of implementing a prototype, including type inference
    - on lobste.rs, doug-moen points out that it only supports pointful array programming styles instead of point-free style in the industry
- SymbolicAI: A neuro-symbolic perspective on LLMs (on HN)
    - a very interesting DSL to do semantic map lambdas, compare with different contexts for nuanced evaluation, combine facts and rules, and more
    - LOTUS: A Query Engine For Processing Data with LLMs
- Theoretical Analysis of Positional Encodings in Transformer Models (on HN)
    - ALiBi effectively extrapolates to longer sequences by imposing a monotonic distance bias
    - Wavelet-based encodings provide strong extrapolation, matching or surpassing ALiBi, due to exponential decay of high-frequency components beyond Nmax
    - the novel wavelet PE is a promising candidate for transformerbased tasks requiring extrapolation, combining strong theoretical properties with practical performance
- ZeQLplus: Terminal SQLite Database Browser (on HN)
    - written in V
    - UI might be inspired by Harlequin (in Python, supports more databases)
- omarchy: Opinionated Arch/Hyprland Setup
- learn about Dark Archive
- Researchers develop a battery cathode material that does it all #sci