NOTE: This site has just upgraded to Forester 5.x and is still having some style and functionality issues, we will fix them ASAP.

Learning diary › Year 2025 › July, 2025 › 2025-07-29 [2025-07-29]

- #zig #perf
    - Zig profiling on Apple Silicon | Bugsik
        - mstange/samply
            - a command line sampling CPU profiler which uses the Firefox profiler as its UI
            - for macOS, Linux, and Windows
        - verte-zerg/poop at kperf-macos
            - a fork of poop to use kperf under Mac
        - wolfpld/tracy: Frame profiler
            - A real time, nanosecond resolution, remote telemetry, hybrid frame and sampling profiler for games and other applications
            - C, C++, Lua, Python, Fortran, Rust, Zig, C#, OCaml, Odin, GPU
            - Windows, Linux, Android, FreeBSD, WSL, OSX, iOS, QNX
    - Generic Programming and anytype - Docs - Ziggit
    - Using Zig allocator for C libraries (Alignment question)
        - `Allocator.rawAlloc`/`.rawFree`
        - `@alignOf(std.c.max_align_t)`
    - ANDRVV/zprof: 🧮 Cross-allocator profiler for Zig
        - battle-tested in raptodb/rapto: transposition-heuristic storage, low memory footprint and high-performance querying
    - Writing memory efficient C structs (on HN)
    - 10^19712 years to exhaust u65535
        - u65535 is the integer type with the highest max value supported in Zig (from Zig doc)
- #physics
    - Background independence
- #crypto
    - Fintech dystopia (on HN)
- #perf
    - Giving Benchmarks a Boat
        - TPC-C is a benchmark for transactional databases that models an order system with goods spanning many warehouses
        - Each warehouse comes with some set amount of data to be managed and some amount of query traffic that adheres to some specific distribution laid out in the spec
            - they can account for
                - what's the proportion of hot vs. cold data
                - what's the ratio of transactions-per-minute to bytes stored
                - what's the proportion of read-only queries, vs. simple, point writes, vs. complex read-write transactions
                - what proportion of transactions go cross-shard
        - The metric: "we sustained the workload at X warehouses with Y hardware at Z cost."
- #movie
    - Nothing to watch – Experimental gallery visualizing 50k film posters
- #agent
    - Show HN: A GitHub Action that quizzes you on a pull request (on HN) #aicr

        > PR Quiz uses AI to generate a quiz from a pull request and blocks you from merging until the quiz is passed.

    - Principles for production AI agents (on HN)
        - context
            - modern LLMs just need direct detailed context, no tricks, but clarity and lack of contradictions
            - structuring context so that the system part is large and static and user one is small and dynamic works great
            - provide the bare minimum of knowledge in the first place, and the option to fetch more context if needed via tools
            - context compaction tools can help avoiding logs and other artifacts from the feedback loop to bloat the context
        - tools
            - good tools typically operate on a similar level of granularity, and have a limited number of strictly typed parameters
            - idempotency is highly recommended to avoid state management issues
            - LLMs are very likely to misuse your loopholes, and that’s why you don’t want to have any loopholes
                - v.s. human API users are more capable of reading between the lines, can navigate complex docs and find workarounds
            - designing an agent to write some DSL (domain-specific language) code with actions rather than calling tools one by one is a great idea
        - feedback loop
            - actor-critic approach: where an actor decides on actions and a critic evaluates them
                - allow actors to be creative, and critics to be strict
            - error recovery
                - a chain of bad fixes is not fixable anymore - just discard and try again
            - agents try alternatives because the explicitly requested tool call failed with crucial ingredients
                - tell them to ask for these ingredients before trying alternatives
            - baseline agent + logs + LLM analysis = insights for improvments
    - Q&A: Combining Math and LLMs
        - Raw math/algos and LLM are both powerful tools in your modeling toolkit
        - If you’re missing either tool from your toolkit, then that will severely hamstring your modeling abilities and many problems will remain inaccessible to you.
    - Show HN: Companies use AI to take your calls. I built AI to make them for you
        - top comment on HN introduced a great use case for LLM-back reception
            - local plumbing business
                - understand and repeated back the client's request in a slightly different, more professional way for confirmation
                - asked a few more smart follow-ups
                - the plumber called back and jumped straight into solutions, pricing, and his availability
            - prefer to talk to LLM, if the issue can be quickly triaged to the right human who actually understand the situation
            - synchronous interaction by bot
                - can perform first level troubleshooting, ask for clarification, begin to form a plan and get your buy-in
                - v.s. fire-and-forget email form
                    - incomplete reports, missing information, people who have no idea what they're talking about
    - Show HN: Terminal-Bench-RL: Training Long-Horizon Terminal Agents with RL (on HN)
        - Custom scaffolding (system prompt and tools) using Qwen3-32B achieved 13.75- The author has built an RL system, but it has not been used for anything due to cost limitations.
    - Stop selling “unlimited”, when you mean “until we change our minds”
        - the playbook
            - launch with generous/unlimited limits
            - build user dependency
            - add caps targeting "less than 5- frame as "sustainability" or "fairness"
        - that "5- power users with deep workflow integration
            - early adopters who took platform risks
            - team influencers who drive organizational adoption
            - $200/month Claude Max subscribers doing serious work
        - Tokens are getting more expensive
        - Claude Code weekly rate limits
    - The Rise of Vibeinsecurity
        - a story about how devs lost their jobs over vibe apps, and vibe apps are hacked by vibe hackers immediately after deployment
        - this is also an ad for a conference called HackAIcon
    - It's rude to show AI output to people
        - whenever you propagate AI output, you're at risk of intentionally or unintentionally legitimizing it with your good name, providing it with a fake proof-of-thought
    - Agentic Coding Things That Didn’t Work
- #sec #privacy
    - Tea app leak worsens with second database exposing user chats
- #idea
    - Why Try When Others Could Do Better?
        - Even the smartest, highest-agency people in the world are severely bandwidth constrained and don’t get around to doing most of the things they have the potential to do.
        - Compound that over and over again in some niche and you get hundreds, thousands, millions of miles ahead of the fastest runners who aren’t running down that niche.
    - Pragmatism in Programming Proverbs
        - many interesting quotes
    - Objects should shut up (on HN)
        - I'm also deeply annoyed by unnecessary alarms
    - Don’t Let Architecture Astronauts Scare You #system
    - I'm rebelling against the algorithm
    - Maybe writing speed actually is a bottleneck for programming
- #rust #metaprogramming
    - Advanced Rust macros with derive-deftly
    - rules_derive: deriving using macro_rules