NOTE: This site has just upgraded to Forester 5.x and is still having some style and functionality issues, we will fix them ASAP.

Learning diary › Year 2025 › July, 2025 › 2025-07-07 [2025-07-07]

- #ai-safety
    - A non-anthropomorphized view of LLMs
        - some quotes:
            - Alignment and safety for LLMs (should) mean that we should be able to quantify and bound the probability with which certain undesirable sequences are generated
            - Human thought is a poorly-understood process, involving enormously many neurons, extremely high-bandwidth input, an extremely complicated cocktail of hormones, constant monitoring of energy levels, and millions of years of harsh selection pressure
            - Navigating the dramatic changes of the next few decades while trying to avoid world wars and murderous ideologies is difficult enough without muddying our thinking.
    - Our first outage from LLM-written code
        - "There were two competing sources of signal here for what token to predict at the critical moment: transcription and local prediction. Transcription said break. Local prediction said continue. Unfortunately for us, local prediction won."
        - Prevention: cliboard tools
        - a comments on lobste.rs points out
            - move across files can be detected by `git --color-moved`
                - I prefer `--color-moved=dimmed-zebra` or better, `--color-moved-ws=allow-indentation-change`
                - see this tweet
                - it's also supported by `delta`
    - EU rules ask tech giants to publicly track how, when AI models go off the rails
        - "AI companies are moving to user interface innovations to try to grab more unwilling training individuals"
    - Hallucination
        - AI Hallucination Cases Database
        - The Sound of Silence
    - LLMs can now identify public figures in images
        - Claude always responds as if it is completely face blind
            - never identifies or names any humans in the image, nor does it imply that it recognizes the human
            - does not mention or allude to details about a person that it could only know if it recognized who the person was
            - if told by user who the individual is, can discuss that named individual
                - without ever confirming that it is the person in the image, identifying the person in the image, or implying it can use facial features to identify any unique individual
- #agent
    - Building personalized micro agents
        - agent: has access to tools, decides which tools to use, and in what order, determines when the task is complete
            - boils down to 9 LOC
        - micro agent: access to a very limited, highly specific set of tools
            - less confusion when choosing tools
            - works with small, local modelsA
            - safe autonomy
        - meain/esa: Fastest way to create personalized AI agents
    - The era of exploration
        - the immense cost of pretraining is effectively paying a massive, upfront “exploration tax.”
        - Exploration is deciding what data the learner will see
            - World sampling – deciding where to learn, i.e. a particular problem that needs to be solved
            - Path sampling – deciding how to gather data inside a world, e.g. random walks, curiosity‑driven policies, tree search, tool-use, etc.
                - recent work: curiosity objectives, open-endedness, meta‑exploration
    - Prompt Coding: No code edits, only complete rewrites #idea
        - When vibe coding, isn't the source code the prompt?
    - Adding a feature because ChatGPT incorrectly thinks it exists #idea
    - The Architecture Behind Lovable and Bolt
        - uses baml to engineer prompts using schemas
        - uses Beam, an open-source serverless cloud for sanboxed execution
        - made into beam-cloud/lovable-clone
    - How I keep up with AI progress
- interesting projects
    - Am I online?
        - `generate_402` pages could be used for this
    - Show HN: NYC Subway Simulator and Route Designer
    - Showh HN: Microjax – JAX in two classes and six functions
    - tinymcp: Let LLMs control embedded devices via the Model Context Protocol
    - Render your Jupyter notebooks in OpenGist