NOTE: This site has just upgraded to Forester 5.x and is still having some style and functionality issues, we will fix them ASAP.

Learning diary › Year 2025 › May, 2025 › 2025-05-02 [2025-05-02]

#formal #os #web
- DeepSeek-prover-V2: Advancing formal mathematical reasoning via reinforcement learning for subgoal decomposition[ren2025deepseekproverv2]
    - A survey on post-training of large language models[tie2025survey]
        - notes on LM could be based on this survey and the following papers related to r1
    - 100 days after DeepSeek-R1: A survey on replication studies and more directions for reasoning language models[zhang2025days]
        - Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning[guo2025deepseek]
            - should revisit
    - found critics of r1/GRPO
        - Understanding r1-zero-like training: A critical perspective[liu2025understanding]
        - Does reinforcement learning really incentivize reasoning capacity in LLMs beyond the base model?[yue2025does]
    - Kimina-prover preview: Towards large formal reasoning models with reinforcement learning[wang2025kimina]
- found A Survey of Interactive Generative Video
- Polishing your typography with line height units
- Solving Sudoku with Algebraic Geometry and Computer Algebra : A C Programming Approach