2025-05-02 [2025-05-02]
✍️source

#formal #os #web
- DeepSeek-prover-V2: Advancing formal mathematical reasoning via reinforcement learning for subgoal decomposition[ren2025deepseekproverv2]
    - A survey on post-training of large language models[tie2025survey]
        - notes on LM could be based on this survey and the following papers related to r1
    - 100 days after DeepSeek-R1: A survey on replication studies and more directions for reasoning language models[zhang2025days]
        - Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning[guo2025deepseek]
            - should revisit
    - found critics of r1/GRPO
        - Understanding r1-zero-like training: A critical perspective[liu2025understanding]
        - Does reinforcement learning really incentivize reasoning capacity in LLMs beyond the base model?[yue2025does]
    - Kimina-prover preview: Towards large formal reasoning models with reinforcement learning[wang2025kimina]
- found A Survey of Interactive Generative Video
- Polishing your typography with line height units
- Solving Sudoku with Algebraic Geometry and Computer Algebra : A C Programming Approach

Learning diary › Year 2025 › May, 2025 › 2025-05-02 [2025-05-02]
✍️source

Learning diary › Year 2025 › May, 2025 [2025-05]

2025-05-31 [2025-05-31]

2025-05-30 [2025-05-30]

2025-05-29 [2025-05-29]

2025-05-28 [2025-05-28]

2025-05-27 [2025-05-27]

2025-05-26 [2025-05-26]

2025-05-25 [2025-05-25]

2025-05-24 [2025-05-24]

2025-05-23 [2025-05-23]

2025-05-22 [2025-05-22]

2025-05-21 [2025-05-21]

2025-05-20 [2025-05-20]

2025-05-19 [2025-05-19]

2025-05-18 [2025-05-18]

2025-05-17 [2025-05-17]

2025-05-16 [2025-05-16]

2025-05-15 [2025-05-15]

2025-05-14 [2025-05-14]

2025-05-12 [2025-05-12]

2025-05-09 [2025-05-09]

2025-05-08 [2025-05-08]

2025-05-07 [2025-05-07]

2025-05-07 [2025-05-07-2]

2025-05-06 [2025-05-06]

2025-05-04 [2025-05-04]

2025-05-02 [2025-05-02]
✍️source

opening thoughts [lm-0006]

reference. 100 days after DeepSeek-R1: A survey on replication studies and more directions for reasoning language models [zhang2025days]

reference. A survey on post-training of large language models [tie2025survey]

reference. DeepSeek-prover-V2: Advancing formal mathematical reasoning via reinforcement learning for subgoal decomposition [ren2025deepseekproverv2]

reference. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning [guo2025deepseek]

reference. Does reinforcement learning really incentivize reasoning capacity in LLMs beyond the base model? [yue2025does]

reference. Kimina-prover preview: Towards large formal reasoning models with reinforcement learning [wang2025kimina]

reference. Understanding r1-zero-like training: A critical perspective [liu2025understanding]