TinyLLM: Learning a Small Student from Multiple Large Language Models
小样本llm论文阅读,百篇paper计划(3/100)
小样本llm论文阅读,百篇paper计划(3/100)
Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.
局部最优和鞍点都是处在grandient为0的位置,首先是找到这个位置,然后判断它是局部最优还是鞍点,利用海森矩阵从数学公式上就能判定。最后为了让损失函数更新能逃离鞍点,可以用海森矩阵或者动量两种方法。