As the issue discussed, I have tried it before: if 'detach()' is removed, the automate backward() function can't work. In my opinion, the history centroid should also be involved in the process of back-propagate. If it detached from the graph, how can it still work?
PyTorch moving average computation creates inplace operation
I also find a similar Q&A. Does the 'detach()' function applied in moving average work the same as Truncated Back-Propagation Through Time (TBPTT) Algorithm mentioned by the comment in the above Q&A? In this way, how can moving average still be working, since the history average is detached from gradient calculating? It seems just like a constant value in calculating.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744322840a4568504.html
评论列表(0条)