A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit <a rel="external noopener" href="https://www.jstage.jst.go.jp/article/sicetr1965/5/4/5_4_378/_pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<i title="The Society of Instrument and Control Engineers">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2phyh6hw5vckxnkzptkaisrmki" style="color: black;">Transactions of the Society of Instrument and Control Engineers</a>
Many control systems in the technical processes have various unknown elements, which are generally time-varying. This is usually due to the change of environment around the control objects, arising from the time-variation of physical parameters (such as temperature, pressure, source-voltage, and so on). Therefore, it is necessary to regard these elements as the unknown functions of such parameters, not time-varying, and to learn these functions on-line. In some parts of control design problems,<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.9746/sicetr1965.5.378">doi:10.9746/sicetr1965.5.378</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/y74stq2ubfeftdptxy2yewahxi">fatcat:y74stq2ubfeftdptxy2yewahxi</a> </span>
more »... it is desired to know analytically the system characteristics related with such, parameters. From these points of view, this paper is intended to synthesize a learning control policy for a linear discrete system (corrupted by additive noises), containing unknown functions of measurable physical parameters. In this paper, it is assumed that the states of the system are observed in the subintervals of each control stage, to obtain the necessary informations for the on-line function learning. In this case, due to the additive system noises, the ordinary stochastic approximation method for function learning is not available. Therefore, the modified stochastic approximation method, which insures the convergence of the on-line unkown function learning in the entire state space, is considered and used to synthesize the suboptimal learning control policy, with the direct application of dynamic programming method. An another function learning method, which reuses the stored sampled informations by learning the sampling probability characteristics, is presented. As the examples, some simulation results on digital computer are shown for a two-dimensional control system.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190503115944/https://www.jstage.jst.go.jp/article/sicetr1965/5/4/5_4_378/_pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/96/c0/96c06e5788eb76bf503ce989d06d6dfee12f4430.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.9746/sicetr1965.5.378"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>