• 대한전기학회
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • 한국과학기술단체총연합회
  • 한국학술지인용색인
  • Scopus
  • crossref
  • orcid

References

1 
J. Kober, J. A. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1238-1274, 2013.DOI
2 
K. Shao, Z. Tang, Y. Zhu, N. Li, and D. Zhao, “A survey of deep reinforcement learning in video games,” arXiv preprint arXiv:1912.10944, 2019.DOI
3 
L. Ouyang et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, 2022.URL
4 
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.DOI
5 
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 25, 2012.URL
6 
H. V. Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1, 2016.DOI
7 
Z. Wang, T. Schaul, M. Hessel, H. Hasselt, M. Lanctot, and N. Freitas, “Dueling network architectures for deep reinforcement learning,” Proceedings of Machine Learning Research, vol. 48, 2016.URL
8 
A. Vaswani et al., “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, 2017.URL
9 
A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.URL
10 
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.DOI
11 
H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, and W. Gao, “Pre-trained image processing transformer,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.URL
12 
Y. Gong, C. I. J. Lai, Y. A. Chung, and J. Glass, “SSAST: Self-supervised audio spectrogram transformer,” Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, no. 10, 2022.DOI
13 
OpenAI, GPT-3.5: Language Models for Natural Language Understanding, https://openai.com, 2022.URL
14 
A. Dutech et al., “Reinforcement learning benchmarks and bake-offs II,” Advances in Neural Information Processing Systems, vol. 17, 2005.URL
15 
A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 13, no. 5, pp. 834-846, 2012.URL
16 
R. S. Sutton, and A. G. Barto, Reinforcement learning: An introduction, MIT press, 2018.URL
17 
V. R. Konda, and J. N. Tsitsiklis, “Actor-critic algorithms,” Advances in Neural Information Processing Systems, vol. 12, 1999.URL
18 
C. Dann, Y. Mansour, M. Mohri, A. Sekhari, and K. Sridharan, “Guarantees for epsilon-greedy reinforcement learning with function approximation,” Proceedings of the International Conference on Machine Learning, 2022.URL
19 
R. M. Schmidt, “Recurrent neural networks (RNNs): A gentle introduction and overview,” arXiv preprint arXiv:1912.05911, 2019.URL
20 
J. Chung, C. Gulcehre, K. Cho and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014.DOI
21 
S. Hochreiter, and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.URL
22 
G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, W. Zaremba, “OpenAI Gym,” arXiv preprint arXiv:1606.01540, 2016.URL
23 
T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” Proceedings of the International Conference on Machine Learning, 2018.URL
24 
M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, P. Abbeel and W. Zaremba, “Hindsight experience replay,” Advances in Neural Information Processing Systems, vol. 30, 2017.URL