Construction of a Japanese Financial Benchmark for Large Language Model Evaluation in the Financial Domain [in Japanese]

Masanori HIRANO

[Preprint] Dec. 8, 2023

Abstract

With the recent development of large language models (LLMs), the models focusing on the certain domain and language has been discussed in its necessity. There is also a growing need for benchmarks to evaluate the performance of current large language models in each domain. Therefore, in this study, we constructed a benchmark consisting of multiple tasks specific to the Japanese and the financial domain, and conducted benchmark measurements on some main models. As a result, we confirmed that the GPT-4 is currently outstanding and that the constructed benchmarks are functioning effectively.

Keywords

Large Language Model; Benchmark; Finance; Japanese;

doi

10.51094/jxiv.564

bibtex

@preprint{Hirano2023-pre-finllm,
  title={{Construction of a Japanese Financial Benchmark for Large Language Model Evaluation in the Financial Domain [in Japanese]}},
  author={Masanori HIRANO},
  doi={10.51094/jxiv.564},
  year={2023}
}