The wealth of accelerometric recordings collected by the K-NET and KiK-net networks in Japan since 1996 provides a unique opportunity to improve our understanding of many important seismological research questions. Subsets of these data have been used for many case studies, most of them, however, not focusing specifically on the best practices for data selection and giving relatively little attention to the properties and peculiarities directly observable from the data. Yet for many applications, these steps are an important prerequisite for successful and reliable analysis. For this reason, we devote this article to the extraction of a large data set of surface and borehole recordings from the K-NET and KiK-net databases with strong emphasis on data quality and reliability. The final data set available for subsequent work consists of 78,840 records from 2201 earthquakes covering the Japan Meteorological Agency (JMA) magnitude range 2.7-8, observed at 1681 sites throughout Japan. We explain how this data set has been compiled, including automatic phase picking and relocation of events. We also present an overview of the general features of the data set, providing important information for subsequent analysis. Strong amplification effects at high frequencies are immediately visible on the surface recordings. Furthermore, there is a clear presence of downgoing waves in the borehole records, as deconvolution of borehole/surface recording pairs indicates.