

“Bakusai.com” is officially launching a service to provide the rich conversational content from its forums to AI development companies.





The background for starting this service is that many companies developing generative AI models have been using data from the open web for training without proper authorization.

However, scraping data for training purposes has led to increased server load and inconvenience for general users. Incidents related to this issue have occurred.

In the United States, web services have implemented restrictions to avoid service degradation due to training, and some cases have even escalated to legal action against companies.

米X社(旧ツイッター) / U.S. X Corporation (formerly Twitter, Inc.)

2023年7月2日イーロンマスクのX投稿。極端なレベルのデータスクレイピングとシステム操作に対処する為、一時的な制限を設けると発表。引用元リンク[icon name=”up-right-from-square” prefix=”fas”]


To train generative AI models, general user conversations and information have been extensively collected and utilized for learning. X (formerly known as Twitter) also announced viewing restrictions around July 2023.※1

米各新聞社 / U.S. Newspapers

2023年12月末には米The New York Times社はOpenAIと米Microsoftに対して記事を無断に使ったとして提訴、4月にも米Chicago Tribuneなど8メディアも同様に提訴しております。※2

At the end of December 2023, The New York Times filed a lawsuit against OpenAI and Microsoft, alleging unauthorized use of articles. In April, eight other media outlets, including the Chicago Tribune, also filed similar lawsuits.※2

ニューヨーク・タイムズ、OpenAIおよびマイクロソフトを提訴:AIの著作権侵害で「数十億ドル」が請求されていると訴訟が主張。引用元リンク[icon name=”up-right-from-square” prefix=”fas”]
シカゴ・トリビューンやニューヨーク・デイリーニュースを含む8つの主要新聞が、OpenAIおよびマイクロソフトに対する法的反発に参加。引用元リンク[icon name=”up-right-from-square” prefix=”fas”]



In the future, such actions will increasingly be viewed as legal issues, and companies will aim to obtain data on a more solid foundation.

It has been reported that Reddit has reached an agreement with Google, allowing Google to use content posted by Reddit users for training its AI models. This contract is said to be worth approximately $60 million (around 9.7 billion yen) per year. Reddit has also announced a contract with OpenAI to integrate Reddit content into ChatGPT, with Reddit’s content being used to train AI models. Furthermore, Reddit has announced plans to continue selling its latest conversational data to AI development companies.

Reddit・・・A bulletin board-style social news site in the United States. It has 110 million monthly active users (Monthly Active Users).
OpenAI・・・A non-profit research organization based in San Francisco, USA, aimed at developing and promoting "ChatGPT."

爆サイ.comとは? / What is “Bakusai.com”?


Bakusai.com is one of Japan’s largest bulletin board services, boasting approximately 1.1 billion page views per month since its launch in 2000.

The service has 15 million monthly users and over 4,000 categories covering a wide range of topics, from news, sports, politics, and economy to entertainment, gambling, adult services, and games, with a focus on regional communities.

The user base has a gender ratio of 7:3 (male to female), and 60% of the users are between the ages of 25 and 44.

There are about 700,000 posts per day, totaling over 1.1 billion posts since its inception. As an anonymous bulletin board, it facilitates daily exchanges of information in a rich variety of expressions.

サービスを利用される企業様へ / To companies using our service



We at BakuSai are fully committed to improving the quality and accuracy of generative AI model services. However, we will continue to take comprehensive measures against any actions that could lead to the deterioration of BakuSai’s services.

If companies developing generative AI models require access to the rich conversational content held by BakuSai, please contact us via the form below.

引用ニュース / source news

※1:BBCニュースジャパン「ツイッターが閲覧制限、「データ強奪」への一時的対策とマスク氏 」[icon name=”up-right-from-square” prefix=”fas”]
※2:BBCニュースジャパン「米紙ニューヨーク・タイムズがオープンAIとマイクロソフトを提訴 著作権侵害で」[icon name=”up-right-from-square” prefix=”fas”]
