danbooru2020 dataset | request for a link to the dataset

Posted under General

Hey,
I would like to write a master's thesis on the use of deep neural networks to recognize fictional characters in original and fan-art graphics styles. Unfortunately, I couldn't find the server rsync address or torrent address for danbooru2020 dataset.
I tried searching for it on https://gwern.net/danbooru2021#danbooru2020 and https://huggingface.co/datasets/deepghs/danbooru2024?not-for-all-audiences=true, but i found danbooru2024.

This is my first attempt at downloading a dataset like this. I've used Kaggle before and finding the link was easy https://www.kaggle.com/datasets/mahmoudreda55/satellite-image-classification

Duke_Axer said:

Hey,
I would like to write a master's thesis on the use of deep neural networks to recognize fictional characters in original and fan-art graphics styles. Unfortunately, I couldn't find the server rsync address or torrent address for danbooru2020 dataset.
I tried searching for it on https://gwern.net/danbooru2021#danbooru2020 and https://huggingface.co/datasets/deepghs/danbooru2024?not-for-all-audiences=true, but i found danbooru2024.

This is my first attempt at downloading a dataset like this. I've used Kaggle before and finding the link was easy https://www.kaggle.com/datasets/mahmoudreda55/satellite-image-classification

maybe ask someone who used it before to train an ai before? maybe they still have the files

We do have an official BigQuery dataset as mentioned in topic #12774.
https://github.com/danbooru/danbooru/commit/f235b72b3fd64f7164548dc7632ff01cd2966fef

For reference, the links are:

https://console.cloud.google.com/bigquery?project=danbooru1
https://console.cloud.google.com/storage/browser/danbooru_public
https://storage.googleapis.com/danbooru_public/data/posts.json

Though whether or not you can get technical support for that here is a different issue because I think the number of Danbooru users that know how to work with stuff like this can be counted on two hands. And I'm not one of them.

Why do you think they have this? That was compiled by external users, and they probably deleted it because it costs $ to host.

I don't think this site endorses using its content as AI training data, although I guess any site's API can be scraped and not much can be done to help.

use of deep neural networks to recognize fictional characters in original and fan-art graphics styles

But if you weren't lying here, that sounds... innocuous enough. Still, you are mostly on your own with trying to contact the user(s) who uploaded the original.

Moebits said:

I don't think this site endorses using its content as AI training data, although I guess any site's API can be scraped and not much can be done to help.

It doesn't, as stated on the contact page. But yes, any site can be scraped unless they make it basically unusable to average users.

1