Database dump survey
lb54
@byte[]
Holy crepes! That’s some good stuff!
Any chance you could do a write-up of the before and after when you do make the alterations? (I see your patreon write-ups)
Also, coming from a current Drupal dev, hearing the word “normalization” really warms my heart a little. Wish I could do that for some of our stuff <_<
Back to the topic at hand, if be down for the dump!
Holy crepes! That’s some good stuff!
Any chance you could do a write-up of the before and after when you do make the alterations? (I see your patreon write-ups)
Also, coming from a current Drupal dev, hearing the word “normalization” really warms my heart a little. Wish I could do that for some of our stuff <_<
Back to the topic at hand, if be down for the dump!
Hereward
We could only concede if the images on the site remain readily accessible despite the operation in question.
barbeque
Yes.
There was a really awesome plugin some years ago, I forgot what it was called, it was basically what Related Images does except it was insanely accurate* (it did look a little bit at image tags, but also at “other users that favourited x have also favorited y”). I recall the algorithm behind it beingquite complex superlinear probably exponential, damn I hope I can find again which one that was.
* = the current Related Images actually isn’t bad, and it’s better than it used to be, the one I’m talking about was just in a different leagueand didn’t suggest so much anthro the whole time
There was a really awesome plugin some years ago, I forgot what it was called, it was basically what Related Images does except it was insanely accurate* (it did look a little bit at image tags, but also at “other users that favourited x have also favorited y”). I recall the algorithm behind it being
* = the current Related Images actually isn’t bad, and it’s better than it used to be, the one I’m talking about was just in a different league
DaMagics
Lewd Cuddlehorse :3
I would be interested in image / tag relations and hashing data and information for local finding/indexing/archiving and deduping purposes.
Especially if it can contain both original and derpibooru file hashes for the optimized/striped images.
Especially if it can contain both original and derpibooru file hashes for the optimized/striped images.
xbi
Is there timestamp of faving the picture? I am curious to research fave count growth dynamic with time. So if the pic has not only favoriters list but timestamp also, it will be useful for me.
BigBuggyBastage
I’m in much the same boat. I save images with short filenames, so a local database for those, and a database of image numbers and tags associated (relationship) would be extremely useful to me.
The only additional feature I would want is to be able compare my “personal/Derpibooru account” lists of ‘upvoted’ and ‘favorited’ images to a list of locally-stored images, to be sure nothing is missing from local storage. Is that kind of information even available?
Go fsck yourself
I would be interested in image / tag relations and hashing data and information for local finding/indexing/archiving and deduping purposes.Especially if it can contain both original and derpibooru file hashes for the optimized/striped images.
I’m in much the same boat. I save images with short filenames, so a local database for those, and a database of image numbers and tags associated (relationship) would be extremely useful to me.
The only additional feature I would want is to be able compare my “personal/Derpibooru account” lists of ‘upvoted’ and ‘favorited’ images to a list of locally-stored images, to be sure nothing is missing from local storage. Is that kind of information even available?
byte[]
Philomena Contributor
@C00lguy
No.
@lb54
Sure I guess. It’s not that complicated, but right now I think it just needs some time to sit on because it’s not presently a critical issue, it’s going to take a while to recreate the tables (> 20 minutes on my local test), and we would also like to “stack” downtime events as much as possible. That is, we’d prefer to do elasticsearch, postgresql, and apt updates all at once, instead of having a separate downtime for each.
@xbi
We don’t track timestamp information in faves. We track data that are timeseries-like (interaction surrogate keys are strictly increasing, and correlate well with images) but do not track actual time data.
@BigBuggyBastage
Upvotes will not be included in any dump.
@silbasa
No; everything that is present in dumps has to be something that was already public facing, meaning it could already be scraped. We also have no intention of providing dumps earlier than the previous day’s, meaning that requests for information wipes will show up the day after they happen, which is within the timeframe needed to do it. Note we also have a section in our privacy policy dedicated to “Information that may potentially be shared with third parties”; this is such information.
No.
@lb54
Sure I guess. It’s not that complicated, but right now I think it just needs some time to sit on because it’s not presently a critical issue, it’s going to take a while to recreate the tables (> 20 minutes on my local test), and we would also like to “stack” downtime events as much as possible. That is, we’d prefer to do elasticsearch, postgresql, and apt updates all at once, instead of having a separate downtime for each.
@xbi
We don’t track timestamp information in faves. We track data that are timeseries-like (interaction surrogate keys are strictly increasing, and correlate well with images) but do not track actual time data.
@BigBuggyBastage
Upvotes will not be included in any dump.
@silbasa
No; everything that is present in dumps has to be something that was already public facing, meaning it could already be scraped. We also have no intention of providing dumps earlier than the previous day’s, meaning that requests for information wipes will show up the day after they happen, which is within the timeframe needed to do it. Note we also have a section in our privacy policy dedicated to “Information that may potentially be shared with third parties”; this is such information.
Damaged
Word Bug
This would be a great way to kickstart a DB without having to scrape everything.
Blah blah blah privacy concerns, I’m sure you have that well in hoof.
Given the excitement that releasing even partial dumps caused the other day, this will likely make all the stat junkies very happy.
👍
Blah blah blah privacy concerns, I’m sure you have that well in hoof.
Given the excitement that releasing even partial dumps caused the other day, this will likely make all the stat junkies very happy.
👍
Background Pony #688B
And I’m Love My Life
BigBuggyBastage
Go fsck yourself
@ArmadilloEater
Ooh, to be able to download our ‘personal/user data’? I like that idea. ;)
Ooh, to be able to download our ‘personal/user data’? I like that idea. ;)
Rene_Z
This would be interesting for tag data (tag descriptions, categories, aliases), since these are a pain to access individually with the API.
HMage
I eat tea
I want it.
I also want some convinient way to download all images. Also for archiving purposes.
I also want some convinient way to download all images. Also for archiving purposes.
Interested in advertising on Derpibooru? Click here for information!
Help fund the $15 daily operational cost of Derpibooru - support us financially!