File: altOP.jpg (1.3 MB)
Welcome to the Pony Voice Preservation Project!
youtu.be/730zGRwbQuE
The Pony Preservation Project is a collaborative effort by /mlp/ to build and curate pony datasets for as many applications in AI as possible.
Technology has progressed such that a trained neural network can generate convincing voice clips, drawings and text for any person or character using existing audio recordings, artwork and fanfics as a reference. As you can surely imagine, AI pony voices, drawings and text have endless applications for pony content creation.
AI is incredibly versatile, basically anything that can be boiled down to a simple dataset can be used for training to create more of it. AI-generated images, fanfics, wAIfu chatbots and even animation are possible, and are being worked on here.
Any anon is free to join, and there are many active tasks that would suit any level of technical expertise. If you’re interested in helping out, take a look at the quick start guide linked below and ask in the thread for any further detail you need.
EQG and G5 are not welcome.
>Quick start guide:
docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa 0/edit
Introduction to the PPP, links to text-to-speech tools, and how (You) can help with active tasks.
>The main Doc:
docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQna c/edit
An in-depth repository of tutorials, resources and archives.
>Online speech generation
haysay.ai
>Active tasks:
Research into animation AI
Research into pony image generation
>Latest developments:
pastebin.com/4p00iUZM
>The PoneAI drive, an archive for AI pony voice content:
drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp
>Clipper’s Master Files, the central location for MLP voice data:
mega.nz/folder/jkwimSTa#_xk0VnR30C8Ljsy4RCGSig
mega.nz/folder/gVYUEZrI#6dQHH3P2cFYWm3UkQveHxQ
drive.google.com/drive/folders/1MuM9Nb_LwnVxInIPFNvzD_hv3zOZhpwx
>Cool, where is the discord/forum/whatever unifying place for this project?
You're looking at it.
Last Thread: https://desuarchive.org/mlp/thread/43127073/#43127073
Showing all 11 replies.
>>
FAQs:
If your question isn’t listed here, take a look in the quick start guide and main doc to see if it’s already answered there. Use the tabs on the left for easy navigation.
Quick: docs.google.com/document/d/1PDkSrKKiHzzpUTKzBldZeKngvjeBUjyTtGCOv2GWwa 0/edit
Main: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQna c/edit
>Where can I find the AI text-to-speech tools and how do I use them?
A list of TTS tools: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQna c/edit#heading=h.yuhl8zjiwmwq
How to get the best out of them: docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQna c/edit#heading=h.mnnpknmj1hcy
>Where can I find content made with the voice AI?
In the PoneAI drive: drive.google.com/drive/folders/1E21zJQWC5XVQWy2mt42bUiJ_XbqTJXCp
And the PPP Mega Compilation: docs.google.com/spreadsheets/d/1T2TE3OBs681Vphfas7Jgi5rvugdH6wnXVtUVYi ZyJF8/edit
>I want to know more about the PPP, but I can’t be arsed to read the doc.
See the live PPP panel shows presented on /mlp/con for a more condensed overview.
2020 pony.tube/w/5fUkuT3245pL8ZoWXUnXJ4
2021 pony.tube/w/a5yfTV4Ynq7tRveZH7AA8f
2022 pony.tube/w/mV3xgbdtrXqjoPAwEXZCw5
2023 pony.tube/w/fVZShksjBbu6uT51DtvWWz
>How can I help with the PPP?
Build datasets, train AIs, and use the AI to make more pony content. Take a look at the quick start guide for current active tasks, or start your own in the thread if you have an idea. There’s always more data to collect and more AIs to train.
>Did you know that such and such voiced this other thing that could be used for voice data?
It is best to keep to official audio only unless there is very little of it available. If you know of a good source of audio for characters with few (or just fewer) lines, please post it in the thread. 5.1 is generally required unless you have a source already clean of background noise. Preferably post a sample or link. The easier you make it, the more likely it will be done.
>What about fan-imitations of official voices?
No.
>Will you guys be doing a [insert language here] version of the AI?
Probably not, but you're welcome to. You can however get most of the way there by using phonetic transcriptions of other languages as input for the AI.
>What about [insert OC here]'s voice?
It is often quite difficult to find good quality audio data for OCs. If you happen to know any, post them in the thread and we’ll take a look.
>I have an idea!
Great. Post it in the thread and we'll discuss it.
>Do you have a Code of Conduct?
Of course: 15.ai/code
>Is this project open source? Who is in charge of this?
pony.tube/w/mqJyvdgrpbWgZduz2cs1Cm
PPP Redubs:
pony.tube/w/p/aR2dpAFn5KhnqPYiRxFQ97
Stream Premieres:
pony.tube/w/6cKnjJEZSCi3gsvrbATXnC
pony.tube/w/oNeBFMPiQKh93ePqTz1ns8
>>
>>
>>
>>43289213
I don't think there would be such thing for mlp (otherwise it would be scrapped for data years ago), I guess the closes would be to try rip off netflix and other tv shows that have the Audio Description/Video Description tracks for the blind people, and see if that could be somehow used as dataset in whatever project you are planing to do.
>>
>>
File: coolshit.jpg (1.3 MB)
>>43289238
I doubt it would be detailed enough. I heard multimodal llms can take in video, even small ones like the ones I can run on my machine, so one could theoretically tag the entire show like this.
That aside, picrel is a compilation of experiments I did a while ago while playing with show frame compression. I trained a hierarchical VQ autoencoder that encodes 256x384 frames into two maps: 16x24 and 8x12 codebook indices, each codebook has 1024 entries, and then tried to generate larger map given smaller map using discrete diffusion. It's quite undertrained but I got bored of it. Just thought you guys would appreciate the abominations, some of them are even cute.
>>
>>43289447
I think I saw some models months ago that were able to watch 10s video and describe what was happening in a scene along with any interaction people had with the setting, I wish I could remember that the name of it was since that sounds like something you could potentially reuse for your stuff.
Now that I type all of this, I kind of wish there was a program that could easy way to create automated audiobook from a fic, but that would require for tts model to be combined with some llm to understand which characters are included in the story and automatically swap the voices of character/narrator as well as add any relevant background music and sound effects.
>>
>>43289481
>easy way to create automated audiobook from a fic
I made such app a while ago https://files.catbox.moe/cwj64u.mp4
https://drive.google.com/drive/folders/14zMbURz1SuYNMoewX88EjR8sHEkcaK Xa
>>
>>43289508
>if yours only supports cuda 11.x but you still want to run on gpu, run the following inside PVT folder: runtime/python.exe -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Could it be possible to make it work with the cu116? My system+gpu can't get python to work with anything above that, and Im not going to be upgrading my pc for at least next two/three years.
>>