Showing all 40 replies.
>>
>>
>>
>>
>>
>>
>>
Same with search. I often search for song lyrics, and unless I'm very clear in the request that these are lyrics, it gives me advice instead of information about the song.
It's fascinating to see trillion dollar companies fumbling so much. On the one hand it's magical that this is even possible, on the other, it's clear that they're all scrambling. Both genius and amateur hour.
>>
>>
>>
>>109003332
the underlying architecture has changed. i've been feeding everything from valid sentences, to slang, to utter bullshit and mojibake into translate for years. something has changed about the way it hallucinates. it's much less literal. it clearly engages in chatbot behaviors, such as em-dashes, or even talking about the text in some rare cases instead of attempting to translate it.
>>
>>
>>
File: spongebob_my_eyes.jpg (27.7 KB)
>>109003199
>esta hermoso
>si
>te cacho
you illiterate ngroid
>>
>>
>>
>>
>>
>>109003382
i think so. i couldn't make sense of a certain piece of text when i used yandex translate, so i plugged it into a chatbot and it gave a good translation which was pretty accurate and didn't read like a literal translation
>>
>>109003802
Google's moat is its infrastructure (and the fact that now other services like cloudflare gatekeep most of the internet against bots) but if they break their service too much, someone else can always take the lead in search. Seems unthinkable now, but those are things that happen if a company loses its way too much. The bigger question would be whether most people prefer regular search or those AI summaries.
>>
>>
>>109004055
Yes, and GPT-image-2 or Nano-Bana-2 can likely make the translation fit in the pages in an acceptable way. It is still recommended to first have a better model translate and just use those for text insertion though.
>>
>>109004128
The tool already does text extraction and typesetting well enough, the only thing missing was decent translation. I was hoping LLMs would do better.
The tool probably can't feed images to the translation model though and japanese relies a lot on context so idk if it'd be good.
>>
>>
>>109004151
>>109004162
To continue, most frontier LLMs have vision now. Some local ones do too. You'll want to give good instructions, but you don't even need to extract the text beforehand. If you ask a current LLM to OCR a full page of text, it would likely give sub par results, but for the amount of text that's on a manga page it should be more than able to do that perfectly fine.
>>
>>109004170
That's nice to know. The tool I mentioned (manga-image-translator) hasn't been update in a year, so it's not likely to have gained any significant features since I last used it.
I would still prefer something fully automatic and local though so I guess I'll just wait lol.
>>
File: gemini.png (502.5 KB)
>>109004190
It is pretty automatic. Here is with a random image from https://global.discourse-cdn.com/wanikanicommunity/original/4X/d/f/3/d f3ad7cc55b6398b6c1b5da53580d946e734 e0e7.jpeg
You'd have to build a pipeline and work on the prompt a bit so that it doesn't feel the need to preface with anything, etc., but it seems to work to me.
>>
File: random_image.jpg (184.1 KB)
>>109004190
>>109004224
Source image
>>
File: random_image_translation.png (1.5 MB)
>>109004190
>>109004224
Translation
>>
>>
>>109004229
>>109004231
>見えないのなら | これではどうだ?
>If you cannot see it... How about *this*? | If you cannot see it
Perfect for gorgeous looks, can push asap.
>>
>>109004235
>>109004237
Might be. I don't speak moon runes. The images models are no good for translation I guess, so what was said here >>109004170 regarding that it's better to have a smarter model translate first seems to stand. I still think they're neat for text insertion. The SHWACK and the DA might be dumb, but it's still cool.
>>
File: random_image_translation_two_steps.png (934.3 KB)
>>109004235
>>109004237
Here it is again asking the model to translate it first and only insert the translation after. Don't know if it's better or worse but it's different.
>>
>>109004248
You can tell from >>109004237 that the translation doesn't make much sense. As a matter of fact it should be
>見えないのなら/If you cannot see it
>これではどうだ?/How about *this*?
the reason it fucks up here, I happen to know, is because it hates when comics are rtl not ltr
>>109004276
same fundamental fuck up: it does the lines ltr instead of rtl. I was trying to coerce gemma4 to give me the correct panel order for a manga page and it couldn't. A larger model should be able to do it but I can see it struggling here.
>>
>>109004055
https://github.com/zyddnys/manga-image-translator
This one? I could never get it working myself.
>>
>>109003382
even with bullshit models you can get pretty ok translations, much, much, much better than something like tesseract. ive translated a couple of german and french academic texts with gemmatranslate:12b and they seem pretty good but still require a lot of manual review
>>
>>
>>
>>