Broken Scanner

I have been having technical issues with my scanner recently. These mostly came about when I upgraded my OS. There’s an issue with the driver, and the short of it is that I can’t scan a picture in a way that I can save the picture. This means that I’m not going to be able to show you any paper drawings for the time being. I can still show you things I drew on the computer, but we’ll see how it goes.

Internal Structures 2

I’m afraid that I didn’t say what I meant to say last week.

What I meant to write was an essay concerning the fact that neural net programs create pictures in a fundamentally different way from a human. While a human creates an image by first determining the overall shape of the subjects and then filling in the details, the programs create an image by calculating what pixels would be surrounded by other pixels, essentially creating the details first. This leaves the program very good at imitating photos, but rather poor at simulating highly stylized art, such as the visual novel character designs that EndlessVN hoped to emulate.

To get around this, I proposed a program that could use a hard-coded skeleton to define the overall shape of the human body, and then having the program draw features over that skeleton. More fully, I was envisioning a program that could use this skeleton to draw the same character in several different art styles, with certain features being held constant, and possibly defined by the user. Hence, the same process that could draw a character in the style of a visual novel could also draw the same character as something from an American comic book, or a Victorian painting.

I also proposed breaking up designing characters and posing them into two separate process, both of them using the skeleton to keep things consistent between the two. In the same way a character can be made recognizable between art styles, a similar process would be used to keep different images in the same style consistent between them. I suppose what I’m proposing here is something like the animation industries style sheets, and I think that anything like I’m talking about could be used for creating animations as well as drawings, as long as processing power is available.

I then attempted to extend the idea of internal structures, such as the skeleton used for drawing, to GPT. The main thing I was thinking of was hard-coding the concept of characters into the software. This is the pain of working with GPT, so much so that Novel AI started adding descriptions of characters to the training data, so that the users could use the same formatting the team used to get the program to keep the details straight. A discussion on NAI’s reddit, concerning creating a beneath-the-surface record of who said what, rather than just having to have everything be right there in the text.

Regardless, I don’t actually care much about any of this. What I actually want right now is a website where I can upload something I drew and have it redrawn into something good.

Internal Structures

When the page loads, you immediately see a drawing of a girl. She has silver hair, and is wearing something that calls to mind a Japanese serafuku. Or, rather, your brain begins to starts to interpret it as a school girl uniform, until you catch up to your eyes and realize that the flesh-colored blotches on her chest cannot be hands. In fact, she doesn’t have hands; her arms simply fuse into each other, leaving you wondering why a service that bills itself as ‘AI’ would use a picture that makes it so very obvious that their program has no understanding of human anatomy.

This is the experience of opening the webpage for EndlessVN, a service that seeks to do for visual novels what Novel AI does for literature. Interestingly, while most services would stick to one neural net program, EVN seeks to recreate the experience of a visual novel by stapling several programs together, having different programs handle the text, the pictures, and the music. I’ve tried out the free version, in case you’re wondering, but it was too slow for me to do anything; but before I talk about EndlessVN itself, I want to talk about picture generators.

Picture generators are much like text predictors, in that they both allow the user to probabilistically generate a kind of data. But while text predictors spit out words, picture generators fill pixels with colors, using the colors of the pixels around each one to determine the RGB value of each individual pixel.

A quirk of this method is that, while a human is good at creating an impression of person using nothing but lines and space and would be hard pressed to create a photo-realistic face of someone that doesn’t exist, the program is the opposite. The program relies on the the fact that a photo will have patterns of texture on a human’s skin, hair, and clothing to tell where a hard edge, like where the face ends in a picture and the wall behind it begins, would be. This isn’t possible when it’s imitating drawings, which are dominated by solid blocks of color, whether those colors are supposed to represent the foreground or the background.

But even putting aside current image generators difficulties replicating the anime style, I don’t think Endless VN needs a block of pixels of a given size for all of it’s images. It works fine for backgrounds, for the characters, I feel that you need a completely different paradigm.

The fundamental problem with getting a program to draw a character is that you want individual drawings to be consistent. If a character is blonde and has green eyes, you want every picture of them to be blonde and have green eyes. If a character is wearing clothing for a particular scene, you want them to wear the same clothing for the entire scene. And if you want a character to have a cowlick coming off of the back of their head, you want every picture of them to have a cowlick coming off the back of their head.

In other words, there’s a difference between creating a design for a character, and making drawings of that character in various poses. I suspect that you would need different kinds of programs for each. The first would be concerned with generating variations around a set of attributes, such that not every blue-eyed beauty with freckles looks like every other blue-eyed beauty with freckles, and the second would focus on moving the body parts around, and giving the drawings some facade of emotion. I think that both of these would need some understanding of the internal structure of the human body, even if the end result looked like drawings, and I rather suspect that such an internal structure would need to be hard-coded.

But hard-coding internal structures into neural nets isn’t an idea that’s limited to pictures. As it stands, GPT doesn’t really know when someone is talking, just when words are between quotation marks. If it were possible to hard-code some idea of what a character is into it, and allowing it to create personalities in the same way our first program above, it would bring us so much closer to the dream of a program that can simulate an entire, arbitrary world.

Some GPT services have already started to put character profiles right in the training data, so that when the user goes to describe someone in the form of [name: / appearance: / personality:…], the program has something to latch onto. Even still, text predictors still have difficulty keeping characters straight. And like understanding the word ‘not’, I suspect that this is for mechanical reasons, that no amount of training data can actually overcome.

Poetry: To Hell with You

Another round in our battle
is about to begin
And I can see it now
that neither of us will win
But you’re on one side
and to the other I hew
So all the same
to hell with you

Our war started before we were born
and it will continue after we die
No one can make sense of our causes
no matter how hard they try
I spent my life on a conflict
that I’ve started to rue
But all the same
to hell with you

The whole world bears our scars
We’ve been hurting, near and far
Britain, France, and America, too
Both sides are completely hollow
We can do nothing but follow
But even so, to hell with you.