全世界最高峰・最上位の画像生成AI (Stable Diffusion 3 8B+) が凄すぎた件

Stable Diffusion開発元の、Stability AI Japan – External AdvocateのD̷ELLと申します。
今回、Stable Diffusion最高モデル Stable Diffusion 3 80億パラメータ(8B) を搭載したAPI「Stable Image」の、最上位サービス 「Stable Image Ultra」 の体験会を実施しました。
実施内容をレポートにまとめましたので、報告させて頂きます。


概要

  • Stability AIからStable Diffusion 3 2B のモデルがリリースされ、世界を席巻した
  • Stability AI APIでは、最上位モデルの Stable Diffusion 3 8B が利用可能
  • 性能を体験してもらうために、色んな人に利用してもらったら凄かった

経緯

先日、Stability AIから待望のStable Diffusion 3 Medium(2Bモデル)が発表され、大きな話題を呼びました。しかし、Stability AI APIではさらに上位の Large / Ultra(8Bモデル) が利用できることをご存知でしょうか?

先日モデルリリースされた Stable Image Medium は 2B モデルですが、Stable Image Large は その4倍のパラメータを持つ 8Bモデルです。Stable Image Ultra は、8BモデルであるLargeを調整して更に性能を向上させた、名実ともに 全世界最高・最上位の画像生成AI となります。

Stable Image Ultraの紹介は以下の通りです。

当社の最も高度なテキストから画像への生成サービスであるStable Image Ultraは、これまでにない迅速な理解で最高品質の画像を作成します。Ultraは、タイポグラフィ、複雑な構図、ダイナミックな照明、鮮やかな色合い、芸術作品の全体的なまとまりや構成に優れています。Stable Diffusion 3を含む最先端のモデルから作られたUltraは、Stable Diffusionのエコシステムの最高峰を提供します。

Stable Diffusion 3 8Bの性能を最大限に引き出すように調整されたAPI、ということですね。

その優れた性能をぜひ体験していただきたく、先日Google Colabファイルを提供させていただきました。こちらからご確認いただけます。

Google Colab notebook への短縮URL
https://j.aicu.ai/SD3UC

とはいえ、急に有償のAPIを利用するのは少しハードルが高いかもしれません。そこで今回は、懇意にしている皆様に、APIの最上位モデルである Stable Image Ultra を利用いただき、その感想を伺いました。

生成画像集

みなさまの生成された画像と、プロンプト、利用コメントを頂いております。
多種多様な画像を生成いただいておりますので、ぜひお楽しみください。

ご協力頂いた Discordサーバー AI声づくり技術研究会様、ありがとうございます。

スクリーンショット 2024-06-15 223314 - コピー.jpg

うんわさん

コメント:「あまりプロンプトを工夫してなくても非常に高品質な画像がパッと出てきて、とてつもない進化を感じました。」

A soft, plush toy shaped like a smiling face with two round black eyes and a simple curved smile. The toy is light purple and appears to be made of a soft, fuzzy material. It is positioned on a blue quilted surface with a light gray background, cute, kawaii, close-up shot, high detail.

a breathtaking underwater photo of a hand underwater touching the surface to create a ripple of bright abstract eye galaxy nebula vortex of beauty and nature, sunlight and chaos

robot girl, android,hanging,female, robot_torso,mechanical parts, cable, masterpiece, in a futuristic robotics lab, deactivated, wires, highly detailed, dynamic lighting, pale skin

aurora, milky way, night, night sky, shooting star, space, starry sky, galaxy, sky, city lights, constellation, light particles, skyscraper, cityscape, a girl, long hair, skyline, city, standing, twilight, looking at viewer, yellow eye


an image of a World War II battle scene. Include soldiers in era-specific uniforms, trenches, barbed wire, and debris. Show infantry, tanks, and military vehicles with smoke and fire. Add an overcast sky for a grim atmosphere. Use a muted, gritty color palette.


へむろっくさん

コメント:「触ってみな 飛ぶぞ!」

image.png
image.png
image.png
image.png

Girls who play games on a gaming PC with multiple monitors at home, willing, aged 20


Girl taking a selfie with her smartphone in the mirror at home, Young girl dressed in black gothic Lolita fashion, kawaii, anime,


うみせさん

コメント:「SD3を使ってみて、先日公開されたmediumよりも良い感じに生成できて楽しかったです。これまでにSD1、SD2、XL、cascadeと試してきましたが、SD3はそれらの良いところをうまく取り入れているように感じました。プロンプトの効きと生成結果がとてもよく、体験としては非常に素晴らしかったです。まだultraはAPIのみでの利用ですが、APIに抵抗がない方にはぜひ試してみてほしいです。」


At dusk, in a polished, beautiful fantasy city where light and darkness intersect, god rays rain down from high in the sky, illuminating the city.


A surreal landscape with a giant floating crystal in the sky.


game screen shot of Open world game with a character in a forest, with game hud


a concept pixel art of star night, sky full of stars, a person standing on a hill, looking at the sky, japanese anime style, 16bit, Title logo write 「hello world」


a concept art of Dark soul style weapons, setting sheet,


1girl, solo, cyberpunk, barcode, black footwear, black jacket, black skirt, boots, braid, brown hair, building, car, character name, crosswalk, full body, green eyes, hand in pocket, high heel boots, high heels, holding, holding umbrella, jacket, long hair, long skirt, motor vehicle, phone, pink umbrella, road, road sign, sign, single braid, skirt, smile, standing, twin braids, umbrella


Glittering neon signs and flying cars are reflected in the dark, stagnant river. Skyscrapers built high in the sky, cyberpunk city, cyberpunk


In a vibrant 1990s-style anime illustration, a young girl strikes a fashionable model pose in the heart of a bustling city. She embodies the essence of cyberpunk, dressed in the latest streetwear trends that blend futuristic elements with retro flair. Her outfit features a sleek jacket with neon accents, high-waisted pants, and chunky sneakers, all glowing under the city’s neon lights. She wears stylish sunglasses, reflecting the colorful, electric atmosphere around her. Her confident stance and playful expression capture the spirit of a fashion icon, seamlessly merging the past's nostalgia with the future's edgy vibe. The background is a lively urban scene, filled with towering skyscrapers, bright billboards, and bustling crowds, perfectly encapsulating the dynamic energy of a cyberpunk metropolis.


yutoさん

コメント:「未来の技術は今使ってこそ未来の技術と言います。Stable Image API Ultraは今使える未来の技術です!!」


realistic, natural light, photo, long hair, portrait, asian and caucasin mixed girl, beautiful model, white shirt, having card "Yes I am"


realistic, natural light, photo, long hair, portrait, asian and caucasin mixed girl, beautiful model, white shirt,


uthreeさん

コメント:「メチャクチャクオリティ高いとしか言いようがない」



a girl falling in the sky, smile, starry night, white hair, anime, vibrant, high quality,


A detailed anime-style character design, featuring a young girl with long flowing purple hair and bright blue eyes. She is wearing a stylish futuristic outfit with intricate details, including a metallic silver jacket, a neon blue skirt, and knee-high boots. Her expression is cheerful, and she is standing in a vibrant, colorful cityscape with tall buildings and neon signs in the background. The sky is stunningly beautiful, with a gradient of colors from deep blue to vibrant pink, adorned with fluffy white clouds and a glowing sunset. The lighting is dynamic, with a mix of natural and artificial light, giving the scene a lively and energetic atmosphere. The overall style is highly detailed, with a focus on capturing the unique elements of anime art and the breathtaking beauty of the sky.


1girl, solo, cyberpunk, barcode, black footwear, black jacket, black skirt, boots, braid, brown hair, building, car, character name, crosswalk, full body, green eyes, hand in pocket, high heel boots, high heels, holding, holding umbrella, jacket, long hair, long skirt, motor vehicle, phone, pink umbrella, road, road sign, sign, single braid, skirt, smile, standing, twin braids, umbrella


A girl, starry night, anime, vibrant, high quality, pixel art


雫さん

コメント:「久しぶりに画像生成AIを使いましたが、前に使った時よりすごくプロンプトが効きやすい気がしました。ものすごく楽しかったです。貴重な機会をありがとうございました!」


Black long hair, Anime, kawaii, 1girl, black eyes, headphone, white clothing, looking down at viewer, standing, building, city, frombelow, upper body, side shot

image.png


Anime, Kawaii, ilustrated, 1 girl, purple long hair, crimson eyes, sunset, building, city, Aurora front view


In an illustration style, Kawaii and animated, it evokes the interior of a Gothic cathedral, with red ambient lighting showing large stained glass windows on either side, and rays of light in the center illuminating the dust in the air, creating a mysterious atmosphere. And a girl with black wings and red eyes floats in the center, looking at us

image.png
image.png

flyfrontさん

コメント:「長い自然文でもちゃんとイラストに反映されてて良い感じ!」


illustrated in an anime style with the focus on the upper body, from a slightly angled front view. A Japanese woman wearing a lace trimmed blue evening dress off shoulder style is sitting at the counter of a luxury hotel's top-floor bar. The dress shows a collarbone and the feminine curves of her body. She is wearing a jewely necklace and has her silver hair up and red eyes. With a melancholic expression, she gazes out at the fog and rainy skyscraper cityscape through the window. The woman is holding an envelope in her hand, with the word "Invitation" written on it. The bar is elegantly decorated with dim lighting, cocktail glass on counter table, creating a sophisticated and intimate atmosphere. The city lights and rain outside the window create a reflective and moody ambiance. nega:behind, v-neckline, nsfw


kawaii anime style. A medieval girl with blonde hair is swinging a katana toward front with the katana's blade gleaming in the light. She is dressed in traditional European white armor with intricate patterns and details, wearing frilled skirt. Her expression is determined and focused. The background is a blend of a serene landscape, featuring flowers in full bloom, and an ancient cathedral.photorealism, cartoon, samurai, cherry blossoms,


焼肉Yakiinkuさん

コメント:「プロンプトの反映がとても自然に感じました!頭で考えたイメージや情景をかなりの精度で絵にしてくれる(しかもすごい生成速度早いっ)ので、「すごい!」もそうなのですが「とっても楽しい!」って感じなので時間を忘れて夢中になります・・!楽しいイベントありがとうございます!」


Anime, kawaii, depth of field, thick fog, smoke, kisser, cigarette, red and white, monotone, petals fused with body, flowers, glamour, Chinese dress, empty eyes, morbid, hair in a bun, long hair, clock tower, crack in space-time,


Anime, kawaii, girl, solo, depth of field, waves, flat colour, best image quality, symmetrical face, summer, water on dress, water droplets, specular reflection, refracted glass shards, prism, moon celestial body, liquid clothing, long yellow dress, harmony,


Anime, kawaii,depth of field, thick fog, Full smiles, happiness, hope, white wedding dress, church, disquiet, grey world, bursts of blood, despair, cracks in the world, slaughter, incident, stillness,


Anime, kawaii, fantasy, Arabian Nights, lamp witch, dragon, fun, flying, magic carpet, light shards, adventure, boy, girl


代屋モントさん

A scene where a giant octopus-like monster and a fighting humanoid robot shoot pile bunkers into the octopus.


An androgynous elementary school boy with a dark atmosphere wearing a gothic dress


An anime style of a hero wearing a tiger mask standing on a telephone pole.


(((((((anime))))))) depth of field, wave at the edge of dress, masterpiece, flat color, best quality, BRAKE. ((kawaii)), perfect symmetrical face,summer,wave, ((colorful refraction)), ((beautiful detailed sky)), ((dark intense shadows)), ((cinematic lighting)), ((overexposure)), water on the dress, (water sea red dress blending with sea), from side,beautiful detailed glow, ,detailed lighting, detailed water,(beautiful detailed eyes),(smile), standing in the ocean, detailed wet clothes, partially submerged, Refracting glass fragments, prisms, lunar celestial nature, BRAKE. (liquid clothes:1.2) ,a girl solo: {dress<wave>, {{dissolving dress}},A dress in harmony with the sea,dress floating into sea}, (beautiful detailed girl) (long dress blending with ocean), (yellow long dress:1.5), small breasts, skinny 【Negative】 blur, lowres, bad anatomy, bad hands, text error, missing fingers, extra digits, fewer digits, cropped, worst quality, low quality, standard quality, jpeg artifacts, signature, watermark, username, blurry, glow, slippage, blur, bokeh, pink, multiple views,large breasts, large breasts, medium breasts, huge breasts, enormous breasts ,Hair that doesn't fit into the illustration, blush, flat chest q_version, nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, symmetry, outline(painting),cartoons,sketch,(worst quality:2),(backlight:1.2),bad anatomy,bad hands,double navel,collapsed eyeshadow,multiple eyebrows,freckles,signature,logo,2faces,((3fingers:1.2)),((4fingers:1.2)),((6fingers:1.2)),(laugh line:1.2),


マッキーさん

kawaii, anime, 1girl, solo, Very beautiful glowing skin., blue eyes, long hair, gray hair, elf, Huge breasts, looking at viewer, upper body, camisole, absurdres, highres,Detailed background,Outdoor Lakeside

Create a high-resolution, upper-body image of a cute anime girl with blue eyes and long, flowing gray hair. She has tanned skin and is an elf with delicately pointed ears. She is smiling warmly and looking directly at the viewer, giving a friendly and inviting expression. She is wearing a light, pastel-colored camisole that complements her complexion. Her hair cascades gracefully around her shoulders, with a few loose strands framing her face. The background should depict a sunny lakeside scene with clear blue skies, a sparkling lake, and lush greenery. Ensure the background is detailed yet softly blurred to keep the focus on the character. The style should be kawaii and highly detailed, capturing the charming and whimsical essence of anime art. Ensure the image is high quality and high resolution, with careful attention to the character's features and expression.

生の声

https://twitter.com/mckey_draw/status/1801990763578093651

ChatGPTとの連携

プロンプトは、ChatGPTに生成してもらったという方が多くおられました。
GPTsでStable Diffusionのプロンプトを生成できるらしく、ぜひ参考にして頂ければと思います。また、Stable Diffusion 3 は自然言語処理に対応しておりますので、一般的な英文でも高品質な画像が生成できます。

「どんな呪文を使えばいいかわからない…。」という方でも、安心して本稿のような画像が生成可能です。安心ですね。

まとめ

いかがでしょうか?ひとつのAPIで、スタイルの指定など不要で、様々な画像が生成できていることがご覧いただけたかと思います。ユーザーのみなさまは、画像生成のベテランの方から初学者の方まで様々でしたが、望み通りの画像が出力出来たと大好評をいただきました。

APIの利用方法は、以下の記事にまとめておきました。
ぜひこの機会に、全世界最高峰・最上位の画像生成AIをお試しくださいませ。

ご協力いただいたみなさまに感謝申し上げます。
最後までご覧いただき、ありがとうございました。

本投稿はこちらの原作より、AICU media編集部に寄稿されたものです。
寄稿日 2024年06月16日

ご協力いただいた皆様、ありがとうございます。

Stability AI Japan – External AdvocateのD̷ELLさんもありがとうございます。
Stable Diffusionの探求を拡げていけるクリエイターのみなさまに感謝です。