ComfyUI: Yolo World, Inpainting, Outpainting (Workflow Tutorial)

2024 ж. 25 Мам.

20 589 Рет қаралды

This tutorial focuses on Yolo World segmentation and advanced inpainting and outpainting techniques in Comfy UI. It has 7 workflows, including Yolo World instance segmentation, color grading, image processing, object/subject removal using LaMa / MAT, inpaint plus refinement, and outpainting.
------------------------
JSON File (KZhead Membership): www.youtube.com/@controlaltai...
Yolo World Efficient Sam S CPU/GPU Jit Download: huggingface.co/camenduru/Yolo...
Fooocus Inpaint Model Download: huggingface.co/lllyasviel/foo...
LaMa Model Download: github.com/Sanster/models/rel...
MAT Model Download: github.com/Sanster/models/rel...
Inference Install (Comfy UI Portable Instructions):
Go to python_embedded folder. Right click and open terminal.
Command to Install Inference:
python -m pip install inference==0.9.13
python -m pip install inference-gpu==0.9.13
Command to Uninstall Inference:
python -m pip uninstall inference
python -m pip uninstall inference-gpu
Command to Upgrade Inference to Latest Version (non compatible with Yolo World)
python -m pip install --upgrade inference
python -m pip install --upgrade inference-gpu
------------------------
TimeStamps:
0:00 Intro.
01:01 Requirements.
03:21 Yolo World Installation.
04:32 Models, Files Download.
05:09 Yolo World, Efficient SAM.
12:50 Image Processing, Color Grading.
19:09 Fooocus Inpaint Patch.
21:56 Inpaint & Refinement.
24:17 Subject, Object Removal.
31:58 Combine Workflow.
33:17 Outpainting.

Пікірлер

Crazy. Time for me to put my Confy UI cape back on. Great video!!!
@AI3EFX2 ай бұрын
Excellent video, want to try and use ComfyUI as much as I can but Inpaint and Outpaint has been better for me in other UI. Hopefully this will help. I've also only just realised you can zoom in and out of the canvas in Mask Editor due to watching you do it when you were fixing the edge of the mask after adding the polar bear lol.
@runebinder2 ай бұрын
i really love your videos, you really explain it very well!
@LuckRenewal2 ай бұрын
Brilliant explanations. Thanks for making this video, it is so useful, and you have a great mastery of the subject.
@brgresearch2 ай бұрын
- My only gripe, as I'm replicating these workflows, is that perhaps the seed numbers you use could be simpler to replicate, or perhaps pasted in the description? That way we can get the exact same generations that you did easily. Right now, not only is the seed long and complicated, but it's not always clear, like in the case of the bear on the street seed 770669668085503, which even on a 2K monitor (the easiest frame I could fine was at 22:16) , was really hard to make out due to the 6's looking like 8's. Still replicable, but perhaps for ease of following along, an easier seed would be helpful. Thank you again for making this, I'm halfway through replicating the workflows and I'm beginning to understand!
  @brgresearch2 ай бұрын
- @brgresearch the seed number I used is random. Don't use the same seed as it's not cpu generated and will still give different results if you are not using the same GPU as mine. Use any random seed and keep randomising it. You are suppose to replicate the workflow intend rather than the precise output. Meaning, the workflow is supposed to do x with y output, at your end it should still do x with z output. I hope that makes sense.
  @controlaltai2 ай бұрын
- Also, if you need the seed for any video, just send an email or comment on the video, I will just post it for you. I prefer to not post it in description as some one with not a 4090 will get different output.
  @controlaltai2 ай бұрын
- @@controlaltai thank you for the clarification. I did not know that the hardware will also affect the generation. My thought was to try to follow along as exactly as possible, so that I would get the same results and be able to make the changes you made in a similar manner, especially with the seam correction example, because I did not want to get a different bear! I completely understand that it's okay to get a z output, even if yours is y, as long as the workflow process arrives at the same type of result. I'm practicing with the workflow today, and it's really amazing what can be accomplished with this workflow. Thank you so much again, and really appreciate the work and education you are doing.
  @brgresearch2 ай бұрын
Great. Very detailed.
@user-cb1dm8cy1s2 ай бұрын
Very well explained! ❤
@freke802 ай бұрын
- Thank you!!
  @controlaltai2 ай бұрын
This is so high level 😮
@rsunghun2 ай бұрын
amazing as always, thanks for all details Mali~
@ericruffy21242 ай бұрын
Results are amazing. But the learning curve to understand (and not only copy/paste) all these workflows seems a very long journey... Nevertheless, I subscribe immediately 😵‍💫
@oliviertorres80012 ай бұрын
- I had the same reaction. I can't imagine how these workflows were first created, but I'm grateful that eventually, because of these examples, I might understand it.
  @brgresearch2 ай бұрын
- If you put the time in, you will understand. Also, I suggest finding a specific use case. In other words "Why am I in this space, what do I want the AI to help me create?" For me, it was consistent characters, so learning masking and inpainting is great for me so I can ensure likeness and improve my training dataset.
  @blacksage812 ай бұрын
- @@blacksage81 For me, it’s to build my own workflow to sell virtual home staging upfront to my clients. I’m a real estate photographer. Of course, it worth it to struggle a little bit to nail inpainting at a high level of skills 🧗
  @oliviertorres80012 ай бұрын
- @@blacksage81 this is really good advice. For me, I'm trying to create a filter tool for photos with controlnet and to be able to do minor photo repairs using masking and inpainting. ComfyUI is such a flexible tool in that regard, but at the same time, it's amazing to see how some of the workflows are created.
  @brgresearch2 ай бұрын
Dhanyavad
@sanchitwadehra2 ай бұрын
The good news is that the ComfyUI Yolo World plugin is great. The bad news is that the author of this plugin has made many plugins and never maintains them.
@webu5092 ай бұрын
- that's my least favorite thing about self-ran image workflows.
  @haljordan1575Ай бұрын
Thanks for the comment the other day. I had deleted my post already before I saw it so unfortunately the tip wasn't left for others (because of my deletion). The tip (leaving for others here) was to delete the specific custom node folder if you have problems loading an addon - in certain cases anyways. I had an idea for NN model decoders. The idea is simple. It's to pass in a portion of the image that's pre-rendered and that you want unchanged in the final image. So, the decoder would basically do it's magic right after the noise is generated. So, right on top of the noise, the decoder overlays the image you want included (transparent in most cases). It can have some functionality in the NN decoder for shading your image clips - both lighting applied to it as well as shadows. This might even need a new adapter "Type" - but I just haven't gotten deep enough into it yet (sorry if you're reading this as I keep correcting it, it's like 4:48 am... - it's pretty bad writing...) If you all have direct contacts with those at stability ai, you might reach out and suggest something regarding including pre-renders directly into noise at the beginning of the denoise process.
@jeffg46862 ай бұрын
thank you, great
@francaleu77772 ай бұрын
Wow, awesome results. Which resolution are the images? Would this work on 4k images as well or would it be necessary to downscale or crop the inpaint region first?
@geraldwiesinger6302 ай бұрын
- Advisable to downscale, near SDXL resolution. Then upscale using comfyui or topaz.
  @controlaltai2 ай бұрын
Awesome
@liwang-pp7dj2 ай бұрын
If you have a missing .jit file error, go to SkalskiP's huggingface and find the jit files there. Place in your custom nodes > efficient sam yolo world folder.
@croxyg6Ай бұрын
Quick question, if I need to add LoRa's to the workflow, should they come before Self-Attention Guide + Differential Diffusion nodes or after? Does it make a difference?
@mikrodizelsАй бұрын
- I add the lora after self attention and differential diffusion. To be honest I have not tested it in any other order.
  @controlaltaiАй бұрын
why are the model files for inpainting not in safetensors format?
@root65723 күн бұрын
Thank you for always great lectures. I am leaving a message because I have a question. If you uncheck mask_combined and mask_extracted in Yoloworld ESAM and run it (Error occurred when executing PreviewImage: I get the error Cannot handle this data type: (1, 1, 15), |u1). Is there a solution? You can check and run them separately, but if you run them with both turned off, an error will appear.
@moviecartoonworld44592 ай бұрын
- Thank You! So basically, if you pass on the mask to another node, that node cannot handle multiple mask. If yolo for example, detects more than 1 mask, you would get this error when passing on. For that, you should select an extracted mask value or combine the masked. Only a singular mask image should be passed on. If you are getting the error without passing it on, then let me know, something else is wrong, as i doubled checked now and I don't get that error.
  @controlaltai2 ай бұрын
- Thank you for answer!!@@controlaltai
  @moviecartoonworld44592 ай бұрын
thanks for this incredible tutorial, i have a problem, i want to use the images which comes from yoloworld Esam, but there is box and text overlays, how can i remove them
@ucyuzaltms932416 күн бұрын
- The yolo world doesn't not do anything but segment it. You further process by removing objects. And use those images. Why would you want to use images from yoloworld esam?
  @controlaltai15 күн бұрын
Hello 👋can i use the replace methode with refinement to inpaint a face to give a women shorter hair? I tried it and it looked bad with blured hair and masking line.
@user-rk3wy7bz8hАй бұрын
- Yeah, do one at a time. Don't mask the entire face just above eye brows and up to the chin, maintain mask within the facial borders. The masking line has to be refined. Why it is blurred I have no idea. That is dependant on your ksampling and checkpoint. I suggest you look at face detailer for hair. It has an automated workflow specifically for hair. kzhead.info/sun/ktmaf5uOhqhpeXk/bejne.htmlsi=b_6LSljm0SjLYXvq
  @controlaltaiАй бұрын
- I appreciate your very fast answer thank you a lot i will take your advice. ❤
  @user-rk3wy7bz8hАй бұрын
i checked the link in the description for Yolo World Efficient Sam S CPU/GPU Jit in the description and the model there is marked as unsafe by HuggingFace... where can I download it from?
@martintmvАй бұрын
- Please recheck the .jit files are safe. The other file is marked unsafe...yolow-v8_l_clipv2_frozen_t2iv2_bn_o365_goldg_pretrain.pth You can download it from another source here: huggingface.co/spaces/yunyangx/EfficientSAM/tree/main
  @controlaltaiАй бұрын
- @@controlaltai thanks
  @martintmvАй бұрын
Does this only work with SDXL models? I only have tried outpainting for now, I want to outpaint my epicrealism_naturalSinRC1VAE created images, everything seems to work in the previews, but in the final image after going through the sampler, the outpainted area is just noise. I included the same Lora and custom VAE I used to previously generate my images into this workflow as well.
@mikrodizels2 ай бұрын
- The fooocus patch only works with SDXL checkpoints.
  @controlaltai2 ай бұрын
- @@controlaltai Oh ok, got it. Is there an outpaint workflow, that would work like this for SD 1.5?
  @mikrodizels2 ай бұрын
- In comfy, all you have to do is remove the focus patch. However you have seen the difference when applying fooocus. I suggest you switch to any sdxl checkpoint. Even turbo lightning will give good results.
  @controlaltai2 ай бұрын
- @@controlaltai Got it to work without Focus for 1.5, seamless outpaint, but the loss of quality with each queue (image becomes more grainy and red), unfortunately is inescapable no matter what. You are correct, time to try the lightening jugger, cheers
  @mikrodizels2 ай бұрын
I'm trying out the Yolo World Mask Workflow, but I'm getting this error when I get to the first Mast to Image node: "Error occurred when executing MaskToImage: cannot reshape tensor of 0 elements into shape [-1, 1, 1, 0] because the unspecified dimension size -1 can be any value and is ambiguous" I haven't changed any of the settings, and using a decent image with not too much in it (Res 1792 x 2304), and the prompt of shirt which is showing in WD14. Not sure what settings I need to change. Have tried altering the confidence but that hasn't helped and tried both the Yolo L & M models. Any ideas?
@runebinder2 ай бұрын
- That error is when it cannot detect any segment. Try with confidence 0.01 and iou to 0.50, if it still cannot detect anything, You need to check what's your inference. When you launch comfy, do you get any message in command prompt that your inference is in a lower version latest version is 0.9.16. If you get that warning then all dependencies are good. if you don't get that warning, means you are on the latest inference on which this node does not work. The Wd14 is not what Yolo Sees. That's just there to help you. Both are un related. I put that there because, I was testing some images with low resolution and I could not see the objects but the ai could. Let me know if 0.01 / iou 0.50 works or not.
  @controlaltai2 ай бұрын
- Thanks. I’ll check in a bit and let you know how I get on.
  @runebinder2 ай бұрын
- @@controlaltai copied everything out of the Command Prompt window and into a Word doc so I could use Ctrl+F to search for Inference and I get the warning that I'm on 0.9.13 and it asks me to update, so looks good on that front. Tried the same image but used Face as the prompt this time as it's a portrait shot and figured that would be easy for it to find and it worked, thanks for your help :)
  @runebinder2 ай бұрын
How i can contact you for some workflow help
@saberkz2 ай бұрын
At the Beginning Thank you The question is now which inpaint methode is better? To use Vae encode then Refine with Preview Bridge or to work directly with VAE Enfode&Inpaint conditioning without any refinment . I want to know how to get the best results :) Appreciate
@user-rk3wy7bz8hАй бұрын
- Hi, so basically I recommend both. The vae encode is for replacement and replaces way better than vae encode&inpaint conditioning. However during extensive testing I found in some cases the latter can replace as well much better. But in many cases I had to keep re generating with random seeds. I would go with the first method, then try the second, because second is not always replacing an object and the success depends on a variety of factors like the background, object mask etc. For minor edits go with second, for major edits like complete replacement, try first method then the second.
  @controlaltaiАй бұрын
When the workflow passes the ESAM Model Loader, there is an error: """ Error occurred when executing ESAM_ModelLoader_Zho: PytorchStreamReader failed reading zip archive: failed finding central directory """
@hotmarcnetАй бұрын
- Have no idea what is this error. Is this on comfy portable? Windows OS? or different environment?
  @controlaltaiАй бұрын
Hi! I'm trying to understand what's the point of preprocessing with Lama if the samplers then use a denoise of 1.0?
@nkofrАй бұрын
- Hi, The samplers checkpoints are not trained to remove the object that well. Lama is very well trained. However it’s not perfect. The point here is to use lama to accurately remove the subject object and then use fooocus inpainting to guide and fix the image to perfection.
  @controlaltaiАй бұрын
- @@controlaltai Yes but my understanding was that setting denoise to 1.0 was like starting from scratch (not using anything from the denoised area), so if the denoise is set to 1 my understanding is that what Lama has done is completely ignored. No??
  @nkofrАй бұрын
- @nkofr not really, we are using fooocus Inpaint models with inpaint conditioning method, not vae encode method . This method is basically for fine tuning. Where as the vae encode is for subject replacement. Denoising of 1 here is not the same as denoising of 1 in general sampling. Value comparison is apple to oranges. Denoising value also is not a hard rule and depends on the distortion cause by the lama model. So no, denoising of 1 will not undo the lama work, you can actually see in the workflow it uses the bases left by the lama and reconstructs that. The things is Mat And Lama will work in complicated images and the reconstruction done by them is beautiful, however for such complexity we need to just fine tune it. Hence we use the fine tune method.
  @controlaltaiАй бұрын
- @@controlaltai Ok thanks that makes sense! (What you call "fine tune" is the pass with fooocus inpaint). Have you heard about Lama with Refiner? Any idea on how to activate the Refiner for Lama in ComfyUI? Where do you get all that knowledge from?:)
  @nkofrАй бұрын
- No idea on how to activate refiner for lama in comfyui at the moment.
  @controlaltaiАй бұрын
is there a way to make the mask for the mammoth automatically? Like putting a mask where the woman was before with x padding
@alvarocardenas88882 ай бұрын
- Yes, you can, try create a rectangular mask node from masquerade nodes. Use some math nodes to get the size directly from the image source to the width and height input and just define the x and y co ordinates and the mask size.
  @controlaltai2 ай бұрын
I need Help Please: i dont see the Models of the (Load Fooocus Inpaint). I download all 4 and placed them in models -inpaint models.
@user-rk3wy7bz8h2 ай бұрын
- The location is ComfyUI_windows_portable\ComfyUI\models\inpaint And not ComfyUI_windows_portable\ComfyUI\models\inpaint\models After putting the models, close everything including browser and restart.
  @controlaltai2 ай бұрын
- @@controlaltai Thank you. The problem has been solved after i renamed the folder as (inpaint) instead of inpaint models. I apppreciate your fast answer ;) Continue, i like you
  @user-rk3wy7bz8h2 ай бұрын
The issue with this is when you are trying to inpaint pictures that are large, it cannot inpaint accurately at all. Were you able to figure out how to downscale just the masked region such that its max_width is 768 or 1024 so that it is able to inpaint effectively?
@subashchandra955712 күн бұрын
- Downscaling only the mask are is possible. The workflow is different however depending on the image it may or may not work. The inpainting works because the ai needs the surrounding pixel data. So depending on the mask you have to select enough pixel data surrounding to get correct results.
  @controlaltai5 күн бұрын
It stuck on Blur Masked Area, I see issues on github but cant find clear solution, something about pytorch version :(
@stepahinigorАй бұрын
- Yeah, I bypassed it by having the code run via gpu.....had to make modifications to the node.
  @controlaltaiАй бұрын
Hi, thanks for the video. Is this working for SD1.5?
@Make_a_Splash2 ай бұрын
- The fooocus inpaint patch is only for sdxl. Yolo world and color grading doesn't require any checkpoint.
  @controlaltai2 ай бұрын
hey, what if my image is more than 4096 pixels for outpainting?
@ankethajare9176Ай бұрын
- Hey, You may run out of memory issues on consumer grade hardware. SDXL cannot handle that resolution. You can outpaint in smaller pixels and do more runs rather than going beyond 1024 outpaint resolution.
  @controlaltaiАй бұрын
what if you wanted to replace an object with an existing one? or inpainting it?
@haljordan1575Ай бұрын
- The tutorial covers that extensively. Please check the video.
  @controlaltaiАй бұрын
as you wrote, I downloaded the models, but in the node where we select (yolo_world/l) In general, are they supposed to load themselves? but no, I have this error. I got an error when uploading Your world_ModelLoader_Zho: It is impossible to get a "model of the world"
@yklandares2 ай бұрын
- Yes, the dev has designed that it loads the models automatically the "yolo_world/l". However, you have to download the .jit files in the custom node folder root directory. Other wise you get error for models and it does not load automatically.
  @controlaltai2 ай бұрын
- Error occurred when executing Yoloworld_ModelLoader_Zho: Can't get attribute 'WorldModel' on File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\ComfyUI\custom_nodes\ComfyUI-YoloWorld-EfficientSAM\YOLO_WORLD_EfficientSAM.py", line 70, in load_yolo_world_model YOLO_WORLD_MODEL = YOLOWorld(model_id=yolo_world_model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\inference\models\yolo_world\yolo_world.py", line 36, in __init__ self.model = YOLO(self.cache_file("yolo-world.pt")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics\engine\model.py", line 95, in __init__ self._load(model, task) File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics\engine\model.py", line 161, in _load self.model, self.ckpt = attempt_load_one_weight(weights) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics n\tasks.py", line 700, in attempt_load_one_weight ckpt, weight = torch_safe_load(weight) # load ckpt ^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\ultralytics n\tasks.py", line 634, in torch_safe_load return torch.load(file, map_location="cpu"), file # load ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\torch\serialization.py", line 1026, in load return _load(opened_zipfile, ^^^^^^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\torch\serialization.py", line 1438, in _load result = unpickler.load() ^^^^^^^^^^^^^^^^ File "G:\NEUROset\ComfyUIPort\python_embeded\Lib\site-packages\torch\serialization.py", line 1431, in find_class return super().find_class(mod_name, name) @@controlaltai
  @yklandares2 ай бұрын
- Make sure something is masked. Also ensure with multiple objects are masked only one is passed through to the next node. That can be done via mask combine or mask extracted (selection).
  @controlaltaiАй бұрын
- I didn 't sleep for two days and agonized over the process and eventually placed two .jit models but not just in a folder but with the name yolo_world@@controlaltai
  @yklandaresАй бұрын
hi very informatic video i am getting this error while running code "AttributeError: type object 'Detections' has no attribute 'from_inference'
@baseerfarooqui5897Ай бұрын
- Thank you! Is it detecting anything? Try a lower threshold.
  @controlaltaiАй бұрын
- @@controlaltai already tried but nothing happened
  @baseerfarooqui5897Ай бұрын
- Check inference version.
  @controlaltaiАй бұрын
- can u please elaborate it. thanks@@controlaltai
  @baseerfarooqui5897Ай бұрын
import failed and the log file said, can't found Supervision, how to fix this pls.
@neoz84132 ай бұрын
- go to comfyui python embedded folder, open terminal and try python -m pip install supervision If that does not work then try this: python -m pip install inference==0.9.13
  @controlaltai2 ай бұрын
do you have any basic masking, composting videos??
@iangregory95692 ай бұрын
- Not yet, however a 10 to 15, maybe more basic series episodes will be there on the channel covering part by part slowly, explaining every aspect of comfy and stable diffusion. We just don’t have an eta on the first episode…..
  @controlaltai2 ай бұрын
- @@controlaltai ,sorry i guess what i mean is what "mask node" would i use too layer two images together like in ,photoshop fusion, AE so a 3d rendered apple with a separate alpha channel then comped onto a background of a table, but there are so many mask nodes i don't know which is the most straight forward to use for such a simple job, thanks
  @iangregory95692 ай бұрын
- It’s works differently here. Say apply and bg, so you use mask to select apply, cut the apple and then paste it on another bg. This can be done via masquerade nodes cut by mask and paste by mask function. To select the apple, manual is messy, you can use grounding dino, clipseg or yoloworld. All three would suffice. In between you can add a grow mask node, feather mask etc to refine the mask and selection.
  @controlaltai2 ай бұрын
- thank you!@@controlaltai
  @iangregory9569Ай бұрын
Don't ask me what kind of "yellow world" it is. Is it obvious that you need to go somewhere, but as you wrote, I downloaded them, but in yolo_world/limis you need to go somewhere? In general, they kind of have to download themselves, but no. He writes this I got an error when loading Yoloworld_ModelLoader_Zho: It is not possible to get a "model of the world" in File "G:\NEUROset\ComfyUIPort\ComfyUI\execution.py ", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all)
@yklandares2 ай бұрын
- I cannot understand your question. Rephrase.
  @controlaltai2 ай бұрын
- as you wrote, I downloaded the models, but in the node where we select (yolo_world/l) In general, are they supposed to load themselves? but no, I have this error. I got an error when uploading Your world_ModelLoader_Zho: It is impossible to get a "model of the world"@@controlaltai
  @yklandares2 ай бұрын
- Yes models load themselves.
  @controlaltaiАй бұрын
May I get the JSON files for this lesson?
@edba7410Ай бұрын
- Everything is explained in the video. There are 6 to 7 workflows, you can build the workflow yourself.
  @controlaltaiАй бұрын
- @@controlaltai I tried, but I don't get the same results as you. Maybe I can't catch some points and I'm connecting the nodes incorrectly.
  @edba7410Ай бұрын
Great video very helpfull. I just have an issue in removing an object. I have a picture of 4 man, and i wanted to remove 1. In the step at the end after the Ksampler i have the issue that the face details of the other persons change a bit when i see it in the Image Comparer (rgthree) Can i remove 1 person without changing other details?
@kikoking5009Ай бұрын
- Thanks! This is a bit complicated. So I have to try this. Are you finding this issue after the first or second ksampler? Also the approach would depend on how the interaction is in the image. If you can send a sample image, i can try and let you know if successful.
  @controlaltaiАй бұрын
- @@controlaltai I find the Issue in both KSamplers. I don't know how to send a sample image. And here in youtube i can only write
  @kikoking5009Ай бұрын
- @@kikoking5009 send an email to mail @ controlaltai . com (without spaces)
  @controlaltaiАй бұрын
- I cannot reply to you from whatever email you sent from. "Remote server returned '550 5.4.300 Message expired -> 451 Requested action aborted;Reject due to policy restrictions" I need the photo sample of the 4 person along with your workflow. Sent an email from an account where I can reply back to you.
  @controlaltaiАй бұрын
- @@controlaltai i tried and sent it with other email. If it didn't work i really don't know. By the way iam thankful that you answer my question and try to help. Best luck
  @kikoking5009Ай бұрын
Fantastic node thank you. I am getting this error: Error occurred when executing Yoloworld_ESAM_Zho: 'WorldModel' object has no attribute 'clip_model'
@mariusvandenberg42502 ай бұрын
- Me too... there is already a Ticket open, should be fixed soon
  @eric-rorich2 ай бұрын
- Are on your inference 0.9.13 or the latest 0.9.17?
  @controlaltai2 ай бұрын
- @@controlaltaiinference package version 0.9.13
  @eric-rorich2 ай бұрын
- Are the jit models downloaded? When does this error happen? Always or occasionally.
  @controlaltai2 ай бұрын
- @@controlaltai yes i am. I rerun python -m pip uninstall inference and then python -m pip install inference==0.9.13
  @mariusvandenberg42502 ай бұрын
Thank you for your precious tutorial. I follow every steps but I still get the following error: "Error occurred when executing Yoloworld_ESAM_Zho: type object 'Detections' has no attribute 'from_inference' File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-YoloWorld-EfficientSAM\YOLO_WORLD_EfficientSAM.py", line 141, in yoloworld_esam_image detections = sv.Detections.from_inference(results) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^" Any suggestions? 🙏
@Spinaster2 ай бұрын
- Okay, multiple things can be wrong, 1. Check inference is 0.9.13 2. Check if the jit models are downloaded correctly 3. The object may not be detected, for that select some other keyword or reduce threshold. 4. Multiple objects are selected, and is getting passed on to the masked node. Only a single mask can be passed. For this use mask combined or selectg from mask extracted a value of the mask.
  @controlaltai2 ай бұрын
Can that work with animatediff?
@andrejlopuchov79722 ай бұрын
- Yup. I had planned to showcase it however I could not fit it as the video went too long, I had to cut so many concepts, so I though it would be a seperate video all together. Yolo world works for video real time detection. Basically I was able to take a plane lifting off video. Get the plane mask, make it disappear and have the whole video without the plane, only the camera and other elements moving. I still have to iron it out. Other ideas include use these workflow techniques like color grading a video in comfy. So you have a person dancing short video. Use the similar technique to isolate and change the colors say of the clothing and re stitch everything. All shown in the video can be applied to animate diff. Just the workflows would be slightly different.
  @controlaltai2 ай бұрын
The workflow is not too hard to learn. It is taking up too much of my machine resources , if I put every process I want in one workflow. I need to separate them to finish the image. 😢
@arthitchyoutube2 ай бұрын
- Remove the tagger node, the tagger node needs to run on cpu than gpu, there is a trick to do that, on gpu it takes minutes, cpu seconds. For the rest, it’s just what it is. Splitting it is a good idea.
  @controlaltai2 ай бұрын
Luckily its not this cumbersome! Nice
@Pauluz_The_Web_Gnome2 ай бұрын
where should I put yolo modles to ?
@petpo-ev1yd2 ай бұрын
- What yolo models are you talking about....? Check here 4:34
  @controlaltai2 ай бұрын
- @@controlaltai I mean yolo_world/l or yolo_world/m
  @petpo-ev1yd2 ай бұрын
- @@controlaltai Error occurred when executing Yoloworld_ModelLoader_Zho: Could not connect to Roboflow API. here is the error
  @petpo-ev1yd2 ай бұрын
- Did you download the .jit files?
  @controlaltai2 ай бұрын
- @@controlaltai yes,I did everything what you said
  @petpo-ev1yd2 ай бұрын
This....this rivals Adobe.
@godpunisher2 ай бұрын
✨👌😎😯😯😯😎👍✨
@manolomaruАй бұрын
- Thank you!!
  @controlaltaiАй бұрын
- @@controlaltai Hello Malihe 👋🙂 ...Yep, I installed Pinokio to avoid dealing with the other way of installation. But unfortunately I'll have to do it that way. Thank you so much for your time, and ultrasuperfast response 👍
  @manolomaruАй бұрын
Please share workflow 🥺
@sukhpalsukh35112 ай бұрын
- It's already shared with members. Also nothing is hidden in the video, you can create it from scratch if you do not wish to be a member.
  @controlaltai2 ай бұрын
- @@controlaltai Thank you, really advanced but simple tutorial, appreciate your work,
  @sukhpalsukh35112 ай бұрын
please reply to the subscriber)
@yklandares2 ай бұрын
avoid using Yolo World - it has outdated dependencies and most probably you will have issues with other nodes. Also Segm_Detector from Impact-Pack detects objects mush more accurate
@silverspark81752 ай бұрын
- agree with you
  @35wangfeng2 ай бұрын
- I know, the dev doesn’t respond. I am trying to find a way to update the dependencies myself. Will post if I am successful. The techniques I show in the video are not possible via dino or clipseg. As of now the best solution is to just have another comfy installed portable and have the yolo and inpainting inside. I am finding, this is now becoming very common. For example comfy 3d is a mess and requires completely different diffusion, same with plenty of other stuff. With mini conda I can manage different environments instead of separate environments, but I should get around to make a tutorial for that, this way we can still use new stuff without compromising the main go to workflow.
  @controlaltai2 ай бұрын
top
@Mehdi0montahw2 ай бұрын
error : efficient_sam_s_gpu.jit does not exist
@viniciuslacerda4577Ай бұрын
- Check the requirements section of the video. You need to download the two .jit files in the custom nodes yolo folder.
  @controlaltaiАй бұрын
keep the good work 👍 but can you tell me why it dose not mask anything in my workflow please:{ "last_node_id": 8, "last_link_id": 7, "nodes": [ { "id": 3, "type": "Yoloworld_ModelLoader_Zho", "pos": [ -321, 49 ], "size": { "0": 315, "1": 58 }, "flags": {}, "order": 0, "mode": 0, "outputs": [ { "name": "yolo_world_model", "type": "YOLOWORLDMODEL", "links": [ 2 ], "shape": 3, "slot_index": 0 } ], "properties": { "Node name for S&R": "Yoloworld_ModelLoader_Zho" }, "widgets_values": [ "yolo_world/l" ] }, { "id": 1, "type": "LoadImage", "pos": [ -662, 142 ], "size": { "0": 315, "1": 314 }, "flags": {}, "order": 1, "mode": 0, "outputs": [ { "name": "IMAGE", "type": "IMAGE", "links": [ 3, 7 ], "shape": 3, "slot_index": 0 }, { "name": "MASK", "type": "MASK", "links": null, "shape": 3 } ], "properties": { "Node name for S&R": "LoadImage" }, "widgets_values": [ "srg_sdxl_preview_temp_rtriy_00012_ (1).png", "image" ] }, { "id": 4, "type": "ESAM_ModelLoader_Zho", "pos": [ -298, 255 ], "size": { "0": 315, "1": 58 }, "flags": {}, "order": 2, "mode": 0, "outputs": [ { "name": "esam_model", "type": "ESAMMODEL", "links": [ 1 ], "shape": 3 } ], "properties": { "Node name for S&R": "ESAM_ModelLoader_Zho" }, "widgets_values": [ "CUDA" ] }, { "id": 5, "type": "PreviewImage", "pos": [ 653, 50 ], "size": [ 210, 246 ], "flags": {}, "order": 5, "mode": 0, "inputs": [ { "name": "images", "type": "IMAGE", "link": 4 } ], "properties": { "Node name for S&R": "PreviewImage" } }, { "id": 7, "type": "PreviewImage", "pos": [ 1115, 197 ], "size": [ 210, 246 ], "flags": {}, "order": 7, "mode": 0, "inputs": [ { "name": "images", "type": "IMAGE", "link": 6 } ], "properties": { "Node name for S&R": "PreviewImage" } }, { "id": 6, "type": "MaskToImage", "pos": [ 703, 371 ], "size": { "0": 210, "1": 26 }, "flags": {}, "order": 6, "mode": 0, "inputs": [ { "name": "mask", "type": "MASK", "link": 5 } ], "outputs": [ { "name": "IMAGE", "type": "IMAGE", "links": [ 6 ], "shape": 3, "slot_index": 0 } ], "properties": { "Node name for S&R": "MaskToImage" } }, { "id": 2, "type": "Yoloworld_ESAM_Zho", "pos": [ 61, 85 ], "size": { "0": 400, "1": 380 }, "flags": {}, "order": 4, "mode": 0, "inputs": [ { "name": "yolo_world_model", "type": "YOLOWORLDMODEL", "link": 2 }, { "name": "esam_model", "type": "ESAMMODEL", "link": 1, "slot_index": 1 }, { "name": "image", "type": "IMAGE", "link": 3 } ], "outputs": [ { "name": "IMAGE", "type": "IMAGE", "links": [ 4 ], "shape": 3, "slot_index": 0 }, { "name": "MASK", "type": "MASK", "links": [ 5 ], "shape": 3, "slot_index": 1 } ], "properties": { "Node name for S&R": "Yoloworld_ESAM_Zho" }, "widgets_values": [ "fire", 0.1, 0.1, 2, 2, 1, true, false, true, true, true, 0 ] }, { "id": 8, "type": "WD14Tagger|pysssss", "pos": [ -275, 445 ], "size": { "0": 315, "1": 220 }, "flags": {}, "order": 3, "mode": 0, "inputs": [ { "name": "image", "type": "IMAGE", "link": 7 } ], "outputs": [ { "name": "STRING", "type": "STRING", "links": null, "shape": 6 } ], "properties": { "Node name for S&R": "WD14Tagger|pysssss" }, "widgets_values": [ "wd-v1-4-convnext-tagger", 0.35, 0.85, false, false, "", "solo, food, indoors, no_humans, window, fire, plant, potted_plant, food_focus, pizza, tomato, rug, stove, fireplace" ] } ], "links": [ [ 1, 4, 0, 2, 1, "ESAMMODEL" ], [ 2, 3, 0, 2, 0, "YOLOWORLDMODEL" ], [ 3, 1, 0, 2, 2, "IMAGE" ], [ 4, 2, 0, 5, 0, "IMAGE" ], [ 5, 2, 1, 6, 0, "MASK" ], [ 6, 6, 0, 7, 0, "IMAGE" ], [ 7, 1, 0, 8, 0, "IMAGE" ] ], "groups": [], "config": {}, "extra": {}, "version": 0.4 }
@l_Majed_lАй бұрын
- Thanks! Please link the json file on google drive or something.....will check it out for you.
  @controlaltaiАй бұрын
Awesome
@barrenwardo2 ай бұрын