LLM Configuration Types
BlackCat supports multiple types of large model capabilities and will automatically call the appropriate model type depending on the feature. The plugin currently recognizes and uses the following four types of models:
Chat Model (Chat / LLM)
Used for everyday conversation, webpage content understanding, text generation, and reasoning—the core interactive capabilities.
Document Understanding Model (Document / File Understanding)
Used to parse and understand long texts on webpages, structured documents, or user-uploaded files.
Image Generation Model
Generate images from text instructions for creative, design, and content enhancement scenarios.
Vision / Multimodal Model
Used to understand images, screenshots, or video frames on webpages to enable true multimodal analysis.
The plugin will automatically match and call the appropriate model type based on the feature scenario.
For the most complete and powerful AI browser experience, we recommend configuring all four model types.