1. Problem Statement
heretic is an incredibly powerful tool for abliteration, but its model loading capabilities are currently limited to models whose configurations are statically mapped within the transformers library. When attempting to load newer or less common models, the tool fails with the following error:
Code:
Unrecognized configuration class <class 'transformers.models.glm4v.configuration_glm4v.Glm4vConfig'> for this kind of AutoModel: AutoModelForCausalLM.
This issue prevents users from leveraging heretic's capabilities on cutting-edge models like zai-org/GLM-4.6V-Flash, limiting the tool's utility and forcing users to manually modify the source code for each new architecture they wish to use.
2. Root Cause Analysis
The failure stems from the AutoModelForCausalLM class in transformers, which relies on a static, hard-coded mapping between configuration classes (e.g., LlamaConfig, MistralConfig) and their corresponding model classes (e.g., LlamaForCausalLM, MistralForCausalLM).When a new model like GLM-4.6V is released, its configuration class (Glm4vConfig) and model class (Glm4vMoeForConditionalGeneration) are not present in this static mapping in older versions of transformers. Consequently, AutoModelForCausalLM.from_pretrained() has no way of knowing which class to instantiate for the given configuration, leading to the Unrecognized configuration class error.
3. Proposed Solution: A Dynamic Auto-Discovery and Registration Mechanism
To make heretic truly universal and future-proof, we propose replacing the static dependency on transformers' internal mappings with a dynamic, on-the-fly discovery and registration mechanism. The logic is as follows:- Attempt Standard Loading: First, attempt to load the model using the standard AutoModelForCausalLM.from_pretrained().
- Detect Specific Failure: If this fails with the Unrecognized configuration class error, activate the patch.
- Inspect Model Configuration: Use AutoConfig.from_pretrained() to load the model's config.json file. This object contains the necessary metadata.
- Extract Architecture Information: Read the config.architectures field from the configuration object to get the canonical class name for the model (e.g., "Glm4vMoeForConditionalGeneration").
- Dynamic Import: Dynamically import the required model and configuration classes directly from the model's repository files (e.g., transformers.models.glm4v.modeling_glm4v).
- Register with AutoModel: Use AutoModelForCausalLM.register(config_class, model_class) to inject the newly discovered mapping into transformers' internal registry for the current session.
- Retry Loading: Re-attempt AutoModelForCausalLM.from_pretrained(). This time, it will succeed because the mapping is now known.
This approach ensures that heretic can handle any model that follows the standard transformers format, without requiring prior knowledge or hard-coded support.
4. Key Code Changes
The proposed changes are contained within heretic/model.py:Import AutoConfig:
Code:
from transformers import AutoConfig
- Add a new method _patch_new_model_support() which encapsulates the dynamic registration logic.
- Modify __init__() and reload_model() to call this patch method before the model loading loop.
- Critical Dependency: This mechanism relies on the structure of transformers v5.0 and later. Therefore, the environment must be running transformers>=5.0.0rc0.
5. Benefits
- Universality: heretic becomes compatible with any current or future model that adheres to the transformers standard, not just a predefined list.
- Future-Proofing: No need to modify heretic's source code for every new model architecture. The tool will adapt automatically.
- Elegance: The solution preserves the original AutoModelForCausalLM.from_pretrained() logic, only adding a pre-processing patch when necessary. It is non-invasive.
- Backward Compatibility: This change does not affect the loading of existing, supported models (Llama, Mistral, etc.), as the patch is only triggered on failure.
6. Testing & Validation
The proposed solution has been successfully implemented and tested.- Model: zai-org/GLM-4.6V-Flash
- Hardware: NVIDIA GeForce RTX 4090
- Environment: transformers>=5.0.0rc0
Results:The model was loaded successfully without any Unrecognized configuration class errors.
Code:
Loading model zai-org/GLM-4.6V-Flash...
Loading checkpoint shards: 100%|████| 4/4 [00:00<00:00, 868.88it/s]
Ok
* Transformer model with 40 layers
* Abliterable components:
* attn.o_proj: 1 matrices per layer
* mlp.down_proj: 1 matrices per layer