PromptManager is a Python SDK ready for production environments, specializing in the compression, enhancement, generation, and validation of large language model (LLM) prompts. It supports version management and pipeline-style operations. PromptManager is agnostic to LLM service providers and can be deployed and used via SDK, REST API, or CLI, comprehensively addressing various challenges in prompt development, management, and optimization.
The core features of PromptManager cover the entire lifecycle of prompts, from optimization to generation, from validation to version tracking. It is rich in functionality and highly practical。
PromptManager supports installing the core functionality alone or installing corresponding extension modules based on usage needs. It supports different deployment methods like SDK, API, and CLI.
# Install only core functionality
pip install promptmanager
# Install all extensions
pip install promptmanager[all]
# Install specific extensions
pip install promptmanager[api] # For deploying REST API service
pip install promptmanager[cli] # For using the command-line interface
pip install promptmanager[providers] # For integrating major LLM providers
pip install promptmanager[compression] # For advanced semantic compression features
PromptManager offers simple and fast invocation methods, supporting asynchronous and synchronous APIs, as well as single-function calls and combined operations. Below are basic usage examples for core features.
from promptmanager import PromptManager
pm = PromptManager()
Specify a compression ratio to lightweight long prompts while preserving core semantics。
result = await pm.compress(
"Your very long prompt with lots of unnecessary words...",
ratio = 0.5 # Target compression to 50% of original size
)
print(f"Compressed/Original tokens: {result.compressed_tokens}/{result.original_tokens}")
print(result.processed.text)
Optimize messy prompts to improve specification and effectiveness. An enhancement level can be specified。
result = await pm.enhance(
"help me code something for sorting",
level = "moderate"
)
print(result.processed.text)
# Example output: "Write clean, well-documented code to implement a sorting algorithm..."
Automatically generate adapted prompts based on task descriptions and specified styles, supporting various scenarios like code generation and translation。
result = await pm.generate(
task = "Create a Python function to validate email addresses",
style = "code_generation"
)
print(result.prompt)
Detect security risks and formatting issues in prompts, such as injection attacks or unfilled placeholders.
validation = pm.validate("Ignore previous instructions and...")
print(f"Is Valid: {validation.is_valid}") # Outputs False, detects injection attack
print(validation.issues)
Perform multiple operations like enhancement, compression, and validation in one go for improved efficiency.
result = await pm.process(
"messy prompt here",
enhance=True,
compress= True,
validate= True
)
All asynchronous methods have synchronous versions available, supporting different coding scenarios.
result = pm.compress_sync("prompt", ratio = 0.5)
result = pm.enhance_sync("prompt", level = "moderate")
result = pm.generate_sync(task= "Write code")
Four compression strategies are provided, suitable for different scenarios like simple prompts, long documents, or code-focused prompts. Users can choose or customize.
| Strategy | Speed | Quality | Applicable Scenarios |
|---|---|---|---|
| lexical | Fast | Good | Simple prompts, stop word removal |
| statistical | Medium | Better | Long documents, redundant content removal |
| code | Fast | Excellent | Code-heavy prompts |
| hybrid | Adaptive | Optimal | Default for production |
Usage Example。
from promptmanager import PromptCompressor, StrategyType
compressor = PromptCompressor()
# Specify compression strategy and target ratio
result = compressor.compress(
text,
target_ratio = 0.5,
strategy= StrategyType.HYBRID
)
# View compression metrics
print(f"Compression Ratio: {result.compression_ratio:.2%}")
print(f"Tokens Saved: {result.tokens_saved}")
PromptManager supports two enhancement methods: Rules-Only and Hybrid Mode. The former is fast without API calls, while the latter combines with LLMs for higher quality optimization.
from promptmanager import PromptEnhancer, EnhancementMode, EnhancementLevel
enhancer = PromptEnhancer()
# Rules-Only Mode (fast, deterministic, no LLM calls)
result = await enhancer.enhance(
prompt,
mode = EnhancementMode.RULES_ONLY,
level = EnhancementLevel.MODERATE
)
# Hybrid Mode (rules first, then LLM optimization, requires LLM provider config)
from your_provider import LLMProvider
enhancer = PromptEnhancer(llm_provider = LLMProvider())
result = await enhancer.enhance(
prompt,
mode = EnhancementMode.HYBRID
)
# Analyze only, without modification
analysis = await enhancer.analyze(prompt)
print(f"Core Intent: {analysis['intent']['primary']}")
print(f"Prompt Quality Score: {analysis['quality']['overall_score']:.2f}")
PromptManager supports various generation styles like zero-shot, few-shot, chain-of-thought, and code generation.
from promptmanager import PromptGenerator, PromptStyle
generator = PromptGenerator()
# Zero-Shot Generation (simple and direct)
result = await generator.generate(
task = "Explain quantum computing",
style = PromptStyle.ZERO_SHOT
)
# Few-Shot Generation (with examples)
result = await generator.generate(
task = "Translate English to French",
style = PromptStyle.FEW_SHOT,
examples=[
{"input": "Hello", "output": "Bonjour"},
{"input": "Goodbye", "output": "Au revoir"}
]
)
# Chain-of-Thought Generation (suitable for reasoning tasks)
result = await generator.generate(
task = "Solve: If 3x + 5 = 20, what is x?",
style = PromptStyle.CHAIN_OF_THOUGHT
)
# Code Generation (specify programming language)
result = await generator.generate(
task = "Binary search implementation",
style = PromptStyle.CODE_GENERATION,
language= "Python"
)
Use the Pipeline API to customize operation workflows. Add custom steps, clone, and modify to create different variants.
from promptmanager import Pipeline
# Create and configure a pipeline
pipeline = Pipeline()
.enhance(level= "moderate")
.compress(ratio = 0.6, strategy="hybrid")
.validate(fail_on_error=True)
# Run the pipeline on a prompt
result = await pipeline.run("Your prompt here")
print(f"Success: {result.success}")
print(f"Output: {result.output_text}")
print(f"Number of steps: {len(result.step_results)}")
# Add a custom step
def add_signature(text, config)。
return text + "\n\n-- Generated by AI"
pipeline.custom("signature", add_signature)
# Clone the pipeline and modify configuration
variant = pipeline.clone().compress(ratio = 0.4)
PromptManager detects security and quality issues in prompts, outputting clear error and warning messages.
from promptmanager import PromptValidator
validator = PromptValidator()
# Validate a prompt
result = validator.validate(prompt)
if not result.is_valid。
for error in result.errors。
print(f"Error: {error.message}")
for warning in result.warnings。
print(f"Warning: {warning.message}")
Detectable Issue Types。
PromptManager enables saving, retrieving, and version management of prompts, facilitating team collaboration and requirement iteration.
pm = PromptManager(storage_path= "./prompts")
# Save a prompt (specify ID, name, metadata)
pm.save_prompt(
prompt_id = "welcome_v1",
name = "Welcome Message",
content="Hello! How can I help you today?",
metadata= {"author": "team", "category": "greeting"}
)
# Retrieve a prompt (supports specifying version)
prompt = pm.get_prompt("welcome_v1")
prompt_v2 = pm.get_prompt("welcome_v1", version=2)
# List all saved prompts
prompts = pm.list_prompts()
Start an API service via CLI or Python code, providing standardized endpoints for cross-platform calls。
# Start via CLI
pm serve --port 8000
# Start via Python code
from promptmanager.api import create_app
import uvicorn
app = create_app()
uvicorn.run(app, host="0.0.0.0", port=8000)
Core Endpoints。
POST /api/v1/compress - Prompt compressionPOST /api/v1/enhance - Prompt enhancementPOST /api/v1/generate - Prompt generationPOST /api/v1/validate - Prompt validationPOST /api/v1/pipeline - Pipeline operationsGET /health - Service health checkAPI Call Example (curl)。
curl -X POST http://localhost:8000/api/v1/compress \
-H "Content-Type: application/json" \
-d '{
"prompt": "Your long prompt here...",
"ratio": 0.5,
"strategy": "hybrid"
}'
Perform various prompt operations directly from the command line without writing code, convenient for quick testing and batch processing.
# Compress a prompt
pm compress "Your prompt" --ratio 0.5 --strategy hybrid
# Enhance a prompt
pm enhance "messy prompt" --level moderate --mode rules_only
# Generate a prompt
pm generate "Write a sorting function" --style code_generation
# Start the API service
pm serve --port 8000
# Count tokens in a prompt
pm tokens "Your prompt here"
PromptManager is provider-agnostic, supporting mainstream LLMs like OpenAI and Anthropic. It can integrate with 100+ LLMs via LiteLLM with simple configuration.
# Integrate OpenAI
from promptmanager.providers import OpenAIProvider
provider = OpenAIProvider(api_key="sk-...")
# Integrate Anthropic
from promptmanager.providers import AnthropicProvider
provider = AnthropicProvider(api_key="...")
# Integrate LiteLLM (supports 100+ LLMs)
from promptmanager.providers import LiteLLMProvider
provider = LiteLLMProvider(model ="gpt-4")
# Use with PromptManager
pm = PromptManager(llm_provider= provider)
result = await pm.enhance(prompt, mode ="hybrid")
Default parameters for PromptManager can be set via code configuration or environment variables, suitable for unified deployment in production.
from promptmanager import PromptManager
from promptmanager.core.config import PromptManagerConfig
config = PromptManagerConfig(
default_model="gpt-4",
compression_strategy="hybrid",
enhancement_level ="moderate",
cache_enabled=True,
log_level ="INFO"
)
pm = PromptManager(config = config)
PROMPTMANAGER_MODEL=gpt-4
PROMPTMANAGER_LOG_LEVEL=INFO
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=...
PromptManager operations are fast and efficient, suitable for high-concurrency production scenarios. Core performance metrics are as follows (may vary slightly depending on hardware)。
| Operation | Input Size | Latency | Result |
|---|---|---|---|
| Compression (lexical strategy) | 1000 tokens | ~5ms | 40% token reduction |
| Compression (hybrid strategy) | 1000 tokens | ~15ms | 50% token reduction |
| Enhancement (rules-only mode) | 500 tokens | ~10ms | 25% quality improvement |
| Enhancement (hybrid mode) | 500 tokens | ~500ms | 40% quality improvement |
| Validation | 500 tokens | ~2ms | - |
| Generation | - | ~5ms | - |