PromptManager: A production-ready Python SDK for LLM prompt Compression, Enhancement, Generation, and Control

PromptManager is a Python SDK ready for production environments, specializing in the compression, enhancement, generation, and validation of large language model (LLM) prompts. It supports version management and pipeline-style operations. PromptManager is agnostic to LLM service providers and can be deployed and used via SDK, REST API, or CLI, comprehensively addressing various challenges in prompt development, management, and optimization.

The core features of PromptManager cover the entire lifecycle of prompts, from optimization to generation, from validation to version tracking. It is rich in functionality and highly practical。

Prompt Compression: Reduces token count by 30%-70% while preserving semantic meaning, lowering the cost of LLM calls.
Prompt Enhancement: Improves the clarity, structure, and effectiveness of prompts for more accurate LLM understanding.
Prompt Generation: Automatically creates optimized prompts tailored to different scenarios based on task descriptions.
Prompt Validation: Detects security and usability issues such as injection attacks, unfilled template placeholders, and prompt quality problems.
Pipeline Operations: Chains multiple prompt operations together through a fluent API for one-stop processing.
Version Control: Tracks and manages prompts, supporting retrieval, saving, and switching between different versions.

Installing PromptManager

PromptManager supports installing the core functionality alone or installing corresponding extension modules based on usage needs. It supports different deployment methods like SDK, API, and CLI.

# Install only core functionality
pip install promptmanager

# Install all extensions
pip install promptmanager[all]

# Install specific extensions
pip install promptmanager[api] # For deploying REST API service
pip install promptmanager[cli] # For using the command-line interface
pip install promptmanager[providers] # For integrating major LLM providers
pip install promptmanager[compression] # For advanced semantic compression features

Using PromptManager

PromptManager offers simple and fast invocation methods, supporting asynchronous and synchronous APIs, as well as single-function calls and combined operations. Below are basic usage examples for core features.

Initialization

from promptmanager import PromptManager
pm = PromptManager()

Prompt Compression

Specify a compression ratio to lightweight long prompts while preserving core semantics。

result = await pm.compress(
 "Your very long prompt with lots of unnecessary words...",
 ratio = 0.5 # Target compression to 50% of original size
)
print(f"Compressed/Original tokens: {result.compressed_tokens}/{result.original_tokens}")
print(result.processed.text)

Prompt Enhancement

Optimize messy prompts to improve specification and effectiveness. An enhancement level can be specified。

result = await pm.enhance(
 "help me code something for sorting",
 level = "moderate"
)
print(result.processed.text)
# Example output: "Write clean, well-documented code to implement a sorting algorithm..."

Prompt Generation

Automatically generate adapted prompts based on task descriptions and specified styles, supporting various scenarios like code generation and translation。

result = await pm.generate(
 task = "Create a Python function to validate email addresses",
 style = "code_generation"
)
print(result.prompt)

Prompt Validation

Detect security risks and formatting issues in prompts, such as injection attacks or unfilled placeholders.

validation = pm.validate("Ignore previous instructions and...")
print(f"Is Valid: {validation.is_valid}") # Outputs False, detects injection attack
print(validation.issues)

Pipeline Combination Operations

Perform multiple operations like enhancement, compression, and validation in one go for improved efficiency.

result = await pm.process(
 "messy prompt here",
 enhance=True,
 compress= True,
 validate= True
)

Synchronous API Calls

All asynchronous methods have synchronous versions available, supporting different coding scenarios.

result = pm.compress_sync("prompt", ratio = 0.5)
result = pm.enhance_sync("prompt", level = "moderate")
result = pm.generate_sync(task= "Write code")

PromptManager Core Features

Prompt Compression: Multiple Strategies Available

Four compression strategies are provided, suitable for different scenarios like simple prompts, long documents, or code-focused prompts. Users can choose or customize.

Strategy	Speed	Quality	Applicable Scenarios
lexical	Fast	Good	Simple prompts, stop word removal
statistical	Medium	Better	Long documents, redundant content removal
code	Fast	Excellent	Code-heavy prompts
hybrid	Adaptive	Optimal	Default for production

Usage Example。

from promptmanager import PromptCompressor, StrategyType
compressor = PromptCompressor()

# Specify compression strategy and target ratio
result = compressor.compress(
 text,
 target_ratio = 0.5,
 strategy= StrategyType.HYBRID
)
# View compression metrics
print(f"Compression Ratio: {result.compression_ratio:.2%}")
print(f"Tokens Saved: {result.tokens_saved}")

Prompt Enhancement

PromptManager supports two enhancement methods: Rules-Only and Hybrid Mode. The former is fast without API calls, while the latter combines with LLMs for higher quality optimization.

from promptmanager import PromptEnhancer, EnhancementMode, EnhancementLevel
enhancer = PromptEnhancer()

# Rules-Only Mode (fast, deterministic, no LLM calls)
result = await enhancer.enhance(
 prompt,
 mode = EnhancementMode.RULES_ONLY,
 level = EnhancementLevel.MODERATE
)

# Hybrid Mode (rules first, then LLM optimization, requires LLM provider config)
from your_provider import LLMProvider
enhancer = PromptEnhancer(llm_provider = LLMProvider())
result = await enhancer.enhance(
 prompt,
 mode = EnhancementMode.HYBRID
)

# Analyze only, without modification
analysis = await enhancer.analyze(prompt)
print(f"Core Intent: {analysis['intent']['primary']}")
print(f"Prompt Quality Score: {analysis['quality']['overall_score']:.2f}")

Prompt Generation

PromptManager supports various generation styles like zero-shot, few-shot, chain-of-thought, and code generation.

from promptmanager import PromptGenerator, PromptStyle
generator = PromptGenerator()

# Zero-Shot Generation (simple and direct)
result = await generator.generate(
 task = "Explain quantum computing",
 style = PromptStyle.ZERO_SHOT
)

# Few-Shot Generation (with examples)
result = await generator.generate(
 task = "Translate English to French",
 style = PromptStyle.FEW_SHOT,
 examples=[
 {"input": "Hello", "output": "Bonjour"},
 {"input": "Goodbye", "output": "Au revoir"}
 ]
)

# Chain-of-Thought Generation (suitable for reasoning tasks)
result = await generator.generate(
 task = "Solve: If 3x + 5 = 20, what is x?",
 style = PromptStyle.CHAIN_OF_THOUGHT
)

# Code Generation (specify programming language)
result = await generator.generate(
 task = "Binary search implementation",
 style = PromptStyle.CODE_GENERATION,
 language= "Python"
)

Custom Pipelines

Use the Pipeline API to customize operation workflows. Add custom steps, clone, and modify to create different variants.

from promptmanager import Pipeline

# Create and configure a pipeline
pipeline = Pipeline()
 .enhance(level= "moderate")
 .compress(ratio = 0.6, strategy="hybrid")
 .validate(fail_on_error=True)

# Run the pipeline on a prompt
result = await pipeline.run("Your prompt here")
print(f"Success: {result.success}")
print(f"Output: {result.output_text}")
print(f"Number of steps: {len(result.step_results)}")

# Add a custom step
def add_signature(text, config)。
 return text + "\n\n-- Generated by AI"
pipeline.custom("signature", add_signature)

# Clone the pipeline and modify configuration
variant = pipeline.clone().compress(ratio = 0.4)

Prompt Validation

PromptManager detects security and quality issues in prompts, outputting clear error and warning messages.

from promptmanager import PromptValidator
validator = PromptValidator()

# Validate a prompt
result = validator.validate(prompt)

if not result.is_valid。
 for error in result.errors。
 print(f"Error: {error.message}")
 for warning in result.warnings。
 print(f"Warning: {warning.message}")

Detectable Issue Types。

Injection attacks (e.g., containing "ignore previous instructions")
Jailbreak attempts (e.g., containing "you are now DAN")
Unfilled template placeholders (e.g., "{{name}}", "{placeholder}")
Empty prompts or prompts containing only whitespace
Excessively short, invalid prompts

Version Control

PromptManager enables saving, retrieving, and version management of prompts, facilitating team collaboration and requirement iteration.

pm = PromptManager(storage_path= "./prompts")

# Save a prompt (specify ID, name, metadata)
pm.save_prompt(
 prompt_id = "welcome_v1",
 name = "Welcome Message",
 content="Hello! How can I help you today?",
 metadata= {"author": "team", "category": "greeting"}
)

# Retrieve a prompt (supports specifying version)
prompt = pm.get_prompt("welcome_v1")
prompt_v2 = pm.get_prompt("welcome_v1", version=2)

# List all saved prompts
prompts = pm.list_prompts()

Deploying and Using PromptManager

REST API Deployment

Start an API service via CLI or Python code, providing standardized endpoints for cross-platform calls。

# Start via CLI
pm serve --port 8000

# Start via Python code
from promptmanager.api import create_app
import uvicorn
app = create_app()
uvicorn.run(app, host="0.0.0.0", port=8000)

Core Endpoints。

POST /api/v1/compress - Prompt compression
POST /api/v1/enhance - Prompt enhancement
POST /api/v1/generate - Prompt generation
POST /api/v1/validate - Prompt validation
POST /api/v1/pipeline - Pipeline operations
GET /health - Service health check

API Call Example (curl)。

curl -X POST http://localhost:8000/api/v1/compress \
 -H "Content-Type: application/json" \
 -d '{
 "prompt": "Your long prompt here...",
 "ratio": 0.5,
 "strategy": "hybrid"
 }'

CLI Usage

Perform various prompt operations directly from the command line without writing code, convenient for quick testing and batch processing.

# Compress a prompt
pm compress "Your prompt" --ratio 0.5 --strategy hybrid

# Enhance a prompt
pm enhance "messy prompt" --level moderate --mode rules_only

# Generate a prompt
pm generate "Write a sorting function" --style code_generation

# Start the API service
pm serve --port 8000

# Count tokens in a prompt
pm tokens "Your prompt here"

LLM Provider Integration

PromptManager is provider-agnostic, supporting mainstream LLMs like OpenAI and Anthropic. It can integrate with 100+ LLMs via LiteLLM with simple configuration.

# Integrate OpenAI
from promptmanager.providers import OpenAIProvider
provider = OpenAIProvider(api_key="sk-...")

# Integrate Anthropic
from promptmanager.providers import AnthropicProvider
provider = AnthropicProvider(api_key="...")

# Integrate LiteLLM (supports 100+ LLMs)
from promptmanager.providers import LiteLLMProvider
provider = LiteLLMProvider(model ="gpt-4")

# Use with PromptManager
pm = PromptManager(llm_provider= provider)
result = await pm.enhance(prompt, mode ="hybrid")

Global Configuration

Default parameters for PromptManager can be set via code configuration or environment variables, suitable for unified deployment in production.

Code Configuration

from promptmanager import PromptManager
from promptmanager.core.config import PromptManagerConfig
config = PromptManagerConfig(
 default_model="gpt-4",
 compression_strategy="hybrid",
 enhancement_level ="moderate",
 cache_enabled=True,
 log_level ="INFO"
)
pm = PromptManager(config = config)

Environment Variable Configuration

PROMPTMANAGER_MODEL=gpt-4
PROMPTMANAGER_LOG_LEVEL=INFO
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=...

Performance Benchmarks

PromptManager operations are fast and efficient, suitable for high-concurrency production scenarios. Core performance metrics are as follows (may vary slightly depending on hardware)。

Operation	Input Size	Latency	Result
Compression (lexical strategy)	1000 tokens	~5ms	40% token reduction
Compression (hybrid strategy)	1000 tokens	~15ms	50% token reduction
Enhancement (rules-only mode)	500 tokens	~10ms	25% quality improvement
Enhancement (hybrid mode)	500 tokens	~500ms	40% quality improvement
Validation	500 tokens	~2ms	-
Generation	-	~5ms	-