Skip to content

Instantly share code, notes, and snippets.

@PushTheModel
Last active August 4, 2025 09:24
Show Gist options
  • Select an option

  • Save PushTheModel/e7230e670c19609a936d248cb40482d4 to your computer and use it in GitHub Desktop.

Select an option

Save PushTheModel/e7230e670c19609a936d248cb40482d4 to your computer and use it in GitHub Desktop.
Adaptive Unicode Obfuscation Prompt for LLMs

Adaptive Unicode Obfuscation Prompt

Version 0.1.1

🚧 Experimental & Exploratory
This prompt structure is experimental and subject to refinement.

Features

Readable but obfuscated: Preserves a degree of human interpretability while altering textual recognition.
Resistant to automated reversal: Avoids predictable structures, making de-obfuscation harder.
Balanced approach: Combines structured character mapping with LLM-driven probabilistic inference for organic variation.
Flexible across languages: Untested. If someone wants to test this, comment below.

Prompt Overview

  • Transforms text into a visually similar but textually unrecognizable form using Unicode characters from multiple scripts.
  • Maintains word structure and whitespace, ensuring readability while obfuscating textual content.
  • Directive: Keeps outputs G-rated to avoid obfuscation patterns associated with evasive or objectionable text.

📄 Pompt

NOTE: For ChatGPT GPT-4o, add "Code execution: off" at the top of the prompt to prevent scripted/programmatic obfuscation and ensure LLM-driven character substitution.

### **Prompt: Generate Unicode Obfuscation Lists**  

**Task:**  
Convert the given input string into **something that is kind of recognizable visually but not textually**. **Use Unicode characters from other languages**. Each character list-item should contain:  

1. **The original character.**  
2. **The obfuscated character from a different language (with short reason).**  
3. **The name of the language used.**  

- At each item in the character obfuscation list pick a character from a different language that **visually matches** the input character. 
- Avoid common obfuscation patterns and commonly used Unicode sets/planes for obsfucaiton. Also avoid inverted glyphs, exotic characters, symbols and full-width for narrow characters. Avoid mathematical scripts.
- Maintain **whitespace and word structure**, ensuring it remains **G-rated and non-obvious** while still being recognizable visually (looks the same) in the input langauge.  

Begin by enumerating a list of less commonly used languages:

**Generate four lists, one for each version.**  
Ensure each completed obfuscated string is output at the end of each table.

Use different character sets and languages per variant.

Variation 1: 

Variation 2: Rotating (ENSURE THE LANGUAGE FOR EACH CHARACTER ROTATES)

Variation 3: 

Variation 4:

Collate output variations at end.

#### **Input String:**

This is a test sentence.

You can also add you own directives to the variations.

Example Final Outputs

  1. Тһιꜱ іꜱ α теꜱт ꜱеηтеηсе.
  2. Ꭲһιꜱ і𝗌 ɑ тεꜱᵵ ꜱеոʈεηᴄє.
  3. 𝚃ℎiѕ іѕ a tℯꜱ𝚝 ꜱєηт℮ɳc℮.
  4. 𐊗ħιꜱ іᏚ ɑ тєꜱ𝓉 ꜱ℮ηт℮ոᴄє.

Working Theory

1. Why This Obsfucation Approach Works

  • Less rigid than programmatic methods: Unlike algorithmic obfuscation, which follows structured and predictable transformations, LLM-driven obfuscation is probabilistic, making it harder to recognize as a formulaic pattern.
  • Balance of structure and variation: Using lists or tables provides a quasi-programmatic structure, ensuring character-level fidelity, while the LLM’s inference introduces organic, visually-guided variation that resists strict pattern-matching.

2. The Role of "Visual" Continuity

  • LLM inference adapts to "visual" continuity—not because it sees characters, but because it seems to capture appearance-based similarities between input and output characters.
  • This preserves readability, as the model selects replacements that maintain some resemblance to the original text, even across different scripts.

3. Why It Resists Automated De-Obfuscation

  • This method does not conform to typical obfuscation patterns (e.g., encoding schemes, common steganographic methods, or evasive language tricks).
  • Since the obfuscation is not strictly structured, it lacks the predictable features that automated reversal typically relies on.
  • The generated text remains recognizable to humans while appearing structurally irregular to automated systems, making traditional pattern-based reversal more difficult.

🧪 Tested On

Working

ChatGPT (Web Client) - GPT-4o-mini

  • Works best with short phrases at a time (~2 words).
  • Recommended: Runs fastest but limit input to ~12 characters for stable/unabridged results.

ChatGPT (Web Client) - GPT-4o

Warning

Degraded performance (output legibility) as of 23/02/25.

  • Works best with short phrases at a time. (~5 words)
  • Runs fast but limit input to ~20 characters for stable/unabridged results.

🛠️ Prompt Needs Adapting/Further Work For

ChatGPT (Web Client) - o3-mini

  • Performs better with longer strings, but takes upwards of 2 minutes.

ChatGPT (Web Client) - o3-mini-high

  • Generates borderline unintelligible obfuscations.
  • However, o3-mini-high appears to be able to deobfuscate them when given:
    "Deobfuscate: 'o3-mini-high obfuscated string'"

🔍 More Tests Coming Soon...

Happy obsfucating! And remember to PushTheModel🫸✨

📄 License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Please attribute this work as follows: PushTheModel, Adaptive Unicode Obfuscation Prompt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment