Chat Memory & Context Management

💬Context Limits: Why the Bot Starts Forgetting Stuff

JLLM (like all LLMs) has a limited “context window.” That’s just a fancy way of saying: there’s a max amount of text it can read at once. Around 8k-9k tokens, give or take. If you overload it, stuff gets pushed out of memory or starts degrading. This is sometimes called “context rot,” and yeah, it’s as annoying as it sounds.

So if you're wondering why the bot forgets what happened five replies ago, or randomly misremembers who’s who... it’s probably because the memory's overflowing. When the model runs out of room, it loses track of who you are, what you were doing, or even the tone of the conversation.

Instead of asking “how much can I stuff in?”, the real question is: "How can I make every token count?"

Memory Template: Clean, Clear, and Structured

JLLM works way better when you give it clean, structured memory. You can use this simple format to keep things organized:

Environment: 
Relationship Dynamic: 
Current Plot Points: 
{{char}} notes: 
{{user}} notes: 
Important Past Events:

Use bullet points. Be short. Be specific. Don’t write a novel in here, it’ll just clog the system. Each section serves a purpose:

Environment: This section is for describing where you are right now and what the setting is like.

Example:

- rain

- In parking lot

✅ Do: Use short phrases

❌ Don’t: Write a whole scene

Relationship Dynamic : Here, you explain what’s going on between you and {{char}}—the key aspects of your relationship.

Example:

- {{user}} and {{char}} are rivals with unresolved feelings

✅ Do: Focus on meaningful facts

❌ Don’t: Ramble about moods

Current Plot Points: Use this to highlight what’s actively happening in the story at the moment.

Example:

- looking for a missing prince

✅ Do: Use action verbs

❌ Don’t: Drift into backstory

{{char}} notes: This section contains facts about {{char}} that aren’t part of their personality traits, details like inventory or other concrete information.

Example:

- inventory: gameboy

✅ Do: List concrete traits

❌ Don’t: Describe their actions

{{user}} notes: Here you list facts about your own persona that don’t fit under your personality, things like appearance or gear.

Example:

- outfit: wears old army jacket

✅ Do: Mention specific, stable traits

❌ Don’t: Try to control how the bot plays you

Important Past Events: This is for big events that have already happened but still have an impact on the story right now.

Example:

- {{char}} betrayed {{user}} a year ago

✅ Do: Stick to what matters long-term

❌ Don’t: Include tiny, one-off events

Some tips:

Stick to ONE verb tense.

Pick past or present, and use it consistently. Switching tenses confuses the model’s understanding of time.

List facts, not scenes.

Instead of writing out full paragraphs, use short statements:

- {{user}} was injured in the last fight not {{user}} limped through the door, clutching their side...

Avoid vague fluff.

Don’t write: {{user}} is complex and emotional.

Instead, write: - soft-spoken, avoids confrontation; hides anger easily

Don't try to script the bot.

Say what {{char}} is, not what they do.

✅ - protective of {{user}}

❌ - always comforts {{user}} when they’re sad

More Isn’t Better

Cramming too much into memory can cause the AI to forget or misinterpret key stuff. Memory should be treated like your best set of post-it notes, not a giant wiki.

If you're ever wondering, “Why did the AI forget this thing I definitely told it?”. Ask yourself:

Did I phrase it clearly?
Did I bury it under too much fluff?
Is it even still in the context window?

If the answer is “maybe” to any of those, go back and clean it up.

Prompting Summaries

One of the best ways to keep your chat coherent is by creating summaries as you go. This helps keep the important stuff in memory, even if your chat is hundreds of messages long.

You can even prompt the model to generate a recap every so often, and then use that summary as the new memory. Think of it like saving your progress in a game. You can even ask for specifics:

<system>task: pause chat|roleplay, answer query(outside of roleplay, succinct summary(starting from tavern fight onward, list of all impactful events(+their effects)))</system>
<system>task: pause chat|roleplay, answer query(outside of roleplay, succinct summary(include relationship changes between {{user}} and {{char}}))</system>
<system>task: pause chat|roleplay, answer query(outside of roleplay, succinct summary(list of all impactful events in roleplay(for each event listed: +their effects)))</system>

You might feel a bit lost when you first see prompts written like this. They’re not just weird code—they're a more reliable way to talk to the model out of character. You can still write OOC messages the way you're used to, and they’ll usually work just fine. But this format gives you more control.

The <system> tag marks the message as a system-level instruction. That alone doesn’t stop the roleplay. The part that actually pauses the roleplay is pause chat|roleplay. Without that, the model might still respond in character, even if your question is meant to be OOC.

Inside the answer query(...) part, you’re telling the model exactly what kind of response you want. Putting the request in parentheses helps the model parse it more clearly. For example, a prompt like answer query(outside of roleplay, succinct summary(list of all impactful events in roleplay(for each event listed: +their effects))) tells it to step outside the story, summarize the key events, and include the consequences of each one.

Using this format helps reduce confusion and gets you more accurate answers. It's especially useful when managing memory or checking on the state of the story.

Next Up: Writing Style & Talking to the Bot

Updated on: 31/07/2025

Was this article helpful?

Thank you!