Design #AI#UX#design

AI UX Empty States in 2026: Loading, Waiting, and Thinking Patterns

LLM responses take seconds, not milliseconds. A practical guide to loading states, thinking indicators, and empty state design for AI products.

14 min · January 6, 2026 · Updated January 27, 2026

TL;DR

AI responses take 2–30 seconds, not 200ms—traditional loading patterns don’t work.
Two-stage model: Processing (model receives prompt) → Generation (tokens streaming).
Show “thinking” states explicitly—users tolerate waiting when they understand why.
Stream responses as they generate—the typing effect creates perception of speed.
Use progressive fidelity: show low-quality results first, refine as generation completes.
Empty states should suggest prompts, not force users to compose complex input.
Avoid fake progress bars—they break trust when they don’t match reality.

Why AI UX Is Different

Traditional web applications respond in 100–500ms. AI applications take 2–30 seconds. This creates unique UX challenges:

Aspect	Traditional App	AI Application
Response time	100–500ms	2–30s
Loading indicator	Spinner sufficient	Need engagement
Progress	Deterministic	Unpredictable
Output	Complete response	Streaming tokens
User expectation	Instant	Tolerant if explained

Users will wait for AI—but they need to understand what’s happening.

The Two-Stage Loading Model

AI loading has distinct phases:

Stage 1: Processing

The model has received the prompt but hasn’t started generating:

User submits prompt
       │
       ▼
┌─────────────────┐
│   PROCESSING    │
│  "Thinking..."  │ ← Model is reasoning
│  (2-10 seconds) │
└────────┬────────┘
         │
         ▼
    Start streaming

What’s happening: Context processing, retrieval (for RAG), reasoning, planning.

What to show: Thinking indicator, active animation, status text.

Stage 2: Generation

The model is actively producing tokens:

Generation started
       │
       ▼
┌─────────────────┐
│   GENERATING    │
│  "Writing..."   │ ← Tokens streaming
│  (5-20 seconds) │
└────────┬────────┘
         │
         ▼
    Complete response

What’s happening: Token-by-token output.

What to show: Streaming text, typing animation, partial results.

Thinking Indicators

Design Principles

Principle	Implementation
Show activity	Continuous smooth animation
Explain state	”Thinking…”, “Analyzing…”, “Researching…”
Set expectations	”This usually takes 10-15 seconds”
Allow interruption	Cancel/stop button visible

Animation Patterns

/* Thinking indicator animation */
.thinking-indicator {
  display: flex;
  gap: 4px;
  align-items: center;
}

.thinking-dot {
  width: 8px;
  height: 8px;
  border-radius: 50%;
  background: var(--color-primary);
  animation: thinking-pulse 1.4s ease-in-out infinite;
}

.thinking-dot:nth-child(2) {
  animation-delay: 0.2s;
}

.thinking-dot:nth-child(3) {
  animation-delay: 0.4s;
}

@keyframes thinking-pulse {
  0%, 80%, 100% {
    transform: scale(0.6);
    opacity: 0.4;
  }
  40% {
    transform: scale(1);
    opacity: 1;
  }
}

Context-Aware Status Text

function getStatusText(stage: AIStage, context: Context): string {
  if (stage === 'processing') {
    if (context.hasDocuments) {
      return "Searching your documents...";
    }
    if (context.isComplex) {
      return "Analyzing your request...";
    }
    return "Thinking...";
  }
  
  if (stage === 'generating') {
    if (context.outputType === 'code') {
      return "Writing code...";
    }
    if (context.outputType === 'long-form') {
      return "Composing response...";
    }
    return "Generating...";
  }
  
  return "Processing...";
}

Streaming Response Patterns

Why Streaming Matters

Approach	Perceived Wait	User Experience
Wait for complete response	15 seconds	Frustrating
Stream as generated	~0 seconds to first token	Engaging

Streaming creates the perception of instant response, even when total time is the same.

Implementation

// React component for streaming text
function StreamingResponse({ stream }: { stream: AsyncIterable<string> }) {
  const [text, setText] = useState('');
  const [isComplete, setIsComplete] = useState(false);
  
  useEffect(() => {
    let mounted = true;
    
    async function consume() {
      for await (const chunk of stream) {
        if (!mounted) break;
        setText(prev => prev + chunk);
      }
      if (mounted) setIsComplete(true);
    }
    
    consume();
    return () => { mounted = false; };
  }, [stream]);
  
  return (
    <div className="response">
      <div className="response-text">
        {text}
        {!isComplete && <span className="cursor">|</span>}
      </div>
      {isComplete && <ResponseActions />}
    </div>
  );
}

Typing Effect CSS

.cursor {
  display: inline-block;
  width: 2px;
  height: 1.2em;
  background: currentColor;
  animation: blink 1s step-end infinite;
  margin-left: 2px;
}

@keyframes blink {
  50% {
    opacity: 0;
  }
}

Empty State Design

The Problem with Blank Prompts

Empty text areas create two problems:

Users don’t know what to ask
Composing good prompts is hard

Solution: Suggested Prompts

function EmptyState({ context }: { context: AppContext }) {
  const suggestions = getSuggestions(context);
  
  return (
    <div className="empty-state">
      <h3>What can I help you with?</h3>
      
      <div className="suggestions">
        {suggestions.map(suggestion => (
          <button
            key={suggestion.id}
            onClick={() => submitPrompt(suggestion.prompt)}
            className="suggestion-chip"
          >
            <span className="suggestion-icon">{suggestion.icon}</span>
            <span className="suggestion-text">{suggestion.label}</span>
          </button>
        ))}
      </div>
      
      <div className="input-hint">
        Or type your own question...
      </div>
    </div>
  );
}

Context-Aware Suggestions

Context	Suggested Prompts
Code editor	”Explain this function”, “Find bugs”, “Add tests”
Document editor	”Summarize this”, “Make it shorter”, “Improve clarity”
Data analysis	”What are the trends?”, “Find anomalies”, “Create visualization”
Support chat	”How do I…”, “What’s the status of…”, “I need help with…”

Progressive Fidelity

The Pattern

Show lower-quality results immediately, refine as generation completes:

Immediate: Skeleton/placeholder
     ↓
Early: Rough outline or structure
     ↓
Mid: Partially complete content
     ↓
Final: Complete, polished response

Implementation for Images

function ProgressiveImage({ generation }: { generation: ImageGeneration }) {
  return (
    <div className="image-container">
      {/* Low-res preview during generation */}
      {generation.preview && (
        <img 
          src={generation.preview} 
          className="image-preview blur-sm"
          alt="Generating..."
        />
      )}
      
      {/* Progress indicator */}
      {!generation.complete && (
        <div className="progress-overlay">
          <div className="progress-bar" style={{ width: `${generation.progress}%` }} />
          <span className="progress-text">{generation.progress}% complete</span>
        </div>
      )}
      
      {/* Final image */}
      {generation.complete && (
        <img 
          src={generation.final} 
          className="image-final"
          alt={generation.alt}
        />
      )}
    </div>
  );
}

What to Avoid

Fake Progress Bars

// ❌ BAD: Fake progress that doesn't match reality
function FakeProgress() {
  const [progress, setProgress] = useState(0);
  
  useEffect(() => {
    const interval = setInterval(() => {
      // This is lying to users
      setProgress(p => Math.min(p + 5, 95));
    }, 500);
    return () => clearInterval(interval);
  }, []);
  
  return <ProgressBar value={progress} />;
}

// ✅ GOOD: Honest state indicator
function HonestState({ stage }: { stage: string }) {
  return (
    <div className="state-indicator">
      <ThinkingAnimation />
      <span>{stage}</span>
    </div>
  );
}

Misleading Time Estimates

❌ Bad	✅ Good
”Almost done!” (after 10 seconds)	“This usually takes 10-20 seconds"
"Just a moment” (takes 30 seconds)	“Analyzing your documents…”
Progress bar stuck at 99%	Thinking animation with context

Silent Waiting

❌ Bad	✅ Good
Blank screen	”Thinking about your question…”
Static spinner	Animated state with status text
No feedback	Cancel button visible

Re-stating Pattern

Why It Matters

LLMs synthesize from context. Users benefit from seeing what the AI understood:

function RestatementConfirmation({ 
  userInput, 
  aiUnderstanding 
}: { 
  userInput: string; 
  aiUnderstanding: string;
}) {
  return (
    <div className="restatement">
      <div className="user-input">
        <span className="label">You asked:</span>
        <span className="content">{userInput}</span>
      </div>
      
      <div className="ai-understanding">
        <span className="label">I understood this as:</span>
        <span className="content">{aiUnderstanding}</span>
      </div>
      
      <div className="actions">
        <button onClick={proceed}>Yes, continue</button>
        <button onClick={clarify}>No, let me clarify</button>
      </div>
    </div>
  );
}

When to Use

Complex multi-part requests
Ambiguous queries
High-stakes actions (deletions, payments)
When misunderstanding is costly

Accessibility Considerations

<div 
  role="status" 
  aria-live="polite"
  aria-busy="true"
  aria-label="AI is thinking about your question"
>
  <span class="visually-hidden">
    Processing your request. This usually takes 10-20 seconds.
  </span>
  <ThinkingAnimation aria-hidden="true" />
</div>

function AIPrompt() {
  return (
    <div>
      <textarea
        aria-label="Ask AI a question"
        placeholder="Ask me anything..."
        onKeyDown={(e) => {
          if (e.key === 'Enter' && !e.shiftKey) {
            e.preventDefault();
            submit();
          }
          if (e.key === 'Escape') {
            cancel();
          }
        }}
      />
      <button aria-label="Cancel AI request" onClick={cancel}>
        Cancel
      </button>
    </div>
  );
}

Implementation Checklist

Loading States

Distinct processing vs. generating states
Smooth, continuous animations
Context-aware status text
Time expectations when known
Cancel/stop button visible

Streaming

Stream responses as they generate
Cursor/typing indicator during stream
Smooth text appearance
Actions appear on completion

Empty States

Suggested prompts for common tasks
Context-aware suggestions
Easy prompt composition
Clear call to action

Accessibility

Screen reader announcements
Keyboard navigation
Focus management
Motion preferences respected

FAQ

How long is too long for a loading state?

Users tolerate 30+ seconds if they understand why. The key is setting expectations and showing progress. After 60 seconds, offer a “still working” message or option to continue in background.

Should I always stream responses?

For text output, yes. For structured data (JSON, tables), consider showing a skeleton first, then revealing complete data. Streaming partial structured data can cause layout shifts.

How do I handle errors during generation?

Show inline error with retry option. Preserve any partial response if useful. Don’t make users re-type their prompt.

What about slow connections?

Streaming helps here too—users see something immediately. Consider lower-quality initial responses that upgrade when bandwidth allows.

Should I show exact token counts or timing?

Only for power users or developer tools. Regular users don’t care about tokens—they care about getting helpful responses.

Sources & Further Reading

AI Loading States Patterns — Comprehensive pattern library
LLM Design Patterns — Re-stating and other patterns
In The Pocket AI Interactions — Interaction guidelines
Cloudscape GenAI Loading — AWS design system patterns
Loading States Guide — General loading best practices
Microinteractions — Related: animation timing

Interested in our research?

We share our work openly. If you'd like to collaborate or discuss ideas — we'd love to hear from you.

Get in Touch