Back to blog
Design #AI#UX#design

AI UX Empty States in 2026: Loading, Waiting, and Thinking Patterns

LLM responses take seconds, not milliseconds. A practical guide to loading states, thinking indicators, and empty state design for AI products.

14 min · January 6, 2026 · Updated January 27, 2026
Topic relevant background image

TL;DR

  • AI responses take 2–30 seconds, not 200ms—traditional loading patterns don’t work.
  • Two-stage model: Processing (model receives prompt) → Generation (tokens streaming).
  • Show “thinking” states explicitly—users tolerate waiting when they understand why.
  • Stream responses as they generate—the typing effect creates perception of speed.
  • Use progressive fidelity: show low-quality results first, refine as generation completes.
  • Empty states should suggest prompts, not force users to compose complex input.
  • Avoid fake progress bars—they break trust when they don’t match reality.

Why AI UX Is Different

Traditional web applications respond in 100–500ms. AI applications take 2–30 seconds. This creates unique UX challenges:

AspectTraditional AppAI Application
Response time100–500ms2–30s
Loading indicatorSpinner sufficientNeed engagement
ProgressDeterministicUnpredictable
OutputComplete responseStreaming tokens
User expectationInstantTolerant if explained

Users will wait for AI—but they need to understand what’s happening.

The Two-Stage Loading Model

AI loading has distinct phases:

Stage 1: Processing

The model has received the prompt but hasn’t started generating:

User submits prompt


┌─────────────────┐
│   PROCESSING    │
│  "Thinking..."  │ ← Model is reasoning
│  (2-10 seconds) │
└────────┬────────┘


    Start streaming

What’s happening: Context processing, retrieval (for RAG), reasoning, planning.

What to show: Thinking indicator, active animation, status text.

Stage 2: Generation

The model is actively producing tokens:

Generation started


┌─────────────────┐
│   GENERATING    │
│  "Writing..."   │ ← Tokens streaming
│  (5-20 seconds) │
└────────┬────────┘


    Complete response

What’s happening: Token-by-token output.

What to show: Streaming text, typing animation, partial results.

Thinking Indicators

Design Principles

PrincipleImplementation
Show activityContinuous smooth animation
Explain state”Thinking…”, “Analyzing…”, “Researching…”
Set expectations”This usually takes 10-15 seconds”
Allow interruptionCancel/stop button visible

Animation Patterns

/* Thinking indicator animation */
.thinking-indicator {
  display: flex;
  gap: 4px;
  align-items: center;
}

.thinking-dot {
  width: 8px;
  height: 8px;
  border-radius: 50%;
  background: var(--color-primary);
  animation: thinking-pulse 1.4s ease-in-out infinite;
}

.thinking-dot:nth-child(2) {
  animation-delay: 0.2s;
}

.thinking-dot:nth-child(3) {
  animation-delay: 0.4s;
}

@keyframes thinking-pulse {
  0%, 80%, 100% {
    transform: scale(0.6);
    opacity: 0.4;
  }
  40% {
    transform: scale(1);
    opacity: 1;
  }
}

Context-Aware Status Text

function getStatusText(stage: AIStage, context: Context): string {
  if (stage === 'processing') {
    if (context.hasDocuments) {
      return "Searching your documents...";
    }
    if (context.isComplex) {
      return "Analyzing your request...";
    }
    return "Thinking...";
  }
  
  if (stage === 'generating') {
    if (context.outputType === 'code') {
      return "Writing code...";
    }
    if (context.outputType === 'long-form') {
      return "Composing response...";
    }
    return "Generating...";
  }
  
  return "Processing...";
}

Streaming Response Patterns

Why Streaming Matters

ApproachPerceived WaitUser Experience
Wait for complete response15 secondsFrustrating
Stream as generated~0 seconds to first tokenEngaging

Streaming creates the perception of instant response, even when total time is the same.

Implementation

// React component for streaming text
function StreamingResponse({ stream }: { stream: AsyncIterable<string> }) {
  const [text, setText] = useState('');
  const [isComplete, setIsComplete] = useState(false);
  
  useEffect(() => {
    let mounted = true;
    
    async function consume() {
      for await (const chunk of stream) {
        if (!mounted) break;
        setText(prev => prev + chunk);
      }
      if (mounted) setIsComplete(true);
    }
    
    consume();
    return () => { mounted = false; };
  }, [stream]);
  
  return (
    <div className="response">
      <div className="response-text">
        {text}
        {!isComplete && <span className="cursor">|</span>}
      </div>
      {isComplete && <ResponseActions />}
    </div>
  );
}

Typing Effect CSS

.cursor {
  display: inline-block;
  width: 2px;
  height: 1.2em;
  background: currentColor;
  animation: blink 1s step-end infinite;
  margin-left: 2px;
}

@keyframes blink {
  50% {
    opacity: 0;
  }
}

Empty State Design

The Problem with Blank Prompts

Empty text areas create two problems:

  1. Users don’t know what to ask
  2. Composing good prompts is hard

Solution: Suggested Prompts

function EmptyState({ context }: { context: AppContext }) {
  const suggestions = getSuggestions(context);
  
  return (
    <div className="empty-state">
      <h3>What can I help you with?</h3>
      
      <div className="suggestions">
        {suggestions.map(suggestion => (
          <button
            key={suggestion.id}
            onClick={() => submitPrompt(suggestion.prompt)}
            className="suggestion-chip"
          >
            <span className="suggestion-icon">{suggestion.icon}</span>
            <span className="suggestion-text">{suggestion.label}</span>
          </button>
        ))}
      </div>
      
      <div className="input-hint">
        Or type your own question...
      </div>
    </div>
  );
}

Context-Aware Suggestions

ContextSuggested Prompts
Code editor”Explain this function”, “Find bugs”, “Add tests”
Document editor”Summarize this”, “Make it shorter”, “Improve clarity”
Data analysis”What are the trends?”, “Find anomalies”, “Create visualization”
Support chat”How do I…”, “What’s the status of…”, “I need help with…”

Progressive Fidelity

The Pattern

Show lower-quality results immediately, refine as generation completes:

Immediate: Skeleton/placeholder

Early: Rough outline or structure

Mid: Partially complete content

Final: Complete, polished response

Implementation for Images

function ProgressiveImage({ generation }: { generation: ImageGeneration }) {
  return (
    <div className="image-container">
      {/* Low-res preview during generation */}
      {generation.preview && (
        <img 
          src={generation.preview} 
          className="image-preview blur-sm"
          alt="Generating..."
        />
      )}
      
      {/* Progress indicator */}
      {!generation.complete && (
        <div className="progress-overlay">
          <div className="progress-bar" style={{ width: `${generation.progress}%` }} />
          <span className="progress-text">{generation.progress}% complete</span>
        </div>
      )}
      
      {/* Final image */}
      {generation.complete && (
        <img 
          src={generation.final} 
          className="image-final"
          alt={generation.alt}
        />
      )}
    </div>
  );
}

What to Avoid

Fake Progress Bars

// ❌ BAD: Fake progress that doesn't match reality
function FakeProgress() {
  const [progress, setProgress] = useState(0);
  
  useEffect(() => {
    const interval = setInterval(() => {
      // This is lying to users
      setProgress(p => Math.min(p + 5, 95));
    }, 500);
    return () => clearInterval(interval);
  }, []);
  
  return <ProgressBar value={progress} />;
}

// ✅ GOOD: Honest state indicator
function HonestState({ stage }: { stage: string }) {
  return (
    <div className="state-indicator">
      <ThinkingAnimation />
      <span>{stage}</span>
    </div>
  );
}

Misleading Time Estimates

❌ Bad✅ Good
”Almost done!” (after 10 seconds)“This usually takes 10-20 seconds"
"Just a moment” (takes 30 seconds)“Analyzing your documents…”
Progress bar stuck at 99%Thinking animation with context

Silent Waiting

❌ Bad✅ Good
Blank screen”Thinking about your question…”
Static spinnerAnimated state with status text
No feedbackCancel button visible

Re-stating Pattern

Why It Matters

LLMs synthesize from context. Users benefit from seeing what the AI understood:

function RestatementConfirmation({ 
  userInput, 
  aiUnderstanding 
}: { 
  userInput: string; 
  aiUnderstanding: string;
}) {
  return (
    <div className="restatement">
      <div className="user-input">
        <span className="label">You asked:</span>
        <span className="content">{userInput}</span>
      </div>
      
      <div className="ai-understanding">
        <span className="label">I understood this as:</span>
        <span className="content">{aiUnderstanding}</span>
      </div>
      
      <div className="actions">
        <button onClick={proceed}>Yes, continue</button>
        <button onClick={clarify}>No, let me clarify</button>
      </div>
    </div>
  );
}

When to Use

  • Complex multi-part requests
  • Ambiguous queries
  • High-stakes actions (deletions, payments)
  • When misunderstanding is costly

Accessibility Considerations

Screen Reader Support

<div 
  role="status" 
  aria-live="polite"
  aria-busy="true"
  aria-label="AI is thinking about your question"
>
  <span class="visually-hidden">
    Processing your request. This usually takes 10-20 seconds.
  </span>
  <ThinkingAnimation aria-hidden="true" />
</div>

Keyboard Navigation

function AIPrompt() {
  return (
    <div>
      <textarea
        aria-label="Ask AI a question"
        placeholder="Ask me anything..."
        onKeyDown={(e) => {
          if (e.key === 'Enter' && !e.shiftKey) {
            e.preventDefault();
            submit();
          }
          if (e.key === 'Escape') {
            cancel();
          }
        }}
      />
      <button aria-label="Cancel AI request" onClick={cancel}>
        Cancel
      </button>
    </div>
  );
}

Implementation Checklist

Loading States

  • Distinct processing vs. generating states
  • Smooth, continuous animations
  • Context-aware status text
  • Time expectations when known
  • Cancel/stop button visible

Streaming

  • Stream responses as they generate
  • Cursor/typing indicator during stream
  • Smooth text appearance
  • Actions appear on completion

Empty States

  • Suggested prompts for common tasks
  • Context-aware suggestions
  • Easy prompt composition
  • Clear call to action

Accessibility

  • Screen reader announcements
  • Keyboard navigation
  • Focus management
  • Motion preferences respected

FAQ

How long is too long for a loading state?

Users tolerate 30+ seconds if they understand why. The key is setting expectations and showing progress. After 60 seconds, offer a “still working” message or option to continue in background.

Should I always stream responses?

For text output, yes. For structured data (JSON, tables), consider showing a skeleton first, then revealing complete data. Streaming partial structured data can cause layout shifts.

How do I handle errors during generation?

Show inline error with retry option. Preserve any partial response if useful. Don’t make users re-type their prompt.

What about slow connections?

Streaming helps here too—users see something immediately. Consider lower-quality initial responses that upgrade when bandwidth allows.

Should I show exact token counts or timing?

Only for power users or developer tools. Regular users don’t care about tokens—they care about getting helpful responses.

Sources & Further Reading

Interested in our research?

We share our work openly. If you'd like to collaborate or discuss ideas — we'd love to hear from you.

Get in Touch

Let's build
something real.

No more slide decks. No more "maybe next quarter".
Let's ship your MVP in weeks.

Start Building Now