Skip to main content
Generic AI personas repeat the same cultural clichés. The ethnography pipeline is what prevents that in Boses. Before generating any persona, Boses crawls public sources in your target market, extracts structured consumer signals using an LLM, and stores a cultural context snapshot. That snapshot is injected into every persona’s generation prompt. The result is personas that carry real attitudes — specific brands they trust or distrust, real digital habits, current concerns — rather than assumed ones.

Why it matters

Imagine you are testing a new buy-now-pay-later product in the Philippines. A persona grounded in current signals will know that GCash dominates mobile payments, that certain banks are distrusted by younger consumers, and that there is active discourse on Reddit about hidden fees in credit products. A generic persona will not. That difference determines whether your simulation surfaces insights you can act on.

Data sources

The pipeline pulls from three public sources per market:
SourceMarketsWhat it captures
Redditr/Philippines, r/indonesia, r/VietNamCurrent consumer discourse — trending concerns, product sentiment, cultural moments as they emerge in community discussion
Shopee reviewsPH, ID, VNReal purchase opinions from active shoppers: what they love, what disappoints them, price sensitivity, and brand comparisons at the moment of transaction
Google Play StoreGCash (PH), Gojek (ID), MoMo (VN)Digital service trust, UX frustrations, and attitudes toward fintech and super-app ecosystems — the dominant digital infrastructure of each market
Supported markets are PH (Philippines), ID (Indonesia), and VN (Vietnam). Snapshots are shared across all projects within the same market. If your company has five active projects targeting the Philippines, they all draw from the same PH snapshot.

Signal extraction

Raw content from all sources passes through an LLM that extracts structured signals. Each snapshot stores the following signal types:
SignalDescription
trusted_brandsBrands that consumers speak positively about in current discourse
distrusted_brandsBrands with active negative sentiment — distrust, disappointment, or avoidance
digital_habitsHow people use apps, payment methods, social platforms, and delivery services day to day
price_sensitivity_signalsExpressions of value-consciousness, bargain-hunting behaviour, or aspirational spending patterns
cultural_momentsCurrent events, trends, or shared experiences shaping consumer mood and priorities right now
These signals become part of every persona’s generation context. A persona generated from a PH snapshot will reflect the specific brands and digital behaviours that Filipinos are talking about today — not a generalised summary of Filipino consumer behaviour.

Quality gate

Each snapshot receives a quality_score between 0.0 and 1.0 based on the volume and consistency of extracted signals. Only snapshots scoring above 0.5 are activated. If all data sources are unavailable during a crawl, the pipeline will not overwrite a healthy existing snapshot. Your personas continue to use the last valid data rather than degraded output.

Auto-refresh

You do not need to manage snapshot freshness manually. The pipeline triggers automatically in two situations:
  1. When you create a new persona group for a market, Boses queues a background refresh alongside generation.
  2. Staleness check — if the active snapshot for your target market is older than 7 days at the time of persona group creation, a refresh runs before generation proceeds.
The fresher the snapshot, the more current the consumer signals baked into your personas. If you are running research on a time-sensitive topic — a product launch, a cultural event, a competitor move — create a new persona group to trigger a fresh crawl rather than reusing one generated weeks earlier.

Vertical targeting

By default, the pipeline crawls general top posts and bestseller categories. If your research targets a specific industry, you can request a vertical-focused snapshot by contacting your account team. Supported verticals include: fintech, beauty, food delivery, telco, gaming — or any category relevant to your research. A vertical snapshot concentrates signals on the topic area, so personas generated from it carry more specific attitudes and brand awareness for that space.
If you are running research on a specific industry vertical and want personas that reflect current discourse in that category, reach out to your account team to request a targeted refresh before generating personas.