Data we collect
Account data: name, email address, locale preference, password hash (when email/password sign-up is used), authentication provider identifier (e.g. Google, Apple, GitHub), subscription tier, referral code, security events such as sign-in time, IP address, browser, operating system, and approximate geographic region derived from the IP.
Workspace data: chat messages and prompts, uploaded files (PDF, image, Excel, plain text), molecule sketches, drawn structures (SMILES/InChI), saved report drafts, study sessions, flashcards, quiz attempts, mistake history, references retrieved from open scholarly APIs, and exports you choose to generate.
Files and embeddings: when you upload a file, the binary is stored in our private object bucket and parsed into text and image chunks. We compute vector embeddings (1024-dim, BGE-M3, generated on our own server in Germany) so the file becomes searchable for your future questions. Chemistry images may also be processed by an on-premise DECIMER model to extract SMILES.
Usage and telemetry: credits consumed per model, queue latency, model identifiers, error codes, feature interactions, page views, web-vitals, and abuse-prevention signals. Some telemetry is collected via Vercel Analytics, Firebase Analytics, and a self-hosted error monitor where enabled.
Payment metadata: subscription status, plan, invoices, tax region, the customer identifier issued by Stripe, and limited card metadata such as brand and last four digits passed back by Stripe. We never store full card numbers, CVCs, or banking credentials.
Communications: support tickets and replies, opt-in marketing preferences, language preference for transactional email.