Building a crypto card aggregator on a public CSV instead of a database
Most comparison sites I read in 2025 had stale data. The Binance Visa was being recommended in posts published a year after it shut down. Crypto.com cashback rates were stuck on 2021 numbers. So I built sweepbase.net from a different angle: the entire dataset is one CSV file, public, auditable, no DB layer.
This post is the technical why.
Why CSV beats a DB for read-heavy comparison data
The dataset is small (139 cards × 35 columns) and updates rarely (a few rows per week). The reads dwarf the writes by maybe four orders of magnitude. A DB buys nothing here, and costs:
A schema migration every time I add a column
An admin UI nobody is going to use
An ORM layer between me and the truth
An "is the data current?" question that's harder for readers to answer
A CSV in a public repo gives me one source of truth, version control for free, diff-able commits when I change a number, and a data.csv file anyone can download and verify. Every commit is an audit log. When somebody emails me "why did you change Crypto.com's APY?" I just point to the commit.
The runtime stack
Next.js 15.1 with the App Router. The CSV gets read server-side via Node fs, parsed with PapaParse, and validated with Zod. Validation is the thing that earns its keep. Zod's schema gives me a TypeScript type for free (via z.infer), and any malformed row throws at boot, not in the user's browser.
const CardSchema = z.object({
service: z.string().min(1),
fxMargin: z.number().min(0).max(10),
atmFee: z.number().min(0),
// ...
});
export type Card = z.infer<typeof CardSchema>;
Reads are cached with React.cache() so the parse runs once per request, not per component. Pages are rendered with ISR (revalidate: 3600) since cards change rarely. Lighthouse stays at 100 across the catalog.
The data shape
Each row has 35 columns. Some are obvious (issuance fee, FX margin, ATM limits). Some are messier and not in any other comparison site's table:
kycenum:none | basic | fullnonKycboolean override for products that publicly advertise itblockchainTopuparray of supported chains for self-custody loadingibanSwiftSepaboolean for fiat railswelcomeBonusfree-text where there is no numeric model
These are the fields that decide whether a card is actually orderable in your country, not the ones marketing puts on the landing page.
Filters as predicates, not SQL
Every category page (USA, no-KYC, self-custody, travel, and so on) is one Server Component. The filter logic lives in lib/filters.ts as plain functions of type (card: Card) => boolean. Composing them is a .filter() chain. No query language, no indexes to maintain, no migration when I add the 38th category. Adding a new region page is a 6-line PR.
Why this matters for a young site
The CSV-first approach also doubles as an SEO signal that comparison sites usually do not have. Each card page has a date stamp tied to the most recent commit that touched its row. When a reader checks "is this current?", the answer is verifiable, not implied. For Google's freshness signals, it is the same.
Catalog: sweepbase.net/cards. Calculator: sweepbase.net/calculator. Dataset: /datasets/data.csv. If you find a wrong number, the report-error button on every card page goes straight to my inbox.
