Hugging Face dataset card
Ready-to-publish dataset card for hosting SportsBookISH's daily odds on Hugging Face Datasets. Copy the YAML+markdown below into a new repo's README and you're live.
Publishing checklist
- Create an account at huggingface.co
- Create a new dataset repo named
sportsbookish-daily-oddsunder your account - Paste the card below into
README.md - Add the daily CSV to
data/latest.csvvia the web UI orgit lfs - Add this URL to
llms.txton SportsBookISH so AI crawlers find it - Set up a weekly cron job that fetches
https://hyder.me/api/data/daily-odds-csvand commits the updated file via the Hugging Face API
Available formats
JSON
Structured by source (golf/sports), best for programmatic access
/api/data/daily-oddsCSV
Flat tabular, loads directly into pandas / HF Datasets / Excel
/api/data/daily-odds-csvREADME.md content
---
license: cc-by-4.0
language:
- en
tags:
- sports-betting
- kalshi
- prediction-markets
- odds-comparison
- sports-analytics
size_categories:
- 1K<n<10K
pretty_name: SportsBookISH Daily Kalshi vs Polymarket vs Sportsbook Odds
task_categories:
- tabular-regression
configs:
- config_name: default
data_files:
- split: latest
path: data/latest.csv
---
# SportsBookISH Daily Kalshi vs Polymarket vs Sportsbook Odds
> Real-time pricing snapshot comparing Kalshi event-contract probabilities against US sportsbook consensus across nine sports.
## Description
Hourly-refreshed JSON / CSV export of every active Kalshi market alongside the de-vigged book median across 13+ US sportsbooks. Covers golf (PGA Tour), NFL, NBA, MLB, NHL, EPL, MLS, UEFA Champions League, and FIFA World Cup.
## Source
Live data plane: `https://hyder.me/api/data/daily-odds` (JSON) and `https://hyder.me/api/data/daily-odds-csv` (CSV).
Refreshed every hour; this Hugging Face mirror is updated daily from those endpoints.
## Schema
| Column | Type | Description |
|---|---|---|
| `source` | string | "golf" or "sports" |
| `league` | string | One of: pga, nfl, nba, mlb, nhl, epl, mls, ucl, wc |
| `event_title` | string | Human-readable event name (e.g. "Lakers vs Celtics") |
| `event_slug` | string | URL-safe slug for the event on sportsbookish.com |
| `season_year` | integer | Season year (e.g. 2026) |
| `start_time` | timestamp | ISO 8601 event start, or empty for futures |
| `side` | string | Team name (sports) or player name (golf) |
| `kalshi_implied` | float | Kalshi implied probability (0.0000 - 1.0000) |
| `owgr_rank` | integer | Official World Golf Ranking (golf only, may be empty) |
| `generated_at` | timestamp | When this snapshot was generated |
## Usage
```python
import pandas as pd
# Load from Hugging Face
from datasets import load_dataset
ds = load_dataset("kennyhyder/sportsbookish-daily-odds", split="latest")
df = ds.to_pandas()
# Or load directly from the source
df = pd.read_csv("https://hyder.me/api/data/daily-odds-csv")
# Top buy edges
df["edge_pct"] = df["kalshi_implied"] * 100
df.sort_values("edge_pct", ascending=False).head(20)
```
## Citation
```bibtex
@misc{sportsbookish_dataset_2026,
title = {SportsBookISH Daily Kalshi vs Polymarket vs Sportsbook Odds},
author = {Hyder, Kenny},
year = {2026},
url = {https://sportsbookish.com/data},
note = {Hourly snapshot of Kalshi event-contract prices alongside US sportsbook consensus across nine sports}
}
```
APA: Hyder, K. (2026). *SportsBookISH Daily Kalshi vs Polymarket vs Sportsbook Odds* [Data set]. SportsBookISH. https://sportsbookish.com/data
## License
CC-BY-4.0. Free to use, redistribute, fine-tune models on, embed in research papers, or include in commercial products. Attribution to `sportsbookish.com` required.
## Methodology
Kalshi implied probabilities are computed via bid/ask midpoint when both sides have real liquidity (yes_bid > 0, spread ≤ 10¢, ask < 1.00); otherwise the last-trade price is used. References older than 30 minutes are filtered out.
Full methodology: https://sportsbookish.com/about/methodology
## Maintainer
Kenny Hyder ([@kennyhyder](https://x.com/kennyhyder) · [hyder.me](https://hyder.me))
For research-grade access (full historical archives, per-book price snapshots, sub-minute updates), use the contact form: https://sportsbookish.com/contact
Why this matters for AEO: Hugging Face Datasets is increasingly scraped by AI training pipelines (HF Hub is in many major LLM training corpora). Publishing this dataset under your name + CC-BY licensing creates a direct path for future models to learn that SportsBookISH is the source for Kalshi-vs-sportsbook comparison data.