A circle of translucent consumer electronics — phones, cameras, drones, headphones, automobiles, watches — each illuminated by the blue glow of an internal printed circuit board.

01 RESEARCH · DATASETS · 2026

The open dataset layer for AI-native electronics design.

CommonCircuits is building the first large-scale normalized corpus of PCB schematics, layouts, and manufacturing artifacts — a CommonCrawl for circuits.

Repos
50K–150K
Paired
20K–60K
Benchmark
500–2K

§ 01

The premise

Modern AI can write code because the internet contains billions of examples of software. Electronics has no equivalent.

PCB designs exist across GitHub, GitLab, OSHWHub, EasyEDA, KiCad repos, Eagle archives, and manufacturer exports — but they are fragmented, duplicated, format-incompatible, and mostly unusable for ML.

CommonCircuits turns that mess into structured training data.

§ 02

What we are building

A normalized dataset of real-world PCB projects.

Input sources
Normalized outputs
KiCad projects
schematic graph
Eagle .sch / .brd files
layout geometry
EasyEDA / LCSC projects
component–pin–net graph
Altium-derived exports
footprint / pad / via geometry
Open hardware repos
routing topology
Fabrication artifacts
board constraints & manufacturability signals

fig.02 — machine-learning-ready representation

schematic   → components, pins, nets, values, hierarchy
layout      → board outline, layers, pads, vias, tracks, zones
constraints → stackup, clearances, net classes, keepouts
metadata    → domain, license, quality score, dedup family

§ 03

Why now

PCB design is one of the last major engineering workflows without a public foundation dataset.

AI EDA models need examples of how real engineers turn intent into manufacturable hardware:

input schematic graph
+ constraints
model placement
+ routing
output fabrication
package

Today, every team trying to build AI PCB tools has to start by scraping, cleaning, parsing, and deduplicating the same chaotic public data.

CommonCircuits makes the dataset layer reusable.

§ 04

Initial corpus targets

Mostly 2-layer boards — ideal for early layout models, design-rule reasoning, and schematic-to-board automation.

  • 50K–150K

    raw repositories

    with PCB artifacts

  • 20K–60K

    schematic + layout

    paired raw designs

  • 5K–15K

    unique paired

    clean, after dedup

  • 50K+

    noisy pretraining

    corpus of artifacts

  • 500–2K

    benchmark

    high-quality designs

Harder domains — RF, high-speed digital, dense robotics boards, instrumentation, power electronics — are rarer in public data. That scarcity is part of the opportunity.

§ 05

Why it matters

CommonCircuits enables a new generation of AI for hardware.

a

AI placement & routing

Train models that understand full-board context — not just local routing heuristics.

b

EDA agents

Evaluate whether an agent can go from schematic to a valid, manufacturable board layout.

c

Design-rule learning

Decoupling, power paths, connector orientation, via usage, zone fills, trace topology — learned from real designs.

d

Synthetic augmentation

Generate controlled variants of real designs for reinforcement learning and simulation.

e

Public benchmarks

Open evals for schematic parsing, netlist matching, placement, routing, DRC repair, and manufacturability.

§ 06

Deduplication & quality

Open-source hardware data is extremely duplicated — forks, templates, tutorials, reference designs, versioned commits. The same board can appear hundreds of times.

// fingerprint

Electrical & geometric hashing

  • netlist hash
  • component multiset hash
  • footprint multiset hash
  • board outline hash
  • placement graph hash
  • routing topology hash

// quality flags

Every design is labeled

  • parses cleanly
  • schematic / layout netlist matches
  • libraries resolved
  • DRC status known
  • fabrication files present
  • license usable
  • manufacturing metadata present

§ 07 · positioning

CommonCircuits is the CommonCrawl for circuits.

A public and proprietary dataset layer for training, evaluating, and deploying AI systems that understand electronics design.

§ 08

We're looking for

Collaborators across the open hardware, EDA, and AI ecosystems.

  • data open hardware repositories & PCB datasets
  • eng KiCad / Eagle / EasyEDA parser contributors
  • research EDA & ML researchers
  • industry hardware companies willing to share anonymized design artifacts
  • capital investors interested in AI-native engineering infrastructure
Contact the team

sava@dayworkx.com

/ one-line pitch

CommonCircuits turns the world's fragmented open PCB designs into the training corpus for AI-native electronics design.