smar.tr Beta
Smarter datasets. Custom training data in minutes.
Ship Custom Datasets
in Minutes, Not Weeks
The first truly self-service synthetic data platform.
Pixel-perfect labels. Ready to train. No annotation.
Founding members: 40% off lifetime
- Generate or augment
- Pixel-perfect boxes & masks
- Ready in minutes
- Multi-class labels
- YOLO, COCO & more
Best Results So Far
Full Test MetricsPrecision:92.9%Recall:94.5%mAP@50:97.5%mAP@50-95:94.3%F1 Score:94%Conf thresh:0.588pack_complete:97.7%pack_missing:97.3%Base Model: YOLOv8s · 32 epochs

Full Test Metrics
Precision:92.9%
Recall:94.5%
mAP@50:97.5%
mAP@50-95:94.3%
F1 Score:94%
Conf thresh:0.588
pack_complete:97.7%
pack_missing:97.3%
Base Model: YOLOv8s · 32 epochs
98%
mAP@50
94%
mAP@50-95
94%
F1 Score
Internal testing on pharmaceutical defect detection. Getting better with every update.
Built for: ML engineers • Researchers • Academics • Hobbyists • Vision startups • Anyone with a camera
* Beta launches with bounding boxes. Masks & polygons coming soon.
Performance varies by use case.
Frequently Asked Questions
What is synthetic data for computer vision?
Synthetic data is artificially generated training data that mimics real-world images. It comes pre-labeled with pixel-perfect bounding boxes, masks, and polygons, eliminating manual annotation.
Is synthetic data as good as real labeled data for training?
In many cases, yes. Our internal tests show hybrid approaches (synthetic + small real dataset) often outperform large manually-labeled datasets. Synthetic data provides perfect labels and unlimited variety, while real images add domain-specific nuance.
How does smar.tr compare to manual labeling?
smar.tr generates labeled datasets in minutes instead of weeks. Internal tests show 90%+ mAP@50 accuracy, with hybrid approaches (synthetic + small manual dataset) often outperforming large manually-labeled datasets. Sometimes up to 95% in mAP@50.
What export formats does smar.tr support?
smar.tr supports industry-standard formats including COCO and YOLO, with bounding boxes, segmentation masks, polygons and binary masks. More formats are being added during beta.
Can I augment my existing dataset with synthetic data?
Yes. You can upload your own images and smar.tr will generate variations with different settings, defects, backgrounds, lighting, angles, and augmentations. All pre-labeled and ready to train.
How much does synthetic training data cost?
Most personal use datasets cost $5-10 depending on size and complexity. Founding members get 40% off lifetime pricing. For commercial use, pricing scales with volume and features. The best case example on our site costs roughly $20-50.No enterprise contracts or sales calls or demo requests required. Fully self-service!
How do I create a custom YOLO dataset without labeling?
With smar.tr, you decide what you want to detect - describe or upload, preview and the platform generates synthetic images with pre-labeled YOLO annotations in minutes. No manual bounding box drawing required—export directly in YOLO format.
What use cases work best with synthetic data?
Synthetic data excels at object detection, defect detection, inventory counting, and any scenario where you need many labeled examples quickly. Industrial inspection, retail, medical imaging, and agricultural applications are common use cases.
Does smar.tr support multi-label datasets?
Yes. You can create datasets with multiple labels and classes. Define as many objects as you need, and the generator will create complex scenes with multiple annotated objects per image.
How many images can my dataset contain?
It is up to you and your credit balance. There are generally no limits on the platform, allowing you to generate as many images as you need.
Does smar.tr offer an API for synthetic data generation?
API access is currently in development and scheduled for release approximately three months after our public launch. This will enable developers to programmatically generate labeled datasets and integrate smar.tr directly into CI/CD and ML pipelines.
Does smar.tr support 3D synthetic data generation?
Not yet. We have successfully conducted internal proofs of concept for 3D generation with promising results, but the complexity requires additional infrastructure. We plan to introduce 3D capabilities as soon as possible following the stabilization of our core 2D features after the public launch.