How to Use Yomai: Step-by-Step AI Data Pipeline Tutorial

By Yomai Team13 Aug 20256 min readTutorial
Yomai interface showing a pipeline transforming different input formats into structured data

Yomai is an AI-first data transformation engine that takes any input — XML, JSON, PDF, Excel, or even free-text — and converts it into a clean, structured output following your schema.
In this tutorial, we’ll build a fully functional pipeline from input to output without diving too deep into technical internals. By the end, you’ll have a working setup that you can adapt for your own workflows.

Watch the quick demo below, then follow the step-by-step guide to build your own pipeline.


Step 1: Create a New Pipeline in Yomai

  1. Log in to your Yomai account.
  2. From the dashboard, click "New Pipeline".
  3. Give your pipeline a descriptive name (e.g., Invoice Data Extractor).
# Example: Naming your pipeline
Pipeline Name: Invoice Data Extractor

Step 2: Define the Schema / Dictionary

Your schema tells Yomai what the output should look like.

  1. Go to the Schema tab in your pipeline.
  2. Add the output fields you need, such as:
{
  "invoice_number": "string",
  "total_amount": "number",
  "customer_name": "string",
  "issue_date": "date"
}
  1. Optionally, upload a JSON Schema file if you already have a format defined.

Step 3: Configure the Destination

Decide where your structured data will be sent after processing.

  • API Endpoint: Send the data via POST request.
  • CSV File: Save results as downloadable CSV.
  • Database: Directly insert into MySQL, PostgreSQL, etc.

Example API destination configuration:

{
  "type": "api",
  "endpoint": "https://example.com/api/data",
  "method": "POST",
  "auth_token": "YOUR_API_KEY"
}

Step 4: Add and Test Inputs

  1. Click "Add Input" in your pipeline editor.

  2. Upload or connect your source:

    • XML
    • JSON
    • CSV
    • PDF
    • Free-text
  3. Yomai will automatically parse and recognize the format.


Step 5: Preview and Validate the Output

Before deploying, make sure Yomai transforms your data as expected.

  • Use the Preview button to see sample output.
  • Check that each field matches your schema (e.g., no missing or misformatted values).

Step 6: Deploy the Pipeline

  1. Click "Deploy" to make your pipeline live.
  2. Run the pipeline with your real input files or data streams.
  3. Watch as Yomai automatically transforms your inputs into clean, structured outputs.

Common Use Cases

Yomai pipelines can be adapted for many industries and workflows:

  • Invoices & Receipts — Extract totals, dates, and client names.
  • Contracts — Capture key clauses, parties, and dates.
  • Healthcare Forms — Standardize patient data for EMR systems.
  • Surveys — Convert messy CSV responses into normalized JSON.

Next Steps

Once your pipeline is live, you can:

  • Integrate it into existing apps via API calls.
  • Schedule automated runs using cron jobs or workflow tools.
  • Chain multiple Yomai pipelines together for complex ETL workflows.

Start building your own AI-powered pipeline today at Yomai.io.

#data pipeline#AI data transformation#multi-format ingestion#Yomai.io#automation tutorial