> ## Documentation Index
> Fetch the complete documentation index at: https://docs.topsort.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Historical Data Migration

> Migrating historical performance data to address the cold start problem during platform transition

export const LastUpdated = ({date, lang = "en"}) => {
  const translations = {
    en: "Last updated:",
    es: "Última actualización:",
    pt: "Última atualização:",
    fr: "Dernière mise à jour:",
    de: "Zuletzt aktualisiert:"
  };
  const label = translations[lang] || translations.en;
  return <>
<style>{`
.last-updated-component {
display: inline-flex;
align-items: center;
gap: 8px;
padding: 10px 16px;
border-radius: 8px;
margin-top: 12px;
margin-bottom: 16px;
font-size: 14px;
background-color: rgba(0, 0, 0, 0.05);
border: 1px solid rgba(0, 0, 0, 0.12);
color: rgba(0, 0, 0, 0.75);
line-height: 1;
}

        .last-updated-component svg {
          flex-shrink: 0;
          vertical-align: middle;
        }

        .last-updated-component span {
          display: inline-flex !important;
          align-items: center !important;
          line-height: 1 !important;
        }

        [data-theme="dark"] .last-updated-component {
          background-color: #3a3a3a;
          border: 2px solid #888888;
          color: #ffffff;
        }

        [data-theme="dark"] .last-updated-component svg {
          stroke: #ffffff;
        }
      `}</style>
      <div className="last-updated-component">
        <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
          <circle cx="12" cy="12" r="10" />
          <polyline points="12 6 12 12 16 14" />
        </svg>
        <span>
          <strong style={{
    fontWeight: 600
  }}>{label}</strong> 
          <time dateTime={date}>{date}</time>
        </span>
      </div>
    </>;
};

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  Historical data migration is the process of transferring performance metrics and event data from a client's previous ad platform to accelerate Topsort's machine learning models and reduce the initial learning period during platform transition.
</div>

## Problem

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  When clients migrate to Topsort, their campaigns face a **cold start problem** where:
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * **No Performance History**: New campaigns start without any historical performance data
  * **Learning Period**: Machine learning models require 1-4 weeks to accumulate sufficient data for optimization
  * **Suboptimal Performance**: During cold start, campaigns may underperform due to lack of training data
  * **Advertiser Frustration**: Advertisers may experience reduced campaign effectiveness in the initial weeks
</div>

<Note>
  While [Campaign Migration](/en/knowledge-base/ad-platform/campaign-migration/)
  handles campaign structure and settings, historical data migration
  specifically addresses performance data to accelerate model training and
  optimization.
</Note>

## Solution

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  We provide a **historical data ingestion solution** that imports performance metrics and event data from the client's previous platform. This data serves as initial training material for Topsort's machine learning models, significantly reducing the cold start period.
</div>

### How Historical Data Helps

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **Model Training Acceleration:**
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * Provides immediate training data for machine learning algorithms
  * Reduces cold start period from 4 weeks to 1-2 weeks
  * Enables faster campaign optimization and bidding decisions
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **Performance Continuity:**
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * Campaigns can leverage historical performance patterns
  * Better initial bid recommendations based on past data
  * Improved targeting decisions from historical user behavior
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **Risk Reduction:**
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * Minimizes performance dip during platform transition
  * Maintains advertiser confidence with familiar performance levels
  * Provides baseline metrics for comparison and optimization
</div>

### Technical Implementation

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  Our historical data integration:
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * **Ingests event data** including organic impressions, clicks, and purchases
  * **Processes performance metrics** at campaign, product, and user levels
  * **Trains initial models** using imported historical data before go-live
  * **Calibrates algorithms** during initial operation for optimal performance
  * **Updates embeddings** for users, products, and placements based on historical patterns
</div>

## Migration Process

<Steps>
  <Step title="Data Assessment and Scope Definition">
    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
      **Evaluate Historical Data Availability**
    </div>

    * Assess what performance data is available from previous platform
    * Determine data quality and completeness
    * Define time range for historical data (typically 3-6 months)
    * Identify key metrics that align with Topsort's tracking

    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
       
    </div>
  </Step>

  <Step title="Data Export and Preparation">
    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
      **Required Historical Data Types:**
    </div>

    * Campaign performance metrics (impressions, clicks, conversions, spend)
    * Product-level performance data (click-through rates, conversion rates)
    * User behavior events (searches, views, purchases)
    * Organic traffic patterns and seasonal trends
    * Bidding and budget utilization history

    <Note>
      All historical data must comply with privacy regulations. User-level data
      should be anonymized or aggregated where required by local privacy laws.
    </Note>
  </Step>

  <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
     
  </div>

  <Step title="Data Validation and Processing">
    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
      **Quality Assurance Steps:**
    </div>

    * Validate data completeness and accuracy
    * Normalize metrics to match Topsort's data schema
    * Clean and process data for model training
    * Identify and handle data anomalies or outliers

    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
       
    </div>
  </Step>

  <Step title="Model Training and Calibration">
    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
      **Initial Training Process:**
    </div>

    * Import historical data into Topsort's training pipeline
    * Train initial machine learning models using historical patterns
    * Calibrate algorithms for optimal performance
    * Validate model accuracy against known historical outcomes

    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
       
    </div>
  </Step>

  <Step title="Production Deployment and Monitoring">
    <div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
      **Go-Live Process:**
    </div>

    * Deploy trained models to production environment
    * Monitor initial performance against historical baselines
    * Fine-tune algorithms based on new real-time data
    * Gradually shift from historical to real-time data optimization
  </Step>
</Steps>

## Data Requirements

### Required Performance Metrics

| Metric Category          | Required Fields                                                             | Example Format                                              |
| ------------------------ | --------------------------------------------------------------------------- | ----------------------------------------------------------- |
| **Campaign Performance** | campaign\_id, date, impressions, clicks, conversions, spend                 | `campaign-123, 2024-01-15, 1000, 50, 5, 25.00`              |
| **Product Performance**  | product\_id, campaign\_id, date, impressions, clicks, ctr, conversion\_rate | `prod-456, campaign-123, 2024-01-15, 100, 10, 0.10, 0.02`   |
| **User Events**          | user\_id (anonymized), event\_type, product\_id, timestamp, value           | `user-789, purchase, prod-456, 2024-01-15T10:30:00Z, 49.99` |
| **Organic Traffic**      | product\_id, date, organic\_impressions, organic\_clicks, search\_terms     | `prod-456, 2024-01-15, 500, 25, "summer shoes"`             |

### CSV Format Examples

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **Campaign Performance Data:**
</div>

```csv theme={null}
campaign_id,date,impressions,clicks,conversions,spend,ctr,conversion_rate
campaign-123,2024-01-15,1000,50,5,25.00,0.05,0.10
campaign-124,2024-01-15,800,40,3,20.00,0.05,0.075
```

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **Product Performance Data:**
</div>

```csv theme={null}
product_id,campaign_id,date,impressions,clicks,conversions,revenue
prod-456,campaign-123,2024-01-15,100,10,2,49.98
prod-457,campaign-123,2024-01-15,150,8,1,24.99
```

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **User Event Data:**
</div>

```csv theme={null}
user_id,event_type,product_id,timestamp,value,campaign_id
user-789,view,prod-456,2024-01-15T10:00:00Z,,
user-789,click,prod-456,2024-01-15T10:05:00Z,,campaign-123
user-789,purchase,prod-456,2024-01-15T10:30:00Z,49.99,campaign-123
```

## Model Training Process

### Onboarding Training

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **Initial Data Processing:**
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * Historical event data is integrated into training pipelines
  * Models are trained using 3-6 months of historical performance data
  * Initial embeddings are created for users, products, and campaigns
  * Baseline performance predictions are established
</div>

### Ongoing Optimization

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  **Continuous Learning:**
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * **Daily Updates**: ID lookup embeddings updated with new data
  * **Weekly Retraining**: Full model retraining incorporating both historical and new data
  * **Real-time Adaptation**: User behavior embeddings updated continuously
  * **Performance Monitoring**: Historical vs. current performance comparison
</div>

<Tip>
  The combination of historical data and real-time learning typically achieves
  optimal performance within 2-3 weeks, compared to 4-6 weeks with cold start
  alone.
</Tip>

## Success Metrics

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  Historical data migration success is measured by:
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  * **Reduced Cold Start Period**: Learning time decreased from 4 weeks to 1-2 weeks
  * **Performance Continuity**: Campaign performance within 10-15% of historical levels from day one
  * **Model Accuracy**: Prediction accuracy improved by 20-30% compared to cold start scenarios
  * **Advertiser Satisfaction**: Maintained or improved advertiser confidence during transition
</div>

## Integration with Campaign Migration

### Complementary Processes

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  Historical data migration works alongside [Campaign Migration](/en/knowledge-base/ad-platform/campaign-migration/):
</div>

1. **Campaign Structure**: Basic campaign migration handles settings, budgets, and targeting
2. **Performance Data**: Historical data migration provides the performance foundation
3. **Combined Benefit**: Together, they ensure both functional campaigns and optimized performance from day one

### Recommended Sequence

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  1. Complete [Campaign Migration](/en/knowledge-base/ad-platform/campaign-migration/) first to establish campaign structure
  2. Run historical data migration in parallel during testing phase
  3. Deploy both campaign structure and trained models simultaneously
  4. Monitor performance against historical baselines
</div>

<Warning>
  Historical data migration requires additional technical coordination and may
  extend overall migration timeline by 1-2 weeks for model training and
  validation.
</Warning>

## Next Steps

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  For clients interested in historical data migration:
</div>

<div style={{textAlign: 'justify', marginBottom: '1.5rem'}}>
  1. **Assess data availability** from your current platform
  2. **Coordinate technical teams** to discuss historical data requirements
  3. **Plan data extraction** alongside campaign migration timeline
  4. **Coordinate with machine learning team** for model training requirements
</div>

***

<LastUpdated date="2025-11-18" />
