# Complete Optimization Problem and Solution: school_bus

## 1. Problem Context and Goals

### Context  
The business problem involves optimizing the assignment of drivers to schools to minimize the total travel distance while ensuring operational efficiency. Each driver resides in a specific home city, and each school is located in a distinct location. The travel distance between a driver's home city and a school's location is calculated using the Euclidean distance formula, which measures the straight-line distance between two points based on their coordinates. 

The decision to assign a driver to a school is represented as a binary choice: a driver is either assigned to a school (1) or not (0). The business configuration includes two key parameters:  
1. **Maximum number of drivers per school**: Each school must be assigned exactly one driver to ensure efficient operations.  
2. **Maximum number of schools per driver**: Each driver can be assigned to at most one school to avoid overburdening drivers.  

The goal is to make these assignments in a way that minimizes the total travel distance across all driver-school pairs, ensuring that the constraints are satisfied. This problem is naturally suited for a linear optimization formulation, as it involves straightforward relationships between decision variables, coefficients, and constraints without requiring nonlinear operations like multiplication or division of variables.

### Goals  
The primary optimization goal is to minimize the total travel distance incurred by assigning drivers to schools. This is achieved by summing the distances for all driver-school pairs where an assignment is made. Success is measured by the total distance value, which is directly derived from the distances stored in the distance matrix. The objective is to find the optimal set of assignments that satisfies the constraints while achieving the lowest possible total travel distance.

## 2. Constraints  

The problem is subject to the following constraints:  
1. **Each school must be assigned exactly one driver**: This ensures that every school has a dedicated driver for its operations.  
2. **Each driver can be assigned to at most one school**: This prevents drivers from being overburdened by multiple assignments.  

These constraints are expressed in terms of the binary assignment decisions, ensuring that the solution adheres to the operational requirements of the business. The constraints are linear in nature, as they involve simple sums of binary variables without any nonlinear relationships.

## 3. Available Data  

### Database Schema  
```sql
-- Iteration 1 Database Schema
-- Objective: Schema changes include creating tables for drivers, schools, and distance matrix. Configuration logic updates include scalar parameters for distance calculation and business logic formulas for assignment constraints.

CREATE TABLE drivers (
  driver_id INTEGER,
  home_city STRING
);

CREATE TABLE schools (
  school_id INTEGER,
  location STRING
);

CREATE TABLE distance_matrix (
  driver_id INTEGER,
  school_id INTEGER,
  distance FLOAT,
  assign BOOLEAN
);
```

### Data Dictionary  
- **Drivers Table**: Contains information about drivers, including their unique identifiers and home cities.  
  - `driver_id`: Unique identifier for each driver, used to identify drivers in the optimization model.  
  - `home_city`: City where the driver resides, used to calculate the distance to schools.  

- **Schools Table**: Contains information about schools, including their unique identifiers and locations.  
  - `school_id`: Unique identifier for each school, used to identify schools in the optimization model.  
  - `location`: Location of the school, used to calculate the distance from drivers' home cities.  

- **Distance Matrix Table**: Contains the travel distances between drivers' home cities and schools' locations, as well as the binary assignment decisions.  
  - `driver_id`: Unique identifier for each driver, used to identify drivers in the optimization model.  
  - `school_id`: Unique identifier for each school, used to identify schools in the optimization model.  
  - `distance`: Travel distance between a driver's home city and a school's location, used as a coefficient in the objective function.  
  - `assign`: Binary decision variable indicating whether a driver is assigned to a school, used as a decision variable in the optimization model.  


### Retrieved Values

**Query 1: This is essential for the objective function, which aims to minimize the total travel distance.**

```sql
SELECT driver_id, school_id, distance FROM distance_matrix;
```

**Results (CSV format):**
```csv
driver_id,school_id,distance
1,1,10.5
1,2,15.3
1,3,20.1
2,1,18.2
2,2,8.7
2,3,22.4
3,1,19.8
3,2,21.3
3,3,7.5
```

**Query 2: This information is necessary to understand the starting points for distance calculations.**

```sql
SELECT driver_id, home_city FROM drivers;
```

**Results (CSV format):**
```csv
driver_id,home_city
1,CityA
2,CityB
3,CityC
```

**Query 3: This information is necessary to understand the destinations for distance calculations.**

```sql
SELECT school_id, location FROM schools;
```

**Results (CSV format):**
```csv
school_id,location
1,LocationX
2,LocationY
3,LocationZ
```

**Query 4: This helps in understanding if there are enough drivers for each school and vice versa.**

```sql
SELECT (SELECT COUNT(*) FROM drivers) AS total_drivers, (SELECT COUNT(*) FROM schools) AS total_schools;
```

**Results (CSV format):**
```csv
total_drivers,total_schools
3,3
```

**Query 5: This helps in understanding the range of distances, which can be useful for setting bounds in the optimization model.**

```sql
SELECT MIN(distance) AS min_distance, MAX(distance) AS max_distance FROM distance_matrix;
```

**Results (CSV format):**
```csv
min_distance,max_distance
7.5,22.4
```

**Query 6: This can be useful for understanding the typical travel distance a driver might incur.**

```sql
SELECT driver_id, AVG(distance) AS avg_distance FROM distance_matrix GROUP BY driver_id;
```

**Results (CSV format):**
```csv
driver_id,avg_distance
1,15.300000000000002
2,16.433333333333334
3,16.2
```

**Query 7: This can be useful for understanding the typical travel distance to a school.**

```sql
SELECT school_id, AVG(distance) AS avg_distance FROM distance_matrix GROUP BY school_id;
```

**Results (CSV format):**
```csv
school_id,avg_distance
1,16.166666666666668
2,15.1
3,16.666666666666668
```

**Query 8: This helps in understanding the current assignments and can be used to validate constraints.**

```sql
SELECT driver_id, school_id, distance FROM distance_matrix WHERE assign = TRUE;
```

**Results (CSV format):**
```csv
driver_id,school_id,distance
```

**Query 9: This helps in ensuring that the constraint of at most one school per driver is not violated.**

```sql
SELECT driver_id, COUNT(school_id) AS num_schools_assigned FROM distance_matrix WHERE assign = TRUE GROUP BY driver_id;
```

**Results (CSV format):**
```csv
driver_id,num_schools_assigned
```

**Query 10: This helps in ensuring that the constraint of exactly one driver per school is not violated.**

```sql
SELECT school_id, COUNT(driver_id) AS num_drivers_assigned FROM distance_matrix WHERE assign = TRUE GROUP BY school_id;
```

**Results (CSV format):**
```csv
school_id,num_drivers_assigned
```

**Query 11: This provides the current total travel distance, which is the value to be minimized.**

```sql
SELECT SUM(distance) AS total_distance FROM distance_matrix WHERE assign = TRUE;
```

**Results (CSV format):**
```csv
total_distance
""
```

**Query 12: This can be useful for identifying potential optimal assignments.**

```sql
SELECT driver_id, school_id, distance FROM distance_matrix ORDER BY distance ASC LIMIT 10;
```

**Results (CSV format):**
```csv
driver_id,school_id,distance
3,3,7.5
2,2,8.7
1,1,10.5
1,2,15.3
2,1,18.2
3,1,19.8
1,3,20.1
3,2,21.3
2,3,22.4
```

**Query 13: This can be useful for identifying assignments that should be avoided.**

```sql
SELECT driver_id, school_id, distance FROM distance_matrix ORDER BY distance DESC LIMIT 10;
```

**Results (CSV format):**
```csv
driver_id,school_id,distance
2,3,22.4
3,2,21.3
1,3,20.1
3,1,19.8
2,1,18.2
1,2,15.3
1,1,10.5
2,2,8.7
3,3,7.5
```

