# Complete Optimization Problem and Solution: university_basketball

## 1. Problem Context and Goals

### Context  
The business problem revolves around selecting basketball teams for a tournament to maximize the overall win percentage of the selected teams. The decision-making process involves choosing which teams to include in the tournament based on their performance, conference affiliation, and geographical location. Each team is represented by a unique identifier, and the decision to select a team is binary—either the team is selected or not.  

The operational parameters guiding this decision include:  
- **Win Percentage**: The win percentage of each team, which serves as the primary metric for evaluating team performance.  
- **Total Teams Selected**: The total number of teams to be selected for the tournament, ensuring a manageable and competitive tournament size.  
- **Conference Diversity**: Minimum requirements for the number of teams selected from the East, West, and South conferences to ensure fair representation across conferences.  
- **Geographical Distribution**: Maximum limits on the number of teams selected from specific locations (New York, Los Angeles, and Chicago) to promote geographical diversity.  

These parameters are derived from the business configuration, which includes scalar values such as the total number of teams to be selected, minimum team requirements per conference, and maximum team limits per location. The decision-making process is designed to be linear, avoiding complex relationships such as variable products or divisions, ensuring a straightforward and efficient optimization formulation.  

### Goals  
The primary goal of this optimization problem is to maximize the total win percentage of the selected teams for the tournament. Success is measured by the cumulative win percentage of the chosen teams, ensuring that the tournament features the highest-performing teams while adhering to the constraints on conference diversity and geographical distribution.  

The optimization process focuses on selecting the best combination of teams based on their win percentages, ensuring that the final selection aligns with the business requirements for team diversity and location limits. This goal is achieved through a linear formulation that directly maps the win percentages of teams to the decision variables, ensuring clarity and simplicity in the optimization process.  

## 2. Constraints  

The optimization problem is subject to the following constraints, which ensure that the selected teams meet the business requirements for diversity and distribution:  

1. **Total Teams Selected**: The total number of teams selected for the tournament must equal the predefined value specified in the business configuration.  
2. **East Conference Diversity**: The number of teams selected from the East conference must meet or exceed the minimum requirement set in the business configuration.  
3. **West Conference Diversity**: The number of teams selected from the West conference must meet or exceed the minimum requirement set in the business configuration.  
4. **South Conference Diversity**: The number of teams selected from the South conference must meet or exceed the minimum requirement set in the business configuration.  
5. **New York Geographical Limit**: The number of teams selected from New York must not exceed the maximum limit specified in the business configuration.  
6. **Los Angeles Geographical Limit**: The number of teams selected from Los Angeles must not exceed the maximum limit specified in the business configuration.  
7. **Chicago Geographical Limit**: The number of teams selected from Chicago must not exceed the maximum limit specified in the business configuration.  

These constraints are designed to ensure that the final selection of teams is balanced and representative, aligning with the business goals for diversity and geographical distribution.  

## 3. Available Data  

### Database Schema  
```sql
-- Iteration 3 Database Schema
-- Objective: Added missing constraint bounds to business configuration logic and updated data dictionary to reflect these changes. No new tables were created as the missing requirements were better suited for configuration logic.

CREATE TABLE win_percentage (
  Team_ID INTEGER,
  Win_Percent FLOAT
);

CREATE TABLE team_selection (
  Team_ID INTEGER,
  Team_Selection BOOLEAN
);

CREATE TABLE conference_indicator (
  Team_ID INTEGER,
  Conference_Indicator STRING
);

CREATE TABLE location_indicator (
  Team_ID INTEGER,
  Location_Indicator STRING
);
```

### Data Dictionary  
The following tables and columns are used in the optimization problem, with their business purposes and optimization roles clearly defined:  

- **win_percentage**:  
  - **Team_ID**: Unique identifier for each team.  
  - **Win_Percent**: The win percentage of the team, used as the coefficient in the objective function to maximize the total win percentage of selected teams.  

- **team_selection**:  
  - **Team_ID**: Unique identifier for each team.  
  - **Team_Selection**: Binary indicator of whether a team is selected for the tournament, serving as the decision variable in the optimization model.  

- **conference_indicator**:  
  - **Team_ID**: Unique identifier for each team.  
  - **Conference_Indicator**: The conference affiliation of the team, used to enforce constraints on conference diversity.  

- **location_indicator**:  
  - **Team_ID**: Unique identifier for each team.  
  - **Location_Indicator**: The location of the team, used to enforce constraints on geographical distribution.  


### Retrieved Values

**Query 1: This data is crucial for the objective function to maximize the total win percentage of selected teams.**

```sql
SELECT Team_ID, Win_Percent FROM win_percentage;
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent
1,85.0
2,78.5
3,72.0
4,80.5
5,75.0
```

**Query 2: This data represents the decision variables in the optimization model.**

```sql
SELECT Team_ID, Team_Selection FROM team_selection;
```

**Results (CSV format):**
```csv
Team_ID,Team_Selection
1,0
2,0
3,0
4,0
5,0
```

**Query 3: This data is necessary to enforce constraints on conference diversity (East, West, South).**

```sql
SELECT Team_ID, Conference_Indicator FROM conference_indicator;
```

**Results (CSV format):**
```csv
Team_ID,Conference_Indicator
1,East
2,West
3,South
4,East
5,West
```

**Query 4: This data is necessary to enforce constraints on geographical distribution (New York, Los Angeles, Chicago).**

```sql
SELECT Team_ID, Location_Indicator FROM location_indicator;
```

**Results (CSV format):**
```csv
Team_ID,Location_Indicator
1,New York
2,Los Angeles
3,Chicago
4,New York
5,Los Angeles
```

**Query 5: This join is useful for evaluating team performance within specific conferences.**

```sql
SELECT wp.Team_ID, wp.Win_Percent, ci.Conference_Indicator FROM win_percentage wp JOIN conference_indicator ci ON wp.Team_ID = ci.Team_ID;
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent,Conference_Indicator
1,85.0,East
2,78.5,West
3,72.0,South
4,80.5,East
5,75.0,West
```

**Query 6: This join is useful for evaluating team performance within specific locations.**

```sql
SELECT wp.Team_ID, wp.Win_Percent, li.Location_Indicator FROM win_percentage wp JOIN location_indicator li ON wp.Team_ID = li.Team_ID;
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent,Location_Indicator
1,85.0,New York
2,78.5,Los Angeles
3,72.0,Chicago
4,80.5,New York
5,75.0,Los Angeles
```

**Query 7: This comprehensive join is useful for evaluating team performance across both conferences and locations.**

```sql
SELECT wp.Team_ID, wp.Win_Percent, ci.Conference_Indicator, li.Location_Indicator FROM win_percentage wp JOIN conference_indicator ci ON wp.Team_ID = ci.Team_ID JOIN location_indicator li ON wp.Team_ID = li.Team_ID;
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent,Conference_Indicator,Location_Indicator
1,85.0,East,New York
2,78.5,West,Los Angeles
3,72.0,South,Chicago
4,80.5,East,New York
5,75.0,West,Los Angeles
```

**Query 8: This summary statistic is useful for understanding the distribution of teams across conferences.**

```sql
SELECT Conference_Indicator, COUNT(Team_ID) AS Team_Count FROM conference_indicator GROUP BY Conference_Indicator;
```

**Results (CSV format):**
```csv
Conference_Indicator,Team_Count
East,2
South,1
West,2
```

**Query 9: This summary statistic is useful for understanding the distribution of teams across locations.**

```sql
SELECT Location_Indicator, COUNT(Team_ID) AS Team_Count FROM location_indicator GROUP BY Location_Indicator;
```

**Results (CSV format):**
```csv
Location_Indicator,Team_Count
Chicago,1
Los Angeles,2
New York,2
```

**Query 10: This data is useful for evaluating the performance of teams in the East conference.**

```sql
SELECT wp.Team_ID, wp.Win_Percent FROM win_percentage wp JOIN conference_indicator ci ON wp.Team_ID = ci.Team_ID WHERE ci.Conference_Indicator = 'East';
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent
1,85.0
4,80.5
```

**Query 11: This data is useful for evaluating the performance of teams in the West conference.**

```sql
SELECT wp.Team_ID, wp.Win_Percent FROM win_percentage wp JOIN conference_indicator ci ON wp.Team_ID = ci.Team_ID WHERE ci.Conference_Indicator = 'West';
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent
2,78.5
5,75.0
```

**Query 12: This data is useful for evaluating the performance of teams in the South conference.**

```sql
SELECT wp.Team_ID, wp.Win_Percent FROM win_percentage wp JOIN conference_indicator ci ON wp.Team_ID = ci.Team_ID WHERE ci.Conference_Indicator = 'South';
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent
3,72.0
```

**Query 13: This data is useful for evaluating the performance of teams in New York.**

```sql
SELECT wp.Team_ID, wp.Win_Percent FROM win_percentage wp JOIN location_indicator li ON wp.Team_ID = li.Team_ID WHERE li.Location_Indicator = 'New York';
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent
1,85.0
4,80.5
```

**Query 14: This data is useful for evaluating the performance of teams in Los Angeles.**

```sql
SELECT wp.Team_ID, wp.Win_Percent FROM win_percentage wp JOIN location_indicator li ON wp.Team_ID = li.Team_ID WHERE li.Location_Indicator = 'Los Angeles';
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent
2,78.5
5,75.0
```

**Query 15: This data is useful for evaluating the performance of teams in Chicago.**

```sql
SELECT wp.Team_ID, wp.Win_Percent FROM win_percentage wp JOIN location_indicator li ON wp.Team_ID = li.Team_ID WHERE li.Location_Indicator = 'Chicago';
```

**Results (CSV format):**
```csv
Team_ID,Win_Percent
3,72.0
```

