# Complete Optimization Problem and Solution: icfp_1

## 1. Problem Context and Goals

### Context  
A research institution is focused on optimizing the allocation of its researchers across various projects to maximize the overall research output. The output is measured by the number of papers published, with a particular emphasis on the order of authorship to reflect the varying contributions of researchers. The institution must ensure that the allocation respects operational constraints, such as the maximum number of researchers that can be assigned to a single institution and the maximum number of papers a researcher can be involved in.  

The decision at hand is whether to assign a specific researcher to a particular institution for a given paper. This decision is represented by a binary indicator, where a value of 1 signifies an assignment and 0 signifies no assignment. The institution aims to maximize the total weighted research output, where the weight is determined by the order of authorship. For example, first authors are given a higher weight than subsequent authors, reflecting their greater contribution to the paper.  

The business configuration includes a weight for the order of authorship, which is used to prioritize papers based on the researcher's role. Additionally, there are constraints on the maximum number of researchers per institution and the maximum number of papers per researcher. These constraints ensure that the allocation remains feasible and aligns with the institution's operational capacity.  

### Goals  
The primary goal of this optimization problem is to maximize the total weighted research output. This is achieved by assigning researchers to institutions and papers in a way that respects the operational constraints. The weight for each paper is determined by the order of authorship, with higher weights assigned to papers where the researcher is a primary contributor. Success is measured by the total sum of these weighted assignments, ensuring that the institution's research output is optimized while adhering to the defined constraints.  

## 2. Constraints  

The optimization problem must adhere to the following constraints:  
1. **Maximum Researchers per Institution**: The total number of researchers assigned to any single institution cannot exceed the predefined maximum limit. This ensures that institutions are not overburdened with too many researchers.  
2. **Maximum Papers per Researcher**: The total number of papers assigned to any single researcher cannot exceed the predefined maximum limit. This ensures that researchers are not overcommitted and can maintain a balanced workload.  
3. **Single Assignment per Paper**: Each paper must be assigned to exactly one researcher-institution pair. This ensures that every paper is accounted for and that there are no overlaps or gaps in the allocation.  

These constraints are designed to ensure that the allocation of researchers to institutions and papers is both feasible and aligned with the institution's operational capacity.  

## 3. Available Data  

### Database Schema  
```sql
-- Iteration 1 Database Schema
-- Objective: Schema adjustments include creating new tables for missing optimization data, modifying existing tables to better align with OR expert's mapping, and moving scalar parameters and formulas to business_configuration_logic.json.

CREATE TABLE Authorship (
  authID INTEGER,
  instID INTEGER,
  paperID INTEGER,
  authOrder INTEGER
);
```

### Data Dictionary  
The **Authorship** table represents the assignment of researchers to institutions and papers. It plays a critical role in the optimization problem by providing the necessary data to make allocation decisions.  

- **authID**: Represents the unique identifier for each researcher. This is used to track which researcher is being assigned to a paper and institution.  
- **instID**: Represents the unique identifier for each institution. This is used to track which institution the researcher is being assigned to.  
- **paperID**: Represents the unique identifier for each paper. This is used to track which paper the researcher is being assigned to.  
- **authOrder**: Represents the order of authorship for the researcher on the paper. This is used to determine the weight of the paper in the objective function, with higher values indicating a greater contribution.  


### Retrieved Values

**Query 1: This is essential for understanding the current assignments and determining the weights for the objective function.**

```sql
SELECT authID, instID, paperID, authOrder FROM Authorship;
```

**Results (CSV format):**
```csv
authID,instID,paperID,authOrder
1,101,201,1
2,102,202,2
3,103,203,3
```

**Query 2: This is crucial for enforcing the "Maximum Researchers per Institution" constraint.**

```sql
SELECT instID, COUNT(DISTINCT authID) AS totalResearchers FROM Authorship GROUP BY instID;
```

**Results (CSV format):**
```csv
instID,totalResearchers
101,1
102,1
103,1
```

**Query 3: This is crucial for enforcing the "Maximum Papers per Researcher" constraint.**

```sql
SELECT authID, COUNT(DISTINCT paperID) AS totalPapers FROM Authorship GROUP BY authID;
```

**Results (CSV format):**
```csv
authID,totalPapers
1,1
2,1
3,1
```

**Query 4: This helps ensure that the "Single Assignment per Paper" constraint is met by verifying that each paper has exactly one researcher-institution pair.**

```sql
SELECT paperID, COUNT(DISTINCT authID) AS totalResearchers FROM Authorship GROUP BY paperID;
```

**Results (CSV format):**
```csv
paperID,totalResearchers
201,1
202,1
203,1
```

**Query 5: This can be used to prioritize researchers who typically have higher contributions (lower authorship order) in the optimization process.**

```sql
SELECT authID, AVG(authOrder) AS avgAuthOrder FROM Authorship GROUP BY authID;
```

**Results (CSV format):**
```csv
authID,avgAuthOrder
1,1.0
2,2.0
3,3.0
```

**Query 6: This can be used to prioritize institutions that typically have researchers with higher contributions (lower authorship order) in the optimization process.**

```sql
SELECT instID, AVG(authOrder) AS avgAuthOrder FROM Authorship GROUP BY instID;
```

**Results (CSV format):**
```csv
instID,avgAuthOrder
101,1.0
102,2.0
103,3.0
```

**Query 7: This is essential for calculating the weighted research output, as the weight is determined by the order of authorship.**

```sql
SELECT paperID, authID, authOrder FROM Authorship ORDER BY paperID, authOrder;
```

**Results (CSV format):**
```csv
paperID,authID,authOrder
201,1,1
202,2,2
203,3,3
```

**Query 8: This helps in understanding the current allocation of researchers across institutions, which is important for reallocation decisions.**

```sql
SELECT DISTINCT authID, instID FROM Authorship;
```

**Results (CSV format):**
```csv
authID,instID
1,101
2,102
3,103
```

**Query 9: This helps in understanding the current allocation of papers across institutions, which is important for reallocation decisions.**

```sql
SELECT DISTINCT paperID, instID FROM Authorship;
```

**Results (CSV format):**
```csv
paperID,instID
201,101
202,102
203,103
```

**Query 10: This is important for identifying papers where the researcher is the primary contributor, which should be given higher weight in the objective function.**

```sql
SELECT authID, instID, paperID FROM Authorship WHERE authOrder = 1;
```

**Results (CSV format):**
```csv
authID,instID,paperID
1,101,201
```

