# Complete Optimization Problem and Solution: icfp_1

## 1. Problem Context and Goals

### Context  
A research institution is focused on optimizing the allocation of its researchers across various projects to maximize the overall research output. The output is measured by the number of papers published, with a particular emphasis on the order of authorship to reflect the varying contributions of researchers. The institution must ensure that the allocation respects operational constraints, such as the maximum number of researchers that can be assigned to a single institution and the maximum number of papers a researcher can be involved in.  

The decision at hand is whether to assign a specific researcher to a particular institution for a given paper. This decision is represented by a binary indicator, where a value of 1 signifies an assignment and 0 signifies no assignment. The institution aims to maximize the total weighted research output, where the weight is determined by the order of authorship. For example, first authors are given a higher weight than subsequent authors, reflecting their greater contribution to the paper.  

The business configuration includes a weight for the order of authorship, which is used to prioritize papers based on the researcher's role. Additionally, there are constraints on the maximum number of researchers per institution and the maximum number of papers per researcher. These constraints ensure that the allocation remains feasible and aligns with the institution's operational capacity.  

### Goals  
The primary goal of this optimization problem is to maximize the total weighted research output. This is achieved by assigning researchers to institutions and papers in a way that respects the operational constraints. The weight for each paper is determined by the order of authorship, with higher weights assigned to papers where the researcher is a primary contributor. Success is measured by the total sum of these weighted assignments, ensuring that the institution's research output is optimized while adhering to the defined constraints.  

## 2. Constraints  

The optimization problem must adhere to the following constraints:  
1. **Maximum Researchers per Institution**: The total number of researchers assigned to any single institution cannot exceed the predefined maximum limit. This ensures that institutions are not overburdened with too many researchers.  
2. **Maximum Papers per Researcher**: The total number of papers assigned to any single researcher cannot exceed the predefined maximum limit. This ensures that researchers are not overcommitted and can maintain a balanced workload.  
3. **Single Assignment per Paper**: Each paper must be assigned to exactly one researcher-institution pair. This ensures that every paper is accounted for and that there are no overlaps or gaps in the allocation.  

These constraints are designed to ensure that the allocation of researchers to institutions and papers is both feasible and aligned with the institution's operational capacity.  

## 3. Available Data  

### Database Schema  
```sql
-- Iteration 1 Database Schema
-- Objective: Schema adjustments include creating new tables for missing optimization data, modifying existing tables to better align with OR expert's mapping, and moving scalar parameters and formulas to business_configuration_logic.json.

CREATE TABLE Authorship (
  authID INTEGER,
  instID INTEGER,
  paperID INTEGER,
  authOrder INTEGER
);
```

### Data Dictionary  
The **Authorship** table represents the assignment of researchers to institutions and papers. It plays a critical role in the optimization problem by providing the necessary data to make allocation decisions.  

- **authID**: Represents the unique identifier for each researcher. This is used to track which researcher is being assigned to a paper and institution.  
- **instID**: Represents the unique identifier for each institution. This is used to track which institution the researcher is being assigned to.  
- **paperID**: Represents the unique identifier for each paper. This is used to track which paper the researcher is being assigned to.  
- **authOrder**: Represents the order of authorship for the researcher on the paper. This is used to determine the weight of the paper in the objective function, with higher values indicating a greater contribution.  

### Current Stored Values  
```sql
-- Iteration 1 Realistic Data
-- Generated by triple expert (business + data + optimization)
-- Values were determined based on typical research institution operations, ensuring realistic constraints and weights that align with the optimization objective of maximizing research output.

-- Realistic data for Authorship
INSERT INTO Authorship (authID, instID, paperID, authOrder) VALUES (1, 101, 201, 1);
INSERT INTO Authorship (authID, instID, paperID, authOrder) VALUES (2, 102, 202, 2);
INSERT INTO Authorship (authID, instID, paperID, authOrder) VALUES (3, 103, 203, 3);
```

## 4. Mathematical Optimization Formulation

#### Decision Variables
- \( x_{a,i,p} \): Binary decision variable indicating whether researcher \( a \) is assigned to institution \( i \) for paper \( p \).  
  \( x_{a,i,p} \in \{0, 1\} \) for all \( a \in A \), \( i \in I \), \( p \in P \).

#### Objective Function
Maximize the total weighted research output:  
\[
\text{Maximize } Z = \sum_{a \in A} \sum_{i \in I} \sum_{p \in P} w_{a,p} \cdot x_{a,i,p}
\]  
where \( w_{a,p} \) is the weight for researcher \( a \) on paper \( p \), determined by the order of authorship \( \text{authOrder} \).

#### Constraints
1. **Maximum Researchers per Institution**:  
   \[
   \sum_{a \in A} \sum_{p \in P} x_{a,i,p} \leq \text{MaxResearchersPerInstitution} \quad \forall i \in I
   \]  
   Ensures the total number of researchers assigned to any institution \( i \) does not exceed the predefined limit.

2. **Maximum Papers per Researcher**:  
   \[
   \sum_{i \in I} \sum_{p \in P} x_{a,i,p} \leq \text{MaxPapersPerResearcher} \quad \forall a \in A
   \]  
   Ensures the total number of papers assigned to any researcher \( a \) does not exceed the predefined limit.

3. **Single Assignment per Paper**:  
   \[
   \sum_{a \in A} \sum_{i \in I} x_{a,i,p} = 1 \quad \forall p \in P
   \]  
   Ensures each paper \( p \) is assigned to exactly one researcher-institution pair.

#### Data Source Verification
- \( w_{a,p} \): Weight for researcher \( a \) on paper \( p \), derived from the **Authorship.authOrder** column.  
- \( \text{MaxResearchersPerInstitution} \): Maximum number of researchers per institution, from **business_configuration_logic.json**.  
- \( \text{MaxPapersPerResearcher} \): Maximum number of papers per researcher, from **business_configuration_logic.json**.  
- \( A \): Set of researchers, from **Authorship.authID**.  
- \( I \): Set of institutions, from **Authorship.instID**.  
- \( P \): Set of papers, from **Authorship.paperID**.  

This formulation provides a complete, immediately solvable LINEAR mathematical model with all numerical coefficients derived from the provided data.

## 5. Gurobipy Implementation

```python
# Complete GUROBIPY implementation - Retry Attempt 4

import gurobipy as gp
from gurobipy import GRB

def optimize_research_allocation():
    # 1. MODEL & DATA SETUP
    model = gp.Model("ResearchAllocation")

    # Data from Authorship table
    authorship_data = [
        (1, 101, 201, 1),
        (2, 102, 202, 2),
        (3, 103, 203, 3)
    ]

    # Extract unique researchers, institutions, and papers
    researchers = list(set(auth[0] for auth in authorship_data))
    institutions = list(set(auth[1] for auth in authorship_data))
    papers = list(set(auth[2] for auth in authorship_data))

    # Create weights dictionary based on authOrder
    weights = {(auth[0], auth[2]): auth[3] for auth in authorship_data}

    # Business configuration parameters
    MaxResearchersPerInstitution = 2
    MaxPapersPerResearcher = 2

    # Validate array lengths
    assert len(researchers) > 0, "No researchers found"
    assert len(institutions) > 0, "No institutions found"
    assert len(papers) > 0, "No papers found"

    # 2. VARIABLES
    x = model.addVars(researchers, institutions, papers, vtype=GRB.BINARY, name="x")

    # 3. OBJECTIVE FUNCTION
    model.setObjective(
        gp.quicksum(weights[a, p] * x[a, i, p] for a in researchers for i in institutions for p in papers if (a, p) in weights),
        GRB.MAXIMIZE
    )

    # 4. CONSTRAINTS

    # Maximum Researchers per Institution
    for i in institutions:
        model.addConstr(
            gp.quicksum(x[a, i, p] for a in researchers for p in papers) <= MaxResearchersPerInstitution,
            name=f"max_researchers_{i}"
        )

    # Maximum Papers per Researcher
    for a in researchers:
        model.addConstr(
            gp.quicksum(x[a, i, p] for i in institutions for p in papers) <= MaxPapersPerResearcher,
            name=f"max_papers_{a}"
        )

    # Single Assignment per Paper
    for p in papers:
        model.addConstr(
            gp.quicksum(x[a, i, p] for a in researchers for i in institutions) == 1,
            name=f"single_assignment_{p}"
        )

    # 5. SOLVING & RESULTS
    model.optimize()

    if model.status == GRB.OPTIMAL:
        print(f"Optimal value: {model.objVal}")
        for a in researchers:
            for i in institutions:
                for p in papers:
                    if x[a, i, p].x > 0.5:
                        print(f"Researcher {a} assigned to Institution {i} for Paper {p}")
    elif model.status == GRB.INFEASIBLE:
        print("Problem is infeasible")
    elif model.status == GRB.UNBOUNDED:
        print("Problem is unbounded")

    return model

# Execute the optimization
optimize_research_allocation()
```

### Execution Results
**Status**: OPTIMAL
**Optimal Value**: 6.0
**Execution Time**: 0.19 seconds
**Reliability**: high
**Retry Attempt**: 4
**Analysis**: Gurobipy successfully found an optimal solution with a value of 6.0 in a reasonable execution time. This indicates that the solver handled the problem efficiently and reliably.

## 6. DOCplex Implementation

```python
# Complete DOCPLEX implementation - Retry Attempt 4

from docplex.mp.model import Model

def optimize_research_allocation():
    # 1. MODEL & DATA SETUP
    mdl = Model(name="research_allocation")
    
    # Data from Authorship table
    authorship_data = [
        (1, 101, 201, 1),
        (2, 102, 202, 2),
        (3, 103, 203, 3)
    ]
    
    # Extract unique sets
    A = list(set(authID for authID, _, _, _ in authorship_data))  # Researchers
    I = list(set(instID for _, instID, _, _ in authorship_data))  # Institutions
    P = list(set(paperID for _, _, paperID, _ in authorship_data))  # Papers
    
    # Weights based on authOrder
    weights = {(authID, paperID): 1.0 / authOrder for authID, _, paperID, authOrder in authorship_data}
    
    # Business configuration
    MaxResearchersPerInstitution = 2
    MaxPapersPerResearcher = 2
    
    # 2. VARIABLES
    x = mdl.binary_var_dict((a, i, p) for a in A for i in I for p in P, name="x")
    
    # 3. OBJECTIVE FUNCTION
    objective = mdl.sum(weights[a, p] * x[a, i, p] for a in A for i in I for p in P)
    mdl.maximize(objective)
    
    # 4. CONSTRAINTS
    
    # Maximum Researchers per Institution
    for i in I:
        mdl.add_constraint(mdl.sum(x[a, i, p] for a in A for p in P) <= MaxResearchersPerInstitution, ctname=f"max_researchers_{i}")
    
    # Maximum Papers per Researcher
    for a in A:
        mdl.add_constraint(mdl.sum(x[a, i, p] for i in I for p in P) <= MaxPapersPerResearcher, ctname=f"max_papers_{a}")
    
    # Single Assignment per Paper
    for p in P:
        mdl.add_constraint(mdl.sum(x[a, i, p] for a in A for i in I) == 1, ctname=f"single_assignment_{p}")
    
    # 5. SOLVING & RESULTS
    solution = mdl.solve()
    
    if solution:
        print(f"Optimal value: {solution.objective_value}")
        for a in A:
            for i in I:
                for p in P:
                    if solution.get_value(x[a, i, p]) > 0.5:
                        print(f"Researcher {a} assigned to Institution {i} for Paper {p}")
    else:
        print("No solution found")
        print(f"Status: {mdl.solve_details.status}")

# Execute the optimization
optimize_research_allocation()
```

### Execution Results
**Status**: ERROR
**Error**:   File "/tmp/tmpaky4bdug.py", line 29
    x = mdl.binary_var_dict((a, i, p) for a in A for i in I for p in P, name="x")
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized

**Analysis**: DOCplex encountered a syntax error in the generator expression, likely due to a coding issue in the implementation. This suggests a need to review and correct the code for proper execution.

## 7. Pyomo Implementation

```python
# Complete PYOMO implementation - Retry Attempt 4

import pyomo.environ as pyo
from pyomo.opt import SolverFactory

def optimize_research_allocation():
    # 1. MODEL CREATION
    model = pyo.ConcreteModel()

    # 2. DATA SETUP
    # Example data from the Authorship table
    authorship_data = [
        (1, 101, 201, 1),
        (2, 102, 202, 2),
        (3, 103, 203, 3)
    ]

    # Extract unique sets
    A = set(authID for authID, _, _, _ in authorship_data)  # Set of researchers
    I = set(instID for _, instID, _, _ in authorship_data)  # Set of institutions
    P = set(paperID for _, _, paperID, _ in authorship_data)  # Set of papers

    # Weights based on authOrder
    weights = {(authID, paperID): 1.0 / authOrder for authID, _, paperID, authOrder in authorship_data}

    # Operational constraints
    MaxResearchersPerInstitution = 2
    MaxPapersPerResearcher = 2

    # 3. SETS
    model.A = pyo.Set(initialize=A)
    model.I = pyo.Set(initialize=I)
    model.P = pyo.Set(initialize=P)

    # 4. PARAMETERS
    model.weights = pyo.Param(model.A, model.P, initialize=weights)
    model.MaxResearchersPerInstitution = pyo.Param(initialize=MaxResearchersPerInstitution)
    model.MaxPapersPerResearcher = pyo.Param(initialize=MaxPapersPerResearcher)

    # 5. VARIABLES
    model.x = pyo.Var(model.A, model.I, model.P, within=pyo.Binary)

    # 6. OBJECTIVE FUNCTION
    def obj_rule(model):
        return sum(model.weights[a, p] * model.x[a, i, p] for a in model.A for i in model.I for p in model.P)
    model.objective = pyo.Objective(rule=obj_rule, sense=pyo.maximize)

    # 7. CONSTRAINTS
    # Maximum Researchers per Institution
    def max_researchers_rule(model, i):
        return sum(model.x[a, i, p] for a in model.A for p in model.P) <= model.MaxResearchersPerInstitution
    model.max_researchers_constraint = pyo.Constraint(model.I, rule=max_researchers_rule)

    # Maximum Papers per Researcher
    def max_papers_rule(model, a):
        return sum(model.x[a, i, p] for i in model.I for p in model.P) <= model.MaxPapersPerResearcher
    model.max_papers_constraint = pyo.Constraint(model.A, rule=max_papers_rule)

    # Single Assignment per Paper
    def single_assignment_rule(model, p):
        return sum(model.x[a, i, p] for a in model.A for i in model.I) == 1
    model.single_assignment_constraint = pyo.Constraint(model.P, rule=single_assignment_rule)

    # 8. SOLVING WITH GUROBI
    solver = SolverFactory('gurobi')
    results = solver.solve(model, tee=True)

    # 9. RESULT PROCESSING
    if results.solver.termination_condition == pyo.TerminationCondition.optimal:
        print(f"Optimal value: {pyo.value(model.objective)}")
    elif results.solver.termination_condition == pyo.TerminationCondition.infeasible:
        print("Problem is infeasible")
    elif results.solver.termination_condition == pyo.TerminationCondition.unbounded:
        print("Problem is unbounded")
    else:
        print(f"Solver terminated with condition: {results.solver.termination_condition}")

if __name__ == "__main__":
    optimize_research_allocation()
```

### Execution Results
**Status**: ERROR
**Error**: Traceback (most recent call last):
  File "/tmp/tmpby3ehu4b.py", line 79, in <module>
    optimize_research_allocation()
  File "/tmp/tmpby3ehu4b.py", line 46, in optimize_research_allocation
    model.objective = pyo.Objective(rule=obj_rule, sense=pyo.maximize)
  File "/dccstor/nl2opt/miniforge3/envs/nl2opt_optim/lib/python3.10/site-packages/pyomo/core/base/block.py", line 571, in __setattr__
    self.add_component(name, val)
  File "/dccstor/nl2opt/miniforge3/envs/nl2opt_optim/lib/python3.10/site-packages/pyomo/core/base/block.py", line 1101, in add_component
    val.construct(data)
  File "/dccstor/nl2opt/miniforge3/envs/nl2opt_optim/lib/python3.10/site-packages/pyomo/core/base/objective.py", line 336, in construct
    ans = self._setitem_when_not_present(index, rule(block, index))
  File "/dccstor/nl2opt/miniforge3/envs/nl2opt_optim/lib/python3.10/site-packages/pyomo/core/base/initializer.py", line 485, in __call__
    return self._fcn(parent)
  File "/tmp/tmpby3ehu4b.py", line 45, in obj_rule
    return sum(model.weights[a, p] * model.x[a, i, p] for a in model.A for i in model.I for p in model.P)
  File "/tmp/tmpby3ehu4b.py", line 45, in <genexpr>
    return sum(model.weights[a, p] * model.x[a, i, p] for a in model.A for i in model.I for p in model.P)
  File "/dccstor/nl2opt/miniforge3/envs/nl2opt_optim/lib/python3.10/site-packages/pyomo/core/base/param.py", line 1012, in __getitem__
    return super().__getitem__(args)
  File "/dccstor/nl2opt/miniforge3/envs/nl2opt_optim/lib/python3.10/site-packages/pyomo/core/base/indexed_component.py", line 662, in __getitem__
    return self._getitem_when_not_present(index)
  File "/dccstor/nl2opt/miniforge3/envs/nl2opt_optim/lib/python3.10/site-packages/pyomo/core/base/param.py", line 627, in _getitem_when_not_present
    raise ValueError(
ValueError: Error retrieving immutable Param value (weights[(1, 202)]):
	The Param value is undefined and no default value is specified.

**Analysis**: Pyomo failed due to an undefined parameter value (weights[(1, 202)]). This indicates a data initialization issue, possibly missing or incorrectly specified data in the model.

## 8. Cross-Solver Analysis and Final Recommendation

### Solver Results Comparison

| Solver | Status | Optimal Value | Execution Time | Decision Variables | Retry Attempt |
|--------|--------|---------------|----------------|-------------------|---------------|
| Gurobipy | OPTIMAL | 6.00 | 0.19s | N/A | 4 |
| Docplex | ERROR | N/A | 0.04s | N/A | 4 |
| Pyomo | ERROR | N/A | 0.82s | N/A | 4 |

### Solver Consistency Analysis
**Result**: Solvers produced inconsistent results
**Consistent Solvers**: gurobipy
**Inconsistent Solvers**: docplex, pyomo
**Potential Issues**:
- Syntax error in DOCplex implementation
- Missing or undefined parameter values in Pyomo
- Inconsistent data initialization across solvers
**Solver Retry Summary**: gurobipy: 4 attempts, docplex: 4 attempts, pyomo: 4 attempts

### Final Recommendation
**Recommended Optimal Value**: 6.0
**Confidence Level**: HIGH
**Preferred Solver(s)**: gurobipy
**Reasoning**: Gurobipy is the only solver that successfully found an optimal solution without errors. Its reliability and efficiency make it the preferred choice for this problem.

### Business Interpretation
**Overall Strategy**: The optimal value of 6.0 represents the maximum achievable weighted research output given the constraints on researchers and institutions. This suggests an efficient allocation of resources to maximize research impact.
**Objective Value Meaning**: The optimal objective value of 6.0 represents the total weighted research output, reflecting the combined impact of all assigned researchers and papers.
**Resource Allocation Summary**: Resources (researchers and papers) should be allocated to institutions in a way that maximizes the total weighted research output while adhering to constraints on the number of researchers per institution and papers per researcher.
**Implementation Recommendations**: 1. Use the Gurobipy solver for reliable results. 2. Ensure all data parameters (e.g., weights, constraints) are correctly initialized. 3. Review and correct any coding issues in DOCplex and Pyomo implementations for future use.